S920X20做整机AC测试,BMC版本是2025.12,跑了数圈以后,出现BMC重启的异常告警。跑了2天,共出现2次,非偶然现象,如果持续跑,大概率还会出现。
The iBMC1036 |2026-04-17 05:18:34 |Normal |0x1A00006F |Asserted |The openUBMC is reset, with the cause code of 0x10.
555 |2026-04-16 11:04:30 |Normal |0x1A00006F |Asserted |The openUBMC is reset, with the cause code of 0x10.
通过排查app.log日志发现:
2026-04-17 05:15:47.281590 killall busybox_x
2026-04-17 05:15:47.341601 Remove cdev_veth_drv successfully.
2026-04-17 05:15:47.445149 Remove bmc_tpcm_drv successfully.
2026-04-17 05:15:48.322566 Remove bmc_veth_drv successfully.
2026-04-17 05:15:51.085586 Remove bmc_cdev_drv successfully.
2026-04-17 05:15:51.310982 Remove bmc_edma_drv successfully.
rmmod: can’t unload module ‘bmc_hisport_drv’: Resource temporarily unavailable
rmmod: can’t unload module ‘bmc_hisport_drv’: Resource temporarily unavailable
….(重复N次)
rmmod: can’t unload module ‘bmc_hisport_drv’: Resource temporarily unavailable
2026-04-17 05:16:50.242275 Remove bmc_hisport_drv successfully.
bios ERROR: bios file not exist
1970-01-01 00:00:17.328023 metric_analyzer NOTICE: File size is equal to given size
1970-01-01 00:00:20.620811 file_transfer NOTICE: get_dbus_connection:1905 > Dbus (session bus) connected
…(省略重启日志)
2026-04-17 05:18:35.665326 event NOTICE: software_event.lua(158): [BMC_BMC.BMCRestartWithCause_[“0x10”]___] generate an event [assert].(上报重启异常事件)
通过排查kbox_info日志发现,存在kbox restart事件,可能是因为skynet 进程不断崩溃并产生 core dump,导致看门狗超时触发硬件重启:
[20260417051230][I2C-03-RT<-11|-11>] M<0x5c|0x1|0x0|1|10> L<1|3|0> C<1|0|3> E<0x801|0x0|0x0|0x1000> I<0x0|0x510|0x0>
[20260417051232][I2C-03-RT<-11|-11>] M<0x5d|0x1|0x0|1|10> L<1|3|0> C<1|0|3> E<0x801|0x0|0x0|0x1000> I<0x0|0x510|0x0>
[20260417051233][I2C-03-RT<-11|-11>] M<0x5c|0x1|0x0|1|10> L<1|3|0> C<1|0|3> E<0x801|0x0|0x0|0x1000> I<0x0|0x510|0x0>
[20260417051234][I2C-03-RT<-11|-11>] M<0x5d|0x1|0x0|1|10> L<1|2|0> C<1|0|2> E<0x801|0x0|0x0|0x1000> I<0x0|0x510|0x0>
[20260417051234][I2C-03-RT<-11|-11>] M<0x5d|0x1|0x0|1|10> L<1|3|0> C<1|0|3> E<0x801|0x0|0x0|0x1000> I<0x0|0x510|0x0>
[20260417051235][I2C-03-RT<-11|-11>] M<0x5c|0x1|0x0|1|10> L<1|2|0> C<1|0|2> E<0x801|0x0|0x0|0x1000> I<0x0|0x510|0x0>
[20260417051238]coredump:do_coredump 1029,task skynet 6263 5225:end
[20260417051238][I2C-03-RT<-11|-11>] M<0x5c|0x1|0x0|1|10> L<1|2|0> C<1|0|2> E<0x801|0x0|0x0|0x1000> I<0x0|0x510|0x0>
[20260417051239][I2C-03-RT<-11|-11>] M<0x5c|0x1|0x0|1|10> L<1|3|0> C<1|0|3> E<0x801|0x0|0x0|0x1000> I<0x0|0x510|0x0>
[20260417051244]EXT4-fs (mmcblk0gp1): mounted filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e ro without journal. Opts: noload
[20260417051244]EXT4-fs (mmcblk0gp1): unmounting filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e.
[20260417051246]76,2026-04-17 05:12:46,6435,msg_sec_module_cmd_cpy_to_user,523,msg copy_to_user user data error 8
[20260417051247]77,2026-04-17 05:12:46,6435,msg_scm3_ioctl,613,msg cmd(0xc0047301) handle error!
[20260417051309]coredump:do_coredump 767,task skynet 6607 6424:start
[20260417051309]78,2026-04-17 05:13:09,6512,ipmb_cmd_check_data_readable,1285,ipmb check read was interrupted, ret = -4
[20260417051309]coredump:coredump_wait 543,task skynet 6607 6424:star wait
[20260417051309][idump/coredump] tgid=6424 task[6607 skynet], out of the stack, dump size:5f5e000
[20260417051318][I2C-08-RT<-5|-5>] M<0x21|0x1|0x0|1|10> L<1|1|0> C<1|0|1> E<0x801|0x0|0x0|0x1> I<0x0|0x710|0x0>
[20260417051326]coredump:do_coredump 1029,task skynet 6607 6424:end
[20260417051334]EXT4-fs (mmcblk0gp1): mounted filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e ro without journal. Opts: noload
[20260417051334]EXT4-fs (mmcblk0gp1): unmounting filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e.
[20260417051336]79,2026-04-17 05:13:36,6759,msg_sec_module_cmd_cpy_to_user,523,msg copy_to_user user data error 8
[20260417051336]80,2026-04-17 05:13:36,6759,msg_scm3_ioctl,613,msg cmd(0xc0047301) handle error!
[20260417051355]coredump:do_coredump 767,task skynet 6871 6737:start
[20260417051355]81,2026-04-17 05:13:55,6827,ipmb_cmd_check_data_readable,1285,ipmb check read was interrupted, ret = -4
[20260417051355]coredump:coredump_wait 543,task skynet 6871 6737:star wait
[20260417051355][idump/coredump] tgid=6737 task[6871 skynet], out of the stack, dump size:5f5e000
[20260417051412]coredump:do_coredump 1029,task skynet 6871 6737:end
[20260417051418][I2C-08-RT<-5|-5>] M<0x21|0x1|0x0|1|10> L<1|1|0> C<1|0|1> E<0x801|0x0|0x0|0x1> I<0x0|0x710|0x0>
[20260417051419]82,2026-04-17 05:14:19,7017,msg_sec_module_cmd_cpy_to_user,523,msg copy_to_user user data error 8
[20260417051419]83,2026-04-17 05:14:19,7017,msg_scm3_ioctl,613,msg cmd(0xc0047301) handle error!
[20260417051419]EXT4-fs (mmcblk0gp1): mounted filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e ro without journal. Opts: noload
[20260417051420]EXT4-fs (mmcblk0gp1): unmounting filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e.
[20260417051440]coredump:do_coredump 767,task skynet 7136 7008:start
[20260417051440]coredump:coredump_wait 543,task skynet 7136 7008:star wait
[20260417051440]84,2026-04-17 05:14:40,7099,ipmb_cmd_check_data_readable,1285,ipmb check read was interrupted, ret = -4
[20260417051440][idump/coredump] tgid=7008 task[7136 skynet], out of the stack, dump size:5f5e000
[20260417051458]coredump:do_coredump 1029,task skynet 7136 7008:end
[20260417051505]85,2026-04-17 05:15:05,7298,msg_sec_module_cmd_cpy_to_user,523,msg copy_to_user user data error 8
[20260417051505]86,2026-04-17 05:15:05,7298,msg_scm3_ioctl,613,msg cmd(0xc0047301) handle error!
[20260417051506]EXT4-fs (mmcblk0gp1): mounted filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e ro without journal. Opts: noload
[20260417051506]EXT4-fs (mmcblk0gp1): unmounting filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e.
[20260417051518][I2C-08-RT<-5|-5>] M<0x21|0x1|0x0|1|10> L<1|1|0> C<1|0|1> E<0x801|0x0|0x0|0x1> I<0x0|0x710|0x0>
[20260417051527]coredump:do_coredump 767,task skynet 7421 7266:start
[20260417051527]coredump:coredump_wait 543,task skynet 7421 7266:star wait
[20260417051527]87,2026-04-17 05:15:27,7376,ipmb_cmd_check_data_readable,1285,ipmb check read was interrupted, ret = -4
[20260417051527][idump/coredump] tgid=7266 task[7421 skynet], out of the stack, dump size:5f5e000
[20260417051544]coredump:do_coredump 1029,task skynet 7421 7266:end
[20260417051547]88,2026-04-17 05:15:47,5678,gpio_cmd_get_int_event,1321,gpio_get_int_event errs(-4)
[20260417051547]rsyslogd (6301) used greatest stack depth: 10576 bytes left
[20260417051553]89,2026-04-17 05:15:53,7609,msg_sec_module_cmd_cpy_to_user,523,msg copy_to_user user data error 8
[20260417051553]90,2026-04-17 05:15:53,7609,msg_scm3_ioctl,613,msg cmd(0xc0047301) handle error!
[20260417051554]EXT4-fs (mmcblk0gp1): mounted filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e ro without journal. Opts: noload
[20260417051554]EXT4-fs (mmcblk0gp1): unmounting filesystem 5669ad5d-31d8-4027-8a79-916d6e85c70e.
[20260417051601]coredump:do_coredump 767,task skynet 5015 5015:start
[20260417051601]coredump:coredump_wait 543,task skynet 5015 5015:star wait
[20260417051601][idump/coredump] tgid=5015 task[5015 skynet], out of the stack, dump size:5f5e000
[20260417051607]coredump:do_coredump 767,task skynet 7563 7563:start
[20260417051607]coredump:coredump_wait 543,task skynet 7563 7563:star wait
[20260417051607][idump/coredump] tgid=7563 task[7563 skynet], out of the stack, dump size:5f5e000
[20260417051617]coredump:do_coredump 1029,task skynet 5015 5015:end
[20260417051619][I2C-08-RT<-5|-5>] M<0x21|0x1|0x0|1|10> L<1|1|0> C<1|0|1> E<0x801|0x0|0x0|0x1> I<0x0|0x710|0x0>
[20260417051624]coredump:do_coredump 1029,task skynet 7563 7563:end
[20260417051634]coredump:do_coredump 767,task skynet 5208 5208:start
[20260417051634]coredump:coredump_wait 543,task skynet 5208 5208:star wait
[20260417051634][idump/coredump] tgid=5208 task[5208 skynet], out of the stack, dump size:5f5e000
[20260417051649]coredump:do_coredump 1029,task skynet 5208 5208:end
[20260417051650]usb device-0: detach driver
[20260417051650]usb device-0: gadget disconnect
[20260417051650]usb device-0: reset configuration. gadget state <0>
[20260417051650]usb device-0: gadget unbind
[20260417051650]dwc3 8800000.dwc3: request ffff0f05703a6540 was not queued to ep1in
[20260417051650]dwc3 8800000.dwc3: request ffff0f05703a60c0 was not queued to ep2in
[20260417051650]hisport_cdev: 1342, listen thread end, ret=0
[20260417051652]EXT4-fs (mmcblk0gp3): unmounting filesystem 8109ea91-1457-4c48-81fd-ae0e36fee24d.
[20260417051652]EXT4-fs (loop0): unmounting filesystem 3fe8b642-0f76-409a-9d36-5dc2c195ac65.
[20260417051652]systemd-journald[1273]: Received client request to relinquish /var/log/journal/db12d62a826d4fa9ab8262511bf7b714 access.
[20260417051653]EXT4-fs (mmcblk0p9): unmounting filesystem cc4cc981-2a99-4410-b84b-098c954d89b1.
[20260417051654]systemd-shutdown[1]: Syncing filesystems and block devices.
[20260417051654]systemd-shutdown[1]: Sending SIGTERM to remaining processes…
[20260417051654]systemd-journald[1273]: Received SIGTERM from PID 1 (systemd-shutdow).
[20260417051655]systemd-shutdown[1]: Sending SIGKILL to remaining processes…
[20260417051655]systemd-shutdown[1]: Unmounting file systems.
20260417051655[8798]: Remounting ‘/data/trust’ read-only in with options ‘(null)’.
[20260417051655]EXT4-fs (mmcblk0p7): re-mounted 557ef19a-2dd2-4151-aae0-6964c112a9f6 ro. Opts: (null)
20260417051655[8799]: Unmounting ‘/data/trust’.
[20260417051655]EXT4-fs (mmcblk0p7): unmounting filesystem 557ef19a-2dd2-4151-aae0-6964c112a9f6.
20260417051655[8799]: Unmounted ‘/data/trust’.
20260417051655[8800]: Remounting ‘/data’ read-only in with options ‘(null)’.
20260417051655[8800]: Failed to remount ‘/data’ read-only: Device or resource busy
20260417051655[8801]: Unmounting ‘/data’.
20260417051655[8801]: Failed to unmount /data: Device or resource busy
20260417051655[8802]: Remounting ‘/data’ read-only in with options ‘(null)’.
20260417051655[8802]: Failed to remount ‘/data’ read-only: Device or resource busy
20260417051655[8803]: Unmounting ‘/data’.
20260417051655[8803]: Failed to unmount /data: Device or resource busy
[20260417051655]systemd-shutdown[1]: Not all file systems unmounted, 1 left.
[20260417051655]systemd-shutdown[1]: Deactivating swaps.
[20260417051655]systemd-shutdown[1]: All swaps deactivated.
[20260417051655]systemd-shutdown[1]: Detaching loop devices.
[20260417051655]systemd-shutdown[1]: Detaching loopback /dev/loop0.
[20260417051655]systemd-shutdown[1]: All loop devices detached.
[20260417051655]systemd-shutdown[1]: Stopping MD devices.
[20260417051655]systemd-shutdown[1]: All MD devices stopped.
[20260417051655]systemd-shutdown[1]: Detaching DM devices.
[20260417051655]systemd-shutdown[1]: Detaching DM /dev/dm-0 (253:0).
[20260417051655]device-mapper: ioctl: systemd-shutdow[1]: dm-0 (mmcblksp) is removed successfully
[20260417051655]systemd-shutdown[1]: All DM devices detached.
[20260417051655]systemd-shutdown[1]: Unmounting file systems.
20260417051655[8808]: Remounting ‘/data’ read-only in with options ‘(null)’.
[20260417051655]EXT4-fs (mmcblk0p1): re-mounted 35d433be-5b11-4450-a419-b03ff6d9daba ro. Opts: (null)
20260417051655[8809]: Unmounting ‘/data’.
[20260417051655]EXT4-fs (mmcblk0p1): unmounting filesystem 35d433be-5b11-4450-a419-b03ff6d9daba.
20260417051655[8809]: Unmounted ‘/data’.
[20260417051655]systemd-shutdown[1]: All filesystems unmounted.
[20260417051655]systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
[20260417051655]systemd-shutdown[1]: Syncing filesystems and block devices.
[20260417051655]systemd-shutdown[1]: Rebooting.
[20260417051655]kvm: exiting hardware virtualization
[20260417051655]Kernel Offset: 0x3cbfe1600000
[20260417051655]Module Offset: 0x3cbf6ce8d000
[20260417051655]Linear Region Offset: 0xf0540000000
[20260417051655]xhci-hcd xhci-hcd.1.auto: Host halt failed, -110
[20260417051655]xhci-hcd xhci-hcd.0.auto: Host halt failed, -110
[20260417051656]91,2026-04-17 05:16:56,1,hi_mci_wait_card_idle,179,Busy; trying anyway
[20260417051657]reboot: Restarting system
[20260417051657]KBOX: [machine_restart:238]
------------------kbox restart:[ 0.586646]----------------
KBOX: load OK
tasklock: sched_ctrl created.
registered taskstats version 1
Loading compiled-in X.509 certificates
page_owner is disabled