powercycle测试中BMC服务发生异常,导致关机后没有开机

powercycle测试约1700圈,最后一次关机后没有开机。查看BMC framework.log,发现有类似hardware: A message from [ :00000000 ] to [ :0000000c ] maybe in an endless loop (version = 11498889)日志后,BMC服务重启。BMC系统事件中有记录iBMC is reset and started.

希望专家看看是什么原因导致。

2025-12-08 04:37:29.902889 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_event, type:protect_reset, key: PerId:Event_FanLowerSpeed_0101030501, name:State
2025-12-08 04:37:29.903623 persistence NOTICE: persistence_db_intf.lua(179): saving primary key, op_type:delete, type:protect_reset, key:PerId:Event_FanLowerSpeed_0101030501, name:PerId
2025-12-08 04:37:29.914281 persistence NOTICE: persistence_db_intf.lua(179): finish persist save for table: t_event, persist_type: protect_reset, op_type: delete, data_size: 221
2025-12-08 04:37:30.035853 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_event, type:protect_power_off, key: PerId:Event_FanLowerSpeed_0101030201, name:MaskStatePowerOff
2025-12-08 04:37:30.037525 persistence NOTICE: persistence_db_intf.lua(179): saving primary key, op_type:delete, type:protect_power_off, key:PerId:Event_FanLowerSpeed_0101030201, name:PerId
2025-12-08 04:37:30.041334 persistence NOTICE: persistence_db_intf.lua(179): finish persist save for table: t_event, persist_type: protect_power_off, op_type: delete, data_size: 114
2025-12-08 04:37:30.058819 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_event, type:protect_reset, key: PerId:Event_FanLowerSpeed_0101030201, name:SaveReading
2025-12-08 04:37:30.060384 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_event, type:protect_reset, key: PerId:Event_FanLowerSpeed_0101030201, name:MaskStateReset
2025-12-08 04:37:30.061951 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_event, type:protect_reset, key: PerId:Event_FanLowerSpeed_0101030201, name:State
2025-12-08 04:37:30.063961 persistence NOTICE: persistence_db_intf.lua(179): saving primary key, op_type:delete, type:protect_reset, key:PerId:Event_FanLowerSpeed_0101030201, name:PerId
2025-12-08 04:37:30.086460 persistence NOTICE: persistence_db_intf.lua(179): finish persist save for table: t_event, persist_type: protect_reset, op_type: delete, data_size: 221
2025-12-08 04:37:30.607258 hwproxy NOTICE: object_manage.lua(114): start to remove objects, path: /bmc/kepler/ObjectGroup/0101030501, position: 0101030501
2025-12-08 04:37:30.612201 hwproxy NOTICE: object_manage.lua(120): start to DeleteObject MidAvg_FanSpeed_0101030501, owner:hwproxy, path:/bmc/kepler/ObjectGroup/0101030501, position:0101030501, class_name:MidAvg
2025-12-08 04:37:30.613536 hwproxy NOTICE: object_manage.lua(120): start to DeleteObject Scanner_Fan_RSpeed_0101030501, owner:hwproxy, path:/bmc/kepler/ObjectGroup/0101030501, position:0101030501, class_name:Scanner
2025-12-08 04:37:30.640208 hwproxy NOTICE: object_manage.lua(120): start to DeleteObject Scanner_PowerGood_Delay_0101030501, owner:hwproxy, path:/bmc/kepler/ObjectGroup/0101030501, position:0101030501, class_name:Scanner
2025-12-08 04:37:30.645730 hwproxy NOTICE: object_manage.lua(120): start to DeleteObject Smc_ExpBoardSMC_0101030501, owner:hwproxy, path:/bmc/kepler/ObjectGroup/0101030501, position:0101030501, class_name:Smc
2025-12-08 04:37:30.650735 hwproxy NOTICE: object_manage.lua(141): remove objects completely, path: /bmc/kepler/ObjectGroup/0101030501, position: 0101030501
2025-12-08 04:37:30.674894 hwproxy NOTICE: app_objects.lua(401): scanner: Scanner_Adc_0_0101 change scan enabled to: 0
2025-12-08 04:37:30.744253 hwproxy NOTICE: object_manage.lua(114): start to remove objects, path: /bmc/kepler/ObjectGroup/0101030201, position: 0101030201
2025-12-08 04:37:30.744726 hwproxy NOTICE: object_manage.lua(120): start to DeleteObject MidAvg_FanSpeed_0101030201, owner:hwproxy, path:/bmc/kepler/ObjectGroup/0101030201, position:0101030201, class_name:MidAvg
2025-12-08 04:37:30.746035 hwproxy NOTICE: object_manage.lua(120): start to DeleteObject Scanner_PowerGood_Delay_0101030201, owner:hwproxy, path:/bmc/kepler/ObjectGroup/0101030201, position:0101030201, class_name:Scanner
2025-12-08 04:37:30.789059 hwproxy NOTICE: object_manage.lua(120): start to DeleteObject Scanner_Fan_RSpeed_0101030201, owner:hwproxy, path:/bmc/kepler/ObjectGroup/0101030201, position:0101030201, class_name:Scanner
2025-12-08 04:37:30.811538 hwproxy NOTICE: object_manage.lua(120): start to DeleteObject Smc_ExpBoardSMC_0101030201, owner:hwproxy, path:/bmc/kepler/ObjectGroup/0101030201, position:0101030201, class_name:Smc
2025-12-08 04:37:30.898485 hwproxy NOTICE: object_manage.lua(141): remove objects completely, path: /bmc/kepler/ObjectGroup/0101030201, position: 0101030201
2025-12-08 04:37:30.998930 hwproxy NOTICE: app_objects.lua(401): scanner: Scanner_Adc_5_0101 change scan enabled to: 0
2025-12-08 04:37:31.039467 hwproxy NOTICE: app_objects.lua(401): scanner: Scanner_Adc_4_0101 change scan enabled to: 0
2025-12-08 04:37:31.047733 hwproxy NOTICE: app_objects.lua(401): scanner: Scanner_Adc_3_0101 change scan enabled to: 0
2025-12-08 04:37:31.057479 hwproxy NOTICE: app_objects.lua(401): scanner: Scanner_Adc_1_0101 change scan enabled to: 0
2025-12-08 04:37:32.920787 hwdiscovery NOTICE: hwcomponent.lua(236): [self-release] name: Connector_M2Connect_2_0101, group position: 01010D, current: 0, previous: 1
2025-12-08 04:37:33.073425 hwdiscovery NOTICE: component.lua(175): finish to cleanup resource tree, position: 01010D
2025-12-08 04:37:33.347458 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_pcie_dev_info, type:protect_reset, key: GroupPosition:PCIeDevice_01010D, name:DiagnosticFault
2025-12-08 04:37:33.351363 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_pcie_dev_info, type:protect_reset, key: GroupPosition:PCIeDevice_01010D, name:BandwidthReduction
2025-12-08 04:37:33.352714 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_pcie_dev_info, type:protect_reset, key: GroupPosition:PCIeDevice_01010D, name:PredictiveFault
2025-12-08 04:37:33.354033 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_pcie_dev_info, type:protect_reset, key: GroupPosition:PCIeDevice_01010D, name:FatalError
2025-12-08 04:37:33.355433 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_pcie_dev_info, type:protect_reset, key: GroupPosition:PCIeDevice_01010D, name:CorrectableError
2025-12-08 04:37:33.357963 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_pcie_dev_info, type:protect_reset, key: GroupPosition:PCIeDevice_01010D, name:LinkSpeedReduced
2025-12-08 04:37:33.359625 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_pcie_dev_info, type:protect_reset, key: GroupPosition:PCIeDevice_01010D, name:UncorrectableError
2025-12-08 04:37:33.368028 persistence NOTICE: persistence_db_intf.lua(179): persist values, op_type:delete, tname:t_pcie_dev_info, type:protect_reset, key: GroupPosition:PCIeDevice_01010D, name:UCEByBIOS
2025-12-08 04:37:33.372579 persistence NOTICE: persistence_db_intf.lua(179): saving primary key, op_type:delete, type:protect_reset, key:GroupPosition:PCIeDevice_01010D, name:GroupPosition
2025-12-08 04:37:33.390141 persistence NOTICE: persistence_db_intf.lua(179): finish persist save for table: t_pcie_dev_info, persist_type: protect_reset, op_type: delete, data_size: 541
2025-12-08 04:37:33.817780 [:00000024] framework: May overload, message queue length = 1286
2025-12-08 04:37:36.645264 [:00000024] framework: May overload, message queue length = 1193
2025-12-08 04:37:40.165175 [:00000010] bmc_core: May overload, message queue length = 1386
2025-12-08 04:38:35.536797 [:00000000] hardware: A message from [ :00000000 ] to [ :0000000c ] maybe in an endless loop (version = 11498889)
2025-12-08 04:38:26.064982 [:00000000] ras: A message from [ :00000000 ] to [ :0000000d ] maybe in an endless loop (version = 5103941)
2025-12-08 04:38:24.741403 [:00000000] interface: A message from [ :00000000 ] to [ :0000000f ] maybe in an endless loop (version = 12962429)
2025-12-08 04:38:39.399219 [:00000000] remote_service: A message from [ :00000000 ] to [ :0000000c ] maybe in an endless loop (version = 41856753)
2025-12-08 04:38:39.399231 [:00000000] om: A message from [ :00000000 ] to [ :0000000c ] maybe in an endless loop (version = 51707289)
2025-12-08 04:38:37.795479 [:00000000] energy: A message from [ :00000000 ] to [ :0000000c ] maybe in an endless loop (version = 18575595)
2025-12-08 04:38:53.330460 [:00000000] bmc_core: A message from [ :00000000 ] to [ :00000012 ] maybe in an endless loop (version = 58349881)
2025-12-08 04:38:56.008805 [:00000000] interface: A message from [ :00000000 ] to [ :0000000f ] maybe in an endless loop (version = 12962429)
1970-01-01 08:00:21.383592 [:00000002] framework: LAUNCH snlua bootstrap
1970-01-01 08:00:21.433799 [:00000003] framework: LAUNCH snlua launcher
1970-01-01 08:00:21.443903 [:00000004] framework: LAUNCH snlua cdummy
1970-01-01 08:00:21.458911 [:00000005] framework: LAUNCH harbor 0 4
1970-01-01 08:00:21.464160 [:00000006] framework: LAUNCH snlua datacenterd
1970-01-01 08:00:21.484316 [:00000007] framework: LAUNCH snlua service_mgr
1970-01-01 08:00:21.504800 [:00000008] framework: LAUNCH snlua hica/subsys/framework/service/main
1970-01-01 08:00:21.524918 [:00000009] framework: LAUNCH snlua sd_bus
1970-01-01 08:00:21.968711 [:00000002] unknown: LAUNCH snlua bootstrap
1970-01-01 08:00:21.998437 [:00000003] unknown: LAUNCH snlua launcher
1970-01-01 08:00:22.018693 [:00000004] unknown: LAUNCH snlua cdummy
1970-01-01 08:00:22.028803 [:00000005] unknown: LAUNCH harbor 0 4
1970-01-01 08:00:22.038978 [:00000006] unknown: LAUNCH snlua datacenterd
1970-01-01 08:00:22.049091 [:00000007] unknown: LAUNCH snlua service_mgr
1970-01-01 08:00:22.069352 [:00000008] unknown: LAUNCH snlua sync_time/service/main
1970-01-01 08:00:22.090361 [:0000000a] framework: LAUNCH snlua harbor
1970-01-01 08:00:22.130797 [:0000000b] framework: LAUNCH snlua debug_console 0.0.0.0 40010
1970-01-01 08:00:22.133179 [:0000000c] framework: LAUNCH snlua persistence/service/main
1970-01-01 08:00:22.142433 [:0000000d] framework: LAUNCH snlua soctrl/service/main
1970-01-01 08:00:22.152394 [:0000000e] framework: LAUNCH snlua key_mgmt/service/main
1970-01-01 08:00:22.161951 [:0000000f] framework: LAUNCH snlua maca/service/main
1970-01-01 08:00:22.174142 [:00000010] framework: LAUNCH snlua hwproxy/service/main

感觉你这个像是被框架认为消息什么的死循环了,后面应该是框架重启服务了