// 此模板仅供参考,如果不适用可以修改
问题描述
DC稳定性测试中概率出现无法获取Raid卡温度和Disks Temp温度、无法获取网卡温度
环境信息
-
操作系统:[如 Ubuntu 24.04]
-
软件版本:OpenUBMC2512
-
硬件配置:[如 CPU、内存等]
重现步骤
-
在系统下执行Power Cycle命令
-
系统启动完成进入系统
-
查询传感器读值
期望结果
传感器无异常
实际结果
第2次power cycle时,无法获取Raid卡温度和Disks Temp温度,并且有无法获取网卡温度告警
![]()
![]()
怀疑链路出现问题
app日志有如下打印,get_pd_list failed ret = 4098
![]()
2026-02-26 11:24:28.036599 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:24:28.036650 hardware ERROR: hs_pd.c(102): histore_get_ctrl_pd_list get_pd_list failed ret = 4098
2026-02-26 11:24:28.038104 storage ERROR: controller_object.lua(947): get_ctrl_pd_list failed and return ./opt/bmc/apps/storage/lualib/sml/init.lua:79: 4098
2026-02-26 11:24:38.997641 general_hardware NOTICE: fructl_handler.lua(76): get_power_state: system[1] get power power ON
2026-02-26 11:24:53.073144 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:24:53.073194 hardware ERROR: hs_ctrl.c(1308): get_bbu_info failed, ret = 4098
2026-02-26 11:24:58.200837 unknown_service NOTICE: write_service.lua(33): mctp_write_service failed, count: 4, err: write failed, fd = 23, Timer expired
2026-02-26 11:25:18.093377 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:25:18.093538 hardware ERROR: hs_pd.c(102): histore_get_ctrl_pd_list get_pd_list failed ret = 4098
2026-02-26 11:25:18.094814 storage ERROR: controller_object.lua(947): get_ctrl_pd_list failed and return ./opt/bmc/apps/storage/lualib/sml/init.lua:79: 4098
2026-02-26 11:25:19.219253 general_hardware NOTICE: fructl_handler.lua(76): get_power_state: system[1] get power power ON
2026-02-26 11:25:28.402880 unknown_service NOTICE: write_service.lua(33): mctp_write_service failed, count: 3, err: write failed, fd = 23, Timer expired [repeated 5 times in 297s from 2026-02-26 11:19:15.480722 to 2026-02-26 11:24:12.840341][flush]
2026-02-26 11:25:32.291580 bmc_network NOTICE: dhcp_process.lua(1255): the field composed by dhcpv6 vendor class is empty string
2026-02-26 11:25:32.295989 bmc_network NOTICE: dhcp_process.lua(1016): eth2: no available prefix info, send_rs_and_parse_ra now
2026-02-26 11:25:43.111432 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:25:43.111484 hardware ERROR: hs_ctrl.c(385): histore_get_ctrl_info get_ctrl_info failed, ret = 4098
2026-02-26 11:25:59.287451 general_hardware NOTICE: fructl_handler.lua(76): get_power_state: system[1] get power power ON
2026-02-26 11:26:08.135842 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:26:33.160625 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:26:33.160674 hardware ERROR: hs_pd.c(102): histore_get_ctrl_pd_list get_pd_list failed ret = 4098
2026-02-26 11:26:33.161477 storage ERROR: controller_object.lua(947): get_ctrl_pd_list failed and return ./opt/bmc/apps/storage/lualib/sml/init.lua:79: 4098
2026-02-26 11:26:39.347815 general_hardware NOTICE: fructl_handler.lua(76): get_power_state: system[1] get power power ON
2026-02-26 11:26:46.284816 bmc_network NOTICE: dhcp_process.lua(1255): the field composed by dhcpv6 vendor class is empty string
2026-02-26 11:26:46.290582 bmc_network NOTICE: dhcp_process.lua(1016): eth2: no available prefix info, send_rs_and_parse_ra now
2026-02-26 11:26:58.186340 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:26:58.186392 hardware ERROR: hs_ctrl.c(1308): get_bbu_info failed, ret = 4098
2026-02-26 11:27:04.201009 unknown_service NOTICE: write_service.lua(33): mctp_write_service failed, count: 1, err: write failed, fd = 23, Timer expired [repeated 45 times in 308s from 2026-02-26 11:21:56.760579 to 2026-02-26 11:27:04.201009]
2026-02-26 11:27:19.562302 general_hardware NOTICE: fructl_handler.lua(76): get_power_state: system[1] get power power ON
2026-02-26 11:27:23.210398 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:27:23.211368 storage ERROR: controller_object.lua(1127): get_ctrl_ld_list failed and return ./opt/bmc/apps/storage/lualib/sml/init.lua:79: 4098
2026-02-26 11:27:24.412767 event NOTICE: events.lua(106): System minor count change 0 to 1 by [Event_TempFail_0101010103].
2026-02-26 11:27:24.414505 event NOTICE: events.lua(127): System Health change Normal to Minor.
2026-02-26 11:27:24.434997 event NOTICE: hardware_event.lua(570): Event_TempFail_0101010103|{"value":[1],"source":{"properties":[{"Interface":"bmc.kepler.Systems.NetworkAdapter","Path":"/bmc/kepler/Systems/1/NetworkAdapters/NetworkAdapter_1_0101010103","Service":"bmc.kepler.network_adapter","Property":"TemperatureStatus"}]},"type":"synchronization"}
2026-02-26 11:27:24.548909 event NOTICE: abstract_event.lua(241): [Event_TempFail_0101010103] generate an event [assert] while Reading change to [1].
2026-02-26 11:27:24.555878 redfish NOTICE: alarm.lua(600): received a event[6-0].
2026-02-26 11:27:24.557276 event_policy NOTICE: synchronizer.lua(252): received a event[6-0].
2026-02-26 11:27:48.240541 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:27:48.240588 hardware ERROR: hs_pd.c(102): histore_get_ctrl_pd_list get_pd_list failed ret = 4098
2026-02-26 11:27:48.241422 storage ERROR: controller_object.lua(947): get_ctrl_pd_list failed and return ./opt/bmc/apps/storage/lualib/sml/init.lua:79: 4098
2026-02-26 11:27:59.630960 general_hardware NOTICE: fructl_handler.lua(76): get_power_state: system[1] get power power ON
2026-02-26 11:27:59.640350 unknown_service NOTICE: write_service.lua(33): mctp_write_service failed, count: 2, err: write failed, fd = 23, Timer expired [repeated 12 times in 312s from 2026-02-26 11:22:47.160320 to 2026-02-26 11:27:59.640350]
2026-02-26 11:28:00.148601 bmc_network NOTICE: dhcp_process.lua(1255): the field composed by dhcpv6 vendor class is empty string
2026-02-26 11:28:00.149933 bmc_network NOTICE: dhcp_process.lua(1016): eth2: no available prefix info, send_rs_and_parse_ra now
2026-02-26 11:28:13.260769 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:28:13.260822 hardware ERROR: hs_ctrl.c(385): histore_get_ctrl_info get_ctrl_info failed, ret = 4098
2026-02-26 11:28:38.282263 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:28:38.282318 hardware ERROR: hs_ctrl.c(497): histore_get_ctrl_phy_err_count failed, CtrlId = 0, return 0x1002
2026-02-26 11:28:39.783547 general_hardware NOTICE: fructl_handler.lua(76): get_power_state: system[1] get power power ON
2026-02-26 11:29:03.300628 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:29:03.300683 hardware ERROR: hs_ctrl.c(1308): get_bbu_info failed, ret = 4098
2026-02-26 11:29:09.324152 sensor NOTICE: sensor_instance.lua(1170): sensor [ThresholdSensor_CPU1TADVFS_010101] threshold capability can not be supported. [repeated 33 times in 304s from 2026-02-26 11:24:05.627951 to 2026-02-26 11:29:09.324152]
2026-02-26 11:29:09.324435 sensor NOTICE: sensor_instance.lua(1174): sensor [ThresholdSensor_CPU1TADVFS_010101] hysteresis capability can not be supported. [repeated 33 times in 304s from 2026-02-26 11:24:05.628231 to 2026-02-26 11:29:09.324435]
2026-02-26 11:29:14.163404 bmc_network NOTICE: dhcp_process.lua(1255): the field composed by dhcpv6 vendor class is empty string
2026-02-26 11:29:14.163908 bmc_network NOTICE: dhcp_process.lua(1016): eth2: no available prefix info, send_rs_and_parse_ra now
2026-02-26 11:29:19.821470 general_hardware NOTICE: fructl_handler.lua(76): get_power_state: system[1] get power power ON
2026-02-26 11:29:25.320651 unknown_service NOTICE: write_service.lua(33): mctp_write_service failed, count: 3, err: write failed, fd = 23, Timer expired
2026-02-26 11:29:28.320621 hardware ERROR: hs_misc.c(678): process_histore_cmd: process_histore_cmd return error 0xffffffff
2026-02-26 11:29:28.320674 hardware ERROR: hs_pd.c(102): histore_get_ctrl_pd_list get_pd_list failed ret = 4098
2026-02-26 11:29:28.321695 storage ERROR: controller_object.lua(947): get_ctrl_pd_list failed and return ./opt/bmc/apps/storage/lualib/sml/init.lua:79: 4098
framework日志,hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:10:17.163508 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:10:42.192742 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:11:07.217134 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:11:32.242057 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:11:57.271235 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:12:22.319395 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:12:47.340984 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:13:12.359893 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:13:37.381646 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:14:02.400865 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:14:27.423956 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:14:52.444198 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:15:17.465403 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:15:42.498017 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:16:07.523149 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:16:32.550662 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:16:57.585729 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:17:22.608165 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:17:47.637705 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:18:12.662965 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:18:37.680945 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:19:02.711165 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:19:27.733222 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:19:52.761220 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:20:17.784662 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:20:42.809641 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:21:07.834126 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:21:32.855939 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:21:57.875472 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:22:22.900779 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:22:47.931385 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:23:12.960325 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:23:37.980885 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:24:03.007825 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:24:28.036950 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:24:53.073477 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:25:18.093376 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:25:43.111768 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:26:08.136177 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:26:33.160967 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:26:58.186878 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:27:23.210744 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:27:48.240897 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:28:13.261200 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:28:38.282647 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:29:03.300973 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:29:28.320969 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
2026-02-26 11:29:53.350925 [:0000000f] hardware: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
尝试过的解决方案
再次执行Power Cycle命令,恢复正常
完整日志见附件
R410KV2_2102315PFSD9S1100004_20260226-1213_网卡温度告警.tar.gz.txt (6.1 MB)