系统管理-其他-PCIE卡页面槽位号和CPU归属信息不同步


在pcieaddrinfo对象中slotid、socketid是可以正常显示的,但是在pcie_device对象中slotid和socketid都是0,怀疑是没有成功同步。我们用的是非天池模式,静态配置sr文件,求助,有什么解决方法或者排查思路。

1 个赞

pcie_device加日志调试?
csr是怎么配的?SlotId、SocketId是引用还是同步?

这是我csr的配置

        "Connector_PCIe_17": {
            "Bom": "14140130",
            "Slot": 17,
            "Position": 17,
            "Presence": "<=/Scanner_Slot17Presence.Value|>expr($1 == 1? 0 : 1)",
            "Id": "19e5d500",
            "AuxId": "02000110",
            "Buses": [
                "I2cMux_9548Chan63"
            ],
            "SystemId": "${SystemId}",
            "ManagerId": "${ManagerId}",
            "ChassisId": "${ChassisId}",
            "SilkText": "PCIeSlot${Slot}",
            "IdentifyMode": 2,
            "Container": "Component_Switch",
            "Type": "PCIe"
        },
        "PcieAddrInfo_17": {
            "Location": "PCIeSlot${Slot}",
            "ComponentType": 8,
            "ContainerSlot": "${Slot}",
            "ContainerUID": "00000001040302024342",
            "ContainerUnitType": "IEU",
            "GroupPosition": "PcieAddrInfo_17_${GroupPosition}"
        },
      .............
        "BusinessConnector_23": {
            "Name": "Down_4",
            "Direction": "Downstream",
            "Slot": 18,
            "LinkWidth": "X16",
            "MaxLinkRate": "PCIe 5.0",
            "ConnectorType": "PCIe CEM",
            "UpstreamResources": [
            {
                "Name": "Up_130",
                "ID": 255,
                "Offset": 6,
                "Width": 2
            }
            ],
            "RefMgmtConnector": "#/Connector_PCIE_17",
            "RefPCIeAddrInfo": "#/PcieAddrInfo_17"
        },

这块没问题,看一下卡的配置,尤其是PCIeDevice



这是pcie_device和pcie_addrinfo的属性信息

https://gitcode.com/openUBMC/pcie_device/blob/main/src/lualib/device/class/pcie_device.lua#L162
在这个函数这里加点调试日志?这里是把PCIeAddrInfoPCIeDevice槽位同步的地方

大佬,经过调试,问题定位到下图代码段,通过观察日志也发现是有返回bdf信息的。
问题:为什么我的配置会存在RefMgmtConnector exists but has no path问题呢,我现在使用的pcie device是300i dou,使用csr文件是:14140130_19e5d500_02000110.sr?


日志部分打印了,此部分为通过function c_biz_topo:get_pcie_addr_info(type, slot, position)
匹配到了biz_conn

2025-06-30 14:21:52.847214 pcie_device NOTICE: device_loader.lua(508): [BizTopoLoader] type=PCIeCard, slot_id=1 - DeviceSSBDF=[0x00, 0x00, 0x0d, 0x00, 0x00]
2025-06-30 14:21:52.847446 pcie_device NOTICE: device_loader.lua(508): [BizTopoLoader] type=PCIeCard, slot_id=2 - DeviceSSBDF=[0x00, 0x00, 0x03, 0x00, 0x00]
2025-06-30 14:21:52.847837 pcie_device NOTICE: device_loader.lua(508): [BizTopoLoader] type=PCIeCard, slot_id=3 - DeviceSSBDF=[0x00, 0x00, 0x0d, 0x00, 0x00]
...
...
2025-06-30 14:21:54.945291 pcie_device NOTICE: device_loader.lua(124): [BizTopoLoader] Get id from PMU, vid:4096, did:690, sub_vid:4096, sub_did:260
2025-06-30 14:21:54.945524 pcie_device NOTICE: biz_topo.lua(831): [get_pcie_addr_info] start: type=PCIeCard, slot=14, position=nil
2025-06-30 14:21:54.947039 pcie_device NOTICE: biz_topo.lua(856): [get_pcie_addr_info] matched connector[2]
2025-06-30 14:21:54.947643 pcie_device NOTICE: device_loader.lua(570): [BizTopoLoader] addr_multi_presence=1, sys_id=1
2025-06-30 14:21:54.947970 pcie_device NOTICE: biz_topo.lua(831): [get_pcie_addr_info] start: type=PCIeCard, slot=15, position=nil
2025-06-30 14:21:54.961690 pcie_device NOTICE: biz_topo.lua(856): [get_pcie_addr_info] matched connector[10]
2025-06-30 14:21:54.963233 pcie_device NOTICE: device_loader.lua(570): [BizTopoLoader] addr_multi_presence=0, sys_id=1
2025-06-30 14:21:54.965584 pcie_device NOTICE: biz_topo.lua(877): [get_mgmt_connector] start: type=PCIeCard, slot=14, position=nil
2025-06-30 14:21:54.969380 pcie_device NOTICE: biz_topo.lua(885): [get_mgmt_connector] get_pcie_addr_info success
2025-06-30 14:21:54.969866 pcie_device NOTICE: biz_topo.lua(897): [get_mgmt_connector] RefMgmtConnector exists but has no path
2025-06-30 14:21:54.972339 pcie_device NOTICE: biz_topo.lua(877): [get_mgmt_connector] start: type=PCIeCard, slot=15, position=nil
2025-06-30 14:21:54.982319 pcie_device NOTICE: biz_topo.lua(885): [get_mgmt_connector] get_pcie_addr_info success
2025-06-30 14:21:54.982574 pcie_device NOTICE: biz_topo.lua(897): [get_mgmt_connector] RefMgmtConnector exists but has no path
2025-06-30 14:21:54.993383 pcie_device NOTICE: device_loader.lua(124): [BizTopoLoader] Get id from PMU, vid:4096, did:690, sub_vid:4096, sub_did:260
2025-06-30 14:21:54.993655 pcie_device NOTICE: biz_topo.lua(831): [get_pcie_addr_info] start: type=PCIeCard, slot=16, position=nil
2025-06-30 14:21:55.006525 pcie_device NOTICE: biz_topo.lua(856): [get_pcie_addr_info] matched connector[8]
2025-06-30 14:21:55.008146 pcie_device NOTICE: device_loader.lua(570): [BizTopoLoader] addr_multi_presence=1, sys_id=1
2025-06-30 14:21:55.008656 pcie_device NOTICE: biz_topo.lua(831): [get_pcie_addr_info] start: type=PCIeCard, slot=17, position=nil
2025-06-30 14:21:55.023969 pcie_device NOTICE: biz_topo.lua(856): [get_pcie_addr_info] matched connector[12]
2025-06-30 14:21:55.029746 pcie_device NOTICE: device_loader.lua(570): [BizTopoLoader] addr_multi_presence=0, sys_id=1
2025-06-30 14:21:55.033952 pcie_device NOTICE: biz_topo.lua(877): [get_mgmt_connector] start: type=PCIeCard, slot=16, position=nil
2025-06-30 14:21:55.045875 pcie_device NOTICE: biz_topo.lua(885): [get_mgmt_connector] get_pcie_addr_info success

部分打印了,这部分是通过业务连接器没有匹配到指定槽位的PCIeAddrInfo对象相关的打印:

2025-06-30 14:21:55.969449 pcie_device NOTICE: biz_topo.lua(881): [get_mgmt_connector] failed to get_pcie_addr_info: 2
2025-06-30 14:21:55.969802 pcie_device NOTICE: device_loader.lua(162): [BizTopoLoader] not ok
2025-06-30 14:21:55.970154 pcie_device NOTICE: biz_topo.lua(877): [get_mgmt_connector] start: type=PCIeCard, slot=22, position=nil
2025-06-30 14:21:56.005530 pcie_device NOTICE: biz_topo.lua(870): [get_pcie_addr_info] connectors with position found, but no matching PCIeAddrInfo (type=PCIeCard, slot=22)
2025-06-30 14:21:56.005745 pcie_device NOTICE: biz_topo.lua(881): [get_mgmt_connector] failed to get_pcie_addr_info: 2
2025-06-30 14:21:56.006215 pcie_device NOTICE: device_loader.lua(162): [BizTopoLoader] not ok
2025-06-30 14:21:56.007098 pcie_device NOTICE: biz_topo.lua(877): [get_mgmt_connector] start: type=PCIeCard, slot=23, position=nil
2025-06-30 14:21:56.042527 pcie_device NOTICE: biz_topo.lua(870): [get_pcie_addr_info] connectors with position found, but no matching PCIeAddrInfo (type=PCIeCard, slot=23)
2025-06-30 14:21:56.043245 pcie_device NOTICE: biz_topo.lua(881): [get_mgmt_connector] failed to get_pcie_addr_info: 2
2025-06-30 14:21:56.043842 pcie_device NOTICE: device_loader.lua(162): [BizTopoLoader] not ok
2025-06-30 14:21:56.044634 pcie_device NOTICE: biz_topo.lua(877): [get_mgmt_connector] start: type=PCIeCard, slot=24, position=nil
2025-06-30 14:21:56.093203 pcie_device NOTICE: biz_topo.lua(870): [get_pcie_addr_info] connectors with position found, but no matching PCIeAddrInfo (type=PCIeCard, slot=24)
2025-06-30 14:21:56.093369 pcie_device NOTICE: biz_topo.lua(881): [get_mgmt_connector] failed to get_pcie_addr_info: 2
2025-06-30 14:21:56.093813 pcie_device NOTICE: device_loader.lua(162): [BizTopoLoader] not ok
2025-06-30 14:21:56.094737 pcie_device NOTICE: biz_topo.lua(877): [get_mgmt_connector] start: type=PCIeCard, slot=25, position=nil
2025-06-30 14:21:56.114714 pcie_device NOTICE: biz_topo.lua(881): [get_mgmt_connector] failed to get_pcie_addr_info: 2
2025-06-30 14:21:56.114877 pcie_device NOTICE: device_loader.lua(162): [BizTopoLoader] not ok
2025-06-30 14:21:56.115530 pcie_device NOTICE: biz_topo.lua(877): [get_mgmt_connector] start: type=PCIeCard, slot=26, position=nil

这个指的是IEU上面的BusinessConnectorRefMgmtConnector没有引用对应的Connector对象。

1 个赞

但是我的BusinessConnector_8是有"RefMgmtConnector": "#/Connector_PCIE_6"属性的,并且这个Connector_PCIE_6是在位的,有什么解决思路吗?

你这个问题有点奇怪的,Connector_PCIE_6在hwdiscovery资源树上能看得到对吧
估计要看一下完整日志了
@caiyesheng_b48v3 帮忙看看

可以看到Connector_PCIE_6的

追踪到相关日志打印dump_info/LogDump/hw_stream.log:

1970-01-01 08:00:30.869311 hwdiscovery WARNING: analyse.lua(123): get reference object failed, object: Connector_PCIE_6

1970-01-01 08:00:30.869707 hwdiscovery WARNING: analyse.lua(408): position: 01010101, ignore property: RefMgmtConnector, value: #/Connector_PCIE_6, object: BusinessConnector_8_01010101,error: analyse property failed

1970-01-01 08:00:30.870275 hwdiscovery WARNING: analyse.lua(123): get reference object failed, object: Connector_PCIE_0

1970-01-01 08:00:30.870500 hwdiscovery WARNING: analyse.lua(408): position: 01010101, ignore property: RefMgmtConnector, value: #/Connector_PCIE_0, object: BusinessConnector_5_01010101,error: analyse property failed

wangzhuwei3@h-partners.com 日志转发我一份。

话题时间过长,此贴关闭,若有新问题再发帖咨询。