M.2 NVMe硬盘是适配问题

关联帖子:如何适配板载硬盘

需求:
将iBMA采集的NVMe硬盘温度同步到NVMe对象、Drive对象,实现相关传感器和散热控制功能

问题:
iBMA更新NVMe硬盘信息时没有找到NVMe对象,c_nvme.collection中没有对象,CSR需要增加什么配置吗

CSR配置

    "Drive_1": {
      "Id": 0,
      "Name": "Disk0",
      "PhysicalLocation": "HDD Plane",
      "NodeId": "HDDPlaneDisk0",
      "Presence": 1,
      "LocateLed": "<=/Scanner_Drive0LocateAccessor.Value",
      "FaultLed": "<=/Scanner_Drive0FalutAccessor.Value",
      "ActivationLed": "<=/Scanner_Drive0ActivationAccessor.Value",
      "TemperatureCelsius": 255,
      "Missing": 0,
      "Health": 0,
      "RebuildState": 0,
      "FirmwareStatus": 255,
      "PredictiveFailure": 0,
      "InAFailedArray": 0,
      "FirmwareStatusError": false,
      "@Default": {
        "PredictedMediaLifeLeftPercent": 255
      },
      "SerialNumber": "",
      "IODeteriorationHealthCode": 0
    },
    "PcieAddrInfo_NVMe_1": {
      "Segment": 0,
      "GroupID": 1,
      "SlotID": 2,
      "SocketID": 0,
      "PortID": 18,
      "Bus": 7,
      "Device": 0,
      "Function": 0,
      "ComponentType": 2,
      "ControllerIndex": 0,
      "ControllerType": 2,
      "ContainerSlot": 1,
      "ContainerUID": "00000001030302021234",
      "ContainerUnitType": "EXU",
      "Location": "EXUBoard1",
      "GroupPosition": "PcieAddrInfo_NVMe_1_${GroupPosition}"
	},

silkconfig.json的DiskSilk信息

"DiskSilk":[{"ControlId":1,"PhyId":12,"SlotId":0,"RootBDF":"0000:07:00.0","SocketId":0}]

host_agent对象

~ ~ # busctl --user introspect bmc.kepler.host_agent /bmc/kepler/Systems/1/Sms/1/ComputerSystem/Systems/1/Storage/1/PCIE_5FSSD/0000_3A00_3A12_2E0_5F0000_3A07_3A00_2E0
NAME                                TYPE      SIGNATURE RESULT/VALUE                             FLAGS
bmc.kepler.sms                      interface -         -                                        -
._40odata_2Econtext                 property  v         s "/redfish/v1/$metadata#Systems/Member… -
._40odata_2Eid                      property  v         s "/bmc/kepler/Systems/1/Sms/1/Computer… -
._40odata_2Etype                    property  v         s "#Storage.v1_0_0.OemPCIE_SSD"          -
bmc.kepler.sms.redfish              interface -         -                                        -
.CapableSpeedGbs                    property  v         x 32                                     -
.CapacityBytes                      property  v         x 0                                      -
.Description                        property  v         s "Device 1d79:2263 (rev 03) (prog-if 0… -
.DeviceID                           property  v         s "0x2263"                               -
.DeviceLocation                     property  v         s "null"                                 -
.DeviceName                         property  v         s "null"                                 -
.DeviceSilkScreen                   property  v         s "null"                                 -
.FirmwareVersion                    property  v         s "X0122B3"                              -
.Id                                 property  v         s "0000:00:12.0_0000:07:00.0"            -
.Manufacturer                       property  v         s "null"                                 -
.MediaType                          property  v         s "SSD"                                  -
.Model                              property  v         s "TS128GMTE672A-VS1"                    -
.Name                               property  v         s "nvme0"                                -
.NegotiatedSpeedGbs                 property  v         x 32                                     -
.Protocol                           property  v         s "NVME"                                 -
.SerialNumber                       property  v         s "J200260993"                           -
.Status                             property  v         s "healthy"                              -
.SubsystemDeviceID                  property  v         s "0x2263"                               -
.SubsystemVendorID                  property  v         s "0x1d79"                               -
.VendorID                           property  v         s "0x1d79"                               -
._40odata_2Econtext                 property  v         s "/redfish/v1/$metadata#Systems/Member… -
._40odata_2Eid                      property  v         s "/bmc/kepler/Systems/1/Sms/1/Computer… -
._40odata_2Etype                    property  v         s "#Storage.v1_0_0.OemPCIE_SSD"          -
bmc.kepler.sms.redfish.BDFNumber    interface -         -                                        -
.BDF                                property  v         s "0000:07:00.0"                         -
.RootBDF                            property  v         s "0000:00:12.0"                         -
bmc.kepler.sms.redfish.DriverInfo   interface -         -                                        -
.DriverName                         property  v         s "nvme"                                 -
.DriverVersion                      property  v         s "1.0"                                  -
bmc.kepler.sms.redfish.SMARTInfo    interface -         -                                        -
.AvailableSpare                     property  v         x 100                                    -
.AvailableSpareThreshold            property  v         x 10                                     -
.ControllerBusyTime                 property  v         x 1291                                   -
.CriticalWarning                    property  v         x 0                                      -
.DataUnitsRead                      property  v         d 62778.8                                -
.DataUnitsWritten                   property  v         d 562580                                 -
.HostReadCommands                   property  v         x 855665                                 -
.HostWriteCommands                  property  v         x 15884788                               -
.MediaErrorCount                    property  v         x 0                                      -
.NumberOfErrorInfoLogEntries        property  v         x 0                                      -
.PercentageUsed                     property  v         x 0                                      -
.PeriodWriteCount                   property  v         s "null"                                 -
.PowerCycles                        property  v         x 111                                    -
.PowerOnHours                       property  v         x 1373                                   -
.Temperature                        property  v         x 43                                     -
.UnsafeShutdowns                    property  v         x 109                                    -
org.freedesktop.DBus.Introspectable interface -         -                                        -
.Introspect                         method    -         s                                        -
org.freedesktop.DBus.ObjectManager  interface -         -                                        -
.GetManagedObjects                  method    -         a{oa{sa{sv}}}                            -
org.freedesktop.DBus.Peer           interface -         -                                        -
.GetMachineId                       method    -         s                                        -
.Ping                               method    -         -                                        -
org.freedesktop.DBus.Properties     interface -         -                                        -
.Get                                method    ss        v                                        -
.GetAll                             method    s         a{sv}                                    -
.Set                                method    ssv       -                                        -
.PropertiesChanged                  signal    sa{sv}as  -                                        -

c_nvme.collection中所有的对象打印一下呢,加载nvme对象是否识别到这个盘是nvme盘了

c_nvme.collection中没有对象,这里的对象是需要配置还是满足什么条件才会添加?
smbios信息中这个设备被当做了PCIeCard类型,是不是应该作为Disk类型?
image
可以麻烦确认下嘛 @mao_0v0_q7rci

nvme盘的话还要配置 BusinessConnector和Connector_ComVPDConnect对象。
可以参考14100665_00000001030302023936.sr

不支持VPD带外管理的盘,还需要配置这些吗

估计是需要的,storage我找不到触发下面这个回调的地方,可能是在框架层触发的。

猜测可能是框架根据connector的type来触发的?

打印了find流程的obj内容,没看到BDF信息,要如何得到呢

c_handler_nvme:find_object obj type: table
c_handler_nvme:find_object obj: PCIeLinkSpeed = 0
c_handler_nvme:find_object obj: on_add_object_complete = table: 0x66ba9c137fd8
c_handler_nvme:find_object obj: NegotiatedSpeedGbs = 255
c_handler_nvme:find_object obj: CapableSpeedGbs = 255
c_handler_nvme:find_object obj: last_pre_fail = 0
c_handler_nvme:find_object obj: last_fail = 0
c_handler_nvme:find_object obj: on_smbios_status_changed = table: 0x66ba9c136c10
c_handler_nvme:find_object obj: Status = 255
c_handler_nvme:find_object obj: update_time = 3
c_handler_nvme:find_object obj: on_presence_changed = table: 0x66ba9c136bc8
c_handler_nvme:find_object obj: PowerOnHours = 4294967295
c_handler_nvme:find_object obj: protocol = 255
c_handler_nvme:find_object obj: nvme_mi_mctp_obj = false
c_handler_nvme:find_object obj: pre_failure_debounce = table: 0x66bab42da188
c_handler_nvme:find_object obj: uuid_index = 0
c_handler_nvme:find_object obj: ManufacturerId = 4294967295
c_handler_nvme:find_object obj: ssd_form_factor = false
c_handler_nvme:find_object obj: on_property_changed = table: 0x66ba9d79c0d8
c_handler_nvme:find_object obj: __slots = table: 0x66bab479dbe8
c_handler_nvme:find_object obj: link_fault = false
c_handler_nvme:find_object obj: __dev_obj = false
c_handler_nvme:find_object obj: __mdb_obj = table: 0x66bab2fdf750
c_handler_nvme:find_object obj: SpareBlockPercentage = 255
c_handler_nvme:find_object obj: __position = 010101080101
c_handler_nvme:find_object obj: MediaErrorCount = 4294967295
c_handler_nvme:find_object obj: init_state = true
c_handler_nvme:find_object obj: __task_count = 0
c_handler_nvme:find_object obj: support_hw_defined_smart_log = 255
c_handler_nvme:find_object obj: Manufacturer = N/A
c_handler_nvme:find_object obj: CapacityMiB = 4294967295
c_handler_nvme:find_object obj: __rowid = 1
c_handler_nvme:find_object 4 nil

不知道你有没有配置过pcie_device丝印。
BDF号要通过和bios进行丝印文件交互后获取:

有PCIeDevice对象,是不是要手动指定BDF、root BDF?

~ ~ # mdbctl lsprop PCIeDevice_1_010101080101
bmc.kepler.Object.Properties
  ClassName="PCIeDevice"
  ObjectIdentifier=[1,"1","","010101080101"]
  ObjectName="PCIeDevice_1_010101080101"
  TraceSamplingRate=0
bmc.kepler.Systems.PCIeDevices.PCIeDevice
  BandwidthReduction=0
  BaseClassCode=0
  Bus=0
  DevBus=0
  DevDevice=0
  DevFunction=0
  Device=0
  DeviceName="Disk2"
  DiagnosticFault=0
  FaultByBios=0
  Function=0
  FunctionClass=4
  FunctionProtocol=""
  FunctionType=""
  LinkSpeedReduced=0
  MaxPCIeType=""
  NegotiatedPCIeType=""
  PCIeDeviceType=""
  PCIeType=""
  Position=""
  PredictiveFault=0
  ProgrammingInterface=0
  Segment=0
  SlotID=0
  SlotType=""
  SocketID=0
  SubClassCode=0
  UCEByBIOS=0
bmc.kepler.Systems.PCIeDevices.PCIeDevice.RAS
  CorrectableError=0
  CorrectableErrorOverfrequencyCount=0
  FatalError=0
  FatalErrorCount=0
  NonFatalErrorCount=0
  ParityError=0
  SystemError=0
  TimeoutError=0
  UncorrectableError=0
Private
  Container=""
  DeviceType=8
  GroupPosition="PCIeDevice_010101080101"
  MultihostPresence=0
  RefComponent=""

是的,重点是配置PcieAddrInfo_NVMe_xx的PortID、SlotID、SocketID属性

重启BIOS后会收到BIOS设置过来的PcieDiskBDF。
有几个问题:
1 PcieDiskBDF怎么更新到NVMe的obj上面,用于find_object的BDF匹配,可以给出代码位置吗?
2 看流程是BIOS启动时才会上报PcieDiskBDF,单独重启BMC拿不到这部分数据,有什么影响?

当前NVMe的obj依赖NVMe-MI更新温度,如果想用BMA上报的温度进行更新,需要修改update流程,在c_handler_nvme:update_smart_info流程直接更新NVMe对象的温度属性报错:

c_handler_nvme:update_smart_info Assignment failed: kepler.class.SetSyncPropertyError: The property TemperatureCelsius of the object Nvme_1_010101080101 is a synchronous property and cannot be set

报错原因是CSR中TemperatureCelsius 使用了同步语法:

        "Nvme_1": {
            "Slot": "${Slot}",
            "TemperatureCelsius": "<=/Scanner_Temp.Value",
            "MediaType": 1,
            "Protocol": 6,
            "PredictedMediaLifeLeftPercent": "<=/Scanner_Remtime.Value",
            "RefComponent": "#/Component_PCIeCard1",
            "Failure": "<=/Scanner_SSD_Fault.Value",
            "PredictiveFailure": "<=/Scanner_SSD_Pre_Fault.Value",
            "VPDChip": "#/Chip_Virtual_SSD",
            "SSDChip": "#/Chip_SSD"
        }

修改为固定值就可以实现iBMA信息对NVMe对象温度属性的更新:

        "Nvme_1": {
            "Slot": "${Slot}",
            "TemperatureCelsius": 0,
            "MediaType": 1,
            "Protocol": 6,
            "PredictedMediaLifeLeftPercent": "<=/Scanner_Remtime.Value",
            "RefComponent": "#/Component_PCIeCard1",
            "Failure": "<=/Scanner_SSD_Fault.Value",
            "PredictiveFailure": "<=/Scanner_SSD_Pre_Fault.Value",
            "VPDChip": "#/Chip_Virtual_SSD",
            "SSDChip": "#/Chip_SSD"
        }

nvme获取bdf的函数在get_pcie_device_info。与controller不同,controller是同步语法直接同步bdf信息。但是nvme在开发时,nvme对象所在sr没有pcie的对象,导致无法直接同步。所以根据的是层级加载的逻辑来匹配,pcie设备对象比nvme对象先加载一级,通过position匹配对象


在endpoint创建时拿到BDF,但是硬盘实际是不支持NVMe-MI的,check_support_mctp会失败导致endpoint无法创建吧

是的,如果单独获取nvme盘的bdf可以通过get_pcie_device_info函数。当前是nvme-mi才使用了