PSR配置导致BMC升级失败

问题描述

PSR有如下配置会导致BMC升级失败,删除配置后可以升级成功

      "ProductName": "",
      "ProductAlias": "",
      "DefaultProductAlias": "",
      "ProductPicture": "",
      "ProductUniqueID": "",
      "ProductVendorID": "",
      "ProductId": 0,
      "ProductVersion": "",
      "DeviceOwnerID": "<=/FruData_Fru0.ProductSerialNumber",
      "DeviceSerialNumber": "<=/FruData_Fru0.SystemSerialNumber",
      "DeviceName": "<=/FruData_Fru0.SystemProductName"
    },
    "Contact_1": {
      "OfficalWeb": "",
      "SupportWeb": "",
      "Email": "",
      "Phone": "",
      "Copyright": "",
      "DefaultCopyright": "",
      "KVMClientDownloadLink": "",
      "QRCodeSupported": false
    },
    "Dimension_1": {
      "HeightU": 1
    }

升级失败的错误日志

2026-03-22 16:58:06.590503 firmware_mgmt NOTICE: active_fructl.lua(95): get host type is Singlehost
2026-03-22 16:58:06.592041 firmware_mgmt NOTICE: utils.lua(36): The file path is Local.
2026-03-22 16:58:06.598892 firmware_mgmt NOTICE: init.lua(79): Upgrading_Flag is true
2026-03-22 16:58:06.634233 firmware_mgmt NOTICE: init.lua(40): update status to FS_SIMPLE_UPGRADING.
2026-03-22 16:58:06.657593 firmware_mgmt NOTICE: task_mgmt.lua(287): Create task[Id: 1211934748, StartTime: 2026-03-22T16:58:06+08:00, Progress: 0, State: New] successfully
2026-03-22 16:58:06.662757 firmware_mgmt NOTICE: task_service.lua(59): task create success, task id: 1211934748
2026-03-22 16:58:06.663290 firmware_mgmt NOTICE: task_id_mgmt.lua(30): add serial task id(1211934748) successfully
2026-03-22 16:58:06.664545 firmware_mgmt NOTICE: tasks_scheduling.lua(121): start tasks processer
2026-03-22 16:58:06.759941 firmware_mgmt NOTICE: task_mgmt.lua(418): Update task[Id: 1211934748, StartTime: 2026-03-22T16:58:06+08:00, Progress: 0, State: Running] successfully
2026-03-22 16:58:06.814665 firmware_mgmt NOTICE: file_transfer.lua(141): start to move file [rootfs_openUBMC.hpm] from tmp to shm
2026-03-22 16:58:07.399124 firmware_mgmt NOTICE: file_transfer.lua(146): move_file_s ok:true, err:0
2026-03-22 16:58:09.795654 firmware_mgmt NOTICE: validate_sign.lua(195): verify signature successfully
2026-03-22 16:58:09.796908 firmware_mgmt NOTICE: action.lua(37): Validate signature successfully
2026-03-22 16:58:09.799150 firmware_mgmt NOTICE: hpm_package.lua(757): ManufacturerValidateEnabled is false, there is no need to validate manufacture_id.
2026-03-22 16:58:09.953728 firmware_mgmt NOTICE: hpm_package.lua(562): parse cfg file successfully, Version:25.06.01.07 FileNum:2
2026-03-22 16:58:09.954211 firmware_mgmt NOTICE: hpm_package.lua(450): get obj table: 0x4af25c1fb328 for Id=25
2026-03-22 16:58:09.956094 firmware_mgmt NOTICE: hpm_package.lua(457): get obj table: 0x4af25c1fb328 for Id=25
2026-03-22 16:58:09.957808 firmware_mgmt NOTICE: hpm_package.lua(468): get obj table: 0x4af25c1fb328 for Id=25
2026-03-22 16:58:09.958381 firmware_mgmt NOTICE: hpm_package.lua(415): System product info:ProductId(0), ProductVendorID(), ProductUniqueID()
2026-03-22 16:58:09.958710 firmware_mgmt NOTICE: hpm_package.lua(329): check product id successfully
2026-03-22 16:58:09.959424 firmware_mgmt ERROR: task_instance.lua(159): parse package(rootfs_openUBMC.hpm) failed, ret:./opt/bmc/libmc/lualib/mc/utils.lua:1215: ./opt/bmc/libmc/lualib/mc/logging.lua:253: bad argument #2 to ‘format_msg’ (number expected, got nil).
2026-03-22 16:58:10.050052 firmware_mgmt NOTICE: task_mgmt.lua(418): Update task[Id: 1211934748, StartTime: 2026-03-22T16:58:06+08:00, Progress: 0, State: Exception] successfully
2026-03-22 16:58:10.185230 firmware_mgmt ERROR: state_handle.lua(115): not find state obj
2026-03-22 16:58:11.299110 firmware_mgmt NOTICE: tasks_scheduling.lua(126): upgrade queue is empty, exit the tasks processer
2026-03-22 16:58:11.299566 firmware_mgmt NOTICE: tasks_scheduling.lua(138): stop tasks processer
2026-03-22 16:58:11.301086 firmware_mgmt NOTICE: active_fructl.lua(95): get host type is Singlehost
2026-03-22 16:58:11.301490 firmware_mgmt NOTICE: active_single_host_fructrl.lua(61): active_single_host_fructrl fructrl get power status
2026-03-22 16:58:11.306161 firmware_mgmt NOTICE: state_simple_upgrading.lua(101): simple upgraded, current active mode is:nil, wait restart seconds:360000
2026-03-22 16:58:11.310380 firmware_mgmt NOTICE: init.lua(40): update status to FS_IDLE.
2026-03-22 16:58:11.311682 firmware_mgmt NOTICE: init.lua(79): Upgrading_Flag is false
2026-03-22 16:58:25.298324 firmware_mgmt NOTICE: task_id_mgmt.lua(40): delete serial task id(1211934748) successfully
2026-03-22 16:58:25.298663 firmware_mgmt NOTICE: task_id_mgmt.lua(42): del serial task id(1211934748) successfully
2026-03-22 16:58:25.298992 firmware_mgmt NOTICE: task_instance.lua(100): destroy task info(id:1211934748) successfully

配置为什么会影响升级过程?

答案内容由AI生成,仅供参考:

总结与建议

项目 说明
问题本质 PSR中存在空值字段与动态引用,在BMC升级早期环境未完全初始化时,导致类型错误或解析失败
故障触发点 固件包解析阶段 (hpm_package.lua) 尝试处理产品信息时,调用日志格式化函数传入 nil
关键日志线索 format_msg 接收 nil 但期望 number,说明 ProductId 或相关字段未被正确解析
根本原因 升级过程中系统上下文不完整,无法正确解析PSR中动态引用字段,加上空字符串和默认值处理不当,引发Lua运行时错误

建议措施

  1. 避免在PSR中配置无意义的空字段:删除所有值为空字符串且非必需的字段。
  2. 审慎使用动态引用:确保在升级流程中被引用的对象路径已存在或可安全降级处理。
  3. 升级前配置校验:在BMC升级前执行PSR/CSR配置合法性检查,排除潜在风险项。
  4. 完善错误处理机制:在 format_msg 等关键函数中增加对 nil 值的防护,避免因日志打印导致流程中断。

:white_check_mark: 最佳实践:保持PSR配置最小化,仅保留必要字段;对于测试环境,建议清除所有非功能性配置后再进行升级操作。

点击此处查看详细分析解答

PSR配置导致BMC升级失败分析

问题描述

在进行BMC固件升级过程中,由于PSR(Physical Slot Record)中包含特定配置,导致升级失败。失败的错误日志显示在解析固件包 rootfs_openUBMC.hpm 时发生异常:

2026-03-22 16:58:09.959424 firmware_mgmt ERROR: task_instance.lua(159): parse package(rootfs_openUBMC.hpm) failed, ret:./opt/bmc/libmc/lualib/mc/utils.lua:1215: ./opt/bmc/libmc/lualib/mc/logging.lua:253: bad argument #2 to ‘format_msg’ (number expected, got nil)

失败配置项

以下PSR配置内容被证实会导致BMC升级失败:

{
  "ProductName": "",
  "ProductAlias": "",
  "DefaultProductAlias": "",
  "ProductPicture": "",
  "ProductUniqueID": "",
  "ProductVendorID": "",
  "ProductId": 0,
  "ProductVersion": "",
  "DeviceOwnerID": "<=/FruData_Fru0.ProductSerialNumber",
  "DeviceSerialNumber": "<=/FruData_Fru0.SystemSerialNumber",
  "DeviceName": "<=/FruData_Fru0.SystemProductName"
},
"Contact_1": {
  "OfficalWeb": "",
  "SupportWeb": "",
  "Email": "",
  "Phone": "",
  "Copyright": "",
  "DefaultCopyright": "",
  "KVMClientDownloadLink": "",
  "QRCodeSupported": false
},
"Dimension_1": {
  "HeightU": 1
}

解决方案:删除上述配置后,BMC升级可成功完成。


根本原因分析

1. 空值与默认值引发类型错误

从日志错误信息 bad argument #2 to ‘format_msg’ (number expected, got nil) 可知,系统在调用日志格式化函数 format_msg 时传入了一个 nil 值,而该参数期望的是一个数字类型。

这表明在固件包验证或解析阶段(hpm_package.lua),系统尝试读取产品信息如 ProductId 时,虽然配置中写为 0,但在后续处理中该值可能被误判为 nil 或无效值,尤其当与其他空字符串字段(如 ProductVendorID)共同存在时,引发类型不匹配错误。

相关日志:

firmware_mgmt NOTICE: hpm_package.lua(415): System product info:ProductId(0), ProductVendorID(), ProductUniqueID()

2. 动态引用字段的求值失败

如下字段使用了路径引用语法:

"DeviceOwnerID": "<=/FruData_Fru0.ProductSerialNumber",
"DeviceSerialNumber": "<=/FruData_Fru0.SystemSerialNumber",
"DeviceName": "<=/FruData_Fru0.SystemProductName"

这些引用在升级过程中可能因为目标对象尚未加载或上下文环境未初始化而导致解析失败,返回 nil。BMC升级初期(FS_SIMPLE_UPGRADING 阶段)系统资源和对象树可能不完整,此时尝试访问 /FruData_Fru0 路径会失败。

3. 升级流程中的对象解析依赖

根据知识库内容,BMC自发现机制会从EEPROM和本地文件读取CSR/PSR配置。若这些配置存在异常或引用无效路径,会导致 对象解析失败,进一步导致 对象分发异常,影响升级流程。

参考文档:“如果Eeprom被烧录了错误的csr,可能会导致对象解析失败,进一步导致对象分发异常。”

尽管本案例不是EEPROM问题,但原理一致:错误或不兼容的PSR配置会导致系统在关键阶段出现异常,中断升级流程


总结与建议

项目 说明
问题本质 PSR中存在空值字段与动态引用,在BMC升级早期环境未完全初始化时,导致类型错误或解析失败
故障触发点 固件包解析阶段 (hpm_package.lua) 尝试处理产品信息时,调用日志格式化函数传入 nil
关键日志线索 format_msg 接收 nil 但期望 number,说明 ProductId 或相关字段未被正确解析
根本原因 升级过程中系统上下文不完整,无法正确解析PSR中动态引用字段,加上空字符串和默认值处理不当,引发Lua运行时错误

建议措施

  1. 避免在PSR中配置无意义的空字段:删除所有值为空字符串且非必需的字段。
  2. 审慎使用动态引用:确保在升级流程中被引用的对象路径已存在或可安全降级处理。
  3. 升级前配置校验:在BMC升级前执行PSR/CSR配置合法性检查,排除潜在风险项。
  4. 完善错误处理机制:在 format_msg 等关键函数中增加对 nil 值的防护,避免因日志打印导致流程中断。

:white_check_mark: 最佳实践:保持PSR配置最小化,仅保留必要字段;对于测试环境,建议清除所有非功能性配置后再进行升级操作。