2026-05-09 01:55:07: HAM7 NPU1 Rst Out detected.
2026-05-09 01:55:08: System Restart [Unknown].
2026-05-09 01:55:10: HAM3 NPU1 Rst Out detected.
2026-05-09 01:59:42: [Minor Warning], CPU0(socket CPU1) NO.2 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-09 02:44:40: BMC detected system power off.
2026-05-09 02:50:39: BMC detected system power on.
2026-05-09 02:50:42: System Restart [ChassisControlCommand].
2026-05-09 02:55:00: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-09 03:15:25: HAM8 NPU1 Rst Out detected.
2026-05-09 03:15:26: HAM5 NPU1 Rst Out detected.
2026-05-09 03:15:26: HAM7 NPU1 Rst Out detected.
2026-05-09 03:15:28: HAM3 NPU1 Rst Out detected.
2026-05-09 03:15:28: HAM4 NPU1 Rst Out detected.
2026-05-09 03:15:28: System Restart [Unknown].
2026-05-09 03:15:29: HAM1 NPU1 Rst Out detected.
2026-05-09 03:15:31: HAM6 NPU1 Rst Out detected.
2026-05-09 03:16:45: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-09 03:22:45: HAM1 NPU1 Rst Out detected.
2026-05-09 03:22:45: HAM6 NPU1 Rst Out detected.
2026-05-09 03:22:46: HAM8 NPU1 Rst Out detected.
2026-05-09 03:22:46: HAM4 NPU1 Rst Out detected.
2026-05-09 03:22:47: HAM3 NPU1 Rst Out detected.
2026-05-09 03:22:47: System Restart [Unknown].
2026-05-09 03:22:48: HAM5 NPU1 Rst Out detected.
2026-05-09 03:22:49: HAM7 NPU1 Rst Out detected.
2026-05-09 03:24:07: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-09 03:24:37: HAM6 NPU1 Rst Out detected.
2026-05-09 03:24:37: HAM7 NPU1 Rst Out detected.
2026-05-09 03:24:38: HAM3 NPU1 Rst Out detected.
2026-05-09 03:24:38: HAM8 NPU1 Rst Out detected.
2026-05-09 03:24:38: HAM5 NPU1 Rst Out detected.
2026-05-09 03:24:38: HAM4 NPU1 Rst Out detected.
2026-05-09 03:24:39: System Restart [Unknown].
2026-05-09 03:24:40: HAM1 NPU1 Rst Out detected.
2026-05-09 03:26:19: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-09 05:28:30: HAM4 NPU1 Rst Out detected.
2026-05-09 05:28:30: HAM3 NPU1 Rst Out detected.
2026-05-09 05:28:30: HAM8 NPU1 Rst Out detected.
2026-05-09 05:28:31: HAM5 NPU1 Rst Out detected.
2026-05-09 05:28:31: HAM7 NPU1 Rst Out detected.
2026-05-09 05:28:33: System Restart [ChassisControlCommand].
2026-05-09 05:28:34: HAM1 NPU1 Rst Out detected.
2026-05-09 05:28:34: HAM6 NPU1 Rst Out detected.
2026-05-09 05:30:09: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-09 05:32:50: HAM4 NPU1 Rst Out detected.
2026-05-09 05:32:50: HAM3 NPU1 Rst Out detected.
2026-05-09 05:32:51: HAM5 NPU1 Rst Out detected.
2026-05-09 05:32:51: HAM6 NPU1 Rst Out detected.
2026-05-09 05:32:51: HAM7 NPU1 Rst Out detected.
2026-05-09 05:32:51: HAM8 NPU1 Rst Out detected.
2026-05-09 05:32:51: HAM1 NPU1 Rst Out detected.
2026-05-09 05:32:52: System Restart [Unknown].
2026-05-09 05:36:32: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-09 06:10:28: [Minor Warning], CPU0(socket CPU1) NO.2 correctable error rate exceeded threshold by PFAE.(Caused by CPU CORE0 L2C CE)
2026-05-09 06:25:48: [Minor Warning], CPU0(socket CPU1) NO.3 correctable error rate exceeded threshold by PFAE.(Caused by CPU CORE0 L2C CE)
2026-05-09 06:38:23: [Minor Warning], DIMM070 NO.1 correctable error rate exceeded threshold by PFAE.(Caused by DDRC0 CE)
2026-05-09 06:50:18: [Minor Warning], CPU0(socket CPU1) NO.4 correctable error rate exceeded threshold by PFAE.(Caused by PCIE LOCAL CORE1 PORT0_DL CE)
2026-05-09 07:02:38: [Major Warning],DIMM070 uncorrect error (2026-05-09 07:02:28).
                         SERRCODE: 0X0C (Refer to IERRCODE)
                         IERRCODE: 0X48 (HA access non-mirror space uncorrect error(poison enable))
                         Suggest to check and replace DIMM070.
2026-05-09 07:04:42: [Major Warning],DIMM070 uncorrect error (2026-05-09 07:02:28).
                         SERRCODE: 0X0C (Refer to IERRCODE)
                         IERRCODE: 0X48 (HA access non-mirror space uncorrect error(poison enable))
                         Suggest to check and replace DIMM070.
2026-05-09 07:04:43: [Major Warning],DIMM070 uncorrect error (2026-05-09 07:02:28).
                         SERRCODE: 0X0C (Refer to IERRCODE)
                         IERRCODE: 0X48 (HA access non-mirror space uncorrect error(poison enable))
                         Suggest to check and replace DIMM070.
2026-05-09 07:04:44: [Major Warning],DIMM070 uncorrect error (2026-05-09 07:04:33).
                         SERRCODE: 0X0C (Refer to IERRCODE)
                         IERRCODE: 0X48 (HA access non-mirror space uncorrect error(poison enable))
                         Suggest to check and replace DIMM070.
2026-05-09 07:06:58: HAM4 NPU1 Rst Out detected.
2026-05-09 07:06:58: HAM8 NPU1 Rst Out detected.
2026-05-09 07:06:59: HAM5 NPU1 Rst Out detected.
2026-05-09 07:06:59: HAM1 NPU1 Rst Out detected.
2026-05-09 07:06:59: HAM6 NPU1 Rst Out detected.
2026-05-09 07:07:00: HAM7 NPU1 Rst Out detected.
2026-05-09 07:07:00: System Restart [ChassisControlCommand].
2026-05-09 07:07:03: HAM3 NPU1 Rst Out detected.
2026-05-09 07:10:35: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-09 07:14:58: IMU triggered OOB collection and report start.
2026-05-09 07:14:58: IMU triggered OOB collection and report start.
2026-05-09 07:14:59: Caterr Signal Assert.
2026-05-09 07:14:59: IMU Heartbeat Lost!
2026-05-09 07:14:59: IMU collection finish but no data.
2026-05-09 07:14:59: BMC begin to collect DFX registers.
2026-05-09 07:16:29: HAM6 NPU1 Rst Out detected.
2026-05-09 07:16:29: HAM8 NPU1 Rst Out detected.
2026-05-09 07:16:30: HAM5 NPU1 Rst Out detected.
2026-05-09 07:16:30: HAM1 NPU1 Rst Out detected.
2026-05-09 07:16:31: HAM3 NPU1 Rst Out detected.
2026-05-09 07:16:31: HAM4 NPU1 Rst Out detected.
2026-05-09 07:16:32: HAM7 NPU1 Rst Out detected.
2026-05-09 07:16:32: System Restart [Unknown].
2026-05-09 07:16:33: BMC collection fail due to System reboot.
2026-05-09 07:16:51: FDM receive BIOS IERR data finish after warmreset but no data.
2026-05-09 07:16:51: FDM receive BIOS IERR data finish after warmreset but no data.
2026-05-09 07:17:55: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-09 09:33:50: HAM3 NPU1 Rst Out detected.
2026-05-09 09:33:51: HAM8 NPU1 Rst Out detected.
2026-05-09 09:33:51: HAM5 NPU1 Rst Out detected.
2026-05-09 09:33:51: HAM6 NPU1 Rst Out detected.
2026-05-09 09:33:52: HAM7 NPU1 Rst Out detected.
2026-05-09 09:33:52: HAM4 NPU1 Rst Out detected.
2026-05-09 09:33:53: HAM1 NPU1 Rst Out detected.
2026-05-09 09:33:53: System Restart [ChassisControlCommand].
2026-05-09 09:35:14: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-11 05:12:59: BMC detected system power on.
2026-05-11 05:13:03: System Restart [ACRestoreAlwaysPowerUp].
2026-05-11 05:15:24: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-12 06:52:16: BMC detected system power on.
2026-05-12 06:52:19: System Restart [ACRestoreAlwaysPowerUp].
2026-05-12 06:54:28: System Restart [Unknown].
2026-05-12 06:54:29: HAM5 NPU1 Rst Out detected.
2026-05-12 06:54:30: HAM7 NPU1 Rst Out detected.
2026-05-12 06:54:31: HAM1 NPU1 Rst Out detected.
2026-05-12 06:54:32: HAM3 NPU1 Rst Out detected.
2026-05-12 06:55:55: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-12 09:42:30: BMC detected system power off.
2026-05-12 09:45:16: BMC detected system power on.
2026-05-12 09:45:19: System Restart [ChassisControlCommand].
2026-05-12 09:47:28: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-12 09:48:29: HAM3 NPU1 Rst Out detected.
2026-05-12 09:48:30: HAM5 NPU1 Rst Out detected.
2026-05-12 09:48:32: System Restart [Unknown].
2026-05-12 09:48:32: HAM7 NPU1 Rst Out detected.
2026-05-12 09:48:34: HAM1 NPU1 Rst Out detected.
2026-05-12 09:52:06: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-13 01:12:38: HAM5 NPU1 Rst Out detected.
2026-05-13 01:12:38: HAM1 NPU1 Rst Out detected.
2026-05-13 01:12:39: HAM3 NPU1 Rst Out detected.
2026-05-13 01:12:40: HAM7 NPU1 Rst Out detected.
2026-05-13 01:12:40: System Restart [Unknown].
2026-05-13 01:14:00: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-13 02:01:58: BMC detected system power off.
2026-05-13 02:14:41: BMC detected system power on.
2026-05-13 02:14:44: System Restart [ChassisControlCommand].
2026-05-13 02:16:55: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-13 03:07:55: HAM3 NPU1 Rst Out detected.
2026-05-13 03:07:56: System Restart [Unknown].
2026-05-13 03:07:58: HAM1 NPU1 Rst Out detected.
2026-05-13 03:07:58: HAM7 NPU1 Rst Out detected.
2026-05-13 03:08:00: HAM5 NPU1 Rst Out detected.
2026-05-13 03:09:17: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-13 03:21:29: HAM5 NPU1 Rst Out detected.
2026-05-13 03:21:30: System Restart [Unknown].
2026-05-13 03:21:33: HAM1 NPU1 Rst Out detected.
2026-05-13 03:21:33: HAM7 NPU1 Rst Out detected.
2026-05-13 03:21:34: HAM3 NPU1 Rst Out detected.
2026-05-13 03:22:50: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-13 03:34:37: BMC detected system power on.
2026-05-13 03:34:40: System Restart [ACRestoreAlwaysPowerUp].
2026-05-13 03:36:59: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
2026-05-13 04:00:51: BMC detected system power off.
2026-05-13 05:19:10: BMC detected system power on.
2026-05-13 05:19:12: System Restart [ACRestoreAlwaysPowerUp].
2026-05-13 05:21:52: [Minor Warning], CPU0(socket CPU1) NO.1 uncorrectable error rate exceeded threshold by PFAE.(Caused by LLC_SRAM NFE)
