Network Attached Storage​

 View Only
Expand all | Collapse all

Both HNAS node got rebooted due to network loop detected on network switch

This thread has been viewed 22 times
  • 1.  Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 10:26

    Both HNAS node got rebooted due to network loop detected on network switch, looking for answer on this why both nodes got rebooted. 

    appreciate your help.



    ------------------------------
    Shafeeq Ahmed
    Systems Engineer
    Dxc Technology
    ------------------------------


  • 2.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 10:32

    3090

    Warning

    2023-06-06 15:34:14

    1 ag2-vlan0333: network loop detected: this event, Id 3090, happened once since reset on the LPAR-Type1.

    2202

    Severe

    2023-06-06 15:34:11

    1 Cluster: Node BR-HNAS-2 (id=2) has stopped responding.

    3006

    Warning

    2023-06-06 15:34:05

    1 c0 link has gone down.

    6833

    Severe

    2023-06-06 15:34:05

    1 File System: Mirroring of NVRAM to partner server is suspended because the inter-cluster links to another cluster node are unavailable.

    5192

    Severe

    2023-06-06 15:34:03

    1 Cluster: Heart beating over management network to Node ID 2 lost.

    5195

    Warning

    2023-06-06 15:34:03

    1 Cluster: Heart beating over high speed interconnect link to Node ID 2 lost.



    ------------------------------
    Shafeeq Ahmed
    Systems Engineer
    Dxc Technology
    ------------------------------



  • 3.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 10:42
    Hi Shafeeq

    I'm assuming you have a 2 node cluster.
    What is the output of these 2 commands

    event-log-show -u 2 -o |grep -i sev

    event-log-show -u 1 -o |grep -i sev

    Andy




  • 4.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 12:29

     

    Hi

     

    $ event-log-show -u 2 -o |grep -i sev

    1093 Severe      2023-06-06 15:37:36-04:00 1 Last server shutdown wasn't clean (Software boot reason: Crashed, Unexpected app fail).

    5189 Information 2023-06-06 15:39:51-04:00 1 Cluster: Node ID 2 has taken EVS tcipnasevs1(ID=1) online.

    5189 Information 2023-06-06 15:39:54-04:00 1 Cluster: Node ID 2 has taken EVS tcipnasevs2(ID=2) online.

    5189 Information 2023-06-06 15:39:57-04:00 1 Cluster: Node ID 2 has taken EVS tcipnasevs3(ID=3) online.

    6825 Severe      2023-06-06 15:39:58-04:00 1 File System: The NVRAM log for file system (BRM-FS01) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:00-04:00 1 File System: The NVRAM log for file system (BRM-FS10) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:00-04:00 1 File System: The NVRAM log for file system (BRM-FS14) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:00-04:00 1 File System: The NVRAM log for file system (BRM-FS04) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:00-04:00 1 File System: The NVRAM log for file system (BRM_FS11_ALM) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:00-04:00 1 File System: The NVRAM log for file system (Test_PD_08JAN2020) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:00-04:00 1 File System: The NVRAM log for file system (BRM_POC1) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:02-04:00 1 File System: The NVRAM log for file system (BRM-FS05) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:02-04:00 1 File System: The NVRAM log for file system (BRM-FS07) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:02-04:00 1 File System: The NVRAM log for file system (BRM-FS15) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:02-04:00 1 File System: The NVRAM log for file system (BRM-FS13) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:02-04:00 1 File System: The NVRAM log for file system (BRM-FS06) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:07-04:00 1 File System: The NVRAM log for file system (BRM_POC2) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:10-04:00 1 File System: The NVRAM log for file system (BRM_FS12_DXC) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:10-04:00 1 File System: The NVRAM log for file system (Test_NFS) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:10-04:00 1 File System: The NVRAM log for file system (BRM-FS08) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:10-04:00 1 File System: The NVRAM log for file system (BRM-FS01) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:10-04:00 1 File System: The NVRAM log for file system (BRM-FS09) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:10-04:00 1 File System: The NVRAM log for file system (BRM-FS10) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:12-04:00 1 File System: The NVRAM log for file system (BRM-FS14) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:12-04:00 1 File System: The NVRAM log for file system (BRM-FS04) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:12-04:00 1 File System: The NVRAM log for file system (BRM_FS11_ALM) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:12-04:00 1 File System: The NVRAM log for file system (Test_PD_08JAN2020) was not preserved on this cluster node.

    6825 Severe      2023-06-06 15:40:12-04:00 1 File System: The NVRAM log for file system (BRM_POC1) was not preserved on this cluster node.

    6535 Severe      2023-06-06 15:40:21-04:00 1 CIFS: EVS 1 cannot establish a connection to any DCs.

    6535 Severe      2023-06-06 15:40:24-04:00 1 CIFS: EVS 2 cannot establish a connection to any DCs.

    5503 Severe      2023-06-06 15:40:26-04:00 1 Cluster: The cluster node has experienced an Ethernet failure.

    7767 Severe      2023-06-06 15:41:20-04:00 1 Span 'SP-01' (ID 922C2EDA8AB1B05) has gone offline because System Drive 0 (rack '441934', SD '0000'; used in span 'SP-01', ID 922C2EDA8AB1B05) has failed or become unlicensed.

    8533 Severe      2023-06-06 15:41:20-04:00 1 Filesystems BRM-FS01 (ID 922C3A8CEDE58EE), BRM-FS04 (ID 922CC962C8292E6), BRM-FS10 (ID 905B21956E5A292), BRM-FS14 (ID 91F4878ECECD507), BRM_FS11_ALM (ID 91261F40F6F2F54), BRM_POC1 (ID 2D75921232653B91) and Test_PD_08JAN2020 (ID 9C8E360296CAACD) have gone down.

    7767 Severe      2023-06-06 15:41:20-04:00 1 Span 'SP-02' (ID 922C280D10B36AE) has gone offline because System Drive 16 (rack '441934', SD '0010'; used in span 'SP-02', ID 922C280D10B36AE) has failed or become unlicensed.

    8533 Severe      2023-06-06 15:41:20-04:00 1 Filesystems BRM-FS05 (ID 922CCAC51C452FE), BRM-FS06 (ID 922CD7C746A045B), BRM-FS07 (ID 922CD262A6BFE87), BRM-FS08 (ID 90343FCE2CA36F5), BRM-FS09 (ID 9034394218F8FE5), BRM-FS13 (ID 91F4E834953A4E1), BRM-FS15 (ID 9DBAA20CBE44078), BRM_FS12_DXC (ID 914A6FF286270A5) and BRM_POC2 (ID 2D75922111EB241B) have gone down.

    8533 Severe      2023-06-06 15:41:20-04:00 1 Filesystem Test_NFS (ID 9CF758A3A435136) has gone down.

    8520 Severe      2023-06-06 15:41:20-04:00 1 Spanned primary SDs 0-63 are unhealthy.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS10) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS14) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS01) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM_FS11_ALM) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS04) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (Test_PD_08JAN2020) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM_POC1) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS15) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (Test_NFS) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS08) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS06) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS07) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS09) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS05) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM_POC2) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM_FS12_DXC) has failed.

    6760 Severe      2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS13) has failed.

    5503 Severe      2023-06-06 15:42:15-04:00 1 Cluster: The cluster node has experienced an Ethernet failure.

    1065 Severe      2023-06-06 15:47:46-04:00 1 Recovered fatal on MMB: "HFB vlsi fatal RX/T2_NIB_RX_REG/rx_stuck_block" panic at ./libs/vlsi/assert-monitor/AlteraAssertMonitor.cpp:89: in function void checkAsserts().

    1093 Severe      2023-06-06 15:47:46-04:00 1 Last server shutdown wasn't clean (Software boot reason: Crashed, Unexpected app fail).

    BR-HNAS-1:$

     

     

     

    event-log-show -u 1 -o |grep -i sev – for some reason, not getting any sev alerts on cluster node 1

     

     






  • 5.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 13:22

    Your VSP G400 Unified platform does not go End of Service Life for another year+, but I do see your maintenance has not been renewed.

    Because you have a Unified HNAS with embedded NAS modules, there is no "external loop", the ICC is internal to the "array".

    There is a hint in the output above, but again, you would have to open a GSC case in order to understand what occurred. Could be an internal "burp" on the array that caused this or it could be an issue since your code on array/NAS are so old.

    Either way, I have to shut this down on my end due to lack of support contract.



    ------------------------------
    Albert Hagopian
    Software Development Engineer - Specialist
    Hitachi Vantara
    ------------------------------



  • 6.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 11:34

    Shafeeq, since you have an active account in SalesForce, your answer is to open a case and provide the diagnostics. Since your system should be connected to Hitachi Remote Ops, usually reboots cause automatic case generation with GSC.



    ------------------------------
    Albert Hagopian
    Software Development Engineer - Specialist
    Hitachi Vantara
    ------------------------------



  • 7.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 11:44

    Actually we are out of support from hitachi, we did reachout to GSC, they are saying these devices are not in support.

     






  • 8.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 11:57

    That's unfortunate - this forum is not available to diagnose faults outside the auspices of case generation with GSC.

    It does make me wonder why your name /acct (Toyota Canada) is listed as active - I guess you are, but the legacy gear is out of contract support. You can always contact your local sales team and re-up support to get assistance.



    ------------------------------
    Albert Hagopian
    Software Development Engineer - Specialist
    Hitachi Vantara
    ------------------------------



  • 9.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 12:12

    Thanks, just for discussion sake,

    What will be your guesses, why loop on switch caused, hnas nodes to go reboot.

    Is quorum network got affected ?

    We received P1, and working on RCA.

    And can u guide me to check which logs, that will be appreciated.

     

    Thanks

    shafeeq

     






  • 10.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 12:28

    If a switch port goes into into  loop mode, then the the nodes cannot communicate to each other over ICC links and thus no NVRAM mirroring.

    Beyond that, there's nothing I can say or do.



    ------------------------------
    Albert Hagopian
    Software Development Engineer - Specialist
    Hitachi Vantara
    ------------------------------



  • 11.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-15-2023 15:20
    >
    > Thanks, just for discussion sake,
    >
    > What will be your guesses, why loop on switch caused, hnas nodes to go reboot.
    >

    Based on the AlteraAssertMonitor event, the loop *might* not be the root cause of the reboot.
    This may have been an FPGA SEU ( single event upset ) ....

    That said, a network loop is something that should be completely avoided.
    ( truly answering a question like " can a network loop cause an HNAS crash" would
    Require HNAS internals ( developer ) knowledge )
    I would put ops effort into preventing loops.

    Also, your HNAS has some other issues ( unhealth SDs, licensing ... )


    1093 Severe 2023-06-06 15:37:36-04:00 1 Last server shutdown wasn't clean
    (Software boot reason: Crashed, Unexpected app fail).


    5503 Severe 2023-06-06 15:40:26-04:00 1 Cluster: The cluster node has experienced an Ethernet failure.

    7767 Severe 2023-06-06 15:41:20-04:00 1 Span 'SP-01' (ID 922C2EDA8AB1B05) has gone offline because
    System Drive 0 (rack '441934', SD '0000'; used in span 'SP-01', ID 922C2EDA8AB1B05) has failed or become unlicensed.


    7767 Severe 2023-06-06 15:41:20-04:00 1 Span 'SP-02' (ID 922C280D10B36AE) has gone offline because
    System Drive 16 (rack '441934', SD '0010'; used in span 'SP-02', ID 922C280D10B36AE) has failed or become unlicensed.


    8520 Severe 2023-06-06 15:41:20-04:00 1 Spanned primary SDs 0-63 are unhealthy.

    6760 Severe 2023-06-06 15:41:20-04:00 1 File System: file system (BRM-FS10) has failed.


    5503 Severe 2023-06-06 15:42:15-04:00 1 Cluster: The cluster node has experienced an Ethernet failure.

    1065 Severe 2023-06-06 15:47:46-04:00 1 Recovered fatal on MMB:
    "HFB vlsi fatal RX/T2_NIB_RX_REG/rx_stuck_block"
    panic at ./libs/vlsi/assert-monitor/AlteraAssertMonitor.cpp:89:
    in function void checkAsserts().

    1093 Severe 2023-06-06 15:47:46-04:00 1 Last server shutdown wasn't clean
    (Software boot reason: Crashed, Unexpected app fail).




    > Is quorum network got affected ?
    >
    > We received P1, and working on RCA.
    >
    > And can u guide me to check which logs, that will be appreciated.
    >
    >




  • 12.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-15-2023 15:34

    Thanks Andrew,

     

    But as soon as we applied some settings on network switch, we having seen multiple alerts on Hitachi HNAS, got cluster node alert, heartbeat alert, reboot alert and file systems alert.

    And as soon we revert the settings on network switch, everything came back to normal on hitachi hnas.

     

    I am really trying to understand the problem, so that this can avoided in future.

     

    Appreciate your help.

     

    Thanks

    shafeeq

     






  • 13.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-16-2023 09:44

    Whilst I've known Andy since starting at BlueArc in '95 (he's one of the most thorough customers in existence), I can emphatically state the error he is referencing, is not an FPGA SEU.

    1065 Severe 2023-06-06 15:47:46-04:00 1 Recovered fatal on MMB: "HFB vlsi fatal RX/T2_NIB_RX_REG/rx_stuck_block" panic at ./libs/vlsi/assert-monitor/AlteraAssertMonitor.cpp:89:in function void checkAsserts().

    SEU's on Unified platforms are null and mostly devoid due to more modern FPGA's and similarly on HNAS 5000 series (since that platform leverages the NAS module used on Unified).

    We have exactly on single occurrence of that error on Unified platform and a fix was placed into version much earlier that is being used at this site -  which is SVOS 83-05-43, containing HNAS v13.8.6320.10 (currently 2.5 yrs old).

    The first order of business would be to upgrade the system to the very latest 14.6 build(more modern SVOS as well, obviously), but that can't be done since Shafeeq does not have a valid support contract on this gear.

    Shafeeq should talk to his local acct team and see if he can pay a one-time fee to garner FSE/CSS support time to upgrade the system. After which, have a recurrence if he "tweaks his network" and provide that info to GSC for analysis in to understand what changes were made and why those changes would cause the server(s) to crash.

    No one on planet earth is going to be able to answer the question as to "why" without diagnostics and details on the exact changes made on the Network.

    Now, it's somewhat funny that Shafeeq has his storage array set up to call back into Hitachi Remote Ops (HRO), but not the HNAS portion. That's how I was able to find his SVOS FW version. Had the HNAS portion been set up, I would have been able to look at the diags that are auto-generated and sent to HRO. Even then, I would only be able to shed some additional color.



    ------------------------------
    Albert Hagopian
    Software Development Engineer - Specialist
    Hitachi Vantara
    ------------------------------



  • 14.  RE: Both HNAS node got rebooted due to network loop detected on network switch

    Posted 06-14-2023 12:29

    Hitrack alert

    Recent Event History

    Code

    Severity

    Timestamp

    Text

    5167

    Warning

    2023-06-06 15:49:05

    2 No EVS is running on cluster node 2.

    1368

    Warning

    2023-06-06 15:49:01

    2 The MMB memory size (29440 MB) is invalid.

    6830

    Warning

    2023-06-06 15:48:59

    2 File System: Mirroring of NVRAM to partner server is not in effect.

    1093

    Severe

    2023-06-06 15:47:46

    2 Last server shutdown wasn't clean (Software boot reason: Crashed, Unexpected app fail).

    1065

    Severe

    2023-06-06 15:47:46

    2 Recovered fatal on MMB: "HFB vlsi fatal RX/T2_NIB_RX_REG/rx_stuck_block" panic at ./libs/vlsi/assert-monitor/AlteraAssertMonitor.cpp:89: in function void checkAsserts().

    1621

    Warning

    2023-06-06 15:45:27

    1 PAPI IP address configuration error: Duplicate address found: 2002::afb:3002 already in use in the network.

    1523

    Warning

    2023-06-06 15:45:27

    1 PAPI housekeeping failed in IPAddress.

    5608

    Warning

    2023-06-06 15:45:13

    1 Object Replication: Failed to start replication policy 8ae44200-bc9e-11d3-9025-3428182d4732 schedule 1 on EVS 2 (Source file system has invalid persona object).

    9700

    Warning

    2023-06-06 15:44:41

    1 The live file system usage exceeds 90% for BRM-FS04 on EVS tcipnasevs1 (ID=1).

    7415

    Warning

    2023-06-06 15:44:37

    1 Filesystem 'BRM-FS04' (ID 922CC962C8292E6) on span 'SP-01' (ID 922C2EDA8AB1B05) needs more space but can't auto-expand: the filesystem is confined.

    6833

    Severe

    2023-06-06 15:44:36

    1 File System: Mirroring of NVRAM to partner server is suspended because the cluster configuration has changed.

    2202

    Severe

    2023-06-06 15:44:32

    1 Cluster: Node BR-HNAS-2 (id=2) has stopped responding.

    1523

    Warning

    2023-06-06 15:44:24

    1 PAPI housekeeping failed in SMTP.

    1523

    Warning

    2023-06-06 15:44:24

    1 PAPI housekeeping failed in Hosts.

    5192

    Severe

    2023-06-06 15:44:24

    1 Cluster: Heart beating over management network to Node ID 2 lost.

    3006

    Warning

    2023-06-06 15:44:20

    1 c0 link has gone down.

    1523

    Warning

    2023-06-06 15:44:15

    1 PAPI housekeeping failed in NSOrder.

    1523

    Warning

    2023-06-06 15:44:15

    1 PAPI housekeeping failed in NIS.

    1523

    Warning

    2023-06-06 15:44:15

    1 PAPI housekeeping failed in DNS.

    1523

    Warning

    2023-06-06 15:44:15

    1 PAPI housekeeping failed in MgmntUser.

    5195

    Warning

    2023-06-06 15:42:33

    1 Cluster: Heart beating over high speed interconnect link to Node ID 2 lost.

    5195

    Warning

    2023-06-06 15:42:33

    2 Cluster: Heart beating over high speed interconnect link to Node ID 1 lost.

    5503

    Severe

    2023-06-06 15:42:15

    2 Cluster: The cluster node has experienced an Ethernet failure.

    5167

    Warning

    2023-06-06 15:42:13

    1 No EVS is running on cluster node 1.

    1368

    Warning

    2023-06-06 15:42:11

    1 The MMB memory size (29440 MB) is invalid.

    6830

    Warning

    2023-06-06 15:42:10

    1 File System: Mirroring of NVRAM to partner server is not in effect.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1972 (SIM_DATA_ECHO_DPV_INLINE) on file system BRM-FS01 has gone for 1.796 min (@2023-06-06 15:40:19.968-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1970 (SIM_READ_FS) on file system BRM-FS08 has gone for 1.715 min (@2023-06-06 15:40:24.804-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1967 (SIM_READ_FS) on file system BRM_FS12_DXC has gone for 1.75 min (@2023-06-06 15:40:22.707-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1965 (SIM_READ_FS) on file system BRM_POC2 has gone for 1.753 min (@2023-06-06 15:40:22.523-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1963 (WLOG_READ_NVRAM) on file system BRM-FS01 has gone for 1.295 min (@2023-06-06 15:40:50.006-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1958 (SIM_READ_FS) on file system BRM-FS10 has gone for 1.714 min (@2023-06-06 15:40:24.875-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1957 (SIM_READ_FS) on file system BRM-FS01 has gone for 1.714 min (@2023-06-06 15:40:24.875-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1956 (WLOG_READ_NVRAM) on file system BRM-FS01 has gone for 1.805 min (@2023-06-06 15:40:19.457-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1952 (WLOG_READ_NVRAM) on file system BRM-FS01 has gone for 1.805 min (@2023-06-06 15:40:19.457-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1945 (WFS_PING) has gone for 1.114 min (@2023-06-06 15:41:00.868-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1943 (WLOG_READ_NVRAM) on file system BRM-FS01 has gone for 1.805 min (@2023-06-06 15:40:19.457-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1939 (WLOG_READ_NVRAM) on file system BRM-FS01 has gone for 1.805 min (@2023-06-06 15:40:19.457-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1938 (SIM_READ_FS) on file system BRM-FS09 has gone for 1.715 min (@2023-06-06 15:40:24.804-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1937 (SIM_READ_FS) on file system Test_NFS has gone for 1.72 min (@2023-06-06 15:40:24.522-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1934 (WLOG_READ_NVRAM) on file system BRM-FS01 has gone for 1.805 min (@2023-06-06 15:40:19.457-04:00) without a response.

    2105

    Warning

    2023-06-06 15:42:07

    2 Buffer #1933 (WLOG_READ_NVRAM) on file system BRM-FS01 has gone for 1.805 min (@2023-06-06 15:40:19.457-04:00) without a response.

    3035

    Warning

    2023-06-06 15:41:41

    1 ag3 link has gone down (member interfaces: tg5, tg6).

    3035

    Warning

    2023-06-06 15:41:41

    1 ag1 link has gone down (member interfaces: tg1, tg2).

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS13) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM_FS12_DXC) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM_POC2) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS05) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS09) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS07) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS06) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS08) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (Test_NFS) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS15) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM_POC1) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS04) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM_FS11_ALM) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS01) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS14) has failed.

    6760

    Severe

    2023-06-06 15:41:20

    2 File System: file system (BRM-FS10) has failed.

    8520

    Severe

    2023-06-06 15:41:20

    2 Spanned primary SDs 0-63 are unhealthy.

    8533

    Severe

    2023-06-06 15:41:20

    2 Filesystems BRM-FS05 (ID 922CCAC51C452FE), BRM-FS06 (ID 922CD7C746A045B), BRM-FS07 (ID 922CD262A6BFE87), BRM-FS08 (ID 90343FCE2CA36F5), BRM-FS09 (ID 9034394218F8FE5), BRM-FS13 (ID 91F4E834953A4E1), BRM-FS15 (ID 9DBAA20CBE44078), BRM_FS12_DXC (ID 914A6FF286270A5) and BRM_POC2 (ID 2D75922111EB241B) have gone down.

    7767

    Severe

    2023-06-06 15:41:20

    2 Span 'SP-02' (ID 922C280D10B36AE) has gone offline because System Drive 16 (rack '441934', SD '0010'; used in span 'SP-02', ID 922C280D10B36AE) has failed or become unlicensed.

    8533

    Severe

    2023-06-06 15:41:20

    2 Filesystems BRM-FS01 (ID 922C3A8CEDE58EE), BRM-FS04 (ID 922CC962C8292E6), BRM-FS10 (ID 905B21956E5A292), BRM-FS14 (ID 91F4878ECECD507), BRM_FS11_ALM (ID 91261F40F6F2F54), BRM_POC1 (ID 2D75921232653B91) and Test_PD_08JAN2020 (ID 9C8E360296CAACD) have gone down.

    7767

    Severe

    2023-06-06 15:41:20

    2 Span 'SP-01' (ID 922C2EDA8AB1B05) has gone offline because System Drive 0 (rack '441934', SD '0000'; used in span 'SP-01', ID 922C2EDA8AB1B05) has failed or become unlicensed.

    5608

    Warning

    2023-06-06 15:40:59

    2 Object Replication: Failed to start replication policy 42487f94-7b10-11d4-98d5-3428182d4732 schedule 1 on EVS 2 (Source file system has invalid persona object).

    5608

    Warning

    2023-06-06 15:40:59

    2 Object Replication: Failed to start replication policy f549dce8-c671-11d5-955e-040401090304 schedule 1 on EVS 1 (Source file system has invalid persona object).

    1093

    Severe

    2023-06-06 15:40:55

    1 Last server shutdown wasn't clean (Software boot reason: Crashed, Unexpected app fail).

    5503

    Severe

    2023-06-06 15:40:26

    2 Cluster: The cluster node has experienced an Ethernet failure.

    6535

    Severe

    2023-06-06 15:40:24

    2 CIFS: EVS 2 cannot establish a connection to any DCs.

    3035

    Warning

    2023-06-06 15:40:21

    2 ag2 link has gone down (member interfaces: tg3, tg4).

    6535

    Severe

    2023-06-06 15:40:21

    2 CIFS: EVS 1 cannot establish a connection to any DCs.

    3035

    Warning

    2023-06-06 15:40:21

    2 ag1 link has gone down (member interfaces: tg1, tg2).

    3035

    Warning

    2023-06-06 15:40:21

    2 ag3 link has gone down (member interfaces: tg5, tg6).

    2117

    Warning

    2023-06-06 15:40:20

    2 warning assert TDP/T2_PROT_BUFF_MGR/fsm_ni_rx_out_of_buffers from TdpH1 (HFB1): this event, Id 2117, happened once since reset on the HFB1.

    2117

    Warning

    2023-06-06 15:40:20

    2 warning assert TDP/T2_PROT_BUFF_MGR/out_of_buffers from TdpH1 (HFB1): this event, Id 2117, happened once since reset on the HFB1.

    2117

    Warning

    2023-06-06 15:40:17

    2 warning assert TDP/T2_PROT_BUFF_MGR/fsm_ni_hi_pri_rx_out_of_buffers from TdpH1 (HFB1): this event, Id 2117, happened once since reset on the HFB1.

    2117

    Warning

    2023-06-06 15:40:17

    2 warning assert TDP/T2_PROT_BUFF_MGR/fsm_nv_rx_out_of_buffers from TdpH1 (HFB1): this event, Id 2117, happened once since reset on the HFB1.

    2117

    Warning

    2023-06-06 15:40:17

    2 warning assert TDP/T2_PROT_BUFF_MGR/fsm_di_rx_out_of_buffers from TdpH1 (HFB1): this event, Id 2117, happened once since reset on the HFB1.

    2117

    Warning

    2023-06-06 15:40:17

    2 warning assert TDP/T2_PROT_BUFF_MGR/fsm_icc_rx_out_of_buffers from TdpH1 (HFB1): this event, Id 2117, happened once since reset on the HFB1.

    2117

    Warning

    2023-06-06 15:40:17

    2 warning assert TDP/T2_PROT_BUFF_MGR/fsm_ni_tx_out_of_buffers from TdpH1 (HFB1): this event, Id 2117, happened once since reset on the HFB1.

    6825

    Severe

    2023-06-06 15:40:12

    2 File System: The NVRAM log for file system (BRM_POC1) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:12

    2 File System: The NVRAM log for file system (Test_PD_08JAN2020) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:12

    2 File System: The NVRAM log for file system (BRM_FS11_ALM) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:12

    2 File System: The NVRAM log for file system (BRM-FS04) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:12

    2 File System: The NVRAM log for file system (BRM-FS14) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:10

    2 File System: The NVRAM log for file system (BRM-FS10) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:10

    2 File System: The NVRAM log for file system (BRM-FS09) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:10

    2 File System: The NVRAM log for file system (BRM-FS01) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:10

    2 File System: The NVRAM log for file system (BRM-FS08) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:10

    2 File System: The NVRAM log for file system (Test_NFS) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:10

    2 File System: The NVRAM log for file system (BRM_FS12_DXC) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:07

    2 File System: The NVRAM log for file system (BRM_POC2) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:02

    2 File System: The NVRAM log for file system (BRM-FS06) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:02

    2 File System: The NVRAM log for file system (BRM-FS13) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:02

    2 File System: The NVRAM log for file system (BRM-FS15) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:02

    2 File System: The NVRAM log for file system (BRM-FS07) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:02

    2 File System: The NVRAM log for file system (BRM-FS05) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:00

    2 File System: The NVRAM log for file system (BRM_POC1) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:00

    2 File System: The NVRAM log for file system (Test_PD_08JAN2020) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:00

    2 File System: The NVRAM log for file system (BRM_FS11_ALM) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:00

    2 File System: The NVRAM log for file system (BRM-FS04) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:00

    2 File System: The NVRAM log for file system (BRM-FS14) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:40:00

    2 File System: The NVRAM log for file system (BRM-FS10) was not preserved on this cluster node.

    6825

    Severe

    2023-06-06 15:39:58

    2 File System: The NVRAM log for file system (BRM-FS01) was not preserved on this cluster node.

    3090

    Warning

    2023-06-06 15:39:51

    2 ag2-vlan0333: network loop detected: this event, Id 3090, happened once since reset on the LPAR-Type1.

    1368

    Warning

    2023-06-06 15:39:30

    2 The MMB memory size (29440 MB) is invalid.

    6830

    Warning

    2023-06-06 15:39:26

    2 File System: Mirroring of NVRAM to partner server is not in effect.

    3035

    Warning

    2023-06-06 15:38:22

    2 ag3 link has gone down (member interfaces: tg5, tg6).

    1093

    Severe

    2023-06-06 15:37:36

    2 Last server shutdown wasn't clean (Software boot reason: Crashed, Unexpected app fail).

    6535

    Severe

    2023-06-06 15:34:43

    1 CIFS: EVS 2 cannot establish a connection to any DCs.

    6535

    Severe

    2023-06-06 15:34:25

    1 CIFS: EVS 1 cannot establish a connection to any DCs.

    3035

    Warning

    2023-06-06 15:34:25

    1 ag1 link has gone down (member interfaces: tg1, tg2).

    7415

    Warning

    2023-06-06 15:34:24

    1 Filesystem 'BRM-FS07' (ID 922CD262A6BFE87) on span 'SP-02' (ID 922C280D10B36AE) needs more space but can't auto-expand: the filesystem is confined.

    3090

    Warning

    2023-06-06 15:34:14

    1 ag2-vlan0333: network loop detected: this event, Id 3090, happened once since reset on the LPAR-Type1.

    2202

    Severe

    2023-06-06 15:34:11

    1 Cluster: Node BR-HNAS-2 (id=2) has stopped responding.

    3006

    Warning

    2023-06-06 15:34:05

    1 c0 link has gone down.

    6833

    Severe

    2023-06-06 15:34:05

    1 File System: Mirroring of NVRAM to partner server is suspended because the inter-cluster links to another cluster node are unavailable.

    5192

    Severe

    2023-06-06 15:34:03

    1 Cluster: Heart beating over management network to Node ID 2 lost.

    5195

    Warning

    2023-06-06 15:34:03

    1 Cluster: Heart beating over high speed interconnect link to Node ID 2 lost.

    6535

    Severe

    2023-06-06 15:32:35

    1 CIFS: EVS 1 cannot establish a connection to any DCs.

    6568

    Warning

    2023-06-06 15:32:18

    2 SMB client request to server 'TCIADDC1' timed out after 30 s: this event, Id 6568, happened once in the last 19.76 d on the LPAR-Type1.



    ------------------------------
    Shafeeq Ahmed
    Systems Engineer
    Dxc Technology
    ------------------------------