RNRP Loop Detection
Good Morning , Coulrd you support to us in the following issue ?
We have realized about a warning on one of the nodes of the system , all the nodes went to down suddenly because UPS failed , after booting the nodes the domain co controller node showes Suspect Loop Detection , I am not sure if this node has installed an intel card which we could verify the RSS feature to disable it , another point is , we could reinstall the driver or install a newer . Besides of the ideas writen above do you know something else that we could do , another thing , does the loop protection detection RNRP feature only works for AC800 CN ports ? , or also Does it work for the servers which it could block the port with loops ?
Thanks for your support
Best Regards
Ricardo
Answers
Did you try to clear the buffer. I think its option 4 in RNRP fault tracer.
Clear the 239.239.239.xx (xx is network area) using option 11.
after clearing buffer re-run the check (option 1) and post results.
The loop protection is for all RNRP nodes not just controllers. it will try to re-enable by itself after sometime (I think 60 mins).
I would not suspect RNRP's problem with RSS to strike randomly at start of a computer, rather it would cause intermittent false loop detection from boot time until you disable the feature.
The following technical description is a good starting point:
System 800xA, Network Loops and Storm Protection, 3BSE060651
A revision of that document is underway, there are some minor factual errors that will be corrected in the upcoming revision.
After reading it you will understand that ABB equipment has different methods to protect itself against network loops and storms and that it has changed over time and through versions.
- Loop Detection (all nodes since v4.0)
- Loop Protection
- Storm Protection (only AC 800M since v5.1)
RNRP detect loop when it sees the *same* routing telegram too many times. Each routing telegram has a sequence number which allow receivers to tell if the telegram is a replica of a previous, or is new. Each emitted sequence number should be unique on the network for several minutes until the counter wraps.
Currently, thirteen (13) replicas within 500 ms are required to detect a loop.
RNRP routing telegrams are multicasted and will likely enter an endless resending loop if the network contain a loop. Spanning Tree or Rapid Spanning Tree is *way* too slow to detect this problem before the controllers, etc. will close their ports. ABB recommend using proprietary L2 redundancy protocols, e.g. HiPER ring of Hirschmann or FRNT of Westermo. STP or RSTP does not really belong in a control system; but could be left enabled in switches, etc. as a *slow acting* loop resolver.
AC 800M Storm Protection is a complete different feature. It measures the rate of telegrams (any kind or source) and disables the interface when a certain threshold is reached. Different hardware have different thresholds which vary between 800 and 1600 telegrams per second. Storm Protection has *replaced* Loop Protection in recent AC 800M firmware. Loops are still sensed and reported by RNRP, but does not trigger blocking of any interface.
Example of looped telegram - I have already identified & filtered on the suspect telegram (172.17.0.27)
The reason was a slow acting ring redundancy protocol. Each time a previously open ring is closed, there is a slight delay before the ring manager detects ring closure. During this delay, telegrams will loop. With redundant networking, traffic will be rerouted to the secondary path while the RNRP blocking takes place. Normally, a short loop will result in a 7 second long block of Ethernet interfaces.
Similar kind of problem faced, In running time suddenly all nodes went down due to loop detection.
No physical loops network found.
After we cleared the buffer memory of RNRP, problem resolved.
Add new comment