AC800M Control Net Network Quality (Messages lost)
Hi,
we changed our network hardware.
We had unmanaged switches and everything worked fine.
We had one ring and several point-to-point connections.
Now we are using "intelligent" switches from Hirschmann. (Greyhounds and Bobcats)
We have one ring and several point-to-point connections.
Now we have problems in our network.
We have high values on Messages lost path0 and path1. And we have high values on path switchover.
The picture shows the highest values in the system. Other controllers have lower values.
I tried to disable all "intelligent" protocols.
I think it is related to multicast settings.
Does anyone know a solution or does anyone has a hint how to solve this problem?
"Never touch a running system" is too late :)
Thank you
Best regards,
Florian

we changed our network hardware.
We had unmanaged switches and everything worked fine.
We had one ring and several point-to-point connections.
Now we are using "intelligent" switches from Hirschmann. (Greyhounds and Bobcats)
We have one ring and several point-to-point connections.
Now we have problems in our network.
We have high values on Messages lost path0 and path1. And we have high values on path switchover.
The picture shows the highest values in the system. Other controllers have lower values.
I tried to disable all "intelligent" protocols.
I think it is related to multicast settings.
Does anyone know a solution or does anyone has a hint how to solve this problem?
"Never touch a running system" is too late :)
Thank you
Best regards,
Florian

Answers
My settings:
- Flow Control: Deactivated at the ring ports, activated at the controller ports
- IGMP Snooping: Off
- Unknown Multicasts: Send to all ports
- GMRP: off
- GMRP unknows mutlicasts: discard
- MRP: Activated on the ring. Ring reconfiguration: 500ms
- Rapid Spanning: Off
- Flow Control: Deactivated at the ring ports, activated at the controller ports
- IGMP Snooping: Off
- Unknown Multicasts: Send to all ports
- GMRP: off
- GMRP unknows mutlicasts: discard
- MRP: Activated on the ring. Ring reconfiguration: 500ms
- Rapid Spanning: Off
You have SERIOUS issues with the network and should consider bringing the production system to a safe state (possibly stopped)!!!
Under normal circumstances (with no restarts, no disruptive network activities, etc.) the previous hour counters should stay below 10.
Something seem to restrict RNRP's multicast traffic (periodically multicasted by all nodes to 239.239.239.x and UDP port 2423) from traveling through the network.
Tapping any port of the network for UDP port 2423 traffic should indicate one such telegram per second and node on the RNRP area (filter on individual source nodes to more easily track the expected one second* "rhythm"). *) Router nodes send additional telegrams (one extra per remote area and path) on a 1.5 second interval.

The excessive values indicate a global problem (not just one or two controllers with a failing link)
If you can't resolve this on your own, please file an official support case with your regional ABB support center (and the response might be, ask the vendor for an explanation why multicast traffic go lost to such an extent)
Under normal circumstances (with no restarts, no disruptive network activities, etc.) the previous hour counters should stay below 10.
Something seem to restrict RNRP's multicast traffic (periodically multicasted by all nodes to 239.239.239.x and UDP port 2423) from traveling through the network.
Tapping any port of the network for UDP port 2423 traffic should indicate one such telegram per second and node on the RNRP area (filter on individual source nodes to more easily track the expected one second* "rhythm"). *) Router nodes send additional telegrams (one extra per remote area and path) on a 1.5 second interval.

The excessive values indicate a global problem (not just one or two controllers with a failing link)
- Ring problem? What happens with the statistics if you break up the ring?
- Why having Flow Control enabled when it is mentioned in the Network Configuration User's Guide to have it off?
- Does the port counters in the switches indicate excessive drops on send/receive (read more about that in chapter 4.4.2 of the System Health Check document I have attached)
- Does the AC 800M > Remote System > Controller Diagnosis > Network Information > Get (giving the traffic counters from the controller's perspective) raise any concern?
- Have autonegotiation resulted in asymmetry on some endpoint or uplink? AC 800M need 10 HALF (except for PM891 which can do 100FULL)?
- Switch port Storm Control / Ingress / Egress limitation capping the traffic somewhere?
If you can't resolve this on your own, please file an official support case with your regional ABB support center (and the response might be, ask the vendor for an explanation why multicast traffic go lost to such an extent)
I don't know how to insert a image in a comment.
This is my redundant network (primary has the same layout):

With Point-To-Point i mean a single connection between the mainswitch and the switch at the controller.
On every switch (172.17.85.151-172.17.85.182) is one or more controllers connected.
On the switch 172.17.85.2 are connected my servers.
This is my redundant network (primary has the same layout):

With Point-To-Point i mean a single connection between the mainswitch and the switch at the controller.
On every switch (172.17.85.151-172.17.85.182) is one or more controllers connected.
On the switch 172.17.85.2 are connected my servers.
Just a guess ... did Rapid Spanning Tree get activated on any of the New server hardware network cards ?
Thank you for your hints.
I changed my network:
- I disabled one port on my ring manager and turned off the ring manager. Now I have a network without a ring.
- I disabled the rate limiter (temporarily).
- I disabled Flow Control on every port
But I still have errors on my network (only Control Network).
I attached a log file from the controller (network diagnosis).
I will open a support case and I will try to escalate the issue to SE L3. I hope, you (Stefan) will get the case on your desk. :)
Best regards,
Florian
I changed my network:
- I disabled one port on my ring manager and turned off the ring manager. Now I have a network without a ring.
- I disabled the rate limiter (temporarily).
- I disabled Flow Control on every port
But I still have errors on my network (only Control Network).
I attached a log file from the controller (network diagnosis).
I will open a support case and I will try to escalate the issue to SE L3. I hope, you (Stefan) will get the case on your desk. :)
Best regards,
Florian
Add new comment