Clients disconnected after one of the CS failed.
The system configuration I’ve got standard 2 CS+2AS. In addition to this there’s one CS for drive system and one for QCS control. Recently CS for drives crashed and it’s services were migrated to normal CS. When this server died one ES and one OS got disconnected from system and I’ve to use connect remote node option on PPA to reconnect them to the system. I’m struggling to understand how the failure of CS can disconnect the nodes. Before the server failure no issue reported on these nodes and after reconnecting they’re working fine. Please someone enlighten me on possible reasons for such behaviour.
Voted best answer
Could be a glitch in the Afw-service framework. The framework is responsible for ensuring all nodes have a valid map of where the currently running Afw-services are located, by what IP and TCP port number.
It sounds like the ES and the affected OS were "tied" to the failing CS, and when it went belly up, they did not manage to find a new benefactor (which should take place automatically).
The Afw-framework has been improved in later versions. If you want to learn more, highlights from the Afw-framework are covered in the E143, Troubleshooting 800xA Expert Workshop.