Some events missing
The system in question is small, no redundancy. One AS, one CS with Advant Connectivity, only Workgroup, 800xA 5.1 Rev E with german language pack.
The MB300 is large, a lot of controllers, some OS and a older 800xA 5.0. For the new system I use only 3 controllers, all the others are banned by NODE_DESCR filters.
I have a problem with events, some of them get lost. If I set/reset an alarm, let say five times, it is random if the 0->1 the 1->0 or both are missing or on the list.
At the older OS and on the older 800xA sytem all is fine, all events are in the list!
If I open the RTA board at my new system there are no system messages, all looks well.
But at the 800xA AE OPC Server I get a lot of "Events lost" messages, please see attached file
I have restarted the RTA Board and the Event Collector. The Event Collector was "syncronising" for a long time?
Any idea or some suggestions?
Thank you in advance!
Voted best answer
I could not see any issues with the configuration from the images you posted.
I believe we need to step deeper into the system by logging what is coming into and out from the MB300 OPC AE Server.
But first, verify that LF25 of the RTA DB is populated properly with all custom event texts you may be using in your controllers (LIM_TR larger than 25). A missing record would prevent that alarm or event from reaching the MB300 OPC AE server.
NOTE: The problem of a missing LF25 record would not be random; this contradict your description (as you mention that random alarm and events are lost). I also recollect that missing LF25 records would cause System Messages while connected to the RTA with RTA Config/ONB (but messages to screen can be disabled in ONB settings...).
Logging is a bit more delicate; try enabling the "Events" log on the AdvMbAeOpcServer process/component in AfwAppLogViewer to see output. The input side has some name like "RTA event", "Dual port memory" interface , etc. This log would output the raw events as they are received from the RTA/PU410 unit. I would appreciate if you could send me a personal email with the results, or if you need further instructions to succeed with the logging. I may not be able to provide any detailed feedback this week since I'm currently busy teaching our E143 Expert Workshop here in Västerås.
I believe the "Events lost" event is generated by the Alarm Manager when it senses a disruption in the sequence of the events received from an Event Collector (I believe there is an ever increasing sequence number in this communication); eg if comms have been out for a while. I also recollect such events (under certain circumstances) to be issued on false premises.
Does any service toggle state (service, standby, synchronizing, etc) during normal running? Please specify which and when.
To fit the description something ought to be restarting more or less constantly.
I once witnessed how a 3rd party OPC AE client connecting via remote DCOM could block the A&E distribution in the alarm manager by not accepting a data callback made over DCOM from the ABB.OPCEventServer.1 to the 3rd party client. This resulted in system wide effects. With TCPView.exe (from www.sysinternals.com) I could see server callback to be rejected (stuck in SYN state or immediately being rejected) by client. Lots of red/yellow/green events.
Please describe the setup of the system (with focus on A&E clients, alarm manager, event collector, event storage).
I have checked the services in 800xA, they are all stable. I have restarted the whole system without effect. The TCPView looks good, no connection who toggle. The setup of the system is mainly the standard, please have a look at the screenshots. The only additional AE Client is from PGIM and is connected to ABB OPC AE Server. I have stopped the services of the additional AE Client but the problem was still the same.
I add a screenshot of the existing AE Servers which I see.
Could there be a problem with Rev E?
I've done the logging in the way Stefan suggest. I found that a missing event not had reached the AdvMbAeOpcServer (the events were not in the log).
With a closer look I see that AdvMbAeOpcServer was running twice! The second instance was from the Softing OPC Client. As a stopped this instance all works fine!
Last week we had problems with the PGIM AE Client to get messages. We used the Softing tool to check some things. Seems that an instance of this kept running. After a reboot of the whole system I used the Softing tool to check if we still get these "missing event" messages.
At the end it seems to be a problem if a second AE Client point to the ABB Advant Master OPC AE Server. In this case it is random which Client (800xA or the second) gets the message first, the other one gets nothing!
My thanks to Stefan, his last hint point in the right direction!