Many of our servers have sometimes problems reaching our first OpsMgr server:
Event Description: OpsMgr was unable to set up a communications channel to opsmgr01.domain.local. Communication will resume when opsmgr01.domain.local is available and communication from this computer is allowed.
followed by:
OpsMgr has successfully failed over to opsmgr02.domain.local.
I created a monitor for these events and I noticed that many servers (e.g. 37) have this issue at almost the same time (within a 1/10 of a second). My guess should be the opsmgr01 is to busy at that moment and denies any new connections.
How can I troubleshoot this issue and confirm that my guess is right. I checked already the eventvwr on the opsmgr01 for that time but did not notice any unusual. I also checked the Windows key performance counters but there are no issues with disk, network, cpu or memory. Also on network level there are no denies.
Another funny thing I noticed that all the servers are communication with OpsMgr at the same time. In the hour before this issue I noticed just a couple connections to OpsMgr. Is this normal behavior?
We have two OpsMgr servers, both OpsMgr 2012 R2, RU2. Our servers (about 300) are running Windows 2008R2 or Windows 2012 R0, both are experiencing this problem. We do not monitor network devices with SCOM but we do have Netapp monitoring in place.