Quantcast
Channel: Operations Manager - General forum
Viewing all articles
Browse latest Browse all 11941

False Heartbeat failures for Clustered SQL servers

$
0
0

Hi Experts,

We have an environment where in all the SCOM agents have been mutihomed to two management groups. The first one is a shared Management group to whom the agents in our environment send traps via the Gateway Servers within our domain and second one is in house Management server with just a RMS and Database and Warehouse on SQL instances.

Post having the agents mutihomed we are seeing heartbeat failures alerts being generated frequently for just SQL servers and the alerts gets closed also soon. The initial taught was to increase the default interval of heartbeat and max number of missed heartbeat specific to these servers, but later we saw that its not just heartbeat is being missed infact the agent is unable to send the data to the management servers and we see a lot of missing data in the realtime graphs from opsmgr DB. Next check was to see if how the agent performance on those SQL servers is ?

For clustered SQL servers for which we are receiving false and  frequent heartbeat failures we see agent performance as blank compared to other agents running SQL roles on them.

From SQL application end its looks fine, wondering what could be the reason from SCOM end.

I also found that the Send Queue % Used is also showing blank with alert logged tags. Hence the resolution will be to increase the size of the send queue change the changing regkey ?

But in this cases I should be expecting event ID’s like 2034 or 2023 indicating the send queue size if full or old data from the queue is getting dropped before being sent ahead to the MS.

Need your expert advice.

Regards,

Prajul Nambiar


Viewing all articles
Browse latest Browse all 11941

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>