Hi All,
I have a Root management server running SCOM 2007 R2 CU4. All the roles are in the same RMS server (RMS, SQL, Webconsole).
Suddenly once in a day our alerting stops and even after we see the scom services are running and the management server is in a healthy state. We restart the services SDK, Healthservice, System center management configuration. Still we do not get the alerts.
We need to fully reboot the RMS and it works fine in a day. Same issue continues and we reboot it once in a day.
Any idea what is the issue. We cannot afford rebooting this daily.
We had installed few security patches on the RMS. We got them uninstalled but still the same issue.
Analysed the event logs and found few logs.
Lot of Event is 2115 events.
Event id 26380 - The System Center Operations Manager SDK Service failed due to an unhandled exception.
The service will attempt to restart. Exception:
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at Bid.TraceError(String fmtPrintfW, Object a1, Object a2)
at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.Execute[T](ExecuteArguments executeArguments, RetryPolicy retryPolicy, GenericExecute`1 genericExecute)
at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.ExecuteReaderSingleRow(SqlDataReader sqlDataReader, SqlConnection sqlConnection, IList`1 prologEpilogList, RetryPolicy retryPolicy)
at Microsoft.EnterpriseManagement.Mom.DataAccess.QueryResultsReader.Read()
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.ClientReaderManager.GetObjects(Guid id, Int32 count)
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccess.GetObjectsFromReader(Guid readerId, Int32 count)
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessTieringWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessExceptionTracingWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
at SyncInvokeGetObjectsFromReader(Object , Object[] , Object[] )
at System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object instance, Object[] inputs, Object[]& outputs)
at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean isOperationContextSet)
at System.ServiceModel.Dispatcher.ChannelHandler.DispatchAndReleasePump(RequestContext request, Boolean cleanThread, OperationContext currentOperationContext)
at System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest(RequestContext request, OperationContext currentOperationContext)
at System.ServiceModel.Dispatcher.ChannelHandler.AsyncMessagePump(IAsyncResult result)
at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
at System.ServiceModel.Channels.FramingDuplexSessionChannel.TryReceiveAsyncResult.OnReceive(IAsyncResult result)
at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
at System.ServiceModel.Channels.SynchronizedMessageSource.SynchronizedAsyncResult`1.CompleteWithUnlock(Boolean synchronous, Exception exception)
at System.ServiceModel.Channels.SynchronizedMessageSource.ReceiveAsyncResult.OnReceiveComplete(Object state)
at System.ServiceModel.Channels.SessionConnectionReader.OnAsyncReadComplete(Object state)
at System.ServiceModel.Channels.StreamConnection.OnRead(IAsyncResult result)
at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.Net.LazyAsyncResult.Complete(IntPtr userToken)
at System.Net.LazyAsyncResult.ProtectedInvokeCallback(Object result, IntPtr userToken)
at System.Net.Security.NegotiateStream.ProcessFrameBody(Int32 readBytes, Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.NegotiateStream.ReadCallback(AsyncProtocolRequest asyncRequest)
at System.Net.FixedSizeReader.CheckCompletionBeforeNextRead(Int32 bytes)
at System.Net.FixedSizeReader.ReadCallback(IAsyncResult transportResult)
at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
at System.ServiceModel.Channels.ConnectionStream.ReadAsyncResult.OnAsyncReadComplete(Object state)
at System.ServiceModel.Channels.SocketConnection.FinishRead()
at System.ServiceModel.Channels.SocketConnection.AsyncReadCallback(Boolean haveResult, Int32 error, Int32 bytesRead)
at System.ServiceModel.Diagnostics.Utility.IOCompletionThunk.UnhandledExceptionFrame(UInt32 error, UInt32 bytesRead, NativeOverlapped* nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)
Event id 1103 - Summary: 28 rule(s)/monitor(s) failed and got unloaded, 0 of them reached the failure limit that prevents automatic reload. Management group "My Management Group". This is summary only event, please see other events with descriptions of unloaded rule(s)/monitor(s).
Event id: 26319
The description for Event ID 26319 from source OpsMgr SDK Service cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
Connect uuid:6856a52d-56c3-4aa2-92e3-2ca16e644c12;id=1 The creator of this fault did not specify a Reason. System.ServiceModel.FaultException`1[Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException]: The creator of this fault did not specify a Reason. (Fault Detail is equal to Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException: Sdk Service has not yet initialized. Please retry).
The handle is invalid
The description for Event ID 26319 from source OpsMgr SDK Service cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
Connect
uuid:6856a52d-56c3-4aa2-92e3-2ca16e644c12;id=2
The creator of this fault did not specify a Reason.
System.ServiceModel.FaultException`1[Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException]: The creator of this fault did not specify a Reason. (Fault Detail is equal to Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException: Sdk Service has not yet initialized. Please retry).
The handle is invalid
Event id: 1103
The description for Event ID 1103 from source HealthService cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
Name of my management group
1
0
The handle is invalid
Event id 29104 - OpsMgr Config Service failed to send the dirty state notifications to the dirty OpsMgr Health Services. This may be happening because the Root OpsMgr Health Service is not running.
Event id : 4000 - A monitoring host is unresponsive or has crashed. The status code for the host failure was 2164195371.
Event id 4503 - A module reported an error 0x8007000E from a callback which was running as part of rule "_44E9A997_6A02_4298_8430_8E01952AB6F3_.RaiseAlert" running for instance "Root managementserver FQDN" with id:"{6C3444C3-3990-BB5A-F25E-289D8F427570}" in management group "My Management Group".
==================================
Can anyone help us.