Guys,
I am working on a SCOM 2012 design. The primary concern is to make SCOM a highly available solution. I've few questions below.
Environment:
- Two data centers (suppose DC1 and DC2), away few KM from each other. Good network connectivity. Network latency less than 2ms.
- SQL 2102 AlwaysOn to host OperationsManager database. One node in DC1, other secondary node in DC2.
- Separate SQL server to host data warehouse database. No SQL failover
- Separate SQL Server+SSRS to host SCOM Reports. This will also have Web Console (optional)
- 4 Management Servers (MS). 2 in each data center, member of two resource pool (RS). One RS for each data center.
- 4 Gateway. 2 in each data center to monitor untrusted domain agents.
For OperationsManager database, I am using SQL 2012 AlwaysOn (two node, automatic failover).
- Should I make OperationsManager database on secondary node readable or not? If I make it readable, will it have any impact on SCOM performance etc.
- Both SQL nodes will be in two different data centers (few KM away from each other). The network is very good, high bandwidth and latency is less that 1 or 2 ms. I assume that both nodes will perform okay and shouldn't be an issue?
Data warehouse database will be hosted on a separate SQL Server. It is expected that the size of this database will be grown to 1TB in a year. I am proposing to not make it part of any SQL cluster (either AlwaysOn or traditional SQL cluster) due to the limitations of the hardware availability. Question:
- Does data warehouse database provide real-time data? documentation says it will keep both historical and real-time data. But I've seen few posts where people says it takes about 2hours to aggregate the data then show in the report. We want to write few reports to get real-time data every few minutes, is it possible to get it from DW or from Operations DB?
- In case of disaster, if Data warehouse SQL server dies, I assume that the SCOM will still work because OperationsManager database is hosted separately. In other words, in my opinion data warehouse is not critical for basic SCOM operations like receiving alerts or monitoring agents (other than reporting). Please correct me if I am wrong.
- If above is correct, and we rebuild the server, restore data warehouse database from backups with in 7 days, I assume that the data from OperationsManager database will be put in the data warehouse db. So end result will be that no data loss in data warehouse database. Correct or not?
- SCOM Reports will be hosted on a 3rd server, separate to data warehouse. If data warehouse SQL server is down, it will only impact the reports availability, all other operations will work okay?
For management servers, I am suggesting to put two MS servers in each data center i.e. two MS in DC1 will be part of one resource pool and two MS in DC2 will be part of second resource pool.
- What is the best way to provide failover for MS - with in data center and across? Should I create one resource pool or 2 pools having 2 MS in each?
- Agents in DC1 will talk to resource pool 1 in its own data center. Agents in DC2 will talk to the resource pool 2. My assumption is that 2 MS in each data center will provide automatic failover with in data center?
- I assume that we can assign one resource pool as primary to agents and another as failover? i.e. in case both MS servers in one data center goes down, agents will talk to MS in other DC???
- Same as above, connectivity between both data center is very good and network latency is less than 2ms (maybe less than 1ms). Will there any be any issue with the communication between MS servers across data centers? I read lot of post opposing this solution but in most cases the mentioned only about network latency - in my case it is below 2ms.
SCOM Report Server will be installed on separate SQL server hosting SSRS. The Web Console will also be installed on this server.
- If this server goes down, I believe it will only impact on reports availability. Correct?
- Any known issues putting SSRS, Reports and Web Console on same server? To me I don't see any.
Same as MS server, we are putting two Gateway (to manage agent in un-trusted domain) in each data center, providing local failover capability to agents and across data center in case of a disaster.
- Same as MS Resource pool, can we create and assign two resource pools for gateway servers in each data center? Will this provide failover mechanism with in data center and across data center?
Thanks in advance :)