I had a call the other day from a company who had Exchange issues. One investigation it turned out they had a very suspect Active Directory and no-one would admit to what they had actually done to get it in such a state!
One server (DC1) would not talk to the other DC’s (Kerberos issues and replication issues) and the other DC’s where missing the Microsoft Exchange Security Groups OU and contained groups as well as other Exchange related stuff – though the schema and configuration was present!
DC1’s event logs where full of errors going back about six days (to when the issue started, though I only got a call a day before we had it fixed). But if I looked back in the log more than six days the event log showed only stuff from almost a year ago. I suspect a snapshot of the server was restored – but as I said, the only thing anyone claimed to have done was attempted to restore a user from a backup!
So the first step was to see if we could isolate DC1 from Exchange and do a setup /PrepareAD to replace the missing items in the domain naming context.
This requires limiting Exchange to DC2 with Set-ExchangeServer Exchange Management Shell cmdlet, but the shell would not start due to AD errors, so out with the registry editor.
To hard code Exchange to selected DC’s you need to visit HKEY_LOCAL_MACHINE\ SYSTEM\ CurrentControlSet\ services\ MSExchange ADAccess and create a new key called Instance0. Inside \Instance0 create a String called ConfigDCHostName that has a value of the FQDN of DC to use.
Then create a Profiles key under HKEY_LOCAL_MACHINE\ SYSTEM\ CurrentControlSet\ services\ MSExchange ADAccess\, which is the same location as before. Under Profiles create a subkey called Default. For Exchange 2010 create a DWORD called MinUserDC and a value of 1 and under Default key create two more keys called UserDC1 and UserGC1. MinUserDN is in a different location for Exchange 2007.
Inside UserDC1 key add a string called HostName (the value being the FQDN of the domain controller server to use) and a DWORD called IsGC with a value of 0.
Inside UserGC1 key add a string called HostName (the value being the FQDN of the global catalog server to use) and a DWORD called IsGC with a value of 1.
An example is shown in the picture for clarity:
Restart the Microsoft Exchange ADTopology Service to see if it can now connect to the correct server (the MinUserDC value stops Exchange attempting to connect to the PDC emulator as well as the listed domain controllers). In my clients issues, the PDC Emulator was DC1 that was effectively unreachable.
If you can get Exchange online now, great! Time to fix the issues with DC1. But if you can’t (and in my example I could not) then time for more troubleshooting – its sort of just like the MCM Qual Lab, just with real customer data!
To cut a long story short, in my example I decided that DC1 was the more accurate DC and that an authoritative restore of it to the last available AD backup (one month old!) might fix up the issues that had crept in since the abortive work done by the client earlier in the week. In this clients case, I used ntdsutil on DC2 to remove DC1 and then used dcpromo to demote all the DC’s so that they returned to member servers and standalone machines. Then I used ntdsutil to remove DC2 etc from the copy of AD on DC1 so that I was left with an almost up to date copy of AD on DC1. Then I rejoined DC2 etc. to the DC1 replica so I was back where the client thought they were with a number of DC’s but all replicating and Exchange objects all present. I needed to rejoin the servers to the domain, but once that was done I had a working Exchange environment. It was only six and a half days since the outage, and the clients email cloud filtering company held email for seven days – so no loss of email! Just about!
All in a days work for a Microsoft Certified Master | Exchange Server 2010.