Rookie Mistake, even though I am an IT ‘veteran’.
As a Systems Administrator, I am always trying to time Windows Automatic Updates perfectly. It’s one thing to install updates on a solo PC, laptop, or tablet. You manually click the Automatic Updates, sit there and watch them install, and hopefully they go smoothly. These days, Windows updates usually do go smoothly. But for a mid sized or large company or organization, these need to be automated. It is the same in a UNIX, Linux or other environment.
Without going into detail, Windows Updates can be automated via a tool called WSUS. The updates are downloaded to a central server and from there, via group policies, they are pushed out to the local domain servers and workstations. Usually a reboot is involved, which can be scary on critical servers. It also can be a sweaty experience!
Of course, in a perfect world a networked and domain environment would be duplicated in a test environment. Like I typed, ‘a perfect world’. It is easier said than done, and some companies do not allocate resources for a test environment. These companies, oddly enough, rely entirely on MICROSOFT ENGINEERS to test Windows updates properly before pushing them out to the world! But I digress …
The Rookie Mistake was entirely on me. I properly had spread out the Updates via my WSUS server. Workstations on 1 day, then Member Servers over the Weekend, spread apart from the DC’s time frame. I would check alerts to verify all servers came back online and services were up etc. Any Sys Admin knows ugly Netlogon and other errors can be generated as a DC reboots. Member servers start to panic when they cannot contact Active Directory and security related services. Because the Windows Domain Controller holds the keys to the Kingdom, it is imperative that a DC be active during restarts.
So, the point: One key item possibly overlooked by some administrators: separate Domain Controller and Member Server Group Policies, especially regarding update schedules.
I had done this initially but one day, carelessly, I was trying to get the DC updates done at a different time for scheduling purposes. The reason was that I wanted them done closer to the morning so I could fix quicker if there was indeed an issue. In my infinite (non) wisdom, I erroneously scheduled the new day + time for automated updates to be exactly the same as the schedule for the member servers. Again, this may seem fine on the surface, but again, certain services from member servers NEED to contact a domain controller continuously in order to start successfully. Exchange 2010 and SQL 2008 come to mind! Services like SQL Server and Exchange’s Information Store will literally STOP if there is no contact with a Domain Controller, but mainly when there are simultaneous reboots amongst all DC s and Member servers.
So, do not forget to schedule Domain Controller Windows Updates at least 1 hour apart from Member server updates. Brief reboots of the DC (s) while Member Servers are running are OK, but simultaneous reboots can be problematic. And always have services alerts running, in case there is a problem. I use Netikus Event Sentry for this (it is a solid product), and you can weed out the non critical services from the critical.
I readjusted the schedule to separate the Updates again, and … lesson (re) learned!