I foolishly thought that I would quickly swap out my 2012 domain controllers with 2019 domain controllers, thus beginning a weeks-long saga. I have 2 DCs in my homelab, DC1 and DC2.
Built a new DC, joined to the domain, promoted to a DC (it ran AD prep for me, nice!), transferred FSMO roles (all were on DC1), all looked great! Demoted DC1, all logins failed with ‘Domain Unavailable’.
Uh-oh.
Thankfully I had my Synology backing up my FSMO role holder DC. So I restored it from scratch. I figured I might have missed something obvious so I did it again. Same result.
Ran through all sorts of crazy debugging, ntdsutil commands looking for old metadata to clean up, found some old artifacts that I thought might have been causing the issue, and repeated the process. Same result.
Several weeks later I realized what happened – I had a failing UPS take down my Synology multiple times until I replaced it a few days ago. Guess which VM I never restarted? The Enterprise CA. The CA caused all of this. Or at least most of it. Even after I powered up the CA, I was unable to cleanly transfer all FSMO roles. Everything but the Schema Master transferred cleanly, even though they all transferred cleanly while the CA was down. I had to seize the schema master role and manually delete DC1 from ADUC – thankfully, current versions of AD do the metadata cleanup for you when you delete a DC from ADUC.
In hilarious irony, I specifically built the CA on a member server and not a domain controller to avoid upgrade problems.
In summary:
- When you don’t administer AD every day, you forget lots of things
- No AD upgrade is easy
- Make sure you have a domain controller backup before you start
- Turn on your CA
- Run repadmin /showrepl and dcdiag BEFORE you start messing with the domain
- Run repadmin /showrepl and dcdiag AFTER you add a domain controller and BEFORE you remove old domain controllers
- ntdsutil is like COBOL – old and verbose