No good deed goes unpunished - outage 8/6/2018
As part of upgrading our service, we purchased, configured, and tested a new multi port multi gigabit router to handle residential and small business traffic. The brand new unit is marketed as an Edge Router, for internet service providers to deliver to their edge networks. Hey, that's us!
At 11PM on Sunday 8/5/2018, the router, stopped renewing folks IP address leases. As the leases became due, folks homes were "disconnected" from the internet. Your internet gateway device "checks out" an address from our server to put you on the "global internet". This "lease" normally lasts for hours or days; it takes less than a tenth of a second to renew. Normally, this is invisible to everyone. The root cause was simple (in hindsight, after hours of scrubbing log files and head scratching). When we configured the new router, we had not set authoritative mode on the DHCP server (the lease master). As a result, when folks “lease” was due, the new router assumed some other router was handling the old reservation and ignored it.
We resolved the issue within five minutes of detecting the problem.
We have made two operational adjustments to the router and we are working to implement a redundant solution to avoid this in the future.
Thanks to two of our customers who sent in a detailed messages which collaborated the issue and let us resolve it quickly and confidently. You rock!
-Steve