Sunday, August 14, 2016

LIGHTNING DAMAGE

We lost a power distribution unit supplying almost all infrastructure at the Hanson tower location early this morning. It failed at about 6:50am, almost certainly because of cumulative  damage from yesterday and last night's rounds of severe thunderstorms. The fact that it took down multiple network devices simultaneously-including our primary Ripton backhaul link- foiled normal alerting and slowed diagnosis. We were able to retrieve and configure a backup PDU and restore service to most Ripton customers at approximately 11:45am. A single AP failed to come back up, and it was close to 1:00pm before the final handful services was restored. We apologize for the outage and thank our customers for their understanding.

Sunday, July 24, 2016

BACK UP!

With the restoration of grid power at 11:49am, all nodes came back up immediately, and all customers appear to be logging in successfully. We will be calibrating our runtime on batteries at the Old Town Road location when this month's round of upgrades is complete, make sure that we have 24 hours minimum, and put a plan in place for providing temporary generation for more prolonged grid outages. Thank you for your patience and understanding.
STORM UPDATE/RIPTON DOWN 

As of 9:00 this morning the entire Ripton network is down. Battery backup power at the Old Town Road relay was exhausted overnight. Ironically, doubling the battery backup capacity there is imminent as that location is now carrying the significantly greater electrical load of a radio relay for the Fire Department's emergency services communications--in fact the new battery cabinet was shipped on Friday, two days ago.
Service will be restored as soon as Green Mountain Power restores grid power. They are now predicting restoral at 11:55am.

Saturday, July 23, 2016

STORM REPORT

The network seems to have survived this afternoon's violent storm without serious incident. At this time (6:09pm) it appears that the minor network nodes at Old Town Road, Brooks (near Breadloaf), and Selden Mill Road remain without power after exhausting their battery backups. We are inspecting those sites to verify and hoping for restoration of grid power as soon as possible.

Monday, June 20, 2016

Last Week



Early last week, our RADIUS server* started experiencing memory sector failures, which accelerated over the next several days. This occurred at the worst possible time, as I was out of town and also out of cell coverage, a very rare circumstance, and the problem was technically beyond support resources left here.
Although we had the problem controlled to some extent by Wednesday evening, allowing almost all users to connect owing to constant micromanagement, it was actually not until Friday mid-day that we had a more durable--but still temporary--fix in place.
We are now in process of building two new RADIUS servers, and will have redundancy and failover to the backup going forward. We will continue to host the first in our rack, and are trying to decide on a remote location for the backup. This will increase the reliability of the system by several orders of magnitude.
In addition, we will be addressing the problem of high level local backup support. As always, we thank you for your patience and understanding.



* Our RADIUS service stores subscriber credentials and other data. Every time a user attempts to log  in, an intermediary authentication device must check with the RADIUS server to validate the user's credentials before allowing it to connect to the network. Most of our voice service customers connect differently and would not have been affected, but we had a couple voice services connecting using RADIUS for monitoring, and those were impacted.

Thursday, January 21, 2016

FIBER FU

Last evening, around 7:00pm, when many Riptonites were sitting down to an important School Board presentation, a fiber demarc device of our fiber provider for the Middlebury-Ripton link failed. Normally, we expect immediate notification from that provider in such an eventuality, since we do not have "visibility" into their network segments. Absent that notification, we wasted many hours troubleshooting, tearing down, and reprovisioning multiple devices on our network before finally concluding, by process of elimination, that the problem was not on our end. Support from our fiber provider was able to confirm the failure of their equipment and locate it, and we were able to help them restore connectvity shortly after 3:00am.

Later today, we will confer with that provider's engineering staff, confirm that normal emergency notification procedures are now in place, and ask them to support our process for automatic restoration.

Additionally, we continue work to add a replacement wireless link into Ripton for total redundancy.