Monday, June 20, 2016

Last Week



Early last week, our RADIUS server* started experiencing memory sector failures, which accelerated over the next several days. This occurred at the worst possible time, as I was out of town and also out of cell coverage, a very rare circumstance, and the problem was technically beyond support resources left here.
Although we had the problem controlled to some extent by Wednesday evening, allowing almost all users to connect owing to constant micromanagement, it was actually not until Friday mid-day that we had a more durable--but still temporary--fix in place.
We are now in process of building two new RADIUS servers, and will have redundancy and failover to the backup going forward. We will continue to host the first in our rack, and are trying to decide on a remote location for the backup. This will increase the reliability of the system by several orders of magnitude.
In addition, we will be addressing the problem of high level local backup support. As always, we thank you for your patience and understanding.



* Our RADIUS service stores subscriber credentials and other data. Every time a user attempts to log  in, an intermediary authentication device must check with the RADIUS server to validate the user's credentials before allowing it to connect to the network. Most of our voice service customers connect differently and would not have been affected, but we had a couple voice services connecting using RADIUS for monitoring, and those were impacted.

No comments:

Post a Comment