Monitor the Monitoring
#1
It was a week like any other.
But something...something was different.
Was it that the users were quieter than usual? No.
Were we in the middle of a pandemic that affected sensations of time passing? Sure, but that's old news.

Then it was discovered. The first small issue from the first user.
"Huh, that's odd" they probably said. "Weird, that didn't route to the notification system" probably soon followed.

The monitoring system service had failed.

High-availability monitoring, you say? Too expensive, "too complex", or too annoying.
A backup monitoring system to monitory the primary one? Genius.
But wait, that's already in place. Why did the failure not get detected?

The backup monitors the primary's webserver and uptime...not the monitoring service itself.

Curse you, former admins. The week was too quiet, and I was too full of hope.

(Monitoring system fixed and now the secondary monitoring system is actually monitoring the primary)


Forum Jump:


Users browsing this thread: 1 Guest(s)