Alerting (aka monitors/alarms) always felt like a second-class citizen within all the different monitoring/observability/infrastructure tools with a very narrow feature set, which in turn results in poor alerts, alert fatigue (yes, your muted Slack channel), unreliable product and a complete alerting-hell.
Keep is an open-source alerting CLI tool that @shaharglazner and I wrote out of a pain we felt throughout our careers as developers and developers managers. Alerting (aka monitors/alarms) always felt like a second-class citizen within all the different monitoring/observability/infrastructure tools with a very narrow feature set, which in turn results in poor alerts, alert fatigue (yes, your muted Slack channel), unreliable product and a complete alerting-hell.
It's not only that we couldn't create better applicative/infrastructure alerts, but it's also that it is tough to maintain them and ensure they work over time.
Organizations today have so many tools they use for alerting that it's becoming an absolute nightmare.
The best way to describe what we had in mind when we first built Keep is how one of our first users puts it:
Keep is doing to alerting what GitHub actions did to CI/CD
There were three main guidelines when we started coding:
We constantly try to improve with our promised:
Try our first mock alert and get it up and running in <5 minutes.
So we're adding plenty more deployment options, providers, and functions. We're working on simplifying the syntax furthermore.
What do you think about the need for this kind of "abstraction"? What do you think about alerts as post-production tests? How do you manage and control your alerting chaos right now?
Would love to hear your thoughts; feel free to comment here / on our Github repo / in our Slack.