What’s this dashboard for?
When you’re building software, particularly a SaaS product, there’s a multitude of systems and processes that need monitoring. Being able to see that everything is running like a well-oiled machine gives you peace of mind. But it’s also crucial to know when something has gone awry so you can explore and fix any issues.
We have various alerts for developers of any critical issues with our infrastructure, but there are lots of slightly less urgent processes that we also need to monitor. For these, a TV dashboard is the perfect way to tell the team if there’s something that needs investigating during their day or week. And staying on top of core processes means we can spot any bugs before users do – rather than our support team waking up to a mountain of bug reports.
Our product monitoring dashboard pulls data from three different data sources: Amazon CloudWatch, Pingdom and PagerDuty.
The main visualization on the dashboard is Latency, pulled from Amazon CloudWatch. This gives us a high-level overview of how well our systems are performing, and can be the first sign that something needs fixing. We also display CPU Usage for several of our workers, which allows us to identify any unusual spikes.
From Pingdom, we can easily see if something has gone down, or if everything is running smoothly.
Finally, we use PagerDuty to show which engineer is on call at any given time. It’s a really simple way of telling people who can be contacted if something critical goes down
- Lets you check instantly that everything is operating smoothly or if something needs attention
- Helps you solves operational issues faster, by showing who’s on-call at any given time