How we monitor our own checkpoints

Checkpoints Map

If you have been using our services, then you probably know that you are able to monitor your website from over 150 different locations around the world. These checkpoints need to stay online and operate flawlessly to help make sure that your websites do the same. Today we would like to reveal just how we monitor these checkpoints, ensuring that they are always available when you need them.

How are they monitored?

We don’t just provide monitoring solutions, we also consume them! That’s right, we trust our own product so much that we use it to monitor our own checkpoints. For every checkpoint around the world we have an HTTPS monitor. The checkpoints themselves are actually web services, a form of an API.

When it is time for a checkpoint to be monitored, our system will select which checkpoint will actually perform the check and send a command to that location. Since we are able to talk to every single checkpoint over HTTPS, we are able to use our own system to monitor them all. To ensure that we always stay online, a secondary monitoring system is in place to maintain the continuity of our monitoring and alerting subsystems.

What are we checking for?

Every checkpoint has a health webpage that we can monitor. This proxy page reports errors that our monitors can read when a malfunction occurs. Since there are specific things in our server environment that can’t be read through HTTPS directly, we custom program the checks ourselves and send the results to the health page. This webpage is then monitored by Uptrends in the usual way.

This health page helps us monitor all of the essential processes at the same time. We make sure things like transactions, Full Page Checks, uptime checks, and IPv4 and IPv6 (where available) function correctly. We also verify that the DNS is okay and accurate, and that the internal clock of each checkpoint is in sync with our own.

How is Uptrends Infra involved?

In addition to HTTPS monitors, we also use our own internal monitoring product, Uptrends Infra, to monitor vital health statistics. First, we use it to measure system metrics like CPU, memory and hard disk usage, network traffic, and other processes. Second, Infra also acts as an application performance monitor (APM) that uses Windows performance counters to measure and expose application related metrics. For example, we ask the web server process on the checkpoint to report the number of incoming requests. By combining hardware metrics and the application metrics from Infra, and the health checks coming in through the HTTPS monitors, our internal and external monitoring work together to ensure proper network connectivity, IPv4/IPv6 capabilities, and a healthy request throughput.

How do we choose the locations for our checkpoints?

If you look at the complete map of all of our checkpoint locations, you will notice that we have checkpoints stationed in dozens of countries all over the world. These checkpoints are physically positioned in servers at that location. When we want to add a new checkpoint, we do research and find a provider that does web hosting in that location. To verify that the checkpoint is actually in the location we need it to be, we analyse traceroutes and perform various other checks to confirm the location of the data center.

We will also explicitly request IPv6 support if it is available. If a web hosting provider supports IPv6 it gives the Uptrends application a bit of an edge, as native IPv6 support allows native IPv6 monitoring. In the past it has been difficult to find IPv6 support, but in the past 6 months we have noticed a shift in mindset that many more people are supporting it.

Once the system is online we install our own custom software on it, which includes the checkpoint software with the health page, the transaction processor, etc. All of our checkpoints are Windows machines, meaning that the checks we perform are executed as a true Windows client. We use Windows’ network stack and the security cipher suites available on those machines.

What happens when a checkpoint experiences an error?

If the health page for a checkpoint starts to register a problem, our own Uptrends monitors will recognize that and send an alert into our dedicated Slack channel. If the server goes down completely, rendering even the health page unavailable, we will still receive an alert showing that the checkpoint is unhealthy.

A separate monitoring system, which we’ve dubbed our Checkpoint Robot, can spot the unhealthiness and immediately pull it out of our network so that no monitor requests go to that location.

How do the checkpoints benefit me?

To stay on top of your end user experience, you need to know how your site performs from all over the world. Not just uptime and performance, but critical elements of your website like Transactions and web services, too. Our global network gives you the power to monitor your site from all of these different locations.

Since we have so many, we made it an Uptrends standard that anytime an error is detected it will be confirmed from a second location before you get an alert. This way, even if one of our checkpoints is unhealthy, it is unlikely that the second checkpoint will be experiencing the same problem. This creates virtually zero false alerts for you, allowing you to focus on making your online presence the best it can be. To learn more about selecting the right set of checkpoints, check out our Knowledge Base! 

Interested in getting to know our application a bit better? You can try today with our 30 day free trial, no credit card required!

 

Leave a Reply

Your email address will not be published. Required fields are marked *