Overview

Canary Checker can collect health about systems in few different ways:

Active Application health checks involve sending periodic requests to the service or application and checking the response to ensure that it is working correctly, Active health checks are proactive and can detect issues quickly, but they can also introduce some load on the system being monitored.
Active Infrastructure health checks are similar to application health checks, but instead of sending a request to the application it sends a request to the infrastructure to deploy a new application or infrastructure component e.g. a new Kubernetes pod or EC2 instance.
Passive health checks rely on monitoring the activity in the system, analysing it, and detecting anomalies or errors. Passive health checks are less intrusive than active health checks, but they may not detect issues as quickly.

Health checks can be defined in 3 different ways:

UI: Navigate to Settings → Health → Click on the button
GitOps canary-checker is fully Gitops enabled using Kubernetes Custom Resource Definitions (CRD)
CLI For rapid development and feedback, canary-checker can be run as a normal CLI application by specifying the health check definition in a config file.

Canary checker runs health checks on a pre-defined CRON schedule and provides a fully customizable platform that:

Securely references authentication credentials from Kubernetes secrets and configmaps.
Parses and transforms the response using JSONPath, Go templates or Javascript to validate, extrapolate or aggregate results.

Check Types

Protocol	Status	Checks
HTTP(s)	GA	Response body, headers and duration
DNS	GA	Response and duration
Ping/ICMP	GA	Duration and packet loss
TCP	GA	Port is open and connectable
Data Sources
SQL (MySQL, Postgres, SQL Server)	GA	Ability to login, results, duration, health exposed via stored procedures
LDAP	GA	Ability to login, response time
ElasticSearch / Opensearch	GA	Ability to login, response time, size of search results
Mongo	Beta	Ability to login, results, duration,
Redis	GA	Ability to login, results, duration,
Prometheus	GA	Ability to login, results, duration,
Alerts		Prometheus
Prometheus Alert Manager	GA	Pending and firing alerts
AWS Cloudwatch Alarms	GA	Pending and firing alerts
DevOps
Git	GA	Query Git and Github repositories via SQL
Azure Devops
Integration Testing
JMeter	Beta	Runs and checks the result of a JMeter test
JUnit	Beta	Run a pod that saves Junit test results
File Systems / Batch
Local Disk / NFS	GA	Check folders for files that are: too few/many, too old/new, too small/large
S3	GA	Check contents of AWS S3 Buckets
GCS	GA	Check contents of Google Cloud Storage Buckets
SFTP	GA	Check contents of folders over SFTP
SMB / CIFS	GA	Check contents of folders over SMB/CIFS
Config
AWS Config	GA	Query AWS config using SQL
AWS Config Rule	GA	AWS Config Rules that are firing, Custom AWS Config queries
Config DB	GA	Custom config queries for Mission Control Config D
Kubernetes Resources	GA	Kubernetes resources that are missing or are in a non-ready state
Backups
GCP Databases	GA	Backup freshness
Restic	Beta	Backup freshness and integrity
Infrastructure
EC2	GA	Ability to launch new EC2 instances
Kubernetes Ingress	GA	Ability to schedule and then route traffic via an ingress to a pod
Docker/Containerd	Deprecated	Ability to push and pull containers via docker/containerd
Helm	Deprecated	Ability to push and pull helm charts
S3 Protocol	GA	Ability to read/write/list objects on an S3 compatible object store