Learn how to set up data alerts to validate your assumptions about your data.
Introduction
A data alert is usually set up to let a user know when there is an anomaly or a certain threshold has been passed. An example is when a data alert is used for quality assurance (QA). However, there is a lot of flexibility to set up a data alert for any purpose you choose.
Fundamentally, a data alert is based on a SQL select statement. If this select statement returns rows or errors, it will trigger a notification. If a select statement returns zero rows, nothing happens.
For example:
SELECT *
FROM sample_table
WHERE sample_column
IS NULL
This will trigger a notification if the sample_column has null values.
There are two types of data alerts in Mozart Data:
1. Notify-only alerts on any type of table. These alerts will run after a connector sync, after a transform run, or after a snapshot run. They will send you a notification if they trigger.
2. Revert and notify alerts are only on transforms. If this data alert returns rows on the new transform table or an error, this will fail the transform run, revert the transform table back to its previous state, and send a notification.
Alerts can be important for QA and validation of data pipelines. Pipelines involve manipulating data via transforms based on certain assumptions about the data. Assumptions like no null values in a column, no duplicate IDs in a column, are just a few examples. You can write alerts that assert such assumptions.
Writing a Data Alert
You can write an alert by going to the Alerts tab on the Table page. Click on Create Alert button in the top right as shown.
You can provide a descriptive name for the alert and a SQL query. To test the SQL query, click the Run button. If 0 rows are showing, your alert is successful on the current table.
Otherwise, you will see an error message or the rows that were returned.
If you edit an alert’s SQL, but it hasn’t been run yet, you will see a warning message- 'SQL has changed since last run' below the alert.
Running a Data Alert
Alerts are enabled to run when a table changes (transform run, connector sync, snapshot run) by default. You can disable an alert by using the toggle next to it as shown below.
Once an alert is enabled, each time a table changes, either manually or scheduled, the enabled alert(s) will run. However, if any enabled revert-and-notify alert fails, the transform table will not get created/updated. You will be able to see the error in the Runs tab on the Transform Runs page. Notify-only alerts will run after the table change is successful. This means if a transform has revert-and-notify alerts enabled that fail, the notify-only alerts will not run.
It is important to note that if your transform run fails for a non-data alert reason (e.g., bad transform SQL), no alerts will be run. In such cases, the transform run page will show the appropriate error message as shown below.
Viewing Alert Results
When your alert fails, Mozart Data will store up to 2000 failing rows. When your alert passes, there are no results available since it will have returned 0 rows.
To see the Alerts in Mozart Data: You can view them by clicking the button under the Last Results column. If the data check has passed, the Results button will be disabled. Please note that the alert results correspond to last table change. You can view the timestamp for the last table change at the top of the Alerts tab and view further details in the Runs tab.
To receive a CSV file attachment via email: You can do this by enabling the email notification in your Settings page (see here: Notifications). Next, toggle on the 'Attach CSV' for each Data Alert and you will start receiving email alerts every time the table is synced / transform is scheduled and the data alert fails.
To receive the results directly via Slack: You can do this by installing the Mozart Data app into your Slack workspace (see here: Using Slack with Mozart Data).