Tests are important for quality assurance and validation of data pipelines. Pipelines involve manipulating data via transforms based on certain assumptions about the data. Assumptions like no null values in a column, no duplicate IDs in a column, are just a few examples. You can write tests that assert such assumptions.
A test is a SQL query that makes an assertion. When a test is run, the test succeeds if 0 rows are returned. If 1 or more rows are returned, the test will fail. An example of a test is shown below. The test succeeds when 0 rows are returned, indicating there are no null values in the CUSTOMER_ID column.
SELECT * FROM mozart.shopify_customers WHERE customer_id is NULL;
You can write a test by going to the Tests tab on the Transform page. Click on Create Test button as shown.
You can provide a descriptive name for the test and a sql query. To test the SQL query, click the Run button. If 0 rows are show, your test is successful on the current transform table.
If you edit a test’s SQL, but it hasn’t been run yet, you will see a warning message-
SQL has changed since last run below the test.
Tests are enabled to run when a transform runs by default. You can disable a test by using the toggle next to it as shown below.
Once a test is enabled, each time a transform is run, either manually or scheduled, the enabled test(s) will run. If all enabled tests pass, the transform table will be created/updated. However, if any enabled test fails, the transform table will not get created/updated. You will be able to see the error in the Runs tab on the Transform Runs page.
It is important to note that if your transform fails for another reason, your tests will not be run. In such cases, the transform run page will show the appropriate error message as shown below.
Each test has its own run status associated with it. This is indicated under the Last Test Status column on the Tests page. This column can show 3 symbols- grey circle, red cross, and a green checkmark.
If the Last Test Status shows a grey circle, it could be because the test didn't run on the last run of your transform. To run the test, first enable it via the Enabled toggle. You can then either manually run the transform or schedule it. Once the transform is run, all enabled tests will run and their Last Test Status will be updated. A red cross indicates that the test failed. Thus, no changes to the transform table (if it exists) have been made. A green check mark indicates the test passed.
When your test fails, Mozart Data will store up to 2000 failing rows. You can view them by clicking the Results button under the Failed Test Results column. When your test passes, there are no results available since it will have returned 0 rows. In such a case, Results button will be disabled. Please note that the test results correspond to last transform run. You can view the timestamp for the last transform run at the top of the Test tab and view further details in the Run tab.
Updated 9 days ago