Data validation, profiling, quality check using EM

We can add a new action, "Call with another table" that should simplify applying re-usable data quality checks. The action would call another (re-usable) project and pass another table to it. The project does whatever data quality check you design and returns a list of errors in whatever format you prefer.

You will be able to put several actions "Call with another table" in a row, where each call does a different data quality check. Each "Call with another table" action will append its result to the result of the previous action, thus building up a list of errors returned by each check. In the end, you will either have a list of errors, or an empty list and will be able to decide what to do with it next. Basically, you take a table, and pass it through a pipeline of checks, and collect the results into one table.

Something like in the sketch below:

With this action, it should be possible to re-use data validation logic and build data quality check pipelines.

How does it sound to you?

1 Like