Data Quality Dashboard using uArrow DQ

How to Guide

uArrow Data quality helps customers to unlock the value and confidence of the data.

According to Gartner, poor data quality is also hitting organizations where it hurts – to the tune of $15 million as the average annual financial cost in 2017. And as per IBM extracting business value from data, businesses lose $3.1 trillion annually due to poor data quality in the US alone.

The Enterprise value is measured by the quality, performance and accuracy of its data. But the cost we pay to data quality is always undermined. Data Quality also leads to reputational loss of opportunity, which is intangible in this cloud world. It is noted in HBR, bad data cost 3 trillion per year in US alone!

Data Quality historically needs a huge amount of investment, time and tools. uArrow Data Quality comes as a niche player in the Data Quality tools space to help define the business rules, measure, monitor, and take actions to unlock the intrinsic value of the data.

And uArrow DQ comes with a push-down approach where our rule engine pushes the checks to the data source rather than the data moves to 3rd party cloud. In this way, we don’t need to move the vast data across the network or cloud and enter the labyrinth of security hassles.
There are multiple lenses to see and measure the data quality. One of the most common industry practices is to measure the Data Quality score by Data Quality Dimensions.

Data Quality Dimensions

uArrow Data quality product has a wide range of checks across the various Data quality dimensions.
1. Completeness check.
This check ensure there are no gaps in the data from what was supposed to be collected and what was actually collected

some of the checks uArrow supports

Missing values in a column
Missing values in a table
Missing values values in a Json
Missing values w.r.t. a reference population

2. Consistency check.
This check ensure the data are consistent if they respect a set of constraints

some of the checks uArrow supports

Comparison of two fields

a) by relation operator (<,>,==,!=)

b) Infer relationship

c) mutually exclusive

d) based on set of values
Aggregation of a field in a table linked to a master table with a field in the master table.

3. Uniqness check.
This check ensure there are no duplicates in the data was received.

some of the checks uArrow supports

  • Duplicates on a single column
  • Duplicates on N number of columns

4. Validity check.
This check ensure, does information is in a specific format, does it follow the defined business rules, or is it in usable format?

some of the checks uArrow supports

  • data type check
  • range check
  • mandatory check
  • List of values check

5. Accuracy Check.
This check refer to the level to which data describes the real world scenario.

some of the checks uArrow supports

  • precision check, for instance Singapore phone number starts with +65
  • The Age of the person cannot be > 200 or <0
  • The length of the order number must be 10 digits only

6. Integrity Check.
Over the course of the journey, the data might have been transformed by many downstream systems. The Integrity check indicates that the attributes are maintained correctly, even as data gets stored, processed and used in diverse systems.

some of the checks uArrow supports

  • Find an external key in a single field of another table.
  • Find an external key in another table using a combination of n fields.
  • Find an external key in a flat file