The datapine Blog
News, Insights and Advice for Getting your Data in Shape

Your Ultimate Guide To Data Quality Management in The Digital Age

Wordcloud - Elements of Data Quality Management

What is data quality management (DQM) exactly? In a nutshell, DQM is:

  • The acquisition of data
  • The implementation of advanced data processes
  • Effective distribution of data
  • Managerial oversight of data

Engineered to be the “Swiss Army Knife” of data development, DQM processes prepare your organization to face the challenges of digital age data, wherever and whenever they appear. Among the prominent digital age data innovators of today, especially those industry leaders driving the big data evolution, effective DQM is recognized as the key to consistent data analysis.

Exclusive Bonus Content: What is Data Quality Management exactly?
Read here our free summary with the key takeaway points of this guide!

Why Is Data Quality Management Necessary?

ROI of Data Quality Management

While the digital age has been successful in prompting innovation far and wide, it has also facilitated what is referred to as the “data crisis” of the digital age–low-quality data.

Low-quality data is the leading cause of failure for advanced data and technology initiatives, to the tune of $600 billion dollars. That is how much low-quality data costs american businesses each year (not counting businesses in every other country of the world).

We’ll get into some of the consequences of poor-quality data in a moment. However, let’s make sure not to get caught in the “quality trap,” because the ultimate goal of data quality management is not to create subjective notions of what “high-quality” data is. No, the ultimate goal of DQM is to increase return on investment (ROI) for those business segments that depend upon data.

From customer relationship management , to supply chain management, to enterprise resource planning, the benefits of effective DQM can have a ripple impact on an organization’s performance. With quality data at their disposal, organizations can form data warehouses for the purposes of examining trends and establishing future-facing strategies. Industry-wide, the positive ROI on quality data is well understood. According to recent big data surveys by Accenture, 92% of executives using big data to manage are satisfied with the results, and 89% rate data as “very” or “extremely” important, as it will “revolutionize operations the same way the internet did”.

As you can see, the leaders of big business clearly understand the importance of quality data.

The Consequences of Bad Data Quality

Bad data quality can impact every aspect of an organization, including:

  • How much your marketing campaigns cost and how effective they are
  • How accurately you are able to understand your customers
  • How quickly you can turn prospects into leads into sales
  • How accurately you can make business decisions

RingLead provides us with a very informative inforgaphic that discloses the true costs of bad data as well as clean data. Here is an extract:

Infography on the consequences of bad data quality management

Source: RingLead

Additionally to this infographic, a study by Gartner tells us that bad data quality cost the companies they surveyed an average of $14.2 million dollars a year.

The intangible costs

However much data we can find on the tangible costs of bad data, we can’t examine the intangible costs directly. However, we can use our intuition and imagination in this area.

Let’s say that you’re striving to create a data-driven culture at your company. You’re spearheading the effort, and currently conducting a pilot program to show the ROI of making data-driven decisions using business intelligence and analytics. If your data isn’t high-quality, you’re going to run into a lot of problems showing other people the benefits of BI. If you blame the data quality “after the fact”, your words will just sound like excuses.

However, if you address things upfront, and make it clear to your colleagues that high data quality is absolutely necessary and is the cornerstone of getting ROI from data, you’ll be in a much better position.

One huge intangible cost: bad decisions

Maybe you’re not trying to convince others of the importance of data-driven decision making. Maybe your company already utilizes data analytics, but isn’t giving due diligence to data quality. In that case, you can face an even bigger blow up: making costly decisions based on inaccurate data.

As big data expert Scott Lowe states in his post Data quality is too important to ignore, maybe the worst is that decisions are made with bad data: that can lead to greater and serious problems in the end. He would rather make a decision listening to his guts than taking the risk to make one with bad data.

For example, let’s say you have an incorrect data set showing that your current cash flows are healthy. Feeling optimistic, you expand operations significantly. Then, a quarter or two later you run into cashflow issues and suddenly it’s hard to pay your vendors (or even your employees). This kind of disastrous situation is one that could be prevented by higher-quality data.

The Benefits of High-Quality Data

Let’s examine the benefits of high-quality data in one area: marketing. Imagine you have a list you purchased with 10,000 emails, names, phone numbers, businesses, and addresses on it. Then, imagine that 20% of that list is inaccurate (which fits in line with the chart data from Ringlead above). That means that 20% of your list has either the wrong email, name, phone number, etc. How does that translate into numbers?

Well, look at it like this: if you run a Facebook ad campaign targeting the names on this list, the cost will be up to 20% higher than it should be – because of those false name entries. If you do physical mail, up to 20% of your letters won’t even reach their recipients. With phone calls, your sales reps will be wasting more of their time on wrong numbers or numbers that won’t pick up. With emails, you might think that it’s no big deal, but your open rates and other metrics will be distorted based on your “dirty” list. All of these costs add up quickly, contributing to the $600 billion annual data problem that U.S. companies face.

However, let’s flip the situation: if your data quality is on point, then you’ll be able to:

  • Get Facebook leads at lower costs than your competition
  • Get more ROI from each direct mail, phone call, or email campaign you execute
  • Show C-suite executives better results, making it more likely your ad spend will get increased

All in all, in today’s digital world, having high-quality data is what makes the difference between the leaders of the pack and the “also-rans”.

Exclusive Bonus Content: What is Data Quality Management exactly?
Read here our free summary with the key takeaway points of this guide!

The 5 Pillars of Data Quality Management

Now that you understand the importance of high-quality data and want to take action to solidify your data foundation, let’s take a look at the 5 essential pillars of what is data quality management.

Pillar #1 – The people

roles in data quality management

Technology is only as efficient as the individuals who implement it. We may function within a technologically advanced business society, but human oversight and process implementation have not (yet) been rendered obsolete. Therefore, there are several DQM roles that need to be filled, including:

DQM Program Manager: The program manager role should be filled by a high level leader who accepts the responsibility of general oversight for business intelligence initiatives. He/she should also oversee the management of the daily activities involving data scope, project budget and program implementation. The program manager should lead the vision for quality data and ROI.

Organization Change Manager: The change manager does exactly what the title suggests: organizing. He/she assists the organization by providing clarity and insight into advanced data technology solutions. As quality issues are often highlighted with the use of a dashboard software, the change manager plays an important role in the visualization of data quality.

Business/Data Analyst: The business analyst is all about the “meat and potatoes” of the business. This individual defines the data quality needs from an organizational perspective. These needs are then quantified into data models for acquisition and delivery. This person (or group of individuals) ensures that the theory behind data quality is communicated to the development team.

Pillar #2 – Data profiling

Illustration of Data Profiling

Data profiling is an essential process in the data quality management lifecycle. It involves:

  1. Reviewing data in detail
  2. Comparing and contrasting the data to its own metadata
  3. Running statistical models
  4. Reporting the quality of the data

This process is initiated for the purpose of developing insight into existing data, with the goal of comparing it to data quality goals. It helps businesses develop a starting point in the DQM process and sets the standard for how to improve data quality. The data quality metrics of complete and accurate data are imperative to this step. Accurate Data is looking for disproportionate numbers, and Complete Data is defining the data body and ensuring that all data points are whole.

Pillar #3 – Defining data quality

Visialize high data quality level

The third pillar of data quality management is quality itself. “Quality rules” should be created and defined based on business goals and requirements. These are the business/technical rules with which data must comply in order to be considered viable.

Business requirements are likely to take a front seat in this pillar, as critical data elements should depend upon industry. The development of quality rules is essential to the success of any DQM process, as the rules will detect and prevent compromised data from infecting the health of the whole set.

Much like antibodies detecting and correcting viruses within our bodies, data quality rules will correct inconsistencies among valuable data. When teamed together with business intelligence software, data quality rules can be key in predicting trends and reporting analytics.

Pillar #4 – Data reporting

DQM reporting is the process of removing and recording all compromising data or data quality exceptions. This should be designed to follow as a natural process of data rule enforcement. Once exceptions have been identified and captured, they should be aggregated so that data quality patterns can be identified.

The captured data points should be modeled and defined based on specific characteristics (e.g., by rule, by date, by source, etc.). Once this data is tallied, it can be applied to a business intelligence solution to report on the state of data quality and the exceptions that exist within a dashboard. If possible, automated and “on-demand” technology solutions should be implemented as well, so dashboard insights can appear in real time.

Reporting and monitoring are the crux of data quality management ROI, as they provide visibility into the state of data at any moment in real time. By allowing businesses to identify the location and domiciles of data exceptions, teams of data specialists can begin to strategize remediation processes.

Knowledge of where to begin engaging in proactive data adjustments will help businesses move one step closer to recovering their part of the $600 billion lost each year to low-quality data.

Pillar #5 – Data repair

data repair illustration

Data repair is the two-step process of determining:

  1. The best way to remediate data
  2. The most efficient manner in which to implement the change

The most important aspect of data remediation is the performance of a “root cause” examination to determine why, where, and how the data defect originated. Once this examination has been implemented, the remediation plan should begin.

Data processes that depended upon the previously defective data will likely need to be re-initiated, especially if their functioning was at risk or compromised by the defected data. These processes could include reports, campaigns, or financial documentation.

This is also the point where data quality rules should be reviewed again. The review process will help determine if the rules need to be adjusted or updated, and it will help begin the process of data evolution. Once data is deemed of high-quality, critical business processes and functions should run more efficiently and accurately, with a higher ROI and lower costs.

Exclusive Bonus Content: What is Data Quality Management exactly?
Read here our free summary with the key takeaway points of this guide!

What About Data Quality Metrics?

illustration of data quality metrics

Data quality metrics are key in determining the overall health of an organization. As we have demonstrated, low-quality data can impact productivity, bottom line, and overall ROI. In order for organizations to follow the general pattern of the 5 Pillars of DQM, data metrics must be of a high-quality and clearly defined.

While data analysis  can be quite complex, there are a few basic measurements that all key DQM stakeholders should be aware of. They include:

  • Accuracy: This metric makes reference to business transactions or status changes as they happen in real time. Accuracy should be measured through source documentation (i.e., from the business interactions), but if not available, then through confirmation techniques of an independent nature. It will indicate whether data is void of significant errors.
  • Completeness: As a data quality metric, completeness means determining whether or not each data entry is a “full” data entry. All available data entry fields must be complete, and sets of data records should not be missing any pertinent information. Completeness will indicate if there is enough information to draw conclusions.
  • Integrity: Also known as data validation, integrity refers to the structural testing of data to ensure that the data complies with procedures. This means there are no unintended data errors, and it corresponds to its appropriate designation (e.g., date, month and year).

3 Sources of Low-Quality Data

illustration of the various processes affecting data quality

Image source: TechTarget

We’ve just gone through how to clean data that may not be accurate. However, as the saying goes, an ounce of prevention is worth a pound of cure. With that in mind, here are some of the origins of low-quality data, so that you can be mindful about keeping your records accurate as time goes on. Remember: keeping your data high-quality isn’t a one time job. It’s a continual process that never ends.

Source #1: Mergers and acquisitions

When two companies join together in some way, their data tags along into this new working relationship. However, just like when two people with children from prior marriages form a new relationship, things can sometimes get messy.

For example, it’s very possible, and even probable, that your two companies use entirely different data systems. Maybe one of you has a legacy database, while the other has updated things. Or you use different methods of collecting data. It’s even possible that one partner in the relationship simply has a lot of incorrect data.

Data quality expert Steve Hoberman gives an example of mergers causing difficulty in his article Thirteen causes of enterprise data quality problems. He writes that when these two databases disagree with each other, you must set up a winner-loser matrix that states which database’s entries are to be regarded as “true”. As you might expect, these matrices can get exceedingly complex: at some point, “the winner-loser matrix is so complex, that nobody really understands what is going on”, he says. Indeed, the programmers can start arguing with business analysts about futilities and “consumption of antidepressants is on the rise”.

Action Step: In the event of a planned merger or acquisition, make sure to bring the heads of IT to the table so that these kinds of issues can be planned for in advance -before any deals are signed.

Source #2: Transitioning from legacy systems

To a non-technical user, it may be hard to understand the difficulties inherent in switching from one operating system to another. Intuitively, a layman would expect that things are “set up” so that transitions are easy and painless for the end user. This is definitely not in line with reality.

Many companies use so-called “legacy systems” for their databases that are decades old, and when the inevitable transition time comes, there’s a whole host of problems to deal with. This is due to the technical nature of a data system itself. Every data system has three parts:

  1. The database (the data itself)
  2. The “business rules” (the ways in which the data is interpreted)
  3. The user interface (the ways in which the data is presented)

These distinct parts can create distinct challenges during data conversion from one system to another. As Steve Hoberman writes, the center of attention is the data structure during the data conversion. But this is a failing approach, as the business rule layers of the source and destination are very different. The converted data is inevitably inaccurate for practical purposes even though it remains technically correct.

Action step: When transitioning from a legacy system to a newer one, it’s not enough that your transition team be experts in one system or the other. They need to be experts in both to ensure that the transition goes smoothly.

Source #3: User error

This is a problem that will probably never go away, due to the fact that humans will always be involved with data entry, and humans make mistakes. People mistype things on a regular basis, and this must be accounted for. In his TechTarget post, Steve Hoberman relates a story of how his team was in charge of “cleansing” a database and correcting all of the wrong entries.

You would think that data cleansing experts would be infallible, right? Well, that wasn’t the case. As Mr. Hoberman states, “still 3% of the corrections were entered incorrectly. This was in a project where data quality was the primary objective!”

Action step: Make all of the forms that your company uses as easy and straightforward to fill out as possible. While this won’t prevent user error entirely, it will at least mitigate it.

Exclusive Bonus Content: What is Data Quality Management exactly?
Read here our free summary with the key takeaway points of this guide!

To Conclude…

In this article, we examined what is data quality management through several topics:

  1. The costs of bad data quality
  2. The benefits of data with integrity
  3. The five pillars of data quality management
  4. How to measure the quality of data
  5. The three sources of data error

We hope this post has given you the information and tools you need to keep your data high-quality. We also hope you agree that data quality management is a crucial process for keeping your organization competitive in today’s digital marketplace. While it may seem to be a real pain to maintain high-quality data, consider that other companies also feel like data quality management is a huge hassle. So if your company is the one who takes the pains to make your data sound, you’ll automatically gain a competitive advantage in your market. As the saying goes, “if it were easy, everyone would be doing it.”

Data quality management is the precondition to create efficient business dashboards that will help your decision-making and bring your business forward. To start building your own company dashboards and benefit from one of the best solution on the market, start your 14-day free trial here!

 

1 Star2 Stars3 Stars4 Stars5 Stars (21 votes, average: 4.86 out of 5)