Your data scientists just spent three weeks building a customer churn prediction model. The accuracy looks great in testing. But when you deploy it, the predictions are wildly off. After days of debugging, you discover the root cause: your customer interaction timestamps are inconsistent across systems. Some are in UTC, others in local time, and a concerning number are NULL. Sound familiar?
The harsh reality is that most organizations are sitting on a foundation of fractured data. Your data warehouse is likely ingesting data from dozens of sources — each with its own data model, update frequency, and quality standards. Your business users have probably created countless spreadsheet exports that now live their own separate lives. And those “temporary” data processing scripts from three years ago? They’re now mission-critical, running in production with minimal documentation.
This isn’t just a technical headache. When your CEO asks why last quarter’s customer acquisition cost varies by 40% between the Marketing and Finance reports, both teams can defend their numbers. They’re both pulling from “reliable” sources, using “correct” definitions, and following “established” processes. Yet somehow, you’re still debating basic metrics in quarterly business reviews.
In more than 20 years of providing data and analytics solutions, our experts have seen firsthand how poor data quality hampers organizational success. In 2023, Forrester found that when data and analytics workers reported poor data quality at their organization, more than 25% estimated their business lost $5 million annually as a result. A further 7% estimated the cost at $25 million or more.
We’ve written this article in our quest to help organizations worldwide improve data integrity. Read on to learn about four battle-tested steps that have helped organizations transform their data from a liability into a strategic asset. You’ll also learn what to include in your data stack and how to overcome common challenges.
Understanding data quality improvement
Every data practitioner knows that successful data quality initiatives start with understanding what you’re trying to improve. Garbage in, garbage out isn’t just a saying – it’s the difference between trusted insights and costly mistakes. Before diving into how to improve data quality, let’s examine what “quality” means for your specific use case.
DAMA (UK) has defined six dimensions of data quality that cover most situations: accuracy, completeness, uniqueness, consistency, reliability, timeliness, and validity.
Think of the dimensions of data quality as flavors to pick and choose from to suit the end user, not as a checklist of criteria for all data to meet.
That is to say: Various industries and internal functions have different data quality requirements. For example, a rough estimate of a team’s travel expenses may be good enough for a manager to make a budget forecast but not accurate enough for an audit. Last year’s stock market data is not timely enough for investment decisions but is good enough for an AI model for economic forecasting.
4 practical steps to improve data quality
Improving data quality is critical to your company’s data maturity journey.
- Data maturity is the dominant framework for assessing an organization’s ability to make the best use of data.
A data-naïve organization may only use data passively, store it in silos, and have little in the way of data governance. At the other end of the spectrum, a highly data-mature organization will have a deeply ingrained data culture backed by a unified data architecture, enterprise data management, and comprehensive data governance tools, standards, and practices.
Gartner has outlined 12 actions to improve data quality along your data maturity journey:
Source: Gartner
We’ve distilled this set of actions into four practical steps for improving data quality:
Step 1. Start with targeted data quality principles
There is no cookie-cutter approach to improving data quality because data quality means something different to every organization. You’ll need to choose data quality pillars that make the most sense for your stakeholders.
Depending on your end users’ needs and tolerances, you could focus your data quality initiative on improving accuracy or decide that you don’t need to improve data accuracy and instead prioritize other dimensions of data quality. Knowing what data quality dimensions to focus on will help you effectively allocate resources.
Actions to take:
- Establish exactly how data quality improvements will impact business success
- Define “good enough” data quality for your stakeholders
- Set standards for organizational data quality
Step 2. Define a fit-for-purpose data quality strategy
Once you’ve established your data quality pillars, it’s time to build an improvement strategy to ensure data quality and integrity in practice.
One unintuitive aspect of data maturity is letting go of the notion that data quality is binary: “good” or “bad.” Organizations generally start with a truth-based model, in which the goal is total correctness. However, a more mature approach to data quality leans into the greater business value of “good enough.”
If an end user doesn’t need perfect data, pursuing perfection will only create a bottleneck. The answer is shifting to a trust-based model that gives users more immediate and contextualized access to usable, if imperfect, data.
Actions to take:
- Profile your data regularly to detect quality problems
- Build dashboards to monitor the quality of high-priority datasets
- Transition from a trust-based model to a truth-based model
Step 3. Assign accountability for data quality improvement
While data quality initiatives typically get off the ground thanks to high-level data quality champions, those aren’t the people who will do the work. Establish clear ownership of granular data quality responsibilities to ensure data quality plans become a reality.
An accountability structure makes it easier to coordinate efforts, track progress against data quality improvement targets, and verify that improving data quality is having an impact on business KPIs.
Actions to take:
- Integrate data quality into every data and analytics governance board meeting
- Assign specific roles and responsibilities for data quality tasks
- Set up cross-functional working groups focused on data quality
Step 4. Make improving data quality part of your culture
At the most mature organizations, data quality improvement becomes part of the organization’s DNA. These organizations recognize that data quality initiatives aren’t time-bounded projects but ongoing practices embedded into daily operations and aligned with business objectives.
At an organization with high data maturity, data literacy is ubiquitous. Interdepartmental communications emphasize the importance of data quality. Internal communications include data success stories to celebrate colleagues who improve data quality and reinforce data quality best practices.
Actions to take:
- Include a data quality review as a step in your release management process
- Regularly communicate with business departments about the benefits of data quality
- Keep up with data quality best practices among external organizations and peers
Building a comprehensive data stack: A guide to ensuring data quality
Data management experts at Intellias believe that data observability is the key to a data stack that supports your data quality strategy. With end-to-end visibility into enterprise-wide data, you can proactively monitor data collection and data management processes. That way, you can catch any issues during data capture and find and resolve problems that arise later.
Core elements of a comprehensive tech stack for improving data quality include:
- Data integration tools to align data ingested from different sources (such as a CRM and ERP). Integration makes sure data is consistent across the enterprise and prevents silos.
- Data observability platforms to provide real-time monitoring of your data lifecycle, catching issues like missing data or errors as they happen. These platforms also track data lineage, so you’ll always know where your data has been and how it’s changed.
- Data automation tools that use machine learning techniques to handle data validation, cleansing, and transformation, reducing manual effort.
- Data governance tools to enforce rules and standards so your data is easily accessed by data consumers and protected from unauthorized access or damage.
Look for tools with robust data version control features, which will help keep your data safe and organized. These features allow you to track changes, test updates without affecting live data, and quickly revert to a previous version of your database if needed. If data gets corrupted or if changes introduce errors, version control can help your team recover quickly and maintain consistent, high-quality data.
Improve data quality by moving logic upstream in your pipeline
One of the most efficient and effective ways to improve data quality is to build your data management system to address issues early in the pipeline, before errors or omissions flow into analytics or production environments.
This aligns with the software development Rule of Ten, which holds that at each step of the process, the cost and time to fix a problem increases tenfold.
Source: Researchgate
Data engineering services professionals at Intellias can help you build a data management architecture that moves quality checks upstream, closer to the point of data entry or collection. That way, your team can detect errors at the source and prevent data quality issues (and their costs) from compounding.
Overcoming challenges while improving data quality for your organization
Managing a big IT project is always an uphill battle. If you’re pushing to improve data quality at your organization, you will encounter all the typical barriers to technology projects – and more.
Here are eight specific challenges you may run into (and how to plan for them):
1. Lack of executive buy-in
Leaders don’t always understand the value of data quality efforts. Without leadership support, you’re unlikely to get the resources you need for success.
What to do: Start speaking their language. Gain the support of leadership by clearly connecting the dots between data quality improvements and business outcomes. Highlight benefits like increased operational efficiency, improved decision-making, and stronger regulatory compliance.
2. Inadequate data governance
Data quality initiatives require a solid governance framework to provide structure and accountability. What should you do if the teams managing data don’t follow consistent processes?
What to do: Establish or enhance data governance policies and practices as part of your data quality initiative. Define clear roles, set data quality standards, and implement processes that guarantee consistent data handling. You can ensure long-term success by tying data quality improvements directly to data governance.
3. Scalability issues
As your volume of data grows, manual processes will become too slow and error-prone to keep up. If your plan doesn’t account for automation at scale, it may not be an effective strategy for long.
What to do: Choose modern, scalable tools that can grow with your organization’s data needs. Automate data quality checks and validation processes to make sure you’ll have the capacity to handle them even as data volumes increase.
4. Data complexity
Data flows into your organization from diverse systems in a variety of formats. While some data is structured in tables and lists, an increasing amount is unstructured, including text documents, video and audio files, and social media posts. It arrives in different cadences, in batches or in streams. With all that complexity, ensuring the consistency and high quality of data is anything but straightforward.
What to do: Use data integration tools to standardize formats and ensure consistency across systems. Managing data quality from the start helps prevent errors and inconsistencies from escalating into bigger problems as data moves through your pipeline.
5. Time and resource constraints
Data quality improvement initiatives are resource-intensive. Without outside support, it’s easy for these projects to stall – especially for smaller or less experienced teams.
What to do: If your team is stretched too thin to handle a data quality improvement initiative, call on Intellias for data strategy consulting. We’ll give you as much help as you need to bring your data quality initiative to life without sacrificing normal operations. In the long term, you can make the most of your IT resources by prioritizing high-impact data quality tasks and using automation to handle repetitive processes like data cleansing.
6. Resistance to change
If your organization is early in its data maturity journey, employees may see new data quality processes as unnecessary or time-consuming. You could encounter pushback, or uncooperative colleagues could fail to follow new processes.
What to do: Engage key stakeholders early in the process. Show how improving data quality can reduce errors and inefficiencies. Celebrate quick wins along the way to demonstrate the initiative’s value.
7. Inconsistent data standards
Without consistent data standards across departments, data users will encounter confusion and errors. Trouble sharing and using data across the organization reinforces existing data silos.
What to do: Establish clear definitions of terms and standards for data across the business. A data glossary will keep everyone aligned on how data should be labeled, formatted, and interpreted. Regular audits can help to ensure adherence to these definitions, leading to more accurate and reliable data.
8. Legacy IT systems
If your organization is still running on old systems, they may not support modern data quality tools. Outdated instruments make it harder to manage data effectively or implement real-time monitoring.
What to do: Where possible, upgrade legacy systems or implement integration solutions that connect them with newer, more advanced tools. Modernizing your business software and implementing cloud governance will help your organization maintain high data quality, even with older infrastructure.
Let Intellias enhance your data quality improvement process
Improving data quality is one of the most essential actions you can take to support your business – but it’s more nuanced than focusing on 100% complete and accurate data. To progress along your data maturity path, you’ll need to reflect on what constitutes “good enough” data for your end users to achieve their business objectives. That knowledge will enable you to select the data quality strategy that best suits your organization’s needs.
Intellias is passionate about helping businesses succeed in their data quality initiatives. The business benefits of this are astounding. Without trustworthy data, companies can’t pivot quickly to avoid risks or take opportunities. But when an organization uses a trust-based model to ensure data quality and integrity, the sky is the limit.
If you’re ready to improve data quality at your organization, reach out to Intellias for expert support that will speed up the process and set you on the right path for continuous improvement.