What are the norms, standards and frameworks that help us manage data quality in Big data environments?
Keywords: big data, ISO 8000, data quality.
Table of contents:
- A world of data
- What is Big data?
- ISO/IEC 20547 series of standards
- Data Governance
- Data quality management
- The ISO/IEC 8000 standard
- Other frameworks and standards
- Bibliography
A world of data
-
By 2025, every connected person in the world will interact by leaving their digital footprint about 4900 times per day - that's one interaction every 18 seconds(1).
"The data-driven world is going to be always on, always keeping track of everything, always monitoring, always listening and always watching - because it's going to be always learning."
Source: The Digitization of the World - From Edge to Core. IDC White Paper. Doc# US44413318. November 2018.
-
Based on market statistics, it is believed that data generated worldwide is increasing at a rate of 40% annually. By 2025 more than 180 zetabytes will be created according to the global data and business intelligence platform Statista https://www.statista.com/statistics/871513/worldwide-data-created/.
-
Statista increased these values after the COVID-19 pandemic as there was a huge shift in the labour and social marketplace by bringing many of the human interactions into the virtual realm; consequently increasing the amount of data available.
-
IDC predicted in 2018 that the Global Datasphere will grow from 33 Zettabytes in 2018 to 175 Zettabytes in 2025.
Source: The Digitization of the World - From Edge to Core. IDC White Paper. Doc# US44413318. November 2018.
-
In this context, we talk about Big Data and we are faced with the problem of managing the quality of this data in our organisations.
What is Big data?
-
According to ISO/IEC DIS 20546(en) Information technology - Big data - Overview and vocabulary, the definition of Big data is as follows:
"extensive data sets, primarily in the data characteristics of volume, variety, velocity and/or variability, that require scalable technology for efficient storage, manipulation, management and analysis."
-
Although it adds a note to the entry specifying that this term "is commonly used in many different ways, for example, as the name for scalable technology used to manage large big data datasets".
Source: https://www.iso.org/obp/ui/#iso:std:iso-iec:20546:dis:ed-1:v1:en.
-
Similarly, the consultancy firm Gartner defines it as:
"High-volume, high-velocity and/or high-variety data assets that demand innovative and cost-effective ways of processing information that enable better insight, decision-making and process automation."
Source: Gartner's Glossary. https://www.gartner.com/en/information-technology/glossary/big-data.
-
Among the benefits brought by Big data are:
- Cost reduction.
- Discover more efficient ways of doing business.
- Better decision making.
- Create new products and services that the customer wants and needs.
Source: What is Big Data? Oracle. https://www.oracle.com/big-data/what-is-big-data.html.

ISO/IEC 20547 series of standards
-
The ISO/IEC 20547 series is intended to provide users with a standardised approach to developing and implementing Big Data architectures and to provide references for approaches.
-
The vocabulary and common concepts are described in ISO/IEC 20546.
Source: Getting big on data https://www.iso.org/news/ref2578.html.
-
For its part, to advance the progress of Big Data, the NIST Big Data Public Working Group (NBD-PWG) https://www.nist.gov/itl/big-data-nist is working to develop a consensus on important and fundamental concepts related to Big Data.
-
To this end, it has made available to the public the NIST Big Data Interoperability Framework consisting of 9 documents ranging from the NIST Big Data Interoperability Framework: Volume 1, Definitions. Version 3.0(https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.1500-1r2.pdf) to Volume 9: Modernization and Adoption(https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.1500-10r1.pdf).
-
Other standards that can be considered when implementing a Big Data solution are:
- ISO/IEC 55000 | Asset Management
- ISO/IEC 9001 | Quality Management
- ISO/IEC 20000 | IT Service Management
- ISO/IEC 31000 | Risk Management
- ISO/IEC 27000 | Security Management
Data governance / Data governance
- Data governance is a set of principles, standards and practices that are applied end-to-end across the data lifecycle (collection, storage, use, protection, archiving and disposal) to ensure that data is reliable and consistent.
-
To do this, establish organisational structures, confirm data stewards, enforce rules and policies, document processes and record common metrics and business terms.
Source: https://www.informatica.com/blogs/data-governance-vs-data-management-whats-the-difference.html.
Data quality management
-
In this scenario, the International Organization for Standardization (ISO) has developed a series of technical standards that focus on Data Governance, Data Quality Management, and the Quality of software products (Data) also taking into account the aspects of Data Security and Data Privacy.
-
The standards related to data management and data quality are:
- ISO/IEC 8000 | Data Quality Management
- ISO/IEC 33000 | SPICE - Software Process Improvement and Capability dEtermination
- ISO/IEC 38505 | Governance of IT - Governance of data
- ISO/IEC 25012 | Data quality model*.
- ISO/IEC 11179 | metadata management
*ISO/IEC 25012 lists the characteristics of data quality as follows: accuracy, completeness, consistency, credibility, timeliness, accessibility, conformity, confidentiality, efficiency, precision, traceability, understandability, availability, portability and recoverability.
- The portal Datos.gob.es https://datos.gob.es/es/documentacion/normas-tecnicas-para-un-correcto-gobierno-del-dato provides valuable information on the subject:
- An article on the different technical standards to consider when developing effective data governance.
- The report "Standards for the data economy"(2).
- An infographic with the technical standards for proper data governance published by the Spanish Association for Standardisation (UNE).


