Data quality describes the quality and characteristics of specific data in terms of accuracy, uniqueness, consistency, completeness, validity, and timeliness. Data quality is an aspect of data governance.
Metadata is data that provides information about data and is stored in a data catalog. There are different types of metadata, e.g. descriptive, structural, administrative, statistical and legal.
Consistency means that the data format is uniform and correct for each user. A consistent database enables increased confidence in analysis results and simplified execution of database transactions. Inconsistent data, on the other hand, can lead to serious analysis errors and, as a result, incorrect business decisions.
Data management comprises the totality of all measures to be able to use data as a valuable resource. These processes include the recording, storage, organization and maintenance of data.
Data management software
A data management software is a tool that enables the effective organization of data of different origins and characteristics. Comprehensive management of even large amounts of data includes consolidation, contextualization, harnessing for analysis and reporting, and enabling technologies such as artificial intelligence.
Data governance defines the processes and responsibilities that ensure the quality and security of data. The following measures are typical tasks of data governance: assignment and definition of user rights, task and responsibility management, control and monitoring of data.
Master data management
Master data management is a combination of processes and technologies for synchronizing and harmonizing master data in an organization. It enables the creation of a unified foundation that provides accurate and complete master data to all employees in the organization.
Data quality management
Data quality management encompasses measures and practices that ensure the suitability of data for analysis and decision making and enable the highest level of data quality. The better the data quality of an organization, the more accurate analysis results can be achieved.
A data dictionary, also known as a metadata repository, is a central location where metadata, including information about data and its origin, format, meaning, and relationships, is stored. The data dictionary ensures data consistency when data is processed in different departments of the company.
A data repository is a unit in which data sets are collected and stored for a specific purpose. A data repository provides users with a consolidated space in which data relevant to ongoing operations in particular can be stored.
A metadata repository also known as a data dictionary, is a central location where metadata, including information about data and its origin, format, meaning, and relationships, is stored. The metadata repository ensures data consistency when data is processed in different departments of the company.
Data maintenance includes measures whose goal is to improve database performance. Maintenance measures help to improve speed, optimize capacity, find and fix errors and hardware failures. Good data maintenance ensures constant access and usability of the data.
Data vault (modeling)
A data model describes the arrangement and classification of data and splits structural information and its attributes. The advantages of a clean data model are flexible integration options, easier scaling, the possibility to check historical data and faster adaptations to changing data landscapes.
The tasks of metadata management consist of defining and designing strategies and measures that enable the access, maintenance, analysis and integration of data in the company. The role of metadata management is becoming increasingly important as the amount and variety of data increases.
Big data management
Big data management describes the management, organization and control of large volumes of unstructured and structured data. The central goals are to ensure good data accessibility and a high level of data quality.
A data catalog is an inventory of all data sets in an organization. The catalog helps to understand the origin of the data, its use, quality, relationships and business context. An important application area is the search for data for analytical or business purposes.
AI-driven data catalog
An AI-driven data catalog is a data catalog that provides a high level of automation based on artificial intelligence (AI) or machine learning (ML). The catalog helps understand the origin of data, its usage, quality, relationships, and business context. Next-generation data catalogs leverage AI to enable automated metadata tagging, data provenance tracking, and data consumption analysis.
A metadata repository is a location where metadata is stored and systemised. There are four different types of data in a metadata repository: historical, current, generic, and integrated.
Data owners are people who are responsible for data sets, their accessibility and maintenance. A data owner thus has a direct influence on the quality of a data catalog.
Data stewards are persons who are responsible for data sets within the data catalog. They ensure its up-to-dateness and the state of the metadata. They take on the role of an intermediary between data owners and users.
Data integration includes technologies to bring together data from different sources and present it in a combined and harmonized view. A typical application of data integration is data mining.
Data integration tool
Data integration tools are tools that implement the merging of data from different sources. Data integration tools act as a link between source systems and analytics platforms. They usually include optimized native connectors for batch loading from various common data sources.
Data virtualization is a data management technique that allows an application to extract and manage data without querying technical details about it. Data virtualization does not create a physical copy of the data, but creates a virtual layer to mask the complexity of the underlying data landscape. This approach offers a fast, flexible alternative to classic integration.
The field of business intelligence (BI) comprises strategies and technologies used by companies to analyze current and historical business data. The central task of Business Intelligence is the data-based answering of strategic questions.
Business intelligence platform
A business intelligence platform is a tool that supports the BI area in extracting, analyzing and visualizing a company's data. Business decisions are then made based on the results generated. Business intelligence platforms typically include three main areas: Analysis, Information Delivery and Platform Integration.
Data profiling describes procedures for analyzing and evaluating data in terms of its key features such as content, characteristics, links and dependencies. Data profiling provides a critical understanding of data that organizations can use as an advantage.
A data profiler is a tool that enables the analysis and testing of data sets and allows the early detection of problems. By applying descriptive statistics to the analysis of the content and structure, the effort of data preparation can be better estimated.
Data compliance encompasses a range of measures designed to ensure that sensitive data is managed in accordance with internal corporate policies and external legal and regulatory requirements.
The General Data Protection Regulation (GDPR) is a regulation of EU law that focuses on data protection and privacy in the European Union. External data transfers are also covered by the regulation.
Data silos are data sets that are controlled and managed solely by a limited group of users in an organization. These datasets cause problems when the information they contain is needed by other departments because they do not have access.
Big Data is data that consists of a large amount of information and the volume of which is getting faster and larger. There are five main characteristics of Big Data- the five Vs: volume, velocity, variety, veracity, and value. Big Data makes it possible to gain more valuable insights, but it also leads to new technological challenges.