A data dictionary is a catalogue of metadata that contains the definitions and presentation rules for all of an organization’s application data and the relationships between the various data objects so that the database is structured in a redundancy-free and uniform manner. It is a use case of a specific data model.
In a relational database, a data dictionary is a set of tables and views that are read-only when queries (such as SQL). The Data Dictionary is structured like a database but does not contain application data, but metadata, that is, data that describes the structure of the application data (and not the content itself). The creation and maintenance of such a data catalogue are usually carried out via an interactive dialogue or with the help of a data definition language (DDL).
An active data dictionary reflects the current detailed status of the data model at all times. Changes to the structure of a database can be made directly in the maintenance interface of the Data Dictionary, or by other means, for example, a command interpreter of a DDL. Regardless of how these changes are made, the timeliness of an active data dictionary is always automatically guaranteed. In a passive data dictionary, this synchronicity is not given. Changes to the structure of the database management system (DBMS) must be manually updated in the Data Dictionary (DD) if desired and economically possible. In particular, DD products for modelling and documenting the conceptual data model suffer from this problem.
---
In the development and maintenance of data models, different modeling levels are distinguished:
- Conceptual level (usually related to an area of application, in business informatics often also company-wide or even cross-company)
- Logical level
- Physical level in which the conceptual/logical data model is mapped and implemented about a specific DBMS.
According to the different levels of data modelling, the data dictionaries can be distinguished according to support for these model planes. Depending on the level, the data dictionaries differ according to the type, content and data types of the necessary metadata, but also about their functions and evaluation options.
A conceptual/logical data modeling data dictionary includes:
- Definition of entities, data elements, and relationships between entities
- Business definitions and explanations thereof
In addition to the definition of the essential data objects or elements and their relationships, detailed descriptive texts are typically stored at the level of the respective entities, which are linked to each other using hyperlink technology. When an organization builds an enterprise-wide data model (UwDM), information about application-related semantics, data type, and data representation is collected for each data element. The semantic information defines the exact meaning of a data element and is formulated as continuous text. The display rules determine how data elements are stored (e.g. data type such as integer, text, maximum text length, input format, output formats, allowed value ranges as a check rule, static or dynamic set, and so on). This first form is often not included as a standard function in the functionality of a DBMS. Therefore, isolated solutions often have to be used here. However, these represent a passive data dictionary concerning the DBMS. Changes to the conceptual data model cannot be automatically applied to the physical data model of the DBMS.
A data catalogue can also be used as a glossary by considering information objects/entities, data elements/attributes, and also relationships/relationship as terms whose definitions are stored in the respective description part. The Data Dictionary can be further developed into complete ontologies or classes or business process models. If, in addition to the data structure, the methods for data transformation are also described, this is called a repository.
In any case, it makes sense to integrate the metadata from the Data Dictionary into the Integrated Development Environment (IDE). For the dynamic or generic programming of forms and reports, however, a data dictionary that is meaningfully structured and visible for the needs of application programming is also a necessary prerequisite.
The conceptual and physical data modelling capabilities are often not integrated into a data dictionary. More seriously, changes to the detailed database architecture are not reflected into the conceptual data model. Either a time-consuming manual follow-up is necessary, or up-to-dateness of the documented data model is lost.