The term data integrity refers to the correctness and completeness of the information in a database, data warehouse, data mart or in such similar structure. Our previously discussed topic Data Consistency was for a narrower space. As for the databases, the contents are modified with INSERT, DELETE or UPDATE statements, which may result in the loss of integrity of the stored data in many different ways. In those ways, invalid data may get added to the database. Changes resulting errors to the database may happen out of a system error or a power failure. When the changes can be partially applied on a database system, then the mathematical adjustment of the other part of the database may not get completed. One of the important functions of a relational DBMS is to preserve the integrity of your stored data as much as possible.
Types of Data Integrity
Types of integrity types are two – physical and logical.
Physical integrity is associated with correctly storing and fetching the data. There are matters such as electromechanical faults, material fatigue, corrosion, power outages, natural disasters, ionizing radiation, extreme temperatures which may affect the data integrity. There are some error detecting algorithms or error-correcting codes to ensure data integrity. A database management system might be compliant with the ACID properties, while the RAID controller or hard disk drive’s internal write cache might not be compliant with the ACID properties.
Logical integrity may get affected by software bugs and human errors.
Types of integrity constraints
- Domain Integrity: Domain integrity is the validity of the restrictions that a given column of the table must comply with.
Required Data : sets a column to have a non-NULL value. It is defined by making the declaration of a column is NOT NULL when the table containing the columns is first created, as part of the CREATE TABLE statement.
Validity Check : when a table is created, each column has a data type and the DBMS ensures that only data of the specified type is entered into the table. - Entity integrity : establishes that the primary key of a table must have a unique value for each row of the table; if not, the database will lose its integrity. It is specified in the CREATE TABLE statement. The DBMS automatically checks the uniqueness of the value of the primary key with each INSERT AND UPDATE statement. An attempt to insert or update a row with an existing primary key value will fail.
- Referential integrity : ensures integrity between foreign and primary keys (parent / child relationships). There are four database updates that can corrupt referential integrity:
The insertion of a child row occurs when the foreign key does not match the parent’s primary key.
The update in the foreign key of the child row, where there is an update in the foreign key of the child row with an UPDATE statement and it does not match any primary key.
The deletion of a parent row, with which, if a parent row – which has one or more children – is deleted, the child rows will be orphaned.
The update of the primary key of a parent row, where if in a parent row, which has one or more children, its primary key is updated, the child rows will be orphaned.
Conclusion
Data integrity is the overall accuracy and consistency of data. Data integrity is a static property of a single schema, often associated with the relational model.
---