What is a Foreign Key?
Edgar F.Codd as the pioneer of relational database management system (RDBMS) produced thirteen rules (numbered zero to twelve) to define what is required from a database management system in order for it to be considered relational.
With these thirteen rules Codd made sure a RDBMS system complies with the ACID property required from any data base transaction.
In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably.
In the context of relational databases, a foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table. In other words, a foreign key is a column or a combination of columns that is used to establish and enforce a link between two tables.
Now we got that out of the way, let’s get to the main context.
In a typical Data Warehouse Database design the Dimensions will be joined by Facts via Primary-Foreign Key relation, with the primary key in the Dimensions acting as a foreign key in the Facts. But while doing the ETL load we already check this constraint. During the ETL process we load the Dimensions first and then the Fact tables. While loading the Fact tables we do a look-up on the Dimension and only load the records corresponding to the Dimension. In another words we only load the transaction records for which corresponding master entries are already present. So if our ETL code meets all the requirements we don’t need to check the integrity constraint again at the DB level. Removing foreign key constraint will significantly improve ETL load time which is very important for Global systems where ETL window is very short and real time systems.
But my suggestion would be to remove the Foreign Keys but keep the Primary Keys (which by default are created as Clustered Index), because it improves data base performance.
So the ideal approach should be to start a DWBI application with ETL integrity check along with integrity check at DB level .But few months after the deployment when the system gets stable and there is no error at DB level (that means ETL load is happening properly), remove the Foreign Key constraints.
Edgar F.Codd as the pioneer of relational database management system (RDBMS) produced thirteen rules (numbered zero to twelve) to define what is required from a database management system in order for it to be considered relational.
With these thirteen rules Codd made sure a RDBMS system complies with the ACID property required from any data base transaction.
In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably.
In the context of relational databases, a foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table. In other words, a foreign key is a column or a combination of columns that is used to establish and enforce a link between two tables.
Now we got that out of the way, let’s get to the main context.
In a typical Data Warehouse Database design the Dimensions will be joined by Facts via Primary-Foreign Key relation, with the primary key in the Dimensions acting as a foreign key in the Facts. But while doing the ETL load we already check this constraint. During the ETL process we load the Dimensions first and then the Fact tables. While loading the Fact tables we do a look-up on the Dimension and only load the records corresponding to the Dimension. In another words we only load the transaction records for which corresponding master entries are already present. So if our ETL code meets all the requirements we don’t need to check the integrity constraint again at the DB level. Removing foreign key constraint will significantly improve ETL load time which is very important for Global systems where ETL window is very short and real time systems.
But my suggestion would be to remove the Foreign Keys but keep the Primary Keys (which by default are created as Clustered Index), because it improves data base performance.
So the ideal approach should be to start a DWBI application with ETL integrity check along with integrity check at DB level .But few months after the deployment when the system gets stable and there is no error at DB level (that means ETL load is happening properly), remove the Foreign Key constraints.
No comments:
Post a Comment