Foreign key
Encyclopedia
In the context of relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...

s, a foreign key is a referential constraint
Referential integrity
Referential integrity is a property of data which, when satisfied, requires every value of one attribute of a relation to exist as a value of another attribute in a different relation ....

 between two tables
Table (database)
In relational databases and flat file databases, a table is a set of data elements that is organized using a model of vertical columns and horizontal rows. A table has a specified number of columns, but can have any number of rows...

.

A foreign key is a field in a relational table that matches a candidate key
Candidate key
In the relational model of databases, a candidate key of a relation is a minimal superkey for that relation; that is, a set of attributes such that# the relation does not have two distinct tuples In the relational model of databases, a candidate key of a relation is a minimal superkey for that...

 of another table. The foreign key can be used to cross-reference tables.

For example, say we have two tables, a CUSTOMER table that includes all customer data, and an ORDERS table that includes all customer orders. The intention here is that all orders must be associated with a customer that is already in the CUSTOMER table. To do this, we will place a foreign key in the ORDERS table and have it relate to the primary key of the CUSTOMER table.

The foreign key identifies a column
Column (database)
In the context of a relational database table, a column is a set of data values of a particular simple type, one for each row of the table. The columns provide the structure according to which the rows are composed....

 or set of columns in one (referencing) table that refers to a column or set of columns in another (referenced) table.
The columns in the referencing table must reference the columns of the primary key or other superkey
Superkey
A superkey is defined in the relational model of database organization as a set of attributes of a relation variable for which it holds that in all relations assigned to that variable, there are no two distinct tuples that have the same values for the attributes in this set...

 in the referenced table.
The values in one row
Row (database)
In the context of a relational database, a row—also called a record or tuple—represents a single, implicitly structured data item in a table. In simple terms, a database table can be thought of as consisting of rows and columns or fields...

 of the referencing columns must occur in a single row in the referenced table.
Thus, a row in the referencing table cannot contain values that don't exist in the referenced table (except potentially NULL).
This way references can be made to link information
Information
Information in its most restricted technical sense is a message or collection of messages that consists of an ordered sequence of symbols, or it is the meaning that can be interpreted from such a message or collection of messages. Information can be recorded or transmitted. It can be recorded as...

 together and it is an essential part of database normalization
Database normalization
In the design of a relational database management system , the process of organizing data to minimize redundancy is called normalization. The goal of database normalization is to decompose relations with anomalies in order to produce smaller, well-structured relations...

.
Multiple rows in the referencing table may refer to the same row in the referenced table.
Most of the time, it reflects the one (parent table or referenced table) to many (child table, or referencing table) relationship.

The referencing and referenced table may be the same table, i.e. the foreign key refers back to the same table. Such a foreign key is known in SQL:2003
SQL:2003
SQL:2003 is the fifth revision of the SQL database query language. The latest revision of the standard is SQL:2008.-Summary:The SQL:2003 standard makes minor modifications to all parts of SQL:1999 , and officially introduces a few new features such as:* XML-related features * Window functions* the...

 as a self-referencing or recursive foreign key.

A table may have multiple foreign keys, and each foreign key can have a different referenced table. Each foreign key is enforced independently by the database system
Database system
A database system is a term that is typically used to encapsulate the constructs of a data model, database Management system and database....

. Therefore, cascading relationships between tables can be established using foreign keys.

Improper foreign key/primary key relationships or not enforcing those relationships are often the source of many database and data modeling
Data modeling
Data modeling in software engineering is the process of creating a data model for an information system by applying formal data modeling techniques.- Overview :...

 problems.

Defining Foreign Keys

Foreign keys are defined in the ANSI SQL Standard, through a FOREIGN KEY constraint. The syntax to add such a constraint to an existing table is defined in SQL:2003
SQL:2003
SQL:2003 is the fifth revision of the SQL database query language. The latest revision of the standard is SQL:2008.-Summary:The SQL:2003 standard makes minor modifications to all parts of SQL:1999 , and officially introduces a few new features such as:* XML-related features * Window functions* the...

 as shown below. Omitting the column list in the REFERENCES clause implies that the foreign key shall reference the primary key of the referenced table.


ALTER TABLE
ADD [ CONSTRAINT ]
FOREIGN KEY ( {, }... )
REFERENCES
[ ( {, }... ) ]
[ ON UPDATE ]
[ ON DELETE ]


Likewise, foreign keys can be defined as part of the CREATE TABLE SQL statement.


CREATE TABLE table_name (
id INTEGER PRIMARY KEY,
col2 CHARACTER VARYING(20),
col3 INTEGER,
...
FOREIGN KEY(col3)
REFERENCES other_table(key_col) ON DELETE CASCADE,
... )


If the foreign key is a single column only, the column can be marked as such using the following syntax:


CREATE TABLE table_name (
id INTEGER PRIMARY KEY,
col2 CHARACTER VARYING(20),
col3 INTEGER REFERENCES other_table(column_name),
... )

Foreign keys can be defined with stored proc statement.


sp_foreignkey tabname, pktabname, col1 [, col2] ... [, col8]


tabname
: is the name of the table or view that contains the foreign key to be defined.
pktabname : is the name of the table or view that has the primary key to which the foreign key applies. The primary key must already be defined.
col1 : is the name of the first column that makes up the foreign key. The foreign key must have at least one column and can have a maximum of eight columns.

Referential Actions

Because the Database Management System enforces referential constraints, it must ensure data integrity if rows in a referenced table are to be deleted (or updated). If dependent rows in referencing tables still exist, those references have to be considered. SQL:2003
SQL:2003
SQL:2003 is the fifth revision of the SQL database query language. The latest revision of the standard is SQL:2008.-Summary:The SQL:2003 standard makes minor modifications to all parts of SQL:1999 , and officially introduces a few new features such as:* XML-related features * Window functions* the...

 specifies 5 different referential actions that shall take place in such occurrences:
  • CASCADE
  • RESTRICT
  • NO ACTION
  • SET NULL
  • SET DEFAULT

CASCADE

Whenever rows in the master (referenced) table are deleted (resp. updated), the respective rows of the child (referencing) table with a matching foreign key column will get deleted (resp. updated) as well. This is called a cascade delete (resp. update).

Example Tables: Customer(customer_id, cname, caddress) and Order(customer_id, products, payment)

Customer is the master table and Order is the child table, where 'customer_id' is the foreign key in Order and represents the customer who placed the order. When a row of Customer is deleted, any Order row matching the deleted Customer's customer_id will also be deleted.

NOTE: In Microsoft SQL, a cascading delete to a self-referencing table is not allowed. You must either use a trigger, create a stored procedure, or handle the cascading delete from the calling application. An example of this is where a single table has an ID as identity and a ParentID with a relationship to ID in the same table.

RESTRICT

A value cannot be updated or deleted when a row exists in a foreign key table that references the value in the referenced table.

Similarly, a row cannot be deleted as long as there is a reference to it from a foreign key table.

NO ACTION

NO ACTION and RESTRICT are very much alike. The main difference between NO ACTION and RESTRICT is that with NO ACTION the referential integrity check is done after trying to alter the table. RESTRICT does the check before trying to execute the UPDATE
Update (SQL)
An SQL UPDATE statement changes the data of one or more records in a table. Either all the rows can be updated, or a subset may be chosen using a condition.The UPDATE statement has the following form:...

 or DELETE
Delete (SQL)
In the database structured query language , the DELETE statement removes one or more records from a table. A subset may be defined for deletion using a condition, otherwise all records are removed.-Usage:The DELETE statement follows the syntax:...

 statement. Both referential actions act the same if the referential integrity check fails: the UPDATE
Update (SQL)
An SQL UPDATE statement changes the data of one or more records in a table. Either all the rows can be updated, or a subset may be chosen using a condition.The UPDATE statement has the following form:...

 or DELETE
Delete (SQL)
In the database structured query language , the DELETE statement removes one or more records from a table. A subset may be defined for deletion using a condition, otherwise all records are removed.-Usage:The DELETE statement follows the syntax:...

 statement will result in an error.

In other words, when an UPDATE
Update (SQL)
An SQL UPDATE statement changes the data of one or more records in a table. Either all the rows can be updated, or a subset may be chosen using a condition.The UPDATE statement has the following form:...

 or DELETE
Delete (SQL)
In the database structured query language , the DELETE statement removes one or more records from a table. A subset may be defined for deletion using a condition, otherwise all records are removed.-Usage:The DELETE statement follows the syntax:...

 statement is executed on the referenced table using the referential action NO ACTION, the DBMS verifies at the end of the statement execution that none of the referential relationships are violated. This is different from RESTRICT, which assumes at the outset that the operation will violate the constraint. Using NO ACTION, the triggers
Database trigger
A database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database. The trigger is mostly used for keeping the integrity of the information on the database...

 or the semantics of the statement itself may yield an end state in which no foreign key relationships are violated by the time the constraint is finally checked, thus allowing the statement to complete successfully.

SET NULL

The foreign key values in the referencing row are set to NULL
Null (SQL)
Null is a special marker used in Structured Query Language to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL Null serves to fulfill the requirement that all true relational database management systems support...

 when the referenced row is updated or deleted. This is only possible if the respective columns in the referencing table are nullable. Due to the semantics of NULL, a referencing row with NULLs in the foreign key columns does not require a referenced row.

SET DEFAULT

Similar to SET NULL, the foreign key values in the referencing row are set to the column default when the referenced row is updated or deleted.

Triggers

Referential actions are generally implemented as implied triggers
Database trigger
A database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database. The trigger is mostly used for keeping the integrity of the information on the database...

 (i.e. triggers with system-generated names, often hidden.) As such, they are subject to the same limitations as user-defined triggers, and their order of execution relative to other triggers may need to be considered; in some cases it may become necessary to replace the referential action with its equivalent user-defined trigger to ensure proper execution order, or to work around mutating-table limitations.

Another important limitation appears with transaction isolation: your changes to a row may not be able to fully cascade because the row is referenced by data your transaction cannot "see", and therefore cannot cascade onto. An example: while your transaction is attempting to renumber a customer account, a simultaneous transaction is attempting to create a new invoice for that same customer; while a CASCADE rule may fix all the invoice rows your transaction can see to keep them consistent with the renumbered customer row, it won't reach into another transaction to fix the data there; because the database cannot guarantee consistent data when the two transactions commit, one of them will be forced to roll back (often on a first-come-first-served basis.)

Example

As a first example to illustrate foreign keys, suppose an accounts database has a table with invoices and each invoice is associated with a particular supplier. Supplier details (such as address or phone number) are kept in a separate table; each supplier is given a 'supplier number' to identify it. Each invoice record has an attribute containing the supplier number for that invoice. Then, the 'supplier number' is the primary key in the Supplier table. The foreign key in the Invoices table points to that primary key. The relational schema is the following. Primary keys are marked in bold, and foreign keys are marked in italics.

Supplier ( SupplierNumber, Name, Address, Type )
Invoices ( InvoiceNumber, SupplierNumber, Text )

The corresponding Data Definition Language
Data Definition Language
A data definition language or data description language is a syntax similar to a computer programming language for defining data structures, especially database schemas.-History:...

statement is as follows.

CREATE TABLE Supplier (
SupplierNumber INTEGER NOT NULL,
Name VARCHAR(20) NOT NULL,
Address VARCHAR(50) NOT NULL,
Type VARCHAR(10),
CONSTRAINT supplier_pk PRIMARY KEY(SupplierNumber),
CONSTRAINT number_value CHECK (SupplierNumber > 0) )

CREATE TABLE Invoices (
InvoiceNumber INTEGER NOT NULL,
SupplierNumber INTEGER NOT NULL,
Text VARCHAR(4096),
CONSTRAINT invoice_pk PRIMARY KEY(InvoiceNumber),
CONSTRAINT inumber_value CHECK (InvoiceNumber > 0),
CONSTRAINT supplier_fk FOREIGN KEY(SupplierNumber)
REFERENCES Supplier(SupplierNumber)
ON UPDATE CASCADE ON DELETE RESTRICT )

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK