Natural key: Difference between revisions

Content deleted Content added

Inline

Latest revision as of 06:43, 15 May 2024

A natural key (also known as business key^[1] or domain key^[2]) is a type of unique key in a database formed of attributes that exist and are used in the external world outside the database (i.e. in the business domain or domain of discourse).^[3] In the relational model of data, a natural key is a superkey and is therefore a functional determinant for all attributes in a relation.

A natural key serves two complementary purposes:

It provides a means of unique identification for data
It imposes a rule, specifically a uniqueness constraint, to ensure that data remains unique within an information system

The uniqueness constraint assures uniqueness of data within a certain technical context (e.g. a set of values in a table, file or relation variable) by rejecting input of any data that would otherwise violate the constraint. This means that the user can rely on a guaranteed correspondence between facts identified by key values recorded in a system and the external domain of discourse (a single version of the truth according to Kimball).

A natural key differs from a surrogate key which has no meaning outside the database itself and is not based on real-world observation or intended as a statement about the reality being modelled. A natural key therefore provides a certain data quality guarantee whereas a surrogate does not. It is common for elements of data to have several keys, any number of which may be natural or surrogate.

Advantages[edit]

The advantages of using a natural key to uniquely identify records in a relation include less disk space usage, the natural key is an attribute that is related to the business or the real world so in most cases, it is already being stored in the relation which saves disk space as compared to creating a new column for storing the surrogate key.

Another advantage of using natural keys is that it simplifies enforcement of data quality, and they are easier to relate to real life while designing the database system. They simplify the quality of data as using a natural key that is unique in the real world ensures that there cannot be multiple records with the same primary key. Comparing the database schema to a real world scenario is a huge part of designing a database schema and when a natural key is being used in the tables of the database, it makes it easy for the database engineer to engineer the database system.

Disadvantages[edit]

Usage of natural keys as unique identifiers in a table has one main disadvantage which is the change of business rules or the change of rules of the attribute in the real world. The definition of the structure of the natural key attribute might change in the future.

For example if there is a table storing the information about US citizens, the Social Security Number would act as the natural key, Social Security Number being the natural key might pose a problem in the future if the US government changes the structure of the Social Security Number and increases the number of digits in the SSN due to some reason. In that case, the database administrator will have to change the schema of the table and perhaps also update the records of the table. In other cases, this can prevent improvements of the system altogether due to too extensive effort required for the change, e.g., the inability of the knowledge management software Confluence, to represent multiple pages with the same title.^[4]

References[edit]

^ Data modelling: What exactly is a Business Key? by Roy, Rajiv Max. 19 November 2020. Retrieved 1 August 2022.
^ Auto Keys Versus Domain Keys
^ Intelligent Versus Surrogate Keys
^ "[CONFSERVER-2524] Enable creation of same-named pages within a space - Create and track feature requests for Atlassian products".

[1] Data modelling: What exactly is a Business Key? by Roy, Rajiv Max. 19 November 2020. Retrieved 1 August 2022.

[2] Auto Keys Versus Domain Keys

[3] Intelligent Versus Surrogate Keys

[4] "[CONFSERVER-2524] Enable creation of same-named pages within a space - Create and track feature requests for Atlassian products".

[1]

[2]

[3]

[4]

@@ Line 1: / Line 1: @@
+{{Short description|Type of unique key in a database}}
 {{Distinguish|Natural minor}}
-A '''natural key''' (also known as '''business key'''<ref>[https://web.archive.org/web/20121115115247/http://www.dbdebunk.com/2012/11/on-kinds-of-keys.html On Kinds of Keys]</ref>) is a type of [[unique key]] in a [[database]] formed of attributes that exist and are used in the external world outside the database (i.e. in the business domain or [[domain of discourse]]).<ref>[http://www.bcarter.com/intsurr1.htm Intelligent Versus Surrogate Keys]</ref> In the [[relational model]] of data, a natural key is a [[candidate key]] and is therefore a functional determinant for all attributes in a relation. A natural key is sometimes called ''domain key''.<ref>[http://c2.com/cgi/wiki?AutoKeysVersusDomainKeys Auto Keys Versus Domain Keys]</ref>
+A '''natural key''' (also known as '''business key'''<ref>[https://ia.acs.org.au/article/2020/data-modelling--what-exactly-is-a-business-key-.html Data modelling: What exactly is a Business Key?] by Roy, Rajiv Max. 19 November 2020. Retrieved 1 August 2022.</ref> or '''domain key'''<ref>[http://c2.com/cgi/wiki?AutoKeysVersusDomainKeys Auto Keys Versus Domain Keys]</ref>) is a type of [[unique key]] in a [[database]] formed of attributes that exist and are used in the external world outside the database (i.e. in the business domain or [[domain of discourse]]).<ref>[http://www.bcarter.com/intsurr1.htm Intelligent Versus Surrogate Keys]</ref> In the [[relational model]] of data, a natural key is a [[superkey]] and is therefore a [[Functional dependency|functional determinant]] for all attributes in a relation.
+A natural key serves two complementary purposes:
-A natural key serves two complementary purposes: it provides a means of [[Unique identifier|identification]] for data and it imposes a rule, specifically a ''uniqueness constraint'', to ensure that data remains unique within an information system. The uniqueness constraint assures uniqueness of data within a certain technical context (e.g. a set of values in a table, file or relation variable) by rejecting input of any data that would otherwise violate the constraint. This means that the user can rely on a guaranteed correspondence between facts identified by key values recorded in a system and the external domain of discourse (a single version of the truth).
+* It provides a means of [[Unique identifier|unique identification]] for data
+* It imposes a rule, specifically a [[Unique key|uniqueness constraint]], to ensure that data remains unique within an [[information system]]
+The uniqueness constraint assures uniqueness of data within a certain technical context (e.g. a set of values in a table, file or relation variable) by rejecting input of any data that would otherwise violate the constraint. This means that the user can rely on a guaranteed correspondence between facts identified by key values recorded in a system and the external domain of discourse (a [[single version of the truth]] according to [[Ralph Kimball|Kimball]]).
+A natural key differs from a [[surrogate key]] which has no meaning outside the database itself and is not based on real-world observation or intended as a statement about the reality being modelled. A natural key therefore provides a certain [[data quality]] guarantee whereas a surrogate does not. It is common for elements of data to have several keys, any number of which may be natural or surrogate.
-Examples of natural keys could include:
+== Advantages ==
-* Purchase Order number
+The advantages of using a natural key to uniquely identify records in a relation include less disk space usage, the natural key is an attribute that is related to the business or the real world so in most cases, it is already being stored in the relation which saves disk space as compared to creating a new column for storing the [[surrogate key]].
-* [[Flight number]]
-* Login name
-* [[Vehicle registration plate|Vehicle registration number]]
-* [[Social Security number]]
-* Passport number
-* University ID number
+Another advantage of using natural keys is that it simplifies enforcement of data quality, and they are easier to relate to real life while designing the database system. They simplify the quality of data as using a natural key that is unique in the real world ensures that there cannot be multiple records with the same [[primary key]]. Comparing the database schema to a real world scenario is a huge part of designing a database schema and when a natural key is being used in the tables of the database, it makes it easy for the database engineer to engineer the database system.
-The presence of a key guarantees uniqueness within an information system but it is not always necessary that the key values be unique or immutable within some wider population of objects or concepts ''outside'' that system. For example a key on a CITY attribute means that the set of city names assigned to that attribute must be unique at any point in time, so there can only be one city called "Washington" for example. That does not imply that every possible city which might one day be referred to within the system must have a unique name. In logical terms, the proposition being represented by the value "Washington" is that there is a city called Washington ''within the domain of discourse'' at a point in time, not that there is only ''one'' city of that name in every conceivable domain or for all time.
+== Disadvantages ==
-Similarly, the potential occurrence of erroneous or unwanted duplicate information does not necessarily rule out the use of an attribute as a natural key. For example in the US there may be instances of duplicate [[Social Security number]]s mistakenly issued to individuals or other instances of a person fraudulently or mistakenly using another person's SSN. In these situations the use of SSN as a natural key serves the purpose of a data integrity check - detecting potential duplication or fraud by rejecting any duplicate values with the implication that any error should be identified and resolved before entry into the system.
+Usage of natural keys as unique identifiers in a table has one main disadvantage which is the change of [[business rule]]s or the change of rules of the attribute in the real world. The definition of the structure of the natural key attribute might change in the future.
+For example if there is a table storing the information about US citizens, the [[Social Security Number]] would act as the natural key, Social Security Number being the natural key might pose a problem in the future if the US government changes the structure of the Social Security Number and increases the number of digits in the SSN due to some reason. In that case, the [[database administrator]] will have to change the schema of the table and perhaps also update the records of the table. In other cases, this can prevent improvements of the system altogether due to too extensive effort required for the change, e.g., the inability of the [[knowledge management]] software [[Confluence (software)|Confluence]], to represent multiple pages with the same title.<ref>{{Cite web|url=https://jira.atlassian.com/browse/CONFSERVER-2524|title = &#91;CONFSERVER-2524&#93; Enable creation of same-named pages within a space - Create and track feature requests for Atlassian products}}</ref>
-A natural key differs from a [[surrogate key]] which has no meaning outside the database itself and is not based on real-world observation or intended as a statement about the reality being modelled. A natural key therefore provides a certain data quality guarantee whereas a surrogate does not. It is common for elements of data to have several keys, any number of which may be natural or surrogate.
 ==References==