What is the name of the field used to uniquely identify a row in a relational database?

What Does Primary Key Mean?

A primary key is a special relational database table column (or combination of columns) designated to uniquely identify each table record.

A primary key is used as a unique identifier to quickly parse data within the table. A table cannot have more than one primary key.

A primary key’s main features are:

  • It must contain a unique value for each row of data.
  • It cannot contain null values.
  • Every row must have a primary key value.

A primary key might use one or more fields already present in the underlying data model, or a specific extra field can be created to be the primary key.

Techopedia Explains Primary Key

The primary key concept is critical to an efficient relational database. Without the primary key and closely related foreign key concepts, relational databases would not work.

In fact, since a table can easily contain thousands of records (including duplicates), a primary key is necessary to ensure that a table record can always be uniquely identified.

All keys that come from real-world observables and attributes are called natural primary keys, as opposed to surrogate primary keys that are, instead, arbitrarily assigned to each record.

Almost all individuals deal with natural primary keys frequently but unknowingly in everyday life.

For example, students are routinely assigned unique identification (ID) numbers, and all U.S. citizens have government-assigned and uniquely identifiable Social Security numbers. Street addresses or driver license numbers are examples of primary keys used to uniquely identify (respectively) locations or cars.

As another example, a database must hold all of the data stored by a commercial bank. Two of the database tables include the CUSTOMER_MASTER, which stores basic and static customer data (name, date of birth, address, Social Security number, etc.) and the ACCOUNTS_MASTER, which stores various bank account data (account creation date, account type, withdrawal limits or corresponding account information, etc.).

To uniquely identify customers, a column or combination of columns is selected to guarantee that two customers never have the same unique value. Thus, certain columns are immediately eliminated, e.g., surname and date of birth.

A good primary key candidate is the column that is designated to hold Social Security numbers. However, some account holders may not have Social Security numbers, so this column’s candidacy is eliminated.

The next logical option is to use a combination of columns, such as adding the surname to the date of birth to the email address, resulting in a long and cumbersome primary key.

The best option is to create a separate primary key in a new column named CUSTOMER_ID. Then, the database automatically generates a unique number each time a customer is added, guaranteeing unique identification.

As this key is created, the column is designated as the primary key within the SQL script that creates the table, and all null values are automatically rejected.

The account number associated with each CUSTOMER_ID allows for the secure handling of customer queries and quick search times (as with any indexed table.)

For example, a customer may be asked to provide his surname when conducting a bank query. A common surname (such as Smith) query is likely to return multiple results.

When querying data, utilizing the primary key uniqueness feature guarantees one result.

Introduction

SQL Keys is the Key to your success in Analytics!

Data is growing at an exponential rate and so is the demand for professionals who are well versed with the databases.

Organizations all over the world are looking for data scientists and analysts who can draw meaningful insights from the vast amounts of data. And one of the most important languages for handling databases is SQL. That is why those professionals with a background in SQL have an edge over their peers when it comes to working with databases.

What is the name of the field used to uniquely identify a row in a relational database?

An important aspect of working with databases is actually creating one. Creating a database is an altogether different ball game when compared to retrieving data from databases. Why so? Well, creating a database requires a thorough knowledge of how the tables relate to each other within a database and how to handle records so that there is no duplicate data.

A very important aspect of creating databases is understanding the concept of Keys in SQL. These are nothing but a group of columns that can help you identify rows in a table uniquely. But like any other entity, there are many kinds of Keys in SQL.

In this article, I will discuss some of the most common SQL keys that any data scientist or analyst should know before they even start working with databases!

I suggest checking out the SQL course for Data Science if you’re new to SQL.

Table of Contents

  • What are keys in DBMS?
  • Super Key
  • Candidate Key
  • Primary Key
  • Alternate or Secondary Key
  • Foreign Key
  • Composite Key

What are keys in DBMS?

Databases are used to store massive amounts of information which is stored across multiple tables. Each table might be running into thousands of rows. Needless to say, there will be many duplicate rows with redundant information. How do we deal with that? How do we manage records so that we are storing only unique data? And, how do we relate the multiple tables that are present in the database?

SQL keys are the answer to all these queries.

An SQL key is either a single column (or attribute) or a group of columns that can uniquely identify rows (or tuples) in a table.

SQL keys ensure that there are no rows with duplicate information. Not only that, but they also help in establishing a relationship between multiple tables in the database. Therefore, it becomes imperative to learn about the different keys in SQL.

What is a Super key in SQL?

Super key is a single key or a group of multiple keys that can uniquely identify tuples in a table.

Super Key can contain multiple attributes that might not be able to independently identify tuples in a table, but when grouped with certain keys, they can identify tuples uniquely.

Let me take an example to clarify the above statement. Have a look at the following table.

What is the name of the field used to uniquely identify a row in a relational database?

Consider that Id attribute is unique to every employee. In that case, we can say that the Id attribute can uniquely identify the tuples of this table. So, Id is a Super key of this table. Note that we can have other Super Keys too in this table.

For instance – (Id, Name), (Id, Email), (Id, Name, Email), etc. can all be Super keys as they can all uniquely identify the tuples of the table. This is so because of the presence of the Id attribute which is able to uniquely identify the tuples. The other attributes in the keys are unnecessary. Nevertheless, they can still identify tuples.

What is a Candidate key?

Candidate key is a single key or a group of multiple keys that uniquely identify rows in a table.

A Candidate key is a subset of Super keys and is devoid of any unnecessary attributes that are not important for uniquely identifying tuples.

The value for the Candidate key is unique and non-null for all tuples. And every table has to have at least one Candidate key. But there can be more than one Candidate Key too.

For example, in the example that we took earlier, both Id and Email can act as a Candidate for the table as they contain unique and non-null values.

What is the name of the field used to uniquely identify a row in a relational database?

On the other hand, we cannot use the attributes like City or Gender to retrieve tuples from the table as they have no unique values.

What is the name of the field used to uniquely identify a row in a relational database?

Whereas on querying the table on the Id attribute will help us to retrieve unique tuples.

What is the name of the field used to uniquely identify a row in a relational database?

Primary Key in SQL

Primary key is the Candidate key selected by the database administrator to uniquely identify tuples in a table.

Out of all the Candidate keys that can be possible for a table, there can be only one key that will be used to retrieve unique tuples from the table. This Candidate key is called the Primary Key.

There can be only one Primary key for a table. Depending on how the Candidate Key is constructed the primary key can be a single attribute or a group of attributes. But the important point to remember is that the Primary key should be a unique and non-null attribute(s).

There can be two ways to create a Primary key for the table. The first way is to alter an already created to add the Primary key constraint on an attribute. This is shown below:

What is the name of the field used to uniquely identify a row in a relational database?

Now if I try to add a new row with duplicate Id value, it will give me an error message.

What is the name of the field used to uniquely identify a row in a relational database?

The second way of adding a Primary key is during the creation of the table itself. All you have to do is add the Primary Key constraint at the end after defining all the attributes in the table.

What is the name of the field used to uniquely identify a row in a relational database?

To define a Primary Key constraint on multiple attributes, you can list all the attributes in the parenthesis as shown below.

What is the name of the field used to uniquely identify a row in a relational database?

But remember that these attributes should be defined as non-null values otherwise the whole purpose of using the Primary key to identify tuples uniquely gets defeated.

Alternate or Secondary keys in SQL

Alternate keys are those candidate keys which are not the Primary key.

There can be only one Primary key for a table. Therefore all the remaining Candidate keys are known as Alternate or Secondary keys. They can also uniquely identify tuples in a table, but the database administrator chose a different key as the Primary key.

If we look at the Employee table once again, since I have chosen Id as the Primary key, the other Candidate Key (Email), becomes the Alternate key for the table.

What is the name of the field used to uniquely identify a row in a relational database?

Foreign key in SQL

Foreign key is an attribute which is a Primary key in its parent table, but is included as an attribute in another host table.

A Foreign key generates a relationship between the parent table and the host table. For example, in addition to the Employee table containing the personal details of the employees, we might have another table Department containing information related to the department of the employee.

What is the name of the field used to uniquely identify a row in a relational database?

The Primary key in this table is the Department Id. We can add this attribute to the Employee by making it the Foreign key in the table. We can either do this when we are creating the table or we can alter the table later to add the Foreign Key constraint. Here I have altered the table, but creating Foreign Key during table creation is similar to that for Primary Key.

What is the name of the field used to uniquely identify a row in a relational database?

Here, Dep_Id is now the Foreign Key in table Employee while it is a Primary Key in the Department table.

The Foreign key allows you to create a relationship between two tables in the database. Each of these tables describes data related to a particular field (employee and department here). Using the Foreign key we can easily retrieve data from both the tables.

What is the name of the field used to uniquely identify a row in a relational database?

Note: To operate on Foreign keys, you need to know about Joins which you can find out in detail in this article.

 

Using Foreign keys makes it easier to update the database when required. This is so because we only have to make the necessary changes in limited rows. For example, if the Marketing department shifts from Kolkata to Pune, instead of updating it for all the relevant rows in the Employee table, we can simply update the location in the Department table. This ensures that there are only a few places to update and less risk of having different data in different places.

What are Composite keys?

A Composite key is a Candidate key or Primary key that consists of more than one attribute.

Sometimes it is possible that no single attribute will have the property to uniquely identify tuples in a table. In such cases, we can use a group of attributes to guarantee uniqueness. Combining these attributes will uniquely identify tuples in the table.

Consider the following table:

What is the name of the field used to uniquely identify a row in a relational database?

Here, neither of the attributes contains unique values to identify the tuples. Therefore, we can combine two or more attributes to create a key that can uniquely identify the tuples. For example, we can group Transaction_Id and Product_Id to create a key that can uniquely identify the tuples. These are called composite keys.

  • SQL keys are used to uniquely identify rows in a table.
  • SQL keys can either be a single column or a group of columns.
  • Super key is a single key or a group of multiple keys that can uniquely identify tuples in a table.
  • Super keys can contain redundant attributes that might not be important for identifying tuples.
  • Candidate keys are a subset of Super keys. They contain only those attributes which are required to uniquely identify tuples.
  • All Candidate keys are Super keys. But the vice-versa is not true.
  • Primary key is a Candidate key chosen to uniquely identify tuples in the table.
  • Primary key values should be unique and non-null.
  • There can be multiple Super keys and Candidate keys in a table, but there can be only one Primary key in a table.
  • Alternate keys are those Candidate keys that were not chosen to be the Primary key of the table.
  • Composite key is a Candidate key that consists of more than one attribute.
  • Foreign key is an attribute which is a Primary key in its parent table but is included as an attribute in the host table.
  • Foreign keys may accept non-unique and null values.

Endnotes

In this article, we covered the most common and widely used keys that any professionals looking to work with databases should know about.

If you are someone who is looking to work with SQL in Python, then I suggest going through this article. Or if you are someone who is looking for some SQL techniques to employ for better data analysis, then you shouldn’t miss this great article.

I hope you enjoyed this article and do connect in the comments if you have any doubts about this topic!

Which is used to identify a row uniquely from database?

The primary key is the minimal set of attributes which uniquely identifies any row of a table.

What does a relational database use to uniquely identify each row in a table?

Tables have a primary key All tables in a relational database should have a primary key. The primary key is a column, or set of columns, that allows each row in the table to be uniquely identified. No two rows in a table with a primary key can have the same primary key value.

What is a unique field called in a database?

primary key: A field that uniquely identifies a record in a table.

Which key is used to uniquely identify each row of a relation?

In a relational database, a candidate key uniquely identifies each row of data values in a database table. A candidate key comprises a single column or a set of columns in a single database table.