Excel for Microsoft 365 Excel 2021 Excel 2019 Excel 2016 Excel 2013 More…Less
Add more power to your data analysis by creating relationships amogn different tables. A relationship is a connection between two tables that contain data: one column in each table is the basis for the relationship. To see why relationships are useful, imagine that you track data for customer orders in your business. You could track all the data in a single table having a structure like this:
CustomerID |
Name |
|
DiscountRate |
OrderID |
OrderDate |
Product |
Quantity |
---|---|---|---|---|---|---|---|
1 |
Ashton |
chris.ashton@contoso.com |
.05 |
256 |
2010-01-07 |
Compact Digital |
11 |
1 |
Ashton |
chris.ashton@contoso.com |
.05 |
255 |
2010-01-03 |
SLR Camera |
15 |
2 |
Jaworski |
michal.jaworski@contoso.com |
.10 |
254 |
2010-01-03 |
Budget Movie-Maker |
27 |
This approach can work, but it involves storing a lot of redundant data, such as the customer e-mail address for every order. Storage is cheap, but if the e-mail address changes you have to make sure you update every row for that customer. One solution to this problem is to split the data into multiple tables and define relationships between those tables. This is the approach used in relational databases like SQL Server. For example, a database that you import might represent order data by using three related tables:
Customers
[CustomerID] |
Name |
|
---|---|---|
1 |
Ashton |
chris.ashton@contoso.com |
2 |
Jaworski |
michal.jaworski@contoso.com |
CustomerDiscounts
[CustomerID] |
DiscountRate |
---|---|
1 |
.05 |
2 |
.10 |
Orders
[CustomerID] |
OrderID |
OrderDate |
Product |
Quantity |
---|---|---|---|---|
1 |
256 |
2010-01-07 |
Compact Digital |
11 |
1 |
255 |
2010-01-03 |
SLR Camera |
15 |
2 |
254 |
2010-01-03 |
Budget Movie-Maker |
27 |
Relationships exist within a Data Model—one that you explicitly create, or one that Excel automatically creates on your behalf when you simultaneously import multiple tables. You can also use the Power Pivot add-in to create or manage the model. See Create a Data Model in Excel for details.
If you use the Power Pivot add-in to import tables from the same database, Power Pivot can detect the relationships between the tables based on the columns that are in [brackets], and can reproduce these relationships in a Data Model that it builds behind the scenes. For more information, see Automatic Detection and Inference of Relationships in this article. If you import tables from multiple sources, you can manually create relationships as described in Create a relationship between two tables.
Relationships are based on columns in each table that contain the same data. For example, you could relate a Customers table with an Orders table if each contains a column that stores a Customer ID. In the example, the column names are the same, but this is not a requirement. One could be CustomerID and another CustomerNumber, as long as all of the rows in the Orders table contain an ID that is also stored in the Customers table.
In a relational database, there are several types of keys. A key is typically column with special properties. Understanding the purpose of each key can help you manage a multi-table Data Model that provides data to a PivotTable, PivotChart, or Power View report.
Though there are many types of keys, these are the most important for our purpose here:
-
Primary key: uniquely identifies a row in a table, such as CustomerID in the Customers table.
-
Alternate key (or candidate key): a column other than the primary key that is unique. For example, an Employees table might store an employee ID and a social security number, both of which are unique.
-
Foreign key: a column that refers to a unique column in another table, such as CustomerID in the Orders table, which refers to CustomerID in the Customers table.
In a Data Model, the primary key or alternate key is referred to as the related column. If a table has both a primary and alternate key, you can use either one as the basis of a table relationship. The foreign key is referred to as the source column or just column. In our example, a relationship would be defined between CustomerID in the Orders table (the column) and CustomerID in the Customers table (the lookup column). If you import data from a relational database, by default Excel chooses the foreign key from one table and the corresponding primary key from the other table. However, you can use any column that has unique values for the lookup column.
The relationship between a customer and an order is a one-to-many relationship. Every customer can have multiple orders, but an order can’t have multiple customers. Another important table relationship is one-to-one. In our example here, the CustomerDiscounts table, which defines a single discount rate for each customer, has a one-to-one relationship with the Customers table.
This table shows the relationships between the three tables (Customers, CustomerDiscounts, and Orders):
Relationship |
Type |
Lookup Column |
Column |
---|---|---|---|
Customers-CustomerDiscounts |
one-to-one |
Customers.CustomerID |
CustomerDiscounts.CustomerID |
Customers-Orders |
one-to-many |
Customers.CustomerID |
Orders.CustomerID |
Note: Many-to-many relationships are not supported in a Data Model. An example of a many-to-many relationship is a direct relationship between Products and Customers, in which a customer can buy many products and the same product can be bought by many customers.
After any relationship has been created, Excel must typically recalculate any formulas that use columns from tables in the newly created relationship. Processing can take some time, depending on the amount of data and the complexity of the relationships. For more details, see Recalculate Formulas.
A Data Model can have multiple relationships between two tables. To build accurate calculations, Excel needs a single path from one table to the next. Therefore, only one relationship between each pair of tables is active at a time. Though the others are inactive, you can specify an inactive relationship in formulas and queries.
In Diagram View, the active relationship is a solid line and the inactive ones are dashed lines. For example, in AdventureWorksDW2012, the table DimDate contains a column, DateKey, that is related to three different columns in the table FactInternetSales: OrderDate, DueDate, and ShipDate. If the active relationship is between DateKey and OrderDate, that is the default relationship in formulas unless you specify otherwise.
A relationship can be created when the following requirements are met:
Criteria |
Description |
---|---|
Unique Identifier for Each Table |
Each table must have a single column that uniquely identifies each row in that table. This column is often referred to as the primary key. |
Unique Lookup Columns |
The data values in the lookup column must be unique. In other words, the column can’t contain duplicates. In a Data Model, nulls and empty strings are equivalent to a blank, which is a distinct data value. This means that you can’t have multiple nulls in the lookup column. |
Compatible Data Types |
The data types in the source column and lookup column must be compatible. For more information about data types, see Data types supported in Data Models. |
In a Data Model, you cannot create a table relationship if the key is a composite key. You’re also restricted to creating one-to-one and one-to-many relationships. Other relationship types are not supported.
Composite Keys and Lookup Columns
A composite key is composed of more than one column. Data Models can’t use composite keys: a table must always have exactly one column that uniquely identifies each row in the table. If you import tables that have an existing relationship based on a composite key, the Table Import Wizard in Power Pivot will ignore that relationship because it can’t be created in the model.
To create a relationship between two tables that have multiple columns defining the primary and foreign keys, first combine the values to create a single key column before creating the relationship. You can do this before you import the data, or by creating a calculated column in the Data Model using the Power Pivot add-in.
Many-to-Many Relationships
A Data Model cannot have many-to-many relationships. You can’t simply add junction tables in the model. However, you can use DAX functions to model many-to-many relationships.
Self-Joins and Loops
Self-joins are not permitted in a Data Model. A self-join is a recursive relationship between a table and itself. Self-joins are often used to define parent-child hierarchies. For example, you could join an Employees table to itself to produce a hierarchy that shows the management chain at a business.
Excel does not allow loops to be created among relationships in a workbook. In other words, the following set of relationships is prohibited.
Table 1, column a to Table 2, column f
Table 2, column f to Table 3, column n
Table 3, column n to Table 1, column a
If you try to create a relationship that would result in a loop being created, an error is generated.
One of the advantages to importing data using the Power Pivot add-in is that Power Pivot can sometimes detect relationships and create new relationships in the Data Model it creates in Excel.
When you import multiple tables, Power Pivot automatically detects any existing relationships among the tables. Also, when you create a PivotTable, Power Pivot analyzes the data in the tables. It detects possible relationships that have not been defined, and suggests appropriate columns to include in those relationships.
The detection algorithm uses statistical data about the values and metadata of columns to make inferences about the probability of relationships.
-
Data types in all related columns should be compatible. For automatic detection, only whole number and text data types are supported. For more information about data types, see Data types supported in Data Models.
-
For the relationship to be successfully detected, the number of unique keys in the lookup column must be greater than the values in the table on the many side. In other words, the key column on the many side of the relationship must not contain any values that are not in the key column of the lookup table. For example, suppose you have a table that lists products with their IDs (the lookup table) and a sales table that lists sales for each product (the many side of the relationship). If your sales records contain the ID of a product that does not have a corresponding ID in the Products table, the relationship can’t be automatically created, but you might be able to create it manually. To have Excel detect the relationship, you need to first update the Product lookup table with the IDs of the missing products.
-
Make sure the name of the key column on the many side is similar to the name of the key column in the lookup table. The names do not need to be exactly the same. For example, in a business setting, you often have variations on the names of columns that contain essentially the same data: Emp ID, EmployeeID, Employee ID, EMP_ID, and so on. The algorithm detects similar names and assigns a higher probability to those columns that have similar or exactly matching names. Therefore, to increase the probability of creating a relationship, you can try renaming the columns in the data that you import to something similar to columns in your existing tables. If Excel finds multiple possible relationships, then it does not create a relationship.
This information might help you understand why not all relationships are detected, or how changes in metadata—such as field name and the data types—could improve the results of automatic relationship detection. For more information, see Troubleshoot Relationships.
Automatic Detection for Named Sets
Relationships are not automatically detected between Named Sets and related fields in a PivotTable. You can create these relationships manually. If you want to use automatic relationship detection, remove each Named Set and add the individual fields from the Named Set directly to the PivotTable.
Inference of Relationships
In some cases, relationships between tables are automatically chained. For example, if you create a relationship between the first two sets of tables below, a relationship is inferred to exist between the other two tables, and a relationship is automatically established.
Products and Category — created manually
Category and SubCategory — created manually
Products and SubCategory — relationship is inferred
In order for relationships to be automatically chained, the relationships must go in one direction, as shown above. If the initial relationships were between, for example, Sales and Products, and Sales and Customers, a relationship is not inferred. This is because the relationship between Products and Customers is a many-to-many relationship.
Need more help?
Want more options?
Explore subscription benefits, browse training courses, learn how to secure your device, and more.
Communities help you ask and answer questions, give feedback, and hear from experts with rich knowledge.
Содержание
- Relationships between tables in a Data Model
- PRIMARY KEY constraint and AUTO INCREMENT in an Excel file database
- 5 Answers 5
- Possible alternative
- How To Create Primary Key In Excel? New
- How do you create a primary key?
- What is the primary key in Excel?
- Excel How to Create a Unique ID or Primary Key Using IF Function
- Images related to the topicExcel How to Create a Unique ID or Primary Key Using IF Function
- What is a primary key example?
- Can a table have 2 primary keys?
- What is the primary key?
- What is primary key access?
- Which is the best example of a primary key?
- Create your own unique identifier/ data key in Excel 🔑 (* VLOOKUP *)
- Images related to the topicCreate your own unique identifier/ data key in Excel 🔑 (* VLOOKUP *)
- Where is primary key in access?
- What is secondary key explain with example?
- How do I make multiple primary keys?
- How do I make two primary keys in access?
- Can a table have 3 primary keys?
- Why primary key is required?
- How to Add Primary Key in MS Excel in Hindi | Magic Trick In MS Excel
- Images related to the topicHow to Add Primary Key in MS Excel in Hindi | Magic Trick In MS Excel
- What is a primary key class 10?
- Which attribute would you designate as the primary key?
- Information related to the topic how to create primary key in excel
Relationships between tables in a Data Model
Add more power to your data analysis by creating relationships amogn different tables. A relationship is a connection between two tables that contain data: one column in each table is the basis for the relationship. To see why relationships are useful, imagine that you track data for customer orders in your business. You could track all the data in a single table having a structure like this:
This approach can work, but it involves storing a lot of redundant data, such as the customer e-mail address for every order. Storage is cheap, but if the e-mail address changes you have to make sure you update every row for that customer. One solution to this problem is to split the data into multiple tables and define relationships between those tables. This is the approach used in relational databases like SQL Server. For example, a database that you import might represent order data by using three related tables:
Relationships exist within a Data Model—one that you explicitly create, or one that Excel automatically creates on your behalf when you simultaneously import multiple tables. You can also use the Power Pivot add-in to create or manage the model. See Create a Data Model in Excel for details.
If you use the Power Pivot add-in to import tables from the same database, Power Pivot can detect the relationships between the tables based on the columns that are in [brackets], and can reproduce these relationships in a Data Model that it builds behind the scenes. For more information, see Automatic Detection and Inference of Relationships in this article. If you import tables from multiple sources, you can manually create relationships as described in Create a relationship between two tables.
Relationships are based on columns in each table that contain the same data. For example, you could relate a Customers table with an Orders table if each contains a column that stores a Customer ID. In the example, the column names are the same, but this is not a requirement. One could be CustomerID and another CustomerNumber, as long as all of the rows in the Orders table contain an ID that is also stored in the Customers table.
In a relational database, there are several types of keys. A key is typically column with special properties. Understanding the purpose of each key can help you manage a multi-table Data Model that provides data to a PivotTable, PivotChart, or Power View report.
Though there are many types of keys, these are the most important for our purpose here:
Primary key: uniquely identifies a row in a table, such as CustomerID in the Customers table.
Alternate key (or candidate key): a column other than the primary key that is unique. For example, an Employees table might store an employee ID and a social security number, both of which are unique.
Foreign key: a column that refers to a unique column in another table, such as CustomerID in the Orders table, which refers to CustomerID in the Customers table.
In a Data Model, the primary key or alternate key is referred to as the related column. If a table has both a primary and alternate key, you can use either one as the basis of a table relationship. The foreign key is referred to as the source column or just column. In our example, a relationship would be defined between CustomerID in the Orders table (the column) and CustomerID in the Customers table (the lookup column). If you import data from a relational database, by default Excel chooses the foreign key from one table and the corresponding primary key from the other table. However, you can use any column that has unique values for the lookup column.
The relationship between a customer and an order is a one-to-many relationship. Every customer can have multiple orders, but an order can’t have multiple customers. Another important table relationship is one-to-one. In our example here, the CustomerDiscounts table, which defines a single discount rate for each customer, has a one-to-one relationship with the Customers table.
This table shows the relationships between the three tables ( Customers, CustomerDiscounts, and Orders):
Note: Many-to-many relationships are not supported in a Data Model. An example of a many-to-many relationship is a direct relationship between Products and Customers, in which a customer can buy many products and the same product can be bought by many customers.
After any relationship has been created, Excel must typically recalculate any formulas that use columns from tables in the newly created relationship. Processing can take some time, depending on the amount of data and the complexity of the relationships. For more details, see Recalculate Formulas.
A Data Model can have multiple relationships between two tables. To build accurate calculations, Excel needs a single path from one table to the next. Therefore, only one relationship between each pair of tables is active at a time. Though the others are inactive, you can specify an inactive relationship in formulas and queries.
In Diagram View, the active relationship is a solid line and the inactive ones are dashed lines. For example, in AdventureWorksDW2012, the table DimDate contains a column, DateKey, that is related to three different columns in the table FactInternetSales: OrderDate, DueDate, and ShipDate. If the active relationship is between DateKey and OrderDate, that is the default relationship in formulas unless you specify otherwise.
A relationship can be created when the following requirements are met:
Unique Identifier for Each Table
Each table must have a single column that uniquely identifies each row in that table. This column is often referred to as the primary key.
Unique Lookup Columns
The data values in the lookup column must be unique. In other words, the column can’t contain duplicates. In a Data Model, nulls and empty strings are equivalent to a blank, which is a distinct data value. This means that you can’t have multiple nulls in the lookup column.
Compatible Data Types
The data types in the source column and lookup column must be compatible. For more information about data types, see Data types supported in Data Models.
In a Data Model, you cannot create a table relationship if the key is a composite key. You’re also restricted to creating one-to-one and one-to-many relationships. Other relationship types are not supported.
Composite Keys and Lookup Columns
A composite key is composed of more than one column. Data Models can’t use composite keys: a table must always have exactly one column that uniquely identifies each row in the table. If you import tables that have an existing relationship based on a composite key, the Table Import Wizard in Power Pivot will ignore that relationship because it can’t be created in the model.
To create a relationship between two tables that have multiple columns defining the primary and foreign keys, first combine the values to create a single key column before creating the relationship. You can do this before you import the data, or by creating a calculated column in the Data Model using the Power Pivot add-in.
A Data Model cannot have many-to-many relationships. You can’t simply add junction tables in the model. However, you can use DAX functions to model many-to-many relationships.
Self-Joins and Loops
Self-joins are not permitted in a Data Model. A self-join is a recursive relationship between a table and itself. Self-joins are often used to define parent-child hierarchies. For example, you could join an Employees table to itself to produce a hierarchy that shows the management chain at a business.
Excel does not allow loops to be created among relationships in a workbook. In other words, the following set of relationships is prohibited.
Table 1, column a to Table 2, column f
Table 2, column f to Table 3, column n
Table 3, column n to Table 1, column a
If you try to create a relationship that would result in a loop being created, an error is generated.
One of the advantages to importing data using the Power Pivot add-in is that Power Pivot can sometimes detect relationships and create new relationships in the Data Model it creates in Excel.
When you import multiple tables, Power Pivot automatically detects any existing relationships among the tables. Also, when you create a PivotTable, Power Pivot analyzes the data in the tables. It detects possible relationships that have not been defined, and suggests appropriate columns to include in those relationships.
The detection algorithm uses statistical data about the values and metadata of columns to make inferences about the probability of relationships.
Data types in all related columns should be compatible. For automatic detection, only whole number and text data types are supported. For more information about data types, see Data types supported in Data Models.
For the relationship to be successfully detected, the number of unique keys in the lookup column must be greater than the values in the table on the many side. In other words, the key column on the many side of the relationship must not contain any values that are not in the key column of the lookup table. For example, suppose you have a table that lists products with their IDs (the lookup table) and a sales table that lists sales for each product (the many side of the relationship). If your sales records contain the ID of a product that does not have a corresponding ID in the Products table, the relationship can’t be automatically created, but you might be able to create it manually. To have Excel detect the relationship, you need to first update the Product lookup table with the IDs of the missing products.
Make sure the name of the key column on the many side is similar to the name of the key column in the lookup table. The names do not need to be exactly the same. For example, in a business setting, you often have variations on the names of columns that contain essentially the same data: Emp ID, EmployeeID, Employee ID, EMP_ID, and so on. The algorithm detects similar names and assigns a higher probability to those columns that have similar or exactly matching names. Therefore, to increase the probability of creating a relationship, you can try renaming the columns in the data that you import to something similar to columns in your existing tables. If Excel finds multiple possible relationships, then it does not create a relationship.
This information might help you understand why not all relationships are detected, or how changes in metadata—such as field name and the data types—could improve the results of automatic relationship detection. For more information, see Troubleshoot Relationships.
Automatic Detection for Named Sets
Relationships are not automatically detected between Named Sets and related fields in a PivotTable. You can create these relationships manually. If you want to use automatic relationship detection, remove each Named Set and add the individual fields from the Named Set directly to the PivotTable.
Inference of Relationships
In some cases, relationships between tables are automatically chained. For example, if you create a relationship between the first two sets of tables below, a relationship is inferred to exist between the other two tables, and a relationship is automatically established.
Products and Category — created manually
Category and SubCategory — created manually
Products and SubCategory — relationship is inferred
In order for relationships to be automatically chained, the relationships must go in one direction, as shown above. If the initial relationships were between, for example, Sales and Products, and Sales and Customers, a relationship is not inferred. This is because the relationship between Products and Customers is a many-to-many relationship.
Источник
PRIMARY KEY constraint and AUTO INCREMENT in an Excel file database
I’m using an Excel file as a database, and I would like to set the ID column as primary key, and auto increment the value of this column when a new record is inserted in database.
5 Answers 5
You can’t. Excel is not a database management system.
Possible alternative
You can easily copy & paste to/from Excel, or even set up scheduled imports, using MS Access, which ships with most if not all MS Office versions for Windows. This will offer you the DBMS-specific features you are looking for.
Just create a table in Access, leave the first column ID as-is then paste/import data in the other columns. The ID column already has PK and auto-increment by default.
You could also use a more «complete» free DBMS, such as MS SQL Server Express, MySQL or PostgreSQL.
As per above comments Excel isn’t the tool to do this, BUT, if you wanted to, you could add a forumla to do this. Check this post
Following are the steps to create PRIMARY KEY using Excel Custom Validation to Prevent Duplicate Entries in a Excel Cell.
1) Select entire empty column(a column with no input).
2) Go to DATA >> DATA VALIDATION
3) Choose CUSTOM from Allow Drop down list
4) paste =COUNTIF($A:$A,A1)=1 in Formula field.
Now, you column is setup with Primary Keys, means each cell beginning from A1 to entire range of A column allows only unique values as shown below.
So now if you want to put two same value in that excel cell then you got the following error:-
EDIT:- Use Header of excel column to select the row. See Red Circle in below Image, click here to select Whole column in Excel. Then follow from step 2 in above steps
Источник
How To Create Primary Key In Excel? New
Let’s discuss the question: how to create primary key in excel. We summarize all relevant answers in section Q&A of website Achievetampabay.org in category: Blog Finance. See more related questions in the comments below.
How To Create Primary Key In Excel
Table of Contents
How do you create a primary key?
- In Object Explorer, right-click the table to which you want to add a unique constraint, and click Design.
- In Table Designer, click the row selector for the database column you want to define as the primary key. …
- Right-click the row selector for the column and select Set Primary Key.
What is the primary key in Excel?
A Primary key Column is a column with a unique values for each row. The purpose is to bind data together, across tables, without repeating all of the data in every table. Suppose In the “Employee Data” Sheet , the “EMP_ID” column is the primary key, meaning that no two rows can have the same Employee ID.
Excel How to Create a Unique ID or Primary Key Using IF Function
What is a primary key example?
A primary key is a column — or a group of columns — in a table that uniquely identifies the rows in that table. For example, in the table below, CustomerNo, which displays the ID number assigned to different customers, is the primary key.
Can a table have 2 primary keys?
A table can have only ONE primary key; and in the table, this primary key can consist of single or multiple columns (fields).
What is the primary key?
A primary key is the column or columns that contain values that uniquely identify each row in a table. A database table must have a primary key for Optim to insert, update, restore, or delete data from a database table.
What is primary key access?
The Primary key in Microsoft Access is a field or set of fields with unique values throughout the table. The primary key offers several characteristics, such as it uniquely identifies each row in the database, It always contains a value, it is never empty, and the value contains never change.
Which is the best example of a primary key?
- an automatically generated number.
- social security number.
- an email address (but only if two users can’t share the same email address)
- vehicle identification number.
- driver licence number.
- some other special code that is unique to each record.
Create your own unique identifier/ data key in Excel 🔑 (* VLOOKUP *)
Where is primary key in access?
- In the Navigation Pane, right click a table, and select Design View.
- Select the field or fields you want to use as the primary key.
- Select Design > Primary Key.
What is secondary key explain with example?
A primary key is the field in a database that is the primary key used to uniquely identify a record in a database. A secondary key is an additional key, or alternate key, which can be use in addition to the primary key to locate specific data.
How do I make multiple primary keys?
A table can have only one primary key, which may consist of single or multiple fields. When multiple fields are used as a primary key, they are called a composite key. If a table has a primary key defined on any field(s), then you cannot have two records having the same value of that field(s).
How do I make two primary keys in access?
To select more than one field to create a composite key, hold down CTRL and then click the row selector for each field. On the Design tab, in the Tools group, click Primary Key. A key indicator is added to the left of the field or fields that you specify as the primary key.
Can a table have 3 primary keys?
A table can only ever have a one primary key. It is not possible to create a table with two different primary keys. You can create a table with two different unique indexes (which are much like a primary key) but only one primary key can exist.
Why primary key is required?
Without the primary key and closely related foreign key concepts, relational databases would not work. In fact, since a table can easily contain thousands of records (including duplicates), a primary key is necessary to ensure that a table record can always be uniquely identified.
How to Add Primary Key in MS Excel in Hindi | Magic Trick In MS Excel
What is a primary key class 10?
Answer: A field which uniquely identifies each record in a table is known as primary key.
Which attribute would you designate as the primary key?
To qualify as a primary key for an entity, an attribute must have the following properties: It must have a non-null value for each instance of the entity. The value must be unique for each instance of an entity. The values must not change or become null during the life of each entity instance.
- Add primary key excel
- create key in excel
- how to make primary key in database
- how to create a primary key in database
- how to generate key in excel
- add primary key excel
- how to set primary key in access
- create primary key in excel table
- how to use primary key in excel
- how to create a unique key in excel
- how to make something a primary key in excel
- generate primary key in excel
- create primary key
- what is primary key
- foreign key in excel
Here are the search results of the thread how to create primary key in excel from Bing. You can read more if you want.
You have just come across an article on the topic how to create primary key in excel. If you found this article useful, please share it. Thank you very much.
Источник
Primary and Foreign keys are a very simple relational database concept. Until very recently Excel users didn’t have to worry about relationships because Excel didn’t support them.
This all changed when Microsoft added Power Pivot (an OLAP data modeling tool) and Get & Transform (a powerful Extract Transform and Load tool) to Excel. Suddenly Excel users had to understand relational database theory to make use of the new tools.
If you are an Excel users and want to understand how primary and foreign keys fit into the big picture of creating an OLAP solution using Power Pivot you should read my my Excel Power Pivot 2-minute overview article first. But if you are a database designer struggling with relational concepts (but with no interest in Excel) you’ll also get a lot out of this short article.
The article below is an unedited lesson from one of the 35 short focused lessons in my Excel Expert Skills book/e-book that enables students to master the Get & Transform (previously called Power Query) tool that is now included in all current Excel versions.
Leave me a comment if you find the article useful.
-
Excel 365
-
Excel 2019
-
Excel 2016
-
Last Updated:
September 6, 2020
note
Virtually all of the world’s databases work like this
Dr Edgar F Codd (1923-2003) invented the relational database when working for IBM in the 1970’s.
The first relational database products came to market in the late 1970’s and were quickly adopted by big business.
So good and great was Codd’s design that nothing better has been developed in over thirty years.
The entire world of commerce is now powered by relational databases.
You’ll usually find a requirement to analyze data from a relational database in all but the very smallest enterprises.
note
Primary and foreign keys should have the same name
Any database designer worthy of the name will use the same name for the primary key column and the related foreign key column.
I’ve found that, in the real world of business, I often need to work with a badly designed database where this is not the case. This makes the data a lot more difficult to work with.
anecdote
How meaningful primary keys almost stopped the Welsh from buying cars
Many years ago, I implemented a Europe-wide Business Intelligence solution for a very large automotive finance company.
I was not pleased (but unsurprised) to find that the database designer had used meaningful primary and foreign keys.
The designer had used a concatenation of the customer’s last name and date of birth as the primary key for the customer table.
The designer believed that the possibility of having two customers with the same last name and date of birth was extremely unlikely to ever happen.
In fact, it transpired that 13.5% of the Welsh population have a last name of Jones, meaning that it was certain to happen in Wales (and actually very likely to happen everywhere).
The database was, of course, re-designed to use meaningless primary keys.
Lesson 11-26: Understand primary and foreign keys
A set of tables that are related to each other are referred to as a relational database. Some of Get & Transform’s most useful features require an understanding of basic relational database theory.
What is a primary key?
A relational database consists of several tables, each containing data. A database table is conceptually no different to an Excel table except that each row must have a unique primary key.
Here’s an example of a table from a relational database:
The above example comes from a database table called Category. It is good relational database design practice to name a primary key as the table name plus the letters ID. This designer has followed best practice and called the primary key CategoryID.
The only important quality of a primary key is that all primary key values must be different (unique) within the primary key column. That’s because the primary key is used (by a relationship) to identify a single, row in a table. If there were two table rows with the same primary key, a relationship wouldn’t be able to correctly identify which single row was being referenced.
In the above example the primary key is a number but primary keys can consist of numbers, letters or both.
What is a foreign key?
Here is an example of two related tables from a relational database:
You can see how it is possible to determine that Aniseed Syrup is in the Condiments category:
- In the Product table, Aniseed Syrup has a CategoryID of 2.
- The item in the Category table with a primary key (CategoryID) of 2 is Condiments.
While CategoryID is the primary key column in the Category table, it is a foreign key column within the Product table.
You can think of the CategoryID values in the Product table as belonging to the Category table, making them foreign keys in the Product table.
You can probably now see the wisdom of the naming convention for primary keys. It makes it possible to spot which are the primary and foreign keys in a table at a glance.
- The column named with the table name plus ID is the primary key.
- Any other column name that is suffixed with ID (has ID at the end of the name) is a foreign key.
- Any column not suffixed with ID is a regular data field containing information.
Meaningful and meaningless primary keys
In the above example, the primary key is a meaningless number. The number 2 tells you nothing about any attribute of the Condiments category. It simply provides a way to find where the correct Condiments row is located within the table.
It is also possible to use a meaningful primary key (though a professional database designer would never do this). For example, you could argue that because category names are unique, it is fine to use the CategoryName column in the above table as the primary key.
If you decided to use the category name as the primary key the tables would look like this:
From a database design perspective, this is generally a bad idea
You’ll often find, however, that you need to create this type of relationship when creating relationships between tables that originate from different databases. In this case, no formal primary key/foreign key relationship will exist.
For example, you may need to join a table from your corporate database containing customer addresses with a table detailing sales tax rates by state that you’ve downloaded from the Internet. In this case you would need to create a primary key/foreign key relationship between the state columns in each table.
This lesson is excerpted from the above book.
This is the only up-to-date Excel book currently published and includes an entire session devoted to the new Dynamic Arrays features.
It is also the only book that will teach you absolutely every Excel skill including Power Pivot, OLAP and DAX.
Covered in Session 12 – Power Pivot, Data Modelling, OLAP and Business Intelligence
Covered in Lesson 12-20: Use the CUBEVALUE function to query a data model.
Covered in Session 13: An Introduction to DAX
Covered in Lesson 12-6: Use an OLAP pivot table to analysze data residing in a data model.
Covered in Lesson 11-33 Create a merged query using fuzzy logic.
Covered in Session 10: 3D Maps.
Covered in Lesson 11-7: Create and use a custom data type.
Covered in Lesson 9-2: Use Natural Language Queries.
Covered in Lesson 11-8: Use custom data types in formulas.
Covered in Lesson 9-1: Use automatic data analysis to create data insights.
Related Articles
Understand OLTP database design
A Power Pivot data model is an OLAP database. Business databases are OLTP databases. This article discusses how OLTP databases are structured.
How to define a PRIMARY KEY column with Excel Custom Validation to Prevent the Duplicate entries in a Column
A Primary key Column is a column with a unique values for each row. The purpose is to bind data together, across tables, without repeating all of the data in every table.
Suppose In the «Employee Data» Sheet , the «EMP_ID» column is the primary key, meaning that no two rows can have the same Employee ID. The Employee ID distinguishes the two people even if they have the same name.
Suppose If you try to enter duplicate values in «EMP_ID» Column as «EMP_123» or «emp_123» in any cell Above or Below of already entered value «Emp_123» , then you will get the following “error alert” message.
How to Set Up Validation :
You can declare EMP_ID column as a Primary Key in Excel by using the Custom Data Validation as below:
>> First select entire empty Column A (a column with no input).
>> Next go to DATA > DATA VALIDATION
>> Next choose CUSTOM from Allow Drop down list and write the following formula in the Formula Box.
=COUNTIF($A:$A,A1)=1
This means each cell beginning from A1 to entire range of A column allows only unique values.
This validation works for both Text and Numbers.
It will restrict the user to enter the Duplicates above or below the existed value.
You can also use different custom validations as per you requirement. We can customize the
Error Alert message from Data Validation Window.
——————————————————————————————————-
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
———————————————————————————————————
auntiechrissie
-
#1
I know that Excel uses the cell reference as a unique identifier, but can I
define my own «Primary Key» to ensure the uniqueness of a particular field
(for example, National Insurance Number)? I know that I could do this easily
in Access, but the rest of my task is so simple, using Access seems rather
like using a sledgehammer to crack a nut!
Many thanks
Advertisements
JLGWhiz
-
#2
You can name the cell:
Select the cell and on the Excel menu bar — Insert>Nane>Define then enter
the name you want to use and click Add>OK. You can now refer to that name
in formulas and in code to identify that specific cell. See Excel help
files for details of using named ranges.
In VBA you can Use an Object Variable, to do the same thing.
Set myRange = ActiveSheet.Range(«A1»)
This will allow you to use myRange in code any time you want to refer to
Range(«A1»).
See VBA help files for details of using Object Variables.
JLatham
JLatham
-
#4
On second thought, it might not be so complex as to require a McGimpsey
solution. Let’s say that your key column will be column C and that C1 has a
label (as «National Insurance Number») in it and your data entries start on
row 2, then in cell C2 put this formula:
=MAX(C$1:C1)+1
which should return 1. Fill that formula on down the sheet and the number
will auto increment.
If you need to start with a ‘seed’ value, then make C2’s formula something
like
=MAX(C$1:C1)+1+9944
where 9944 is your ‘seed’ value. Then in C3 you put the formula:
=MAX(C$1:C2)+1
and fill that on down the sheet.
You can modify that formula to add other things to the number, such as text,
or to format the result to a specific # of digits, as (in C2)
=TEXT(ROW()-ROW(C$1)+9944,»000000″)
to get 6-digits displayed and fill that formula on down the sheet to
increment the value displayed.
or get really creative with something like this in C2
=»NINABC-22-» & TEXT(ROW()-ROW(C$1)+9944,»000000″)
which will display NINABC-22-009945 in the cell.
—Trying to post for 2nd time.
Advertisements
JLGWhiz
-
#5
In the VBA help files under both Primary Property and Unique Property, there
are code samples along with descriptive narrative that might be what you are
looking for. It apparently involves DAO and tables, but it sounds like what
you want.