Have you ever used VLOOKUP to bring a column from one table into another table? Now that Excel 2013 has a built-in Data Model, VLOOKUP is obsolete. You can create a relationship between two tables of data, based on matching data in each table. Then you can create Power View sheets and build PivotTables and other reports with fields from each table, even when the tables are from different sources. For example, if you have customer sales data, you might want to import and relate time intelligence data to analyze sales patterns by year and month.
All the tables in a workbook are listed in the PivotTable and Power View Fields lists.
When you import related tables from a relational database, Excel can often create those relationships in the Data Model it’s building behind the scenes. For all other cases, you’ll need to create relationships manually.
- Make sure the workbook contains at least two tables, and that each table has a column that can be mapped to a column in another table.
- Format the data as a table, or
Import external data as a table in a new worksheet.
- Give each table a meaningful name: In Table Tools, click Design > Table Name > enter a name.
- Verify the column in one of the tables has unique data values with no duplicates. Excel can only create the relationship if one column contains unique values.
For example, to relate customer sales with time intelligence, both tables must include dates in the same format (for example, 1/1/2012), and at least one table (time intelligence) lists each date just once within the column.
- Click Data > Relationships.
If Relationships is grayed out, your workbook contains only one table.
- In the Manage Relationships box, click New.
- In the Create Relationship box, click the arrow for Table, and select a table from the list. In a one-to-many relationship, this table should be on the many side. Using our customer and time intelligence example, you would choose the customer sales table first, because many sales are likely to occur on any given day.
- For Column (Foreign), select the column that contains the data that is related to Related Column (Primary). For example, if you had a date column in both tables, you would choose that column now.
- For Related Table, select a table that has at least one column of data that is related to the table you just selected for Table.
- For Related Column (Primary), select a column that has unique values that match the values in the column you selected for Column.
- Click OK.
More about relationships between tables in Excel
Notes about relationships
- You’ll know whether a relationship exists when you drag fields from different tables onto the PivotTable Fields list. If you aren’t prompted to create a relationship, Excel already has the relationship information it needs to relate the data.
- Creating relationships is similar to using VLOOKUPs: you need columns containing matching data so that Excel can cross-reference rows in one table with those of another table. In the time intelligence example, the Customer table would need to have date values that also exist in a time intelligence table.
- In a data model, table relationships can be one-to-one (each passenger has one boarding pass) or one-to-many (each flight has many passengers), but not many-to-many. Many-to-many relationships result in circular dependency errors, such as “A circular dependency was detected.” This error will occur if you make a direct connection between two tables that are many-to-many, or indirect connections (a chain of table relationships that are one-to-many within each relationship, but many-to-many when viewed end to end. Read more about Relationships between tables in a Data Model.
- The data types in the two columns must be compatible. See Data types in Excel Data Models for details.
- Other ways to create relationships might be more intuitive, especially if you are not sure which columns to use. See Create a relationship in Diagram View in Power Pivot.
Example: Relating time intelligence data to airline flight data
You can learn about both table relationships and time intelligence using free data on the Windows Azure Marketplace. Some of these datasets are very large, requiring a fast internet connection to complete the data download in a reasonable period of time.
- Start Power Pivot in Microsoft Excel 2013 add-in and open the Power Pivot window.
- Click Get External Data > From Data Service > From Windows Azure Marketplace. The Windows Azure Marketplace home page opens in the Table Import Wizard.
- Under Price, click Free.
- Under Category, click Science & Statistics.
- Find DateStream and click Subscribe. More about this time intelligence data feed.
- Enter your Microsoft account and click Sign in. A preview of the data should appear in the window.
- Scroll to the bottom and click Select Query.
- Click Next.
- Choose BasicCalendarUS and then click Finish to import the data. Over a fast internet connection, import should take about a minute. When finished, you should see a status report of 73,414 rows transferred. Click Close.
- Click Get External Data > From Data Service > From Windows Azure Marketplace to import a second dataset.
- Under Type, click Data.
- Under Price, click Free.
- Find US Air Carrier Flight Delays and click Select.
- Scroll to the bottom and click Select Query.
- Click Next.
- Click Finish to import the data. Over a fast internet connection, this can take 15 minutes to import. When finished, you should see a status report of 2,427,284 rows transferred. Click Close. You should now have two tables in the data model. To relate them, we’ll need compatible columns in each table.
- Notice that the DateKey in BasicCalendarUS is in the format 1/1/2012 12:00:00 AM. The On_Time_Performance table also has a datetime column, FlightDate, whose values are specified in the same format: 1/1/2012 12:00:00 AM. The two columns contain matching data, of the same data type, and at least one of the columns (DateKey) contains only unique values. In the next several steps, you’ll use these columns to relate the tables.
- In the Power Pivot window, click PivotTable to create a PivotTable in a new or existing worksheet.
- In the Field List, expand On_Time_Performance and click ArrDelayMinutes to add it to the Values area. In the PivotTable, you should see the total amount of time flights were delayed, as measured in minutes.
- Expand BasicCalendarUS and click MonthInCalendar to add it to the Rows area.
- Notice that the PivotTable now lists months, but the sum total of minutes is the same for every month. Repeating, identical values indicate a relationship is needed.
- In the Field List, in “Relationships between tables may be needed”, click Create.
- In Related Table, select On_Time_Performance and in Related Column (Primary) choose FlightDate.
- In Table, select BasicCalendarUS and in Column (Foreign) choose DateKey. Click OK to create the relationship.
- Notice that the sum of minutes delayed now varies for each month.
- In BasicCalendarUS and drag YearKey to the Rows area, above MonthInCalendar.
You can now slice arrival delays by year and month, or other values in the calendar.
Tip By default, months are listed in alphabetical order. Using the Power Pivot add-in, you can change the sort so that months appear in chronological order.
- Make sure that the BasicCalendarUS table is open in the Power Pivot window.
- On the Home table, click Sort by Column.
- In Sort, choose MonthInCalendar
- In By, choose MonthOfYear.
The PivotTable now sorts each month-year combination (October 2011, November 2011) by the month number within a year (10, 11). Changing the sort order is easy because the DateStream feed provides all of the necessary columns to make this scenario work. If you’re using a different time intelligence table, your step will be different.
“Relationships between tables may be needed”
As you add fields to a PivotTable, you’ll be informed if a table relationship is required to make sense of the fields you selected in the PivotTable.
Although Excel can tell you when a relationship is needed, it can’t tell you which tables and columns to use, or whether a table relationship is even possible. Try following these steps to get the answers you need.
Step 1: Determine which tables to specify in the relationship
If your model contains just a few tables, it might be immediately obvious which ones you need to use. But for larger models, you could probably use some help. One approach is to use Diagram View in the Power Pivot add-in. Diagram View provides a visual representation of all the tables in the Data Model. Using Diagram View, you can quickly determine which tables are separate from the rest of the model.
Note It’s possible to create ambiguous relationships that are invalid when used in a PivotTable or Power View report. Suppose all of your tables are related in some way to other tables in the model, but when you try to combine fields from different tables, you get the “Relationships between tables may be needed” message. The most likely cause is that you’ve run into a many-to-many relationship. If you follow the chain of table relationships that connect to the tables you want to use, you will probably discover that you have two or more one-to-many table relationships. There is no easy workaround that works for every situation, but you might try creating calculated columns to consolidate the columns you want to use into one table.
Step 2: Find columns that can be used to create a path from one table to the next
After you’ve identified which table is disconnected from the rest of the model, review its columns to determine whether another column, elsewhere in the model, contains matching values.
For example, suppose you have a model that contains product sales by territory, and that you subsequently import demographic data to find out if there is correlation between sales and demographic trends in each territory. Because the demographic data comes from a different data source, its tables are initially isolated from the rest of the model. To integrate the demographic data with the rest of your model, you’ll need to find a column in one of the demographic tables that corresponds to one you’re already using. For example if the demographic data is organized by region, and your sales data specifies which region the sale occurred, you could relate the two datasets by finding a common column, such as a State, Zip code, or Region, to provide the lookup.
Besides matching values, there are a few additional requirements for creating a relationship:
- Data values in the lookup column must be unique. In other words, the column can’t contain duplicates. In a Data Model, nulls and empty strings are equivalent to a blank, which is a distinct data value. This means that you can’t have multiple nulls in the lookup column.
- Data types of both the source column and lookup column must be compatible. For more information about data types, see Data types in Data Models.
To learn more about table relationships, see Relationships between tables in a Data Model.
Top of Page