Combine data from multiple data sources

In this tutorial, you import data from a local Excel file that contains product information, and from an OData feed that contains product order information. You perform transformation and aggregation steps, and combine data from both sources to produce a Total Sales per Product and Year report.

In order to perform this tutorial, you need the Products and Orders workbook. In the Save As dialog box, name the file Products and Orders.xlsx.

In this tutorial


Task 1: Import products into an Excel workbook

In this task, you import products from the Products and Orders.xlsx file into an Excel workbook. You can download the complete Excel workbook for task 1: Combine data from multiple data sources - Task 1.xlsx.

Step 1: Connect to an Excel workbook

  1. Create an Excel workbook.
  2. In the POWER QUERY ribbon tab, click From File > From Excel.
  3. In the Excel browse dialog box, browse for or type the Products and Orders.xlsx path to import or link to a file.
  4. In the Navigator pane, double click the Products worksheet or click Products and click Edit Query.

Step 2: Promote the first row to table column headers

In the Query Preview grid, the first row of the table does not contain the table column names. To promote the first row to table column headers:

  1. Click the table icon (Table icon ) in the top-left corner of the data preview.
  2. Click Use First Row as Headers.

Promote the first row to table column headers

Step 3: Remove other columns to only display columns of interest   

In this step you remove all columns except ProductID, ProductName, CategoryID, and QuantityPerUnit.

  1. In the Query Preview grid, select the ProductID, ProductName, CategoryID, and QuantityPerUnit columns (use Ctrl+Click or Shift+Click).
  2. In the Query Editor ribbon, click Remove Columns > Remove Other Columns or right-click on a column header, and click Remove Other Columns.

Hide other columns

Power Query steps created

As you perform query activities in Power Query, query steps are created and listed in the Query Settings pane, in the APPLIED STEPS list. Each query step has a corresponding Power Query formula, also known as the "M" language. For more information about the Power Query formula language, see Learn about Power Query formulas.

Task Query step Formula
Connect to an Excel workbook Source Source{[Name="Products"]}[Data]
Promote the first row to table column headers FirstRowAsHeader

Table.PromoteHeaders

(Products)

Remove other columns to only display columns of interest RemovedOtherColumns

Table.SelectColumns

(FirstRowAsHeader,{"ProductID", "ProductName", "CategoryID", "QuantityPerUnit"})

Step 4: Import a products query

In this step, you import the Products query into your Excel workbook.

  1. In the Query Editor ribbon, click Apply & Close. The results appear in a new Excel worksheet.

Top of Page Top of Page

Task 2: Import order data from an OData feed

In this task, you import data into your Excel workbook from the sample Northwind OData feed at http://services.odata.org/Northwind/Northwind.svc. You can download the complete Excel workbook for tasks 1 and 2: Combine data from multiple data sources - Task 2.xlsx.

Step 1: Connect to an OData feed

  1. In the POWER QUERY ribbon tab, click From Other Sources > From OData Feed.
  2. In the OData Feed dialog box, enter the URL for the Northwind OData feed.
  3. Click OK.
  4. In the Navigator pane, double click the Orders table or click Orders and click Edit.

 Note    When you hover your mouse over a table, you will see a table preview fly out.

Hover over a Data Source

Step 2: Expand an Order_Details table   

In this step, you expand the Order_Details table that is related to the Orders table, to combine the ProductID, UnitPrice, and Quantity columns from Order_Details into the Orders table. The Expand operation combines columns from a related table into a subject table. When the query runs, rows from the related table (Order_Details) are combined into rows from the subject table (Orders).

In Power Query, a column containing a link to a related table has an Entry link or Table link. An Entry link navigates to a single related record, and represents a one to one relationship with a subject table. A Table link navigates to a related table, and represents a one-to-many relationship with a subject table. A link represents navigation properties in a data source within a relational model. For an OData feed, navigation properties represent an entity with a foreign key association. In a database, such as SQL Server, navigation properties represent foreign key relationships in the database.

Expand the Order_Details Table link

After you expand the Order_Details table, three new columns and additional rows are added to the Orders table, one for each row in the nested or related table.

  1. In the Query Preview pane, scroll to the Order_Details column.
  2. In the Order_Details column, click the expand icon (Expand ).
  3. In the Expand drop-down:
  1. Click (Select All Columns) to clear all columns.
  2. Click ProductID, UnitPrice, and Quantity.
  3. Click OK.

Expand the Order_Details Table link

 Note    In Power Query, you can expand tables linked from to a column, and also the ability to perform aggregate operations on the columns of the linked table before expanding the data in the subject table. For more information about how to perform aggregate operations, see Aggregate data from a column.

Step 3: Remove other columns to only display columns of interest

In this step you remove all columns except OrderDate, ProductID, UnitPrice, and Quantity columns. In the previous task, you used Remove Other Columns. For this task, you remove selected columns.

Remove selected columns

  1. In the Query Preview pane, select all columns:
  1. Click the first column (OrderID).
  2. Shift+Click the last column (Shipper).
  3. Ctrl+Click the OrderDate, Order_Details.ProductID, Order_Details.UnitPrice, and Order_Details.Quantity columns.
  1. Right-click on a selected column header, and click Remove Columns.

Step 4: Calculate the line total for each Order_Details row

In this step, you create a Custom Column to calculate the line total for each Order_Details row.

Calculate the line total for each Order_Details row

  1. In the Query Preview pane, click the table icon (Table icon ) at the top-left corner of the preview.
  2. Click Insert Column > Custom.
  3. In the Insert Custom Column dialog box, in the Custom Column Formula textbox, enter [Order_Details.UnitPrice] * [Order_Details.Quantity].
  4. In the New column name textbox, enter Line Total.
  5. Click OK.

Calculate the line total for each Order_Details row

Step 5: Transform an OrderDate year column

In this step, you transform the OrderDate column to render the order date year.

  1. In the Preview grid, right-click the OrderDate column, and click Transform > Year.
  2. Rename the OrderDate column to Year:
  1. Double-Click the OrderDate column, and enter Year or
  2. Right-Click on the OrderDate column, click Rename, and enter Year.

Step 6: Group rows by ProductID and Year

  1. In the Query Preview grid, select Year and Order_Details.ProductID.
  2. Right-Click one of the headers, and click Group By.
  3. In the Group By dialog box:
  1. In the New column name textbox, enter Total Sales.
  2. In the Operation drop-down, select Sum.
  3. In the Column drop-down, select Line Total.
  1. Click OK.

Group By Dialog Box for Aggregate Operations

Step 7: Rename a query

Before you import the sales data into Excel, name the query Total Sales:

  1. In the Query Settings pane, in the Name text box enter Total Sales.

Final query results

After you perform each step, you will have a Total Sales query over the Northwind OData feed.

Total Sales

Power Query steps created

As you perform query activities in Power Query, query steps are created and listed in the Query Settings pane, in the APPLIED STEPS list. Each query step has a corresponding Power Query formula, also known as the "M" language. For more information about the Power Query formula language, see Learn about Power Query formulas.

Task Query step Formula
Connect to an OData feed Source Source{[Name="Orders"]}[Data]
Expand the Order_Details table Expand Order_Details

Table.ExpandTableColumn

(Orders, "Order_Details", {"ProductID", "UnitPrice", "Quantity"}, {"Order_Details.ProductID", "Order_Details.UnitPrice", "Order_Details.Quantity"})

Remove other columns to only display columns of interest RemovedColumns

Table.RemoveColumns

(#"Expand Order_Details",{"OrderID", "CustomerID", "EmployeeID", "RequiredDate", "ShippedDate", "ShipVia", "Freight", "ShipName", "ShipAddress", "ShipCity", "ShipRegion", "ShipPostalCode", "ShipCountry", "Customer", "Employee", "Shipper"})

Calculate the line total for each Order_Details row InsertedColumns

Table.AddColumn

(RemovedColumns, "Custom", each [Order_Details.UnitPrice] * [Order_Details.Quantity])

Transform the OrderDate column to render the year

RenamedColumns

Table.RenameColumns

(InsertedCustom,{{"Custom", "Line Total"}})

TransformedColumn

Table.TransformColumns   

(RenamedColumns,{{"OrderDate", Date.Year}})

RenamedColumns1

Table.RenameColumns   

(TransformedColumn,{{"OrderDate", "Year"}})

Group rows by ProductID and Year GroupedRows Table.Group
(RenamedColumns1, {"Year", "Order_Details.ProductID"}, {{"Total Sales", each List.Sum([Line Total]), type number}})

Step 8: Disable query download into an Excel workbook

Since the Total Sales query does not represent the final Total Sales per Product and Year report, you disable query download into the Excel workbook. When the Load to Worksheet option is Off in the Queries settings pane, the data result of this query is not downloaded, but the query can still be combined with other queries in order to build the desired result. You learn how to combine this query with the Products query in the next task.

Disable a query download

  1. In the Query Settings pane, uncheck Load to worksheet.
  2. In the Query Editor ribbon, click Apply & Close. In the Workbook Queries pane, the Total Sales query displays Load is disabled.

Disable a query download

Top of Page Top of Page

Task 3: Combine the Products and Total Sales queries

Power Query enables you to combine multiple queries, by merging or appending them. The Merge operation is performed on any Power Query query with a tabular shape, independent of the data source that the data comes from. For more information about combining data sources, see Combine multiple queries. You can download the complete Excel workbook for tasks 1 to 3: Combine data from multiple data sources - Task 3.xlsx.

In this task, you combine the Products and Total Sales queries by using a Merge and Expand query step.

Step 1: Merge ProductID into a Total Sales query

  1. In the Excel workbook, navigate to the Products query on Sheet2.
  2. In the QUERY ribbon tab, click Merge.
  3. In the Merge dialog box, select Products as the primary table, and select Total Sales as the second or related query to merge. Total Sales will become a new expandable column.
  4. To match Total Sales to Products by ProductID, select the ProductID column from the Products table, and the Order_Details.ProductID column from the Total Sales table.
  5. In the Privacy Levels dialog box:
  1. Select Organizational for your privacy isolation level for both data sources.
  2. Click Save.
  1. Click OK.

Security   Privacy Levels prevent a user from inadvertently combining data from multiple data sources, which might be private or organizational. Depending on the query, a user could inadvertently send data from the private data source to another data source that might be malicious. Power Query analyzes each data source and classifies it into the defined level of privacy: Public, Organizational, and Private. For more information about Privacy Levels, see Privacy Levels.

Merge dialog box

After you click OK, the Merge operation creates a query. The query result contains all columns from the primary table (Products), and a single column containing a navigation link to the related table (Total Sales). An Expand operation adds new columns into the primary or subject table from the related table.

Merge Final

Step 2: Expand a merge column

In this step, you expand the merge column with the name NewColumn to create two new columns in the Products query: Year and Total Sales.

Expand the NewColumn table link

  1. In the Query Preview grid, click the NewColumn expand icon (Expand ).
  2. In the Expand drop-down:
  1. Click (Select All Columns) to clear all columns.
  2. Click Year and Total Sales.
  3. Click OK.
  1. Rename these two columns to Year and Total Sales.
  2. Sort Descending by Total Sales to find out which products and in which years the products got the highest volume of sales.
  3. Rename the query to Total Sales per Product.

Expand table link

Power Query steps created

As you perform Merge query activities in Power Query, query steps are created and listed in the Query Settings pane, in the APPLIED STEPS list. Each query step has a corresponding Power Query formula, also known as the "M" language. For more information about the Power Query formula language, see Learn about Power Query formulas.

Task Query step Formula
Merge ProductID into the Total Sales query Source (data source for Merge operation)

Table.NestedJoin

(Products,{"ProductID"},#"Total Sales",{"Order_Details.ProductID"},"NewColumn")

Expand a merge column ExpandNewColumn

Table.ExpandTableColumn

(Source, "NewColumn", {"Year", "Total Sales"}, {"NewColumn.Year", "NewColumn.Total Sales"})

RenamedColumns

Table.RenameColumns

(#"Expand NewColumn",{{"NewColumn.Year", "Year"}, {"NewColumn.Total Sales", "Total Sales"}})

SortedRows

Table.Sort

(RenamedColumns,{{"Total Sales", Order.Descending}})

Step 3: Load a Total Sales per Product query into an Excel Data Model

In this step, you disable the Load to Worksheet option and load a query into the Excel Data Model, in order to build a report connected to the query result. In addition to loading query results into an Excel worksheet, Power Query enables you to load a query result into an Excel Data Model. After you load data into the Excel Data Model, you can use Power Pivot and Power View to further data analysis.

Load the Total Sales per Product query into the Excel Data Model

  1. In the Query Settings pane, uncheck Load to worksheet and check Load to data model.
  2. To load the query into the Excel Data Model, click Apply & Close.

Load Excel Data Model

Final Total Sales per Product query

After you perform each step, you will have a Total Sales per Product query that combines data from the Products and Orders.xlsx file and Northwind OData feed. This query can be applied to a Power Pivot model. In addition, changes to the query in Power Query modify and refresh the resulting table in the Power Pivot model.

Top of Page Top of Page

 
 
Applies to:
Excel 2013, Power BI for Office 365, Excel 2010