Introduction
This article describes the complete steps for Microsoft Excel data import to SQL Server using linked servers technique.
The article describes the steps for all modern platforms:
- Microsoft SQL Server 2005-2016 on the x86/x64 platform.
- Microsoft Excel 2003-2016 files like *.xls, *.xlsx, *.xlsm, *.xlsb.
Bonus
You can develop amazing Microsoft Excel applications for working with Microsoft SQL Server using database development skills only!
Visit www.savetodb.com, download and install SaveToDB Add-In for Microsoft Excel.
That’s all!
Connect to tables, views, and stored procedures, edit the data and save it back to a database.
Add features to your Microsoft Excel applications step by step configuring apps via SQL.
Table of Contents
- Introduction
- The basics of Excel data import to SQL Server using linked servers
- Configuration steps for Excel data import to SQL Server using linked servers
- Install Microsoft.ACE.OLEDB.12.0 driver
- Grant rights to TEMP directory
- Configure ACE OLE DB properties
- Configure linked servers
- How-To: Import Excel 2003 to SQL Server x86
- How-To: Import Excel 2007 to SQL Server x86
- How-To: Import Excel 2003/2007 to SQL Server x64
- Conclusion
- See Also
The Basics of Excel Data Import to SQL Server Using Linked Servers
To import data from Microsoft Excel 2003 files to 32-bit SQL Server the Microsoft.Jet.OLEDB.4.0 provider can be used. Use the T-SQL code like this to add a linked server to Excel 2003 workbook:
EXEC sp_addlinkedserver @server = 'ExcelServer1', @srvproduct = 'Excel', @provider = 'Microsoft.Jet.OLEDB.4.0', @datasrc = 'C:Testexcel-sql-server.xls', @provstr = 'Excel 8.0;IMEX=1;HDR=YES;'
To import data from Microsoft Excel 2007 to 32-bit SQL Server or from any Microsoft Excel files to 64-bit SQL Server the Microsoft.ACE.OLEDB.12.0 provider should be used. Use the T-SQL code like this:
EXEC sp_addlinkedserver @server = 'ExcelServer2', @srvproduct = 'Excel', @provider = 'Microsoft.ACE.OLEDB.12.0', @datasrc = 'C:Testexcel-sql-server.xlsx', @provstr = 'Excel 12.0;IMEX=1;HDR=YES;'
IMEX=1 defines to import all Excel column data including data of mixed types.
HDR=YES defines that Excel data contain column headers.
The way to modify a linked server is to drop and create it again. Use the T-SQL code like this:
EXEC sp_dropserver @server = N'ExcelServer1', @droplogins='droplogins'
There are two ways to use linked server data. The first way is like this:
SELECT * FROM ExcelServer1...[Sheet1$]
and the second one is the use of the OPENQUERY function:
SELECT * FROM OPENQUERY(ExcelServer1, 'SELECT * FROM [Sheet1$]')
The use of the OPENQUERY function is more flexible because queries can contain Excel ranges unlike the entire sheet in the first case.
Configuration Steps for Excel Data Import to SQL Server Using Linked Servers
Install Microsoft.ACE.OLEDB.12.0 driver
To import Excel 2007-2016 files to SQL Server the Microsoft.ACE.OLEDB.12.0 driver should be installed.
To download the driver use the following link:
Microsoft Access Database Engine 2010 Redistributable
Don’t worry about «Access» in the name.
Warning! x64 driver cannot be installed if Microsoft Office 2007-2016 x86 is already installed!
So, there is no way to import Excel data to SQL Server x64 using Linked Servers technique on a machine with Microsoft Office x86!
The SQL Server Error Message if Microsoft.ACE.OLEDB.12.0 is not installed
OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "ExcelServer2" returned message "The Microsoft Access database engine cannot open or write to the file ''. It is already opened exclusively by another user, or you need permission to view and write its data.". Msg 7303, Level 16, State 1, Line 1 Cannot initialize the data source object of OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "ExcelServer2".
Grant rights to TEMP directory
This step is required only for 32-bit SQL Server with any OLE DB provider.
The main problem is that an OLE DB provider creates a temporary file during the query in the SQL Server temp directory using credentials of a user who run the query.
The default directory for SQL Server is a default directory for SQL Server service account.
If SQL Server is run under the Network Service account the temp directory is like:
C:WindowsServiceProfilesNetworkServiceAppDataLocalTemp
If SQL Server is run under the Local Service account the temp directory is like:
C:WindowsServiceProfilesLocalServiceAppDataLocalTemp
Microsoft recommends two ways for the solution:
- A change of SQL Server TEMP directory and a grant of full rights for all users to this directory.
- Grant of read/write rights to the current SQL Server TEMP directory.
See details: PRB: «Unspecified error» Error 7399 Using OPENROWSET Against Jet Database
Usually, only a few accounts are used for import operations. So, we can just add the rights for these accounts.
For example, icacls utility can be used for the rights setup:
icacls C:WindowsServiceProfilesNetworkServiceAppDataLocalTemp /grant vs:(R,W)
if SQL Server is started under Network Service and login «vs» is used to run the queries.
The SQL Server Error Message if a user has no rights for SQL Server TEMP directory
OLE DB provider "Microsoft.Jet.OLEDB.4.0" for linked server "ExcelServer1" returned message "Unspecified error". Msg 7303, Level 16, State 1, Line 1 Cannot initialize the data source object of OLE DB provider "Microsoft.Jet.OLEDB.4.0" for linked server "ExcelServer1".
or the message for Microsoft.ACE.OLEDB.12.0 provider:
OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "ExcelServer2" returned message "Unspecified error". Msg 7303, Level 16, State 1, Line 1 Cannot initialize the data source object of OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "ExcelServer2".
Configure ACE OLE DB properties
This step is required only if the Microsoft.ACE.OLEDB.12.0 provider is used.
Use the following T-SQL code:
EXEC sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'AllowInProcess', 1 GO EXEC sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'DynamicParameters', 1 GO
The SQL Server Error Messages if OLE DB properties are not configured
Msg 7399, Level 16, State 1, Line 1 The OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "ExcelServer2" reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 1 Cannot fetch a row from OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "ExcelServer2".
Configure linked servers
The configuring of linked servers is discussed in the Basics topic.
Use the T-SQL code like this for Excel 2003 linked servers:
EXEC sp_addlinkedserver @server = 'ExcelServer1', @srvproduct = 'Excel', @provider = 'Microsoft.Jet.OLEDB.4.0', @datasrc = 'C:Testexcel-sql-server.xls', @provstr = 'Excel 8.0;IMEX=1;HDR=YES;'
Use the T-SQL code like this for Excel 2007 linked servers or on SQL Server x64:
EXEC sp_addlinkedserver @server = 'ExcelServer2', @srvproduct = 'Excel', @provider = 'Microsoft.ACE.OLEDB.12.0', @datasrc = 'C:Testexcel-sql-server.xlsx', @provstr = 'Excel 12.0;IMEX=1;HDR=YES;'
How-To: Import Excel 2003 to SQL Server x86
Step 1. Grant rights to TEMP directory
icacls C:WindowsServiceProfiles<SQL Server Account>AppDataLocalTemp /grant <User>:(R,W)
The most commonly used paths:
C:WindowsServiceProfilesNetworkServiceAppDataLocalTemp
C:WindowsServiceProfilesLocalServiceAppDataLocalTemp
Step 2. Configure linked server using Microsoft.Jet.OLEDB.4.0 provider
EXEC sp_addlinkedserver @server = 'ExcelServer1', @srvproduct = 'Excel', @provider = 'Microsoft.Jet.OLEDB.4.0', @datasrc = 'C:Testexcel-sql-server.xls', @provstr = 'Excel 8.0;IMEX=1;HDR=YES;'
How-To: Import Excel 2007 to SQL Server x86
Step 1. Install the 32-bit Microsoft.ACE.OLEDB.12.0 driver
Microsoft Access Database Engine 2010 Redistributable
Step 2. Grant rights to TEMP directory
icacls C:WindowsServiceProfiles<SQL Server Account>AppDataLocalTemp /grant <User>:(R,W)
The most commonly used paths:
C:WindowsServiceProfilesNetworkServiceAppDataLocalTemp
C:WindowsServiceProfilesLocalServiceAppDataLocalTemp
Step 3. Configure ACE OLE DB properties
EXEC sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'AllowInProcess', 1 GO EXEC sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'DynamicParameters', 1 GO
Step 4. Configure linked server using Microsoft.ACE.OLEDB.12.0 provider
EXEC sp_addlinkedserver @server = 'ExcelServer2', @srvproduct = 'Excel', @provider = 'Microsoft.ACE.OLEDB.12.0', @datasrc = 'C:Testexcel-sql-server.xlsx', @provstr = 'Excel 12.0;IMEX=1;HDR=YES;'
How-To: Import Excel 2003/2007 to SQL Server x64
Step 1. Install 64-bit Microsoft.ACE.OLEDB.12.0 driver
Microsoft Access Database Engine 2010 Redistributable
Step 2. Configure ACE OLE DB properties
EXEC sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'AllowInProcess', 1 GO EXEC sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'DynamicParameters', 1 GO
Step 3. Configure linked server using Microsoft.ACE.OLEDB.12.0 provider
EXEC sp_addlinkedserver @server = 'ExcelServer2', @srvproduct = 'Excel', @provider = 'Microsoft.ACE.OLEDB.12.0', @datasrc = 'C:Testexcel-sql-server.xlsx', @provstr = 'Excel 12.0;IMEX=1;HDR=YES;'
Conclusion
Using the described techniques you can import data from Microsof Excel 2003-2016 to SQL Server 2005-2016 on the 32-bit or 64-bit platform.
See Also
- References
- OPENQUERY (Transact-SQL)
- How-To
- How to use Excel with SQL Server linked servers and distributed queries
- Downloads
- Microsoft Access Database Engine 2010 Redistributable
- Microsoft Access Database Engine 2016 Redistributable
Table of Contents
- Introduction
- Building the Environment for Testing
- Creating an Excel File to test
- Installing the necessary components in Windows Server
- Enabling SQL Server Instance to Read File
- Querying and Importing the Spreadsheet
- Conclusion
- References
- See Also
- Other Languages
Introduction
We often have to perform data integration in SQL Server, with different data sources such as «.txt» files (tabular text or with separator character), «.csv» files or «.xls» (Excel) files.
It is always not possible to create a SSIS package to do this data import, a useful alternative is to use OPENROWSET method for importing data.
In this article, we will use data import from Excel files (.xls e .xlsx).
Building the Environment for Testing
So that we see the data import process steps from an Excel file to a table from database, we need:
- Create an Excel file to import sample;
- Configure Windows Server, installing the necessary components;
- Configure the necessary permissions to the SQL instance that we need to obtain data files.
Let’s prepare environment for data import!
Creating an Excel File to test
In this step, we will create an Excel file sample with just a few rows to demo.
Add a header row, to explicitly define the data: «ID», «Item Name» and «Date Created».
The data sequences is only to facilitate the visualization of the content that is being manipulated.
See this Excel file in the image below (click to enlarge)
Installing the necessary components in Windows Server
To get the data through a query inside SQL Server, use an OLE DB Data Provider.
Most files can now use the
Microsoft.ACE.OLEDB.12.0 Data Provider that can be obtained free through Data Connectivity Components.
This package will provide all ODBC and OLEDB drivers for data manipulation, as follow below:
File Type (extension) | Extended Properties |
Excel 97-2003 Workbook (.xls) | Excel 8.0 |
Excel 2007-2010 Workbook (.xlsx) | Excel 12.0 XML |
Excel 2007-2010 Macro-enabled workbook (.xlsm) | Excel 12.0 Macro |
Excel 2007-2010 Non-XML binary workbook (.xlsb) | Excel 12.0 |
There are two versions of this package: «AccessDatabaseEngine.exe» for x86 platform and other «AccessDatabaseEngine_x64.exe» for x64 platform.
The minimum system requirements for this installation can be obtained in the same
download package page.
If you are installing the x86 package you must ensure that your user is allowed access to the Temporary directory of your Windows OS.
To know what your Temporary directory open the «Control Panel», click «Advanced System Settings» option. A window will open, select the «Advanced» tab and click the «Environment Variables» button.
A new window will open with your environment variables, including «TEMP» and «TMP» variables, indicating your Temporary directory.
See this windows in the image below (click to enlarge)
So if your operating system is Windows 32-bit (x86) is necessary to include read and write access to the user of your SQL Server instance.
It’s important to remember that the user of your SQL Server instance must be a local user or the default «Local System» account to grant this access.
See this window Service Properties in the image below
Enabling SQL Server Instance to Read File
The settings and permissions to execute a query external data has some details that should be performed to be able to get the data from an Excel files (.xls ou .xlsx) and also other formats.
The execution of distributed queries as OPENROWSET is only possible when the SQL Server instance has the
Ad Hoc Distributed Queries configuration enabled. By default, every SQL Server instance maintains this permission denied.
Note |
---|
The Advanced Settings should only be changed by an experienced professional or a certified professional in SQL Server. It’s important to note not use these commands in Production Databases without previous analysis. |
To enable this feature just use the sp_configure system stored procedure in your SQL instance to display its Advanced Settings in
show advanced options parameter and soon to follow, enable the Ad Hoc Distributed Queries setting to enabling the use of distributed queries.
USE [master]
GO
—CONFIGURING SQL INSTANCE TO ACCEPT ADVANCED OPTIONS
EXEC
sp_configure ‘show advanced options’, 1
RECONFIGURE
GO
—ENABLING USE OF DISTRIBUTED QUERIES
EXEC
sp_configure ‘Ad Hoc Distributed Queries’, 1
RECONFIGURE
GO
These changes in the Advanced settings only take effect after the execution of the RECONFIGURE command.
To get permission granted to use the Data Provider through sp_MSset_oledb_prop system stored procedure to link Microsoft.ACE.OLEDB.12.0 in
SQL Server using AllowInProcess parameter so we can use the resources of the Data Provider and also allow the use of dynamic parameters in queries through of
DynamicParameters parameter for our queries can use T-SQL clauses.
USE [master]
GO
—ADD DRIVERS IN SQL INSTANCE
EXEC
master.dbo.sp_MSset_oledb_prop
N’Microsoft.ACE.OLEDB.12.0′,
N’AllowInProcess’, 1
GO
EXEC
master.dbo.sp_MSset_oledb_prop
N’Microsoft.ACE.OLEDB.12.0′,
N’DynamicParameters’, 1
GO
See
this output SQL script in the image below
After setting up your SQL instance to use the
Microsoft.ACE.OLEDB.12.0 Data Provider and make the appropriate access permissions, we can implement the distributed queries of other data sources,
in this case to Excel files.
Querying and Importing the Spreadsheet
As this demo is for Excel files (.xls) we will perform a query using an OPENROWSET method with the Excel test file that was created earlier in this article.
We use some parameters for this method to be able to data query:
- Data Provider — In this case, using Microsoft.ACE.OLEDB.12.0
- BULK Options — File Version;Where it’s stored; Header (HDR); Import Mode (IMEX)
- Query —
T-SQL statement with or without clauses to data filter and process.
—CONSULTING A SPREADSHEET
SELECT
* FROM
OPENROWSET(‘Microsoft.ACE.OLEDB.12.0’,
‘Excel 12.0; Database=C:MicrosoftTest.xls; HDR=YES; IMEX=1’,
‘SELECT * FROM [Plan1$]’)
GO
See
this output SQL script in the image below
To data group and perform other tasks for data manipulation, the ideal is always load the data into the database. You can insert data into an existing table using the INSERT statement or you can create a table through of INTO command in SELECT statement.
—CONSULTING A SPREADSHEET
SELECT *
INTO
TB_EXAMPLE
FROM OPENROWSET(‘Microsoft.ACE.OLEDB.12.0’,
‘Excel 12.0; Database=C:MicrosoftTest.xls; HDR=YES; IMEX=1’,
‘SELECT * FROM [Plan1$]’)
GO
SELECT * FROM TB_EXAMPLE
GO
See
this output SQL script in the image below
It’s also important to check if the SQL Server Service user has access in Windows directory where Excel files
are stored.
Conclusion
Have the possibility to use an alternative resource for importing data with T-SQL command is very useful, especially when we have to manipulate files in proprietary formats, as for .xlsx files where it’s necessary to use the Data Provider appropriate to obtain
the data correctly and with ease use.
It’s important to watch out that only users that have actually need to manipulate these files can use these resources, while minimizing the vulnerability of their environment through a permission in your SQL Server.
References
- OPENROWSET (Transact-SQL)
- Import Bulk Data by Using BULK INSERT or OPENROWSET(BULK…) (SQL Server)
- OLE DB Providers Tested with SQL Server
- Excel Source
See Also
- Transact-SQL Portal
- Wiki: Portal of TechNet Wiki Portals
Other Languages
- Importando uma planilha Excel para um Banco de Dados SQL Server (pt-BR)
This article was awarded the silver medal in the TechNet
Guru of April 2014
title | description | author | ms.author | ms.date | ms.service | ms.subservice | ms.topic | monikerRange |
---|---|---|---|---|---|---|---|---|
Import data from Excel to SQL Server or Azure SQL Database |
This article describes methods to import data from Excel to SQL Server or Azure SQL Database. Some use a single step, others require an intermediate text file. |
rwestMSFT |
randolphwest |
03/30/2023 |
sql |
data-movement |
conceptual |
=azuresqldb-current||>=sql-server-2016||>=sql-server-linux-2017||=azuresqldb-mi-current |
Import data from Excel to SQL Server or Azure SQL Database
[!INCLUDE SQL Server Azure SQL Database]
There are several ways to import data from Excel files to [!INCLUDE ssnoversion-md] or to Azure SQL Database. Some methods let you import data in a single step directly from Excel files; other methods require you to export your Excel data as text (CSV file) before you can import it.
This article summarizes the frequently used methods and provides links for more detailed information. A complete description of complex tools and services like SSIS or Azure Data Factory is beyond the scope of this article. To learn more about the solution that interests you, follow the provided links.
List of methods
There are several ways to import data from Excel. You may need to install SQL Server Management Studio (SSMS) to use some of these tools.
You can use the following tools to import data from Excel:
Export to text first ([!INCLUDE ssnoversion-md] and SQL Database) | Directly from Excel ([!INCLUDE ssnoversion-md] on-premises only) |
---|---|
Import Flat File Wizard | SQL Server Import and Export Wizard |
BULK INSERT statement | SQL Server Integration Services (SSIS) |
BCP | OPENROWSET function |
Copy Wizard (Azure Data Factory) | |
Azure Data Factory |
If you want to import multiple worksheets from an Excel workbook, you typically have to run any of these tools once for each sheet.
[!IMPORTANT]
To learn more, see limitations and known issues for loading data to or from Excel files.
Import and Export Wizard
Import data directly from Excel files by using the [!INCLUDE ssnoversion-md] Import and Export Wizard. You also can save the settings as a SQL Server Integration Services (SSIS) package that you can customize and reuse later.
-
In [!INCLUDEssManStudioFull], connect to an instance of the [!INCLUDEssNoVersion] [!INCLUDEssDE].
-
Expand Databases.
-
Right-click a database.
-
Select Tasks.
-
Choose to Import Data or Export Data:
:::image type=»content» source=»../../integration-services/import-export-data/media/start-wizard-ssms.jpg» alt-text=»Start wizard SSMS»:::
This launches the wizard:
:::image type=»content» source=»media/excel-connection.png» alt-text=»Connect to an Excel data source»:::
To learn more, review:
- Start the SQL Server Import and Export Wizard
- Get started with this simple example of the Import and Export Wizard
Integration Services (SSIS)
If you’re familiar with SQL Server Integration Services (SSIS) and don’t want to run the [!INCLUDE ssnoversion-md] Import and Export Wizard, create an SSIS package that uses the Excel Source and the [!INCLUDE ssnoversion-md] Destination in the data flow.
To learn more, review:
- Excel Source
- SQL Server Destination
To start learning how to build SSIS packages, see the tutorial How to Create an ETL Package.
:::image type=»content» source=»media/excel-to-sql-data-flow.png» alt-text=»Components in the data flow»:::
OPENROWSET and linked servers
[!IMPORTANT]
In Azure SQL Database, you cannot import directly from Excel. You must first export the data to a text (CSV) file.
[!NOTE]
The ACE provider (formerly the Jet provider) that connects to Excel data sources is intended for interactive client-side use. If you use the ACE provider on [!INCLUDE ssnoversion-md], especially in automated processes or processes running in parallel, you may see unexpected results.
Distributed queries
Import data directly into [!INCLUDE ssnoversion-md] from Excel files by using the Transact-SQL OPENROWSET
or OPENDATASOURCE
function. This usage is called a distributed query.
[!IMPORTANT]
In Azure SQL Database, you cannot import directly from Excel. You must first export the data to a text (CSV) file.
Before you can run a distributed query, you have to enable the ad hoc distributed queries
server configuration option, as shown in the following example. For more info, see ad hoc distributed queries Server Configuration Option.
sp_configure 'show advanced options', 1; RECONFIGURE; GO sp_configure 'ad hoc distributed queries', 1; RECONFIGURE; GO
The following code sample uses OPENROWSET
to import the data from the Excel Sheet1
worksheet into a new database table.
USE ImportFromExcel; GO SELECT * INTO Data_dq FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0', 'Excel 12.0; Database=C:TempData.xlsx', [Sheet1$]); GO
Here’s the same example with OPENDATASOURCE
.
USE ImportFromExcel; GO SELECT * INTO Data_dq FROM OPENDATASOURCE('Microsoft.ACE.OLEDB.12.0', 'Data Source=C:TempData.xlsx;Extended Properties=Excel 12.0')...[Sheet1$]; GO
To append the imported data to an existing table instead of creating a new table, use the INSERT INTO ... SELECT ... FROM ...
syntax instead of the SELECT ... INTO ... FROM ...
syntax used in the preceding examples.
To query the Excel data without importing it, just use the standard SELECT ... FROM ...
syntax.
For more info about distributed queries, see the following articles:
- Distributed Queries (Distributed queries are still supported in [!INCLUDE sssql19-md], but the documentation for this feature hasn’t been updated.)
- OPENROWSET
- OPENDATASOURCE
Linked servers
You can also configure a persistent connection from [!INCLUDE ssnoversion-md] to the Excel file as a linked server. The following example imports the data from the Data
worksheet on the existing Excel linked server EXCELLINK
into a new [!INCLUDE ssnoversion-md] database table named Data_ls
.
USE ImportFromExcel; GO SELECT * INTO Data_ls FROM EXCELLINK...[Data$]; GO
You can create a linked server from SQL Server Management Studio (SSMS), or by running the system stored procedure sp_addlinkedserver
, as shown in the following example.
DECLARE @RC INT; DECLARE @server NVARCHAR(128); DECLARE @srvproduct NVARCHAR(128); DECLARE @provider NVARCHAR(128); DECLARE @datasrc NVARCHAR(4000); DECLARE @location NVARCHAR(4000); DECLARE @provstr NVARCHAR(4000); DECLARE @catalog NVARCHAR(128); -- Set parameter values SET @server = 'EXCELLINK'; SET @srvproduct = 'Excel'; SET @provider = 'Microsoft.ACE.OLEDB.12.0'; SET @datasrc = 'C:TempData.xlsx'; SET @provstr = 'Excel 12.0'; EXEC @RC = [master].[dbo].[sp_addlinkedserver] @server, @srvproduct, @provider, @datasrc, @location, @provstr, @catalog;
For more info about linked servers, see the following articles:
- Create Linked Servers
- OPENQUERY
For more examples and info about both linked servers and distributed queries, see the following article:
- How to use Excel with SQL Server linked servers and distributed queries
Prerequisite — Save Excel data as text
To use the rest of the methods described on this page — the BULK INSERT statement, the BCP tool, or Azure Data Factory — first you have to export your Excel data to a text file.
In Excel, select File | Save As and then select Text (Tab-delimited) (*.txt) or CSV (Comma-delimited) (*.csv) as the destination file type.
If you want to export multiple worksheets from the workbook, select each sheet, and then repeat this procedure. The Save as command exports only the active sheet.
[!TIP]
For best results with data importing tools, save sheets that contain only the column headers and the rows of data. If the saved data contains page titles, blank lines, notes, and so forth, you may see unexpected results later when you import the data.
The Import Flat File Wizard
Import data saved as text files by stepping through the pages of the Import Flat File Wizard.
As described previously in the Prerequisite section, you have to export your Excel data as text before you can use the Import Flat File Wizard to import it.
For more info about the Import Flat File Wizard, see Import Flat File to SQL Wizard.
BULK INSERT command
BULK INSERT
is a Transact-SQL command that you can run from SQL Server Management Studio. The following example loads the data from the Data.csv
comma-delimited file into an existing database table.
As described previously in the Prerequisite section, you have to export your Excel data as text before you can use BULK INSERT to import it. BULK INSERT can’t read Excel files directly. With the BULK INSERT command, you can import a CSV file that is stored locally or in Azure Blob storage.
USE ImportFromExcel; GO BULK INSERT Data_bi FROM 'C:Tempdata.csv' WITH ( FIELDTERMINATOR = ',', ROWTERMINATOR = 'n' ); GO
For more info and examples for [!INCLUDE ssnoversion-md] and SQL Database, see the following articles:
- Import Bulk Data by Using BULK INSERT or OPENROWSET(BULK…)
- BULK INSERT
BCP tool
BCP is a program that you run from the command prompt. The following example loads the data from the Data.csv
comma-delimited file into the existing Data_bcp
database table.
As described previously in the Prerequisite section, you have to export your Excel data as text before you can use BCP to import it. BCP can’t read Excel files directly. Use to import into [!INCLUDE ssnoversion-md] or SQL Database from a test (CSV) file saved to local storage.
[!IMPORTANT]
For a text (CSV) file stored in Azure Blob storage, use BULK INSERT or OPENROWSET. For an examples, see Example.
bcp.exe ImportFromExcel..Data_bcp in "C:Tempdata.csv" -T -c -t ,
For more info about BCP, see the following articles:
- Import and Export Bulk Data by Using the bcp Utility
- bcp Utility
- Prepare Data for Bulk Export or Import
Copy Wizard (ADF)
Import data saved as text files by stepping through the pages of the Azure Data Factory (ADF) Copy Wizard.
As described previously in the Prerequisite section, you have to export your Excel data as text before you can use Azure Data Factory to import it. Data Factory can’t read Excel files directly.
For more info about the Copy Wizard, see the following articles:
- Data Factory Copy Wizard
- Tutorial: Create a pipeline with Copy Activity using Data Factory Copy Wizard.
Azure Data Factory
If you’re familiar with Azure Data Factory and don’t want to run the Copy Wizard, create a pipeline with a Copy activity that copies from the text file to [!INCLUDE ssnoversion-md] or to Azure SQL Database.
As described previously in the Prerequisite section, you have to export your Excel data as text before you can use Azure Data Factory to import it. Data Factory can’t read Excel files directly.
For more info about using these Data Factory sources and sinks, see the following articles:
- File system
- SQL Server
- Azure SQL Database
To start learning how to copy data with Azure data factory, see the following articles:
- Move data by using Copy Activity
- Tutorial: Create a pipeline with Copy Activity using Azure portal
Common errors
Microsoft.ACE.OLEDB.12.0″ hasn’t been registered
This error occurs because the OLEDB provider isn’t installed. Install it from Microsoft Access Database Engine 2010 Redistributable. Be sure to install the 64-bit version if Windows and [!INCLUDE ssnoversion-md] are both 64-bit.
The full error is:
Msg 7403, Level 16, State 1, Line 3
The OLE DB provider "Microsoft.ACE.OLEDB.12.0" has not been registered.
Cannot create an instance of OLE DB provider «Microsoft.ACE.OLEDB.12.0» for linked server «(null)»
This indicates that the Microsoft OLEDB hasn’t been configured properly. Run the following Transact-SQL code to resolve this:
EXEC sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'AllowInProcess', 1; EXEC sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'DynamicParameters', 1;
The full error is:
Msg 7302, Level 16, State 1, Line 3
Cannot create an instance of OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "(null)".
The 32-bit OLE DB provider «Microsoft.ACE.OLEDB.12.0» cannot be loaded in-process on a 64-bit SQL Server
This occurs when a 32-bit version of the OLD DB provider is installed with a 64-bit [!INCLUDE ssnoversion-md]. To resolve this issue, uninstall the 32-bit version and install the 64-bit version of the OLE DB provider instead.
The full error is:
Msg 7438, Level 16, State 1, Line 3
The 32-bit OLE DB provider "Microsoft.ACE.OLEDB.12.0" cannot be loaded in-process on a 64-bit SQL Server.
The OLE DB provider «Microsoft.ACE.OLEDB.12.0» for linked server «(null)» reported an error.
Cannot initialize the data source object of OLE DB provider «Microsoft.ACE.OLEDB.12.0» for linked server «(null)»
Both of these errors typically indicate a permissions issue between the [!INCLUDE ssnoversion-md] process and the file. Ensure that the account that is running the [!INCLUDE ssnoversion-md] service has full access permission to the file. We recommend against trying to import files from the desktop.
The full errors are:
Msg 7399, Level 16, State 1, Line 3
The OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7303, Level 16, State 1, Line 3
Cannot initialize the data source object of OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "(null)".
Next steps
- Get started with this simple example of the Import and Export Wizard
- Import data from Excel or export data to Excel with SQL Server Integration Services (SSIS)
- bcp Utility
- Move data by using Copy Activity
There are many articles about writing code to import an Excel file, but this is a manual/shortcut version:
If you don’t need to import your Excel file programmatically using code, you can do it very quickly using the menu in SQL Server Management Studio (SSMS).
The quickest way to get your Excel file into SQL is by using the import wizard:
-
Open SSMS (SQL Server Management Studio) and connect to the database where you want to import your file into.
-
Import Data: in SSMS in Object Explorer under ‘Databases’, right-click the destination database, and select Tasks, Import Data. An import wizard will pop up (you can usually just click Next on the first screen).
-
The next window is ‘Choose a Data Source‘. Select Excel:
-
In the ‘Data Source’ dropdown list, select Microsoft Excel (this option should appear automatically if you have Excel installed).
-
Click the ‘Browse’ button to select the path to the Excel file you want to import.
-
Select the version of the Excel file (97-2003 is usually fine for files with a .XLS extension, or use 2007 for newer files with a .XLSX extension)
-
Tick the ‘First Row has headers’ checkbox if your Excel file contains headers.
-
Click Next.
- On the ‘Choose a Destination‘ screen, select destination database:
-
Select the ‘Server name’, Authentication (typically your sql username & password) and select a Database as destination. Click Next.
- On the ‘Specify Table Copy or Query‘ window:
- For simplicity just select ‘Copy data from one or more tables or views’, click Next.
-
‘Select Source Tables:‘ choose the worksheet(s) from your Excel file and specify a destination table for each worksheet. If you don’t have a table yet the wizard will very kindly create a new table that matches all the columns from your spreadsheet. Click Next.
-
Click Finish.
With ODBC, you can summarise, and select just the data you need, in an Excel workbook before importing it into SQL Server. You can join data from different areas or worksheets. You can even get data from the result of a SQL Server SELECT statement into an Excel spreadsheet. Phil Factor shows how, and warns of some of the pitfalls.
Why Use ODBC?
It is reasonably easy to insert data from Excel into SQL Server, or the reverse, from any other ODBC database to any other, using PowerShell. The most important direction is from Excel to SQL Server, of course. It is quicker than automating Excel and you can do it without requiring a copy of Excel. It is neater than SSIS too, and more versatile. The most important thing, though, is that you can aggregate before you send the data. It is possible to do a lot of filtering and aggregation of data before it ever gets to SQL Server, since you can turn an existing Excel Workbook into a poor-man’s relational database, or even create one. This article will aim to show how this is done.
I always feel slightly awkward in talking about ODBC. It is a Once and Future technology, developed before its time, but now showing its value for processing large volumes of data, despite its quirks, poor documentation and lackluster support. If you use the ODBC driver, then your Excel workbook becomes a little SQL-based relational database. Worksheets, or areas within worksheets, become tables. There are some features missing, of course, but you can do joins between tables, filter rows to taste, do aggregations and some string manipulations. This means that you need pull far less data into SQL because you can do a lot of selection and pre-processing before the data gets anywhere near SQL server. If, for example, you only need the total, count, and variance of a day’s readings, then why on earth would you want to import more than those aggregated figures? Even if you do, these aggregations, performed on the original data, can be used as a ‘reconciliation’ check that you’ve gulped all the data into their final destination without error.
I also prefer to use ODBC and the sequential data reader to read data from Excel, or any other ODBC source, because it is fast; and I like to use the bulk copy library to insert ODBC ‘reader’ data into a SQL Server table because it is extremely fast, so we’ll use that. When you have a large number of big spreadsheets to insert as a chore, then speed matters.
The ODBC Excel driver (ACE)
ODBC was conceived as a way of making it as easy to connect to a particular datasource such a relational database, text file, data document (e.g. XML), web-based data or spreadsheet
Currently, the state of the art in ODBC for Access and Excel is the Microsoft Access Database Engine 2010 Redistributable which can be downloaded here. This includes the more popular OLEDB drivers which run well in PowerShell too. These drivers enable you to access a range of data files via SQL as if they were a relational database. Formats include Access, CSV, delimited, DBase and Excel
For developing on a general-purpose 64-bit desktop computer, you’re likely to hit a very silly Microsoft muddle. Microsoft recommends that you install the 32-bit version of Office 2010, even on 64-bit machines, since many of the common Office Add-ins did not run in the 64-bit Office environment. This advice has become baked-in ‘best practice’. If you are using 64-bit PowerShell, as most of us are, then you need to use the 64-bit version of the drivers. If you only have the 32-bit Office on your machine, then it will already have the 32-bit drivers, which won’t be visible to 64-bit PowerShell, and won’t work. You can’t install the 64 bit drivers when you already have the 32-bit drivers and I don’t think you can get anything good to happen by uninstalling the 32-bit drivers. Nope. All three (or four if you include Visual Studio) must be 64 bit. I gather that one way out of this Catch 22 is to first install the 64-bit Office 2010 ODBC/OleDB drivers and after that the (32-bit) Office, but there is a better fix that involves tweaking the registry. See this for the full frustrating story.
The ODBC Excel driver in ACE works with the latest Excel spreadsheet format up to 2010 as well as the old ones. I suspect that the latest version will work with Office 2013, though I haven’t yet tried it.
This driver is valuable because of the flexibility it gives. It actually executes ODBC SQL, which is a subset of SQL92, so you can put in column aliases, change the column order, and filter out rows that you don’t want to import. In effect, it presents you with a SQL tables which can be named ranges, if it is an existing worksheet that you’ve added named ranges to.
Select * from MyNamedRange |
More commonly, you can specify with a delimited worksheet name followed by a range, the range being a specification of the area of the worksheet just sufficient to enable the driver to find the data you want. If you leave out the range spec entirely, the entire worksheet becomes the table.
Select * from [MyWorksheet$] |
If, for example, you wanted the data in the range from C3 to L8, you’d use the statement
Select * from [MyWorksheet$C3:M8] |
In ODBC, if you specified, say, row 8 as the end of the table, you can only select rows up to row 8, even if you have inserted more rows beyond that limit, as ODBC allows. If you use some flavours, such as the old MDAC ‘JET’ database engine, then you cannot add new rows beyond the defined limits of a range, otherwise you will get the Exception: "Cannot expand named range" message
If you wanted to define your table as being between the columns C and L, starting at row 3 you’d use
Select * from [NameOfExcelSheet$C3:M] |
If you do this, then there is no limit to the length of the table so you can insert as many rows as you like. The ODBC provider adds new rows to the existing rows in the defined area as space allows
The dreaded connection string
Now, before we start doing interesting things with the ACE drivers, I ought to explain a bit about their connection strings. These contain the specification of the ODBC driver you wish to use, and the settings that you wish to transmit to the driver.
Ignoring, for the time being, the extended property settings, For Microsoft Office Access data, set the Connection String to
«Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ= MyPath/MyFile« |
For Excel data, use
«Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=MyPath/MyFile« |
For dBASE data, use
«Driver={Microsoft Access dBASE Driver {*.dbf, *.ndx, *.mdx)};DBQ=MyPath/MyFile« |
For text data, use
«Driver={Microsoft Access Text Driver {*.txt, *.csv);DBQ=MyPath« |
But you’re likely to want some extended properties for the settings to add a few details about the way that the ODBC provider should tackle this particular connection. Because the defaults can be changed globally in the registry, it is rather better to specify these extended properties rather than to rely on the defaults.
These extended properties are only relevant for the driver that you’re using. They are not always reliable and are poorly documented by Microsoft. I’ll only mention the essentials.
The driver needs to know if the first row of the table holds the name of the column. “HDR=Yes;” indicates that the first row contains column names, not data. It will actually just use the first 64 characters of the header. “HDR=No;” treats the first row as data, but then the columns are named F1 onwards and you’d want to alias them in your SQL statements to give them meaningful column names.
The Excel ODBC doesn’t keep a detailed schema definition of the tables. (the Text and Access drivers by contrast do) The ODBC Excel driver will try to make sense of the data it finds by testing it to see what datatype it can use for the result. It does so by testing a number of rows before doing the import, and you can specify how many rows it tests before deciding the data type of the column by using MaxScanRows in the extended properties. By default the value of this is 8. You can specify any value from 1 – 16 for 1 to 16 rows. You can also make the value to 0 so that it searches all existing rows before deciding the data type, but this slows things down.
This is fine for a relational table but Excel often has mixed types in a column The ODBC Provider will try to return the data of the majority type, but return NULL values for the rest that won’t convert. If the two types are equally mixed in the column, the provider chooses numeric over text, and you lose all the text. Also, it will judge the length of the character datatype in the column from the first rows and if the first rows are less than 255 characters long it will truncate all the subsequent data to 255 characters even if cell values below are longer.
By setting the Import Mode (IMEX=1). You can force mixed data to be converted to text, but only when it finds mixed values on the rows that it checks.
You can also open the Excel workbook in read-only mode by specifying ReadOnly=true
; By Default Readonly
attribute is false, so you can modify data within your workbook. However, this will lock the entire workbook from access until you close the connection.
Let’s try it out.
Just so you can prove all this to yourself, I’ve supplied an Excel workbook that represents the old PUBS database that used to be distributed with SQL Server and Sybase. This means that you can use SQL from old examples that use PUBS and see what works. All you need to do is to convert the SQL Server version slightly by altering the names of the tables slightly to tell the driver that you want the entire worksheet of that name (the $ is the separator between the worksheet name and the range specification)
So let’s pop together a very simple test-rig to try things out in PowerShell. Be warned, I’ve set this up in read-write mode so it will update your spreadsheet in some circumstances (CUD). To play along, you’ll need to download my Excel version of the PUBS database and alter the path to the excel file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
set-psdebug -strict $ErrorActionPreference = «stop» $ExcelFilePath=‘MyPathpubs.xlsx’ #the full path of the excel workbook if (!(Test-Path $ExcelFilePath)) { Write-Error «Can’t find ‘$($ExcelFilePath)’. Sorry, can’t proceed because of this» exit } try { $Connection = New-Object system.data.odbc.odbcconnection $Connection.ConnectionString = ‘Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=’+$ExcelFilePath+‘; Extended Properties=»Mode=ReadWrite;ReadOnly=false; HDR=YES»‘ $Connection.Open() } catch { $ex = $_.Exception Write-Error «whilst opening connection to $ExcelFilePath : Sorry, can’t proceed because of this» exit } try { $Query = New-Object system.data.odbc.odbccommand $Query.Connection = $connection $Query.CommandText = @’ SELECT title, SUM(qty) AS sales, COUNT(*) AS orders FROM [titles$] t INNER JOIN [sales$] s ON t.title_id=s.title_id WHERE title like ‘%?’ GROUP BY title ORDER BY SUM(qty) DESC ‘@ $Reader = $Query.ExecuteReader([System.Data.CommandBehavior]::SequentialAccess) #get the datareader and just get the result in one gulp } catch { $ex = $_.Exception Write-Error «whilst executing the query ‘$($Query.CommandText)’ $ex.Message Sorry, but we can’t proceed because of this!» $Reader.Close() $Connection.Close() Exit; } Try { $Counter = $Reader.FieldCount #get it just once $result=@() #initialise the empty array of rows while ($Reader.Read()) { $Tuple = New-Object -TypeName ‘System.Management.Automation.PSObject’ foreach ($i in (0..($Counter — 1))) { Add-Member ` -InputObject $Tuple ` -MemberType NoteProperty ` -Name $Reader.GetName($i) ` -Value $Reader.GetValue($i).ToString() } $Result+=$Tuple } $result | Format-Table } catch { $ex = $_.Exceptio Write-Error «whilst reading the data from the datatable. $ex.Message» } $Reader.Close() $Connection.Close() |
All these work
Inner joins
SELECT logo, pr_info, pub_name, city, state, country FROM [pub_info$] pif INNER JOIN [publishers$] p ON p.pub_id=pif.pub_id |
Left or right outer joins
SELECT title, stor_id, ord_num, qty,ord_date FROM [titles$] t LEFT OUTER JOIN [sales$] s ON t.title_id=s.title_id |
Expressions using columns
SELECT fname+‘ ‘+ minit+‘ ‘+lname AS name, job_desc FROM [jobs$] d INNER JOIN [employee$] e ON d.job_id=e.job_id |
Simple GROUP BY expression
SELECT COUNT(*) FROM [sales$] GROUP BY stor_ID |
More complex aggregation with ORDER BY clause and a WHERE clause
SELECT title, SUM(qty) AS sales, COUNT(*) AS orders FROM [titles$] t INNER JOIN [sales$] s ON t.title_id=s.title_id WHERE title like ‘%?’ GROUP BY title ORDER BY SUM(qty) DESC |
String functions
SELECT title, left(notes,20)+‘…’ as [note] FROM [titles$] |
UNION and UNION ALL
SELECT au_fname FROM [authors$] UNION ALL SELECT lname FROM [employee$] |
One could go on and on; even subqueries work, but I think I’ve made the point that there is far more power in this ODBC Excel driver than just the facility for pulling out raw data. The same is true of the TEXT driver for OLEDB. It will do all this as well. To conform with the minimum syntax for ODBC, a driver must be able to execute CREATE TABLE, DELETE FROM (searched), DROP TABLE, INSERT INTO, SELECT, SELECT DISTINCT, and UPDATE (searched). SELECT statements can have WHERE and ORDER BY clauses. ACE does a bit better than this, since even the text driver allows SELECT INTO, and SELECT statements allow GROUP BY and HAVING.
Creating a spreadsheet
You can, of course use the ODBC driver to create an Excel spreadsheet and write data into it. Here is the simplest working demo I can write without blushing. Be careful to ensure that the spreadsheet doesn’t exist as the whole point of the demo is to prove to you that it can create an entire spreadsheet workbook with several worksheets.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
$ExcelFilePath=‘MyPathNew.xlsx’ #the full path of the excel workbook $Header= $true # we want your first row to be column headers try { $Connection = New-Object system.data.odbc.odbcconnection $TheConnectionString = ‘Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=’+$ExcelFilePath+‘;Mode=ReadWrite;ReadOnly=false;Extended Properties=»HDR=’+«$(if ($Header){‘YES’}else{‘NO’})»+‘»‘ $Connection.ConnectionString=$TheConnectionString $Connection.Open() } catch { $ex = $_.Exception Write-Error «whilst opening connection to $ExcelFilePath using ‘$($TheConnectionString)’ : $ex.Message» } $Mycommand=$connection.CreateCommand() $MyCommand.CommandText=«create table MyTable (MyColumn varchar, MyOtherColumn varchar)» if ($Mycommand.ExecuteNonQuery() -eq -1) { $MyCommand.CommandText=«insert into MyTable (MyColumn, MyOtherColumn) select ‘myfirstRowCol’,’myFirstRowCol2′» $rows=$Mycommand.ExecuteNonQuery() «$rows rows inserted into worksheet MyTable» } $connection.Close() |
Notice that I can’t create the table and do the insert in one batch as a command. One statement only can be used in the commandText.
Exploring your Excel metadata
You can find out what datatypes are available for any ODBC source, by using the OdbcConnection.GetSchema(string)
method.
$Datatypes=$connection.GetSchema(‘DATATYPES’).TypeName |
Which with my connection gives only the LOGICAL, CURRENCY, NUMBER, VARCHAR and DATETIME
datatypes. More useful is..
$tables=$connection.GetSchema(‘TABLES’).Table_Name |
… that gives you a list of the available worksheets . The complete list, if you wish to peep at them, is
$connection.GetSchema(‘TABLES’) $connection.GetSchema(‘DATATYPES’) $connection.GetSchema(‘DataSourceInformation’) $connection.GetSchema(‘Restrictions’) $connection.GetSchema(‘ReservedWords’) $connection.GetSchema(‘Columns’) $connection.GetSchema(‘Indexes’) $connection.GetSchema(‘Views’) |
Hmm. This is beginning to look a bit more like a database. With the Columns MetadataCollection
, you can find out as much as you’d ever want to know about the data that is available in the spreadsheet so if you want to read all the worksheets straight into SQL Server, this is a wide-open goal.
Creating Worksheets
Going back to the PUBS Excel database, let’s create a peoples table and populate it with both authors and salespeople. This has to be done in three gulps since the driver seems to dislike the idea of doing a batch, and it kicks when I try to UNION the two results.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
$ExcelFilePath=‘C:UsersAdministratorDocumentsPOSHScriptsPubs.xlsx’ #the full path of the excel workbook $Header= $true # true if you want your first row to be read as column headers if (!(Test-Path $ExcelFilePath)) { Write-Error «Can’t find ‘$($ExcelFilePath)’. Sorry, can’t proceed because of this» exit } try { $Connection = New-Object system.data.odbc.odbcconnection $TheConnectionString = ‘Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=’+$ExcelFilePath+‘;Mode=ReadWrite;ReadOnly=false;Extended Properties=»HDR=’+«$(if ($Header){‘YES’}else{‘NO’})»+‘»‘ $Connection.ConnectionString=$TheConnectionString $Connection.Open() } catch { $ex = $_.Exception Write-Error «whilst opening connection to $ExcelFilePath using ‘$($TheConnectionString)’ : $ex.Message» } $Mycommand=$connection.CreateCommand() $MyCommand.CommandText=@» CREATE TABLE people (Person varchar) «@ if ($Mycommand.ExecuteNonQuery() -eq -1) {$MyCommand.CommandText=@» INSERT into [people$](person) SELECT lname FROM [employee$] «@ $rows=$Mycommand.ExecuteNonQuery() $MyCommand.CommandText=@» INSERT into [people$](person) SELECT au_fname FROM [authors$] «@ $rows=$rows+$Mycommand.ExecuteNonQuery() } «$rows rows inserted into table» $connection.Close() |
You’ll find you can UPDATE, INSERT
and DELETE
data perfectly happily this way. If you connect up a spreadsheet to a SQL Server database, then you can have a lot of fun copying entire databases into spreadsheets, and back again. Robyn and I show how to do this here.
The problem is in the Workbook you create. Whether you name it XLS or XSLX it produces an XLSX spreadsheet, in the latest zipped Office Open XML form. The trouble is that, with my version of the driver, I can only get Excel to read it with the XLS filetype, since it says that there is an error if you try to open it as an .XLSX file. I suspect that the ODBC driver hasn’t been that well tested by Microsoft.
Getting data into SQL Server from Excel using PowerShell
Now, what about using PowerShell to copy the data, maybe filtered, sorted and aggregated, into SQL Server, using PowerShell and ODBC. In this direction we can save a lot of time by using the BCP library. We’ll now describe the routine.
We’ll keep this unpacked, as a script rather than a function, since this is designed to illustrate the process.
We’ll start by defining our credentials, preferences, sources and destinations. We’ll read in the data from and excel spreadsheet and then spit it out into SQL Server, creating a table if necessary. To create the destination table (some of these spreadsheets are rather wide and therefore easier to import automatically), we’ll need to examine the metadata, and to interpret this to the SQL Server equivalent, so we’ll do that. To use the BCP library, it is good to have an indication of progress so I’ll show how you do that.
I’ve provided the sample data so that you don’t have to scramble around to find something suitable. This is some climate data, which is handy for checking things like date conversion.
You will notice that although you can render numbers in a variety of ways, there is only one way of storing numbers in Excel, in the ‘NUMBER
‘ datatype (the other datatypes in Excel are LOGICAL, CURRENCY, VARCHAR
and DATETIME
). I’ve therefore had to specify the precision of numeric data, which is tough if you have some columns with integers and others with real decimal data with numbers after the decimal point (scale). Remember that this routine is just creating a staging table, not the final destination. All you need to do is to add your own statements to transfer the data to their final table with the CAST to the correct internal data type!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
set-psdebug -strict $ErrorActionPreference = «stop» $ExcelFilePath = ‘MyPathCambridgeWeatherData.xlsx’ #the full path of the excel workbook $Worksheet = ‘cambridgedata’ #this is the actual worksheet where the data is $DataRange = » #e.g. ‘A2:M33’ this is the range of the cells that make up the table. leave blank to read the whole worksheet # leave out the second row number to read all rows from the column range $Header = $true # true if you want your first row to be read as column headers # If you aren’t reading columns they are labelled F1..n. You can easily specify them #$ColumnNames=»’2011» as year,[F1] as Day’ $ColumnNames = ‘*’ #If you dont have fieldnames in the header of your worksheet, you can specify $Header= $false and use F1..Fn instead. $DestinationTable = ‘CambridgeClimateData’ #the name of the SQL Server table where you want to put the data $Destinationinstance = ‘MyInstance’ #the name of the server or instance $Destinationdatabase = ‘MyDataBase’ #the name of the datatabase where you want to put the data $DestinationWindowsSecurity = $true #or $False if you aren’t using Windows security $DestinationUserID = » #the name of the SQL Server user if not integrated security $DeleteContentsOfTableBeforeCopy = $false $PrecisionForNumericData = 1 if (!(Test-Path $ExcelFilePath)) { Write-Error «Can’t find ‘$($ExcelFilePath)’. Sorry, can’t proceed because of this» exit } try { $Connection = New-Object system.data.odbc.odbcconnection $TheConnectionString = ‘Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=’ + $ExcelFilePath + ‘; Extended Properties=»READONLY=TRUE; HDR=’ + «$(if ($Header) { ‘YES’ } else { ‘NO’ })» + ‘»‘ $Connection.ConnectionString = $TheConnectionString $Connection.Open() } catch { $ex = $_.Exception Write-Error «whilst opening connection to $ExcelFilePath using ‘$($TheConnectionString)’ : $($ex.Message). Sorry, can’t proceed because of this» exit } # get the types via $Connection.GetSchema(‘DataTypes’)|select TypeName, DataType,SQLType try { $Query = New-Object system.data.odbc.odbccommand $Query.Connection = $connection $Query.CommandText = ‘Select’ + $columnNames + ‘ from [‘ + $Worksheet + ‘$’ + $DataRange + ‘]’ $Reader = $Query.ExecuteReader([System.Data.CommandBehavior]::SequentialAccess) #get the datareader and just get the result in one gulp } catch { $ex = $_.Exception Write-Error «whilst making the query ‘$($Query.CommandText)’ $ex.Message Sorry, but we can’t proceed because of this!» Exit; } $columns = $reader.GetSchemaTable() | select columnName, datatype if ($DeleteContentsOfTableBeforeCopy) { $deletionScript = «ELSE DELETE from $DestinationTable « } else { $deletionScript = » } $CreateScript =@» IF NOT EXISTS (select TABLE_NAME from information_schema.tables where TABLE_NAME like ‘$DestinationTable’) CREATE TABLE $DestinationTable ( «@ $CreateScript += $columns | foreach-object{ $datatype = «$($_.dataType)»; «`n`t[$($_.columnName.Trim())] $(switch ($dataType) { ‘double'{ «numeric(18,$PrecisionForNumericData)» } ‘boolean'{ ‘int’ } ‘decimal'{ ‘Money’ } ‘datetime'{ ‘DateTime’ } default { ‘NVARCHAR(MAX)’ } }),» } $CreateScript = $CreateScript.Substring(0, $CreateScript.Length — 1) + «`n`t)`n $deletionScript» $DestinationConnectionString = «Data Source=$Destinationinstance;Initial Catalog=$Destinationdatabase;$( if ($DestinationWindowsSecurity) { ‘integrated security=true’ } else { ‘User Id=’ + $DestinationUserID + ‘;Password=’ + «$(((Get-Credential $DestinationUserID).GetNetworkCredential()).Password)» + ‘;integrated security=false’ } )» try { #test to see if the table is there. If it isn’t, then create it. If it is, then delete the contents $SqlCommand = new-object (‘Data.SqlClient.SqlCommand’) $CreateScript, $DestinationConnectionString; $SqlCommand.Connection.Open(); $handler = [System.Data.SqlClient.SqlInfoMessageEventHandler] { param ($sender, $event) Write-Host «Message: $($event.Message)» }; $SqlCommand.Connection.add_InfoMessage($handler); $success = $SqlCommand.ExecuteNonQuery(); #now squirt the data in using the bulk copy library. $bulkCopy = new-object («Data.SqlClient.SqlBulkCopy») $DestinationConnectionString $bulkCopy.DestinationTableName = $DestinationTable $bulkCopy.BatchSize = 5000 #The number of rows in each batch sent to the server $bulkcopy.NotifyAfter = 200 #The number of rows to copy before firing a notification $bulkCopy.BulkCopyTimeout = 0 #the number of seconds before a time-out $objectEvent = Register-ObjectEvent $bulkcopy SqlRowsCopied -Action { write-host «Copied $($eventArgs.RowsCopied) rows « } $bulkCopy.WriteToServer($reader) #copy all rows to the server } catch { $ex = $_.Exception Write-Error «Whilst doing the bulk copy ‘$($Query.CommandText)’ $ex.Message Sorry, but we can’t proceed because of this!» } $Reader.Close() $SqlCommand.Connection.Close() $Connection.Close() |
OK, but does it work with real data? Off to the Health and Social Care Information Centre for some realistic data in spreadsheet form. I’ve included some data just so you don’t have to go to the site to play along, but it is far better to use the latest version of this data from the site. I’m sure I don’t have to tell you how easy this is to do in a script via PowerShell.
$ExcelFilePath=‘MyPathhosp-epis-stat-admi-tot-ops-11-12-tab.xlsx’ #the full path of the excel workbook $Worksheet=‘Total procedures’ #this is the actual worksheet where the data is $DataRange= ‘A16:J1509’ #e.g. ‘A2:M33’ this is the range of the cells that make up the table. leave blank to read the whole worksheet |
Also
$DestinationTable=‘Hosp’ # or whatever you want. The name of the SQL Server table where you want to put the data |
…and
$PrecisionForNumericData=0 |
Try it. Whoosh. In it goes. If you were doing this as a routine, you’d be wanting to wrap this script into a function with parameters by now, but you know how to do this already, I’m sure. I’m trying to give you the ‘workbench’ narrative here.
Writing to Excel from SQL Server.
The process of going from SQL Server to excel via ODBC is, I think, needlessly complicated, especially if you use parameterised queries (excellent for SQL Server but they add very little for writing to Excel).In this example, I’ll do the old and horrible approach of using insert statements. There are other ways, including even using a dataset, but this is the most obvious.
I’m not particularly happy with this sample because Excel whines a bit when it opens it, saying that it is in the wrong format, (which it is, but you try naming it XLSX) but it deigns to open it.
“The file you are trying to open, ‘MyExcelFile.xls’, is in a different format than specified by the file extension. Verify that the file is not corrupted and is from a trusted source before opening the file. Do you want to open the file now?”
More seriously, it complains that the numbers in the columns are ‘formatted as text’. It turns out that the data is saved in the correct format, but the next time the file is opened, all columns revert to varchar.
Seasoned users of ODBC gets used to the bugs, but if anyone knows of a workaround to this, I’d be grateful.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
set-psdebug -strict $ErrorActionPreference = «stop» $Sourceinstance = ‘MyServerOrInstance’ #the name of the server or instance $Sourcedatabase = ‘AdventureWorks’ #the name of the datatabase where you want to get the data #here is where we put the SQL command to get the result from the database $SelectStatementForDatabase =@» SELECT ProductNumber, p.Name AS ProductName, color, SafetyStockLevel, ReorderPoint, StandardCost, ListPrice, NonDiscountSales = (OrderQty * UnitPrice), Discounts = ((OrderQty * UnitPrice) * UnitPriceDiscount) FROM Production.Product AS p INNER JOIN Sales.SalesOrderDetail AS sod ON p.ProductID = sod.ProductID where ((OrderQty * UnitPrice) * UnitPriceDiscount)>0 ORDER BY ProductName DESC; «@ $SourceWindowsSecurity = $false #or $True if you are using Windows security $SourceUserID = ‘SA’ #the name of the SQL Server user if not integrated security $DestinationTable = ‘ProductWithDiscounts’ $DestinationExcelFilePath = ‘MyPathMyName.xls’ #the full path of the excel workbook $DestinationHeader = $true # true if you want your first row to be read as column headers #firstly, we create a connection string ‘on the fly’ #connect to the datanbase #…and get the DataReader object $SourceConnectionString = «Data Source=$Sourceinstance;Initial Catalog=$Sourcedatabase;$( if ($SourceWindowsSecurity) { ‘integrated security=true’ } else { ‘User Id=’ + $SourceUserID + ‘;Password=’ + «$(((Get-Credential $SourceUserID).GetNetworkCredential()).Password)» + ‘;integrated security=false’ })» try { #here we open a connection to the SQL Server source database $SqlCommand = new-object (‘Data.SqlClient.SqlCommand’) $SelectStatementForDatabase, $SourceConnectionString; $SqlCommand.Connection.Open(); #we open the connection $handler = [System.Data.SqlClient.SqlInfoMessageEventHandler] { param ($sender, $event) Write-Host «Message: $($event.Message)» }; $SqlCommand.Connection.add_InfoMessage($handler); $Reader = $SqlCommand.ExecuteReader([System.Data.CommandBehavior]::SequentialAccess) #get the datareader and just get the result in one gulp } catch { $ex = $_.Exception Write-Error «whilst getting data from $Sourceinstance $Sourcedatabase ‘ : $ex.Message» exit } # excel has only the LOGICAL,CURRENCY,NUMBER,VARCHAR,DATETIME datatypes # according to $connection.GetSchema(‘DATATYPES’).TypeName # lets work out what the Excel datatype would be… $columns = $reader.GetSchemaTable() | select columnName, datatype, @{ name = ‘ExcelDatatype’; expression = { switch ($_.datatype) { { @(‘float’, ‘decimal’, ‘Numeric’) -contains $_ } { ‘Number’ } ‘bit’ { ‘logical’ } ‘int16’{ ‘Int’ } { @(‘smallmoney’, ‘money’) -contains $_ } { ‘currency’ } ‘DateTime’{ ‘datetime’ } default { ‘VarChar’ } } } } # now we need to create an equivalent worksheet in the Workbook. #If there is no workbook, it will create it $CreateScript =@» CREATE TABLE $DestinationTable ( «@ $CreateScript += $columns | foreach-object{ «`n`t$($_.ColumnName.Trim()) $($_.ExcelDataType),» } $CreateScript = $CreateScript.Substring(0, $CreateScript.Length — 1) + «`n`t)» # and make a columnlist for the insert statement. $columnList = ‘[‘ + $columns[0].ColumnName + ‘]’ for ($ii = 1; $ii -le $columns.Length — 1; $ii++) { $params += ‘,?’; $columnList += ‘ ,[‘ + $columns[$ii].ColumnName + ‘]’ } try { #to open the destination workbook or create it if not exist $Connection = New-Object system.data.odbc.odbcconnection $TheConnectionString = ‘Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=’ + $DestinationExcelFilePath + ‘;Mode=ReadWrite;ReadOnly=false;Extended Properties=»HDR=’ + «$(if ($DestinationHeader) { ‘YES’ } else { ‘NO’ })» + ‘»‘ $Connection.ConnectionString = $TheConnectionString $Connection.Open() $insertionCommand = $Connection.CreateCommand() } catch { $ex = $_.Exception Write-Error «whilst opening connection to $DestinationExcelFilePath using ‘$($TheConnectionString)’ : $ex.Message» exit } try { #if the table doesn’t exist we create it. $CreateTableCommand = $Connection.CreateCommand() $CreateTableCommand.CommandText = $CreateScript if ($connection.GetSchema(‘TABLES’).Table_Name -notcontains $DestinationTable) { if ($CreateTableCommand.ExecuteNonQuery() -eq -1) { write-host «created table (worksheet) $DestinationTable» } } } catch { $ex = $_.Exception Write-Error «couldn’t create table with command $CreateScript : $ex.Message» exit } $rows = 0 try { #now we create each insert statement on the fly! Developers look away, please while ($Reader.Read()) { $insertcommand = «INSERT INTO [$destinationTable» + ‘$] (‘ + «$columnList) VALUES(« for ($i = 0; $i -lt $Reader.FieldCount; $i++) { $insertcommand += «$(if ($i -eq 0) { » } else { ‘,’ }) $(if ($columns[$i].ExcelDataType -eq ‘VarChar’) { «‘$($reader.GetValue($i) -replace «‘«, «»«)'» } else { «$($reader.GetValue($i))» }) « } $insertioncommand.CommandText = $insertcommand + ‘)’ $rows += $insertionCommand.ExecuteNonQuery() } } catch { $ex = $_.Exception Write-Error «whilst writing to column $i of file $DestinationExcelFilePath ‘ : $ex.Message» } #we report what we’ve done. write-host «Wrote $rows rows of $($columns.count) columns to worksheet $destinationTable» $Reader.Close() $SqlCommand.Connection.Close() $connection.Close() |
CSV and Delimited ODBC Sources: Text AdventureWorks.
Although the ACE drivers are used more by people reading Excel files, I must emphasize that there are drivers for a number of other formats. It is pretty easy, for example, to turn a bunch of CSV files into a relational database. Just to prove it, I’ve created a CSV/Text version of AdventureWorks, together with its schema.ini. This was originally created in this article The TSQL of CSV: Comma-Delimited of Errors. With this text-based database, you can do a lot of the sample AdventureWorks SQL examples with only a minor modification.
Once you’ve installed the ACE drivers, you’ll can use a modified version of the routine I showed you or exploring the PUBS Excel database to play along.
All you have to do is to unzip Text Adventureworks into a new directory with the name of your database (AdventureWorks) and point your connection string at the directory by giving it the full path to the directory. I just altered two lines
#set the directory in which your database should go. $TextFilePath=‘MyPathToTheDirectoryTextAdventureWorks’ #the path to the database |
… and
$Connection.ConnectionString=‘Driver={Microsoft Access Text Driver (*.txt, *.csv)};DBQ=’+$TextFilePath+» |
Now you should be ready with your text-based relational database.
You can, of course, create tables and write to them using the INSERT statement.
create table [Log#csv] (MyInteger int,TheDate date TheMessage char(125)) |
…and do insert statements into it. You can SELECT INTO
as well, which is new to me. I didn’t notice this in previous incarnations of this driver.
With CREATE
statements, you can use ‘BIT, BYTE , LONGCHAR, CURRENCY, INTEGER, SMALLINT, REAL, FLOAT, CHAR or DATETIME
(Out of curiosity, the OLEDB driver allows Long, Single, Double, Currency, DateTime , Bit, Byte, GUID, BigBinary, LongBinary, VarBinary, LongText, VarChar char
and Decimal
)
# You can list out the tables $Connection.GetSchema(«tables»)|select table_name |
And the schema
$Connection.GetSchema(«columns»)|select tableName, ColumnName, cardinalPosition |
Here are a few of the SQL Statements that work
SELECT * into [gloves#csv] FROM [Production_ProductModel#csv] WHERE ProductModelID IN (3, 4) |
SELECT count(*) as [discounted] FROM [Production_Product#csv] AS p INNER JOIN [Sales_SalesOrderDetail#csv] AS sod ON p.ProductID = sod.ProductID where ((OrderQty * UnitPrice) * UnitPriceDiscount)>0 |
SELECT Name, ProductNumber, ListPrice AS Price FROM [Production_Product#csv] WHERE ProductLine = ‘R’ AND DaysToManufacture < 4 ORDER BY Name DESC |
SELECT p1.ProductModelID FROM [Production_Product#csv] AS p1 GROUP BY p1.ProductModelID having p1.ProductModelID >100 |
SELECT p1.ProductModelID FROM [Production_Product#csv] AS p1 GROUP BY p1.ProductModelID HAVING MAX(p1.ListPrice) >= ALL (SELECT AVG(p2.ListPrice) FROM [Production_Product#csv] AS p2 WHERE p1.ProductModelID = p2.ProductModelID) |
SELECT top 50 SalesOrderID, SUM(LineTotal) AS SubTotal FROM [Sales_SalesOrderDetail#csv] GROUP BY SalesOrderID ORDER BY SalesOrderID; |
SELECT ProductModelID, Name FROM [Production_ProductModel#csv] WHERE ProductModelID IN (3, 4) union all |
SELECT ProductModelID, Name FROM [Production_ProductModel#csv] WHERE ProductModelID NOT IN (3, 4) |
Conclusions
If only Microsoft put some energy into their whole range of ODBC drivers, including all the possible datastores that can be mapped to relational databases, they’d be the obvious way of transferring data, and would put Microsoft in great shape for providing ‘big data’ solutions.. As it is, they are extraordinarily useful, but marred by quirks and oddities.
For me, ODBC is the obvious way to script data from Excel or Access into SQL Server, for doing data imports.