В это статье мы покажем, как получить доступ к данным в файлах Excel напрямую из PowerShell. Возможности прямого обращения к данным Excel из PowerShell открывает широкие возможности по инвентаризации и построению различных отчетов по компьютерам, серверам, инфраструктуре, Active Directory и т.д.
Содержание:
- Доступ к данным в Excel из консоли PowerShell
- Как получить данные из Active Directory и сохранить их в книге Excel?
Обращение к Excel из PowerShell выполняется через отдельный Component Object Model (COM) объект. Это требует наличие установленного Excel на компьютере.
Прежде, чем показать, как обратиться к данным в ячейке файла Excel, необходимо рассмотреть архитектуру уровней представления в документе Excel. На следующем рисунке показаны 4 вложенных уровня в объектной модели Excel:
- Уровень приложения (Application Layer) – запущенное приложение Excel;
- Уровень книги (WorkBook Layer) – одновременно могут быть открыты несколько книг (документов Excel);
- Уровень листа (WorkSheet Layer) – в каждом xlsx файле может быть несколько листов;
- Ячейки (Range Layer) – здесь можно получить доступ к данным в конкретной ячейке или диапазонe ячеек.
Доступ к данным в Excel из консоли PowerShell
Рассмотрим на простом примере как получить доступ из PowerShell к данным в Excel файле со списком сотрудников.
Сначала нужно запустить на компьютере приложение Excel (application layer) через COM объект:
$ExcelObj = New-Object -comobject Excel.Application
После выполнения этой команды на компьютере запускается в фоновом режиме приложение Excel. Чтобы сделать окно Excel видимым, нужно изменить свойство Visible COM объекта:
$ExcelObj.visible=$true
Все свойства объекта Excel можно вывести так:
$ExcelObj| fl
Теперь можно открыть файл (книгу, workbook) Excel:
$ExcelWorkBook = $ExcelObj.Workbooks.Open("C:PSad_users.xlsx")
В каждом файле Excel может быть несколько листов (worksheets). Выведем список листов в текущей книге Excel:
$ExcelWorkBook.Sheets| fl Name, index
Теперь можно открыть конкретный лист (по имени или по индексу):
$ExcelWorkSheet = $ExcelWorkBook.Sheets.Item("AD_User_List")
Текущий (активный) лист Excel можно узнать командой:
$ExcelWorkBook.ActiveSheet | fl Name, Index
Теперь вы можете получить значения из ячеек документа Excel. Можно использовать различные способы адресации ячеек в книге Excel: через диапазон (Range), ячейку (Cell), столбец (Columns) или строку(Rows). Ниже я привел разные примеры получения данных из одной и той же ячейки:
$ExcelWorkSheet.Range("B2").Text
$ExcelWorkSheet.Range("B2:B2").Text
$ExcelWorkSheet.Range("B2","B2").Text
$ExcelWorkSheet.cells.Item(2, 2).text
$ExcelWorkSheet.cells.Item(2, 2).value2
$ExcelWorkSheet.Columns.Item(2).Rows.Item(2).Text
$ExcelWorkSheet.Rows.Item(2).Columns.Item(2).Text
Как получить данные из Active Directory и сохранить их в книге Excel?
Рассмотрим практический пример использования доступа к данным Excel из PowerShell. Например, нам нужно для каждого пользователя в Excel файле получить информацию из Active Directory. Например, его телефон (атрибут telephoneNumber), отдел (department) и email адрес (mail).
# Импорт модуля Active Directory в сессию PowerShell
import-module activedirectory
# Сначала откройте книгу Excel:
$ExcelObj = New-Object -comobject Excel.Application
$ExcelWorkBook = $ExcelObj.Workbooks.Open("C:PSad_users.xlsx")
$ExcelWorkSheet = $ExcelWorkBook.Sheets.Item("AD_User_List")
# Получаем количество заполненных строк в xlsx файле
$rowcount=$ExcelWorkSheet.UsedRange.Rows.Count
# Перебираем все строки в столбце 1, начиная со второй строки (в этих ячейках указано доменное имя пользователя)
for($i=2;$i -le $rowcount;$i++){
$ADusername=$ExcelWorkSheet.Columns.Item(1).Rows.Item($i).Text
# Получаем значение атрибутов пользователя в AD
$ADuserProp = Get-ADUser $ADusername -properties telephoneNumber,department,mail|select-object name,telephoneNumber,department,mail
#Заполняем ячейки данными из AD
$ExcelWorkSheet.Columns.Item(4).Rows.Item($i) = $ADuserProp.telephoneNumber
$ExcelWorkSheet.Columns.Item(5).Rows.Item($i) = $ADuserProp.department
$ExcelWorkSheet.Columns.Item(6).Rows.Item($i) = $ADuserProp.mail
}
#Сохраните xls файл и закройте Excel
$ExcelWorkBook.Save()
$ExcelWorkBook.close($true)
В результате в Excel файле для каждого пользователя были добавлены столбцы с информацией из AD.
Рассмотрим еще один пример построения отчета с помощью PowerShell и Excel. Допустим, вам нужно построить Excel отчет о состоянии службы Print Spooler на всех серверах домена.
# Создать объект Excel
$ExcelObj = New-Object -comobject Excel.Application
$ExcelObj.Visible = $true
# Добавить рабочую книгу
$ExcelWorkBook = $ExcelObj.Workbooks.Add()
$ExcelWorkSheet = $ExcelWorkBook.Worksheets.Item(1)
# Переименовывать лист
$ExcelWorkSheet.Name = 'Статус сервиса spooler'
# Заполняем шапку таблицы
$ExcelWorkSheet.Cells.Item(1,1) = 'Имя сервера'
$ExcelWorkSheet.Cells.Item(1,2) = 'Имя службы'
$ExcelWorkSheet.Cells.Item(1,3) = 'Статус службы'
# Выделить шапку таблицы жирным. задать размер шрифта и ширину столбцов
$ExcelWorkSheet.Rows.Item(1).Font.Bold = $true
$ExcelWorkSheet.Rows.Item(1).Font.size=14
$ExcelWorkSheet.Columns.Item(1).ColumnWidth=25
$ExcelWorkSheet.Columns.Item(2).ColumnWidth=25
$ExcelWorkSheet.Columns.Item(3).ColumnWidth=25
# получим список всех Windows Server в домене
$computers = (Get-ADComputer -Filter 'operatingsystem -like "*Windows server*" -and enabled -eq "true"').Name
$counter=2
# подключается к каждому компьютеру и получаем статус службы
foreach ($computer in $computers) {
$result = Invoke-Command -Computername $computer –ScriptBlock { Get-Service spooler | select Name, status }
#Заполняем ячейки Excel данными с сервера
$ExcelWorkSheet.Columns.Item(1).Rows.Item($counter) = $result.PSComputerName
$ExcelWorkSheet.Columns.Item(2).Rows.Item($counter) = $result.Name
$ExcelWorkSheet.Columns.Item(3).Rows.Item($counter) = $result.Status
$counter++
}
# сохраните полученный отчет и закройте Excel:
$ExcelWorkBook.SaveAs('C:psservice-report.xlsx')
$ExcelWorkBook.close($true)
Область применения возможностей доступа из PowerShell в Excel очень широка. Начиная от простого построения отчетов, например, из Active Directory, и заканчивая возможностью создания PowerShell скриптов для актуализации данных в AD из Excel.
Например, вы можете поручить сотруднику отдела кадров вести реестр пользователей в Excel. Затем с помощью PowerShell скрипта через Set-ADUser сотрудник может автоматически обновлять данные пользователей в AD (достаточно делегировать пользователю права на изменение этих атрибутов пользователей AD и показать как запускать PS скрипт). Таким образом можно вести актуальную адресную книгу с актуальными номерами телефонами и должностями.
May 8th, 2018
The Goal:
Import data from XLSX files conveniently like import-csv lets you do with simpler data.
The preamble:
Excel is a mainstay of the business world at this point, which means a lot of the data you might have to work with will come at you as an XLSX file or need to be one. This can be a bit annoying when scripting.
If we’re just working in PowerShell-land and we can choose to use simple CSV data we have the handy import-csv and export-csv cmdlets, and those CSV files can open up in excel just fine. However, when we are forced to work with XLSX files it can lead to headaches.
If you search around online, or have worked with excel in PowerShell before, you have probably found solutions involving COM objects that tend to start a little like this:
$ExcelFile = New-Object -ComObject Excel.Application
Then there is lots of methods and properties we can interact with on that COM object. This lets us do what we need, but ultimately leverages excel and can cause huge performance issues when working with a lot of data. Often you might see the excel process stop responding for a while and then finally it finishes its work. This is clunky, confusing and un-fun.
The motivation:
I recently had a folder full of XLSX files that I needed to read in. I only cared about a couple columns, and I wanted to import them as objects like I could with import-csv and then pull only unique values out.
I leveraged the PowerShell Gallery and just searched for “Excel”. There is actually quite a few different options there now, but at the time I saw the description for a module called PSExcel, by a fellow named RamblingCookieMonster. The description was simple:
Work with Excel without installing Excel
That sounded good enough to me, so I figured I’d take it for a spin.
The meat:
Install-module PSExcel Get-command -module psexcel
CommandType Name Version Source ----------- ---- ------- ------ Function Add-PivotChart 1.0.2 psexcel Function Add-PivotTable 1.0.2 psexcel Function Add-Table 1.0.2 psexcel Function Close-Excel 1.0.2 psexcel Function ConvertTo-ExcelCoordinate 1.0.2 psexcel Function Export-XLSX 1.0.2 psexcel Function Format-Cell 1.0.2 psexcel Function Get-CellValue 1.0.2 psexcel Function Get-Workbook 1.0.2 psexcel Function Get-Worksheet 1.0.2 psexcel Function Import-XLSX 1.0.2 psexcel Function Join-Object 1.0.2 psexcel Function Join-Worksheet 1.0.2 psexcel Function New-Excel 1.0.2 psexcel Function Save-Excel 1.0.2 psexcel Function Search-CellValue 1.0.2 psexcel Function Set-CellValue 1.0.2 psexcel Function Set-FreezePane 1.0.2 psexcel
Import-XLSX sounds like exactly what I wanted.
Here I’ve generated a simple XLSX with fake people and companies. What I want to do is pull out just a list of unique company names from the company column. In my real world example the XLSX was a bit more complicated and I had dozens to read in on a loop, but that just involved scaling up these actions.
$path = "$PSScriptRootfakepeople.xlsx" import-module psexcel #it wasn't auto loading on my machine $people = new-object System.Collections.ArrayList foreach ($person in (Import-XLSX -Path $path -RowStart 1)) { $people.add($person) | out-null #I don't want to see the output } $people.company | select -unique
Contoso ContosoSuites Fabrikam Parnell Aerospace Humongous Insurance
Just a quick note, if you’re looking to do a bit more with Excel, I found this module as well, which seems a bit more robust.
That’s all for now. Hopefully this helps you if you need to grab some data from XLSX files!
In this article we’ll show how to read and write data from Excel worksheets directly from PowerShell scripts. You can use Excel along with PowerShell to inventory and generate various reports on computers, servers, infrastructure, Active Directory, etc.
Contents:
- How to Read Data from an Excel Spreadsheet using PowerShell?
- Exporting Active Directory User Info to Excel Spreadsheet using PowerShell
You can access Excel sheets from PowerShell via a separate COM object (Component Object Model). This requires Excel to be installed on the computer.
Before showing how to access data in an Excel cell, it is worth to understand the architecture of presentation layers in the Excel file. The figure below shows 4 nested presentation layers in the Excel object model:
- Application Layer – deals with the running Excel app;
- WorkBook Layer – multiple workbooks (Excel files) may be open at the same time;
- WorkSheet Layer – each XLSX file can contain several sheets;
- Range Layer – here you can access data in the specific cell or cell range.
How to Read Data from an Excel Spreadsheet using PowerShell?
Let’s take a look at a simple example of how to use PowerShell to access data in an Excel file containing a list of employees.
First, run the Excel app (application layer) on your computer using the COM object:
$ExcelObj = New-Object -comobject Excel.Application
After running the command, Excel will be launched on your computer in the background. To show the Excel window, change the Visible property of the COM object:
$ExcelObj.visible=$true
You can display all Excel object properties as follows:
$ExcelObj| fl
Then you can open an Excel file (a workbook):
$ExcelWorkBook = $ExcelObj.Workbooks.Open("C:PScorp_ad_users.xlsx")
Each Excel file can contain several worksheets. Let’s display the list of worksheets in the current Excel workbook:
$ExcelWorkBook.Sheets| fl Name, index
Then you can open a sheet you want (by its name or index):
$ExcelWorkSheet = $ExcelWorkBook.Sheets.Item("CORP_users")
You can get the name of the current (active) Excel worksheet using this command:
$ExcelWorkBook.ActiveSheet | fl Name, Index
Then you can get values from cells in Excel worksheet. You can use different methods to get the cell values on the current Excel worksheet: using a range, a cell, a column or a row. See the examples of how to get data from the same cell below:
$ExcelWorkSheet.Range("B4").Text
$ExcelWorkSheet.Range("B4:B4").Text
$ExcelWorkSheet.Range("B4","B4").Text
$ExcelWorkSheet.cells.Item(4, 2).text
$ExcelWorkSheet.cells.Item(4, 2).value2
$ExcelWorkSheet.Columns.Item(2).Rows.Item(4).Text
$ExcelWorkSheet.Rows.Item(4).Columns.Item(2).Text
Exporting Active Directory User Info to Excel Spreadsheet using PowerShell
Let’s see a practical example of how to access Excel data from PowerShell. Suppose, we want to get some information from Active Directory for each user in an Excel file. For instance, their phone number (the telephoneNumber attribute), department and e-mail address .
# Importing Active Directory module into PowerShell session
import-module activedirectory
# Open an Excel workbook first:
$ExcelObj = New-Object -comobject Excel.Application
$ExcelWorkBook = $ExcelObj.Workbooks.Open("C:PScorp_ad_users.xlsx")
$ExcelWorkSheet = $ExcelWorkBook.Sheets.Item("CORP_Users")
# Get the number of filled in rows in the XLSX worksheet
$rowcount=$ExcelWorkSheet.UsedRange.Rows.Count
# Loop through all rows in Column 1 starting from Row 2 (these cells contain the domain usernames)
for($i=2;$i -le $rowcount;$i++){
$ADusername=$ExcelWorkSheet.Columns.Item(1).Rows.Item($i).Text
# Get the values of user attributes in AD
$ADuserProp = Get-ADUser $ADusername -properties telephoneNumber,department,mail|select-object name,telephoneNumber,department,mail
# Fill in the cells with the data received from AD
$ExcelWorkSheet.Columns.Item(4).Rows.Item($i) = $ADuserProp.telephoneNumber
$ExcelWorkSheet.Columns.Item(5).Rows.Item($i) = $ADuserProp.department
$ExcelWorkSheet.Columns.Item(6).Rows.Item($i) = $ADuserProp.mail
}
# Save the XLS file and close Excel
$ExcelWorkBook.Save()
$ExcelWorkBook.close($true)
As a result, the columns containing AD information have been added for each user in the Excel file.
Let’s consider another example of making a report using PowerShell and Excel. Suppose, you want to make an Excel report about Print Spooler service state on all domain servers.
# Create an Excel object
$ExcelObj = New-Object -comobject Excel.Application
$ExcelObj.Visible = $true
# Add a workbook
$ExcelWorkBook = $ExcelObj.Workbooks.Add()
$ExcelWorkSheet = $ExcelWorkBook.Worksheets.Item(1)
# Rename a worksheet
$ExcelWorkSheet.Name = 'Spooler Service Status'
# Fill in the head of the table
$ExcelWorkSheet.Cells.Item(1,1) = 'Server Name'
$ExcelWorkSheet.Cells.Item(1,2) = 'Service Name'
$ExcelWorkSheet.Cells.Item(1,3) = 'Service Status'
# Make the table head bold, set the font size and the column width
$ExcelWorkSheet.Rows.Item(1).Font.Bold = $true
$ExcelWorkSheet.Rows.Item(1).Font.size=15
$ExcelWorkSheet.Columns.Item(1).ColumnWidth=28
$ExcelWorkSheet.Columns.Item(2).ColumnWidth=28
$ExcelWorkSheet.Columns.Item(3).ColumnWidth=28
# Get the list of all Windows Servers in the domain
$computers = (Get-ADComputer -Filter 'operatingsystem -like "*Windows server*" -and enabled -eq "true"').Name
$counter=2
# Connect to each computer and get the service status
foreach ($computer in $computers) {
$result = Invoke-Command -Computername $computer –ScriptBlock { Get-Service spooler | select Name, status }
# Fill in Excel cells with the data obtained from the server
$ExcelWorkSheet.Columns.Item(1).Rows.Item($counter) = $result.PSComputerName
$ExcelWorkSheet.Columns.Item(2).Rows.Item($counter) = $result.Name
$ExcelWorkSheet.Columns.Item(3).Rows.Item($counter) = $result.Status
$counter++
}
# Save the report and close Excel:
$ExcelWorkBook.SaveAs('C:psServer_report.xlsx')
$ExcelWorkBook.close($true)
You can use PowerShell to access Excel in a variety of scenarios. For example, you can create handy Active Directory reports or create PowerShell scripts to update AD data from Excel.
For example, you can ask an employee of your HR department to keep the user register in Excel. Then using a PowerShell script and the Set-ADUser
cmdlet, the employee can automatically update user info in AD (just delegate the employee the permissions to change AD user attributes and show how to run the PowerShell script). Thus you can keep an up-to-date address book with the relevant phone numbers, job titles and departments.
Excel files (.xlsx) are a very important data exchange format for a number of reasons:
- Human Readable: Excel files can easily be opened and read by non-IT staff. It is trivial to browse the data or make changes.
- Type Support: Excel files support basic data types like string, dates, and numeric values
- Not Platform-Specific: You can exchange excel data across platforms and locales. Unlike with text-based formats, encoding and special character support are no issue
PowerShell does not come with native support for .xlsx files though. That’s why previously users resorted to exporting excel data to csv, then use Import-Csv
to read the exported data into PowerShell.
This workaround produces extra work and has a number of other disadvantages. Thanks to the free module ImportExcel, going the extra route via csv is not required anymore. You now can directly read and write .xlsx data. Microsoft Office is not required.
In this article, you’ll learn how to read and write .xlsx and .xlsm files in just a line of code. Plus I provide you with Convert-XlsToXlsx
, a clever function that auto-converts .xls files to .xlsx and .xlsm file types. That’s important because ImportExcel can only deal with the modern .xlsx and .xlsm file types. The older .xls excel files use a proprietary binary format that only excel knows how to read.
Convert-XlsToXlsx
may be highly useful in its own right when you need to bulk-convert older excel files to modern formats.It also illustrates how to access the excel object model, and more importantly, how to release COM objects so you don’t end up with memory leaks and ghost processes.
Adding Excel Support to PowerShell
Thanks to Doug Finke and his awesome free module ImportExcel, reading and writing .xlsx files is a snap now — no Office installation required. Simply download and install this free module from the PowerShell Gallery:
Install-Module -Name ImportExcel -Scope CurrentUser -Force
If you have Administrator privileges at hand, you might want to install the module for All Users instead. This makes sure the module is available for all users but more importantly, it makes the module available for both in Windows PowerShell and PowerShell 7.
Install-Module -Name ImportExcel -Force
When you install modules in the scope CurrentUser, modules are available only for the PowerShell edition you used to do the install, so you would have to potentially install the module twice in different locations.
Reading And Writing Excel Files
The two most important cmdlets from this module are:
-
Import-Excel
: takes a path to a .xlsx file and returns all data from the default worksheet. Use the parameter -WorksheetName to specify a given worksheet. Example:# import excel file and show in gridview (make sure file exists!) $Path = "c:pathtosomeexcel.xlsx" Import-Excel -Path $Path | Out-GridView
-
Export-Excel
: saves all piped data to a *.xlsx file. Use the parameter -WorksheetName to specify a given worksheet. By default, existing data on the worksheet will be overwritten. Example:# create am excel sheet with all local user accounts Get-LocalUser | Export-Excel
Playing With Sample Data
Let’s play with the new excel commands! Writing excel files is simple: pipe data to Export-Excel
to create new excel files:
$Path = "$env:templistOfServices.xlsx"
Get-Service | Export-Excel -Path $Path -AutoSize -AutoFilter -FreezeTopRow -BoldTopRow -ClearSheet -WorksheetName 'List of Services' -Show
To play with Import-Excel
, let’s retrieve some real-world sample data files first.
Downloading Sample Data
Finding excel sample data is easy: just google for Download Excel Sample Data to come up with urls. They come as individual files and ZIP archives. To make downloading a pleasant experience, I created a bunch of helper functions.
To download files, simply use Download-File
and Download-Zip
:
# use TLS1.2 with HTTPS:
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
# creates folder if it does not yet exist:
filter Assert-FolderExists
{
$exists = Test-Path -Path $_ -PathType Container
if (!$exists) {
Write-Warning "$_ did not exist. Folder created."
$null = New-Item -Path $_ -ItemType Directory
}
}
# download, unblock and extract zip files
filter Download-Zip($Path)
{
# download to temp file:
$temp = "$env:temptemp.zip"
Invoke-WebRequest -Uri $_ -OutFile $temp
# unblock:
Unblock-File -Path $temp
# extract archive content:
Expand-Archive -Path $temp -DestinationPath $Path -Force
# report
$zip = [System.IO.Compression.ZipFile]::OpenRead($temp)
$zip.Entries | ForEach-Object { Write-Warning "Download: $_" }
$zip.Dispose()
# remove temp file:
Remove-Item -Path $temp
}
# test whether filename is valid:
function Test-ValidFileName($FileName)
{
$FileName.IndexOfAny([System.IO.Path]::GetInvalidFileNameChars()) -eq -1
}
# download and unblock file:
filter Download-File($Path, $FileName)
{
# does the url specify a filename?
if ([string]::IsNullOrWhiteSpace($FileName))
{
# take filename from url:
$FileName = $_.Split('/')[-1]
# remove url parameters:
$FileName = $FileName.Split('?')[0]
# test for valid file name:
$isValid = Test-ValidFileName -FileName $FileName
if (!$isValid)
{
throw "Url contains no valid file name. $FileName is not valid. Use parameter -FileName to specify a valid filename."
}
}
$filePath = Join-Path -Path $Path -ChildPath $FileName
Invoke-WebRequest -Uri $_ -OutFile $filePath
# unblock:
Unblock-File -Path $Path
Write-Warning "Download: $FileName"
}
# create local folder for downloaded files:
($OutPath = "$env:tempexcelsampledata") | Assert-FolderExists
# download various excel sample files:
'https://www.contextures.com/SampleData.zip' | Download-Zip -Path $OutPath
'https://go.microsoft.com/fwlink/?LinkID=521962' | Download-File -Path $OutPath -FileName financial.xlsx
'http://www.principlesofeconometrics.com/excel/theories.xls' | Download-File -Path $OutPath
'http://www.principlesofeconometrics.com/excel/food.xls' | Download-File -Path $OutPath
'https://www.who.int/healthinfo/statistics/whostat2005_mortality.xls?ua=1' | Download-File -Path $OutPath
'https://www.who.int/healthinfo/statistics/whostat2005_demographics.xls?ua=1' | Download-File -Path $OutPath
When you run this code, it downloads a bunch of excel sample files:
WARNING: Download: SampleData.xlsx
WARNING: Download: financial.xlsx
WARNING: Download: theories.xls
WARNING: Download: food.xls
WARNING: Download: whostat2005_mortality.xls
WARNING: Download: whostat2005_demographics.xls
Reading Excel Files
To read data directly from excel files, use Import-Excel
. For example, to get the financial data for December only, try this:
# path with excel files
# (assuming you downloaded the sample data as instructed before)
Set-Location -Path "$env:tempexcelsampledata"
Import-Excel -Path .financial.xlsx | Where-Object 'Month Number' -eq 12 | Out-GridView
By default,
Import-Excel
reads data from the first worksheet. If your file contains more than one worksheet, use the parameter -WorksheetName to specify its name.
To group the countries for December, simply use the common PowerShell pipeline cmdlets:
Obviously, you can do this with excel directly as well. This is about automation (in case you need to do these kinds of analysis regularly), and it is for PowerShell home boys who may not know how to pivot in excel but do know their tools in PowerShell.
And it is about learning: there is no better way to learn the PowerShell pipeline cmdlets!
# path with excel files
# (assuming you downloaded the sample data as instructed before)
Set-Location -Path "$env:tempexcelsampledata"
Import-Excel -Path .financial.xlsx | Where-Object 'Month Number' -eq 12 | Group-Object -Property Country -NoElement | Sort-Object -Property Count -Descending
Here is the result:
Count Name
----- ----
21 Germany
21 United States of America
21 Canada
21 France
21 Mexico
Accessing XLS Files
The bad news is: .xls files cannot be accessed. They use a proprietary binary format that can only be read by excel.
The good news is: provided you have excel installed, it is trivial to convert .xls files to .xlsx files. If you are really still using .xls files, you should consider this transform for good. .xls is really outdated and should no longer be used.
Converting XLS To XLSX
Above I downloaded a bunch of .xls files that can’t be processed by Import-Excel
. Bummer.
Below is a function Convert-XlsToXlsx
that auto-converts .xls files to .xlsx and .xlsm files, though. The script requires Microsoft Office to be installed on your box because only excel knows how to open the binary format used in .xls files:
function Convert-XlsToXlsx
{
param
(
# Path to the xls file to convert:
[Parameter(Mandatory,ValueFromPipeline,ValueFromPipelineByPropertyName)]
[string[]]
[Alias('FullName')]
$Path,
# overwrite file if it exists:
[switch]
$Force,
# show excel window during conversion. This can be useful for diagnosis and debugging.
[switch]
$Visible
)
# do this before any file can be processed:
begin
{
# load excel assembly (requires excel to be installed)
Add-Type -AssemblyName Microsoft.Office.Interop.Excel
# open excel in a hidden window
$excel = New-Object -ComObject Excel.Application
$workbooks = $excel.Workbooks
if ($Visible) { $excel.Visible = $true }
# disable interactive dialogs
$excel.DisplayAlerts = $False
$excel.WarnOnFunctionNameConflict = $False
$excel.AskToUpdateLinks = $False
# target file formats
$xlsx = [Microsoft.Office.Interop.Excel.XlFileFormat]::xlOpenXMLWorkbook
$xlsm = [Microsoft.Office.Interop.Excel.XlFileFormat]::xlOpenXMLWorkbookMacroEnabled
}
# do this for each file:
process
{
foreach($_ in $Path)
{
# check for valid file extension:
$extension = [System.Io.Path]::GetExtension($_)
if ($extension -ne '.xls')
{
Write-Verbose "No xls file, skipping: $_"
continue
}
# open file in excel:
$workbook = $workbooks.Open($_)
# test for macros:
if ($workbook.HasVBProject)
{
$extension = 'xlsm'
$type = $xlsm
}
else
{
$extension = 'xlsx'
$type = $xlsx
}
# get destination path
$outPath = [System.Io.Path]::ChangeExtension($_, $extension)
# does it exist?
$exists = (Test-Path -Path $outPath) -and !$Force
if ($exists)
{
Write-Verbose "File exists and -Force was not specified, skipping: $_"
Write-Warning "File exists. Use -Force to overwrite. $_"
continue
}
# save in new format:
$workbook.SaveAs($outPath, $type)
# close document
$workbook.Close()
# release COM objects to prevent memory leaks:
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($workbook)
Write-Verbose "File successfully converted: '$_' -> '$outPath'"
}
}
# do this once all files have been processed
end
{
# quit excel and clean up:
$excel.Quit()
# release COM objects to prevent memory leaks:
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($workbooks)
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($excel)
$excel = $workbooks = $null
# clean up:
[GC]::Collect()
[GC]::WaitForPendingFinalizers()
Write-Verbose "Done."
}
}
It is beyond the scope of this article to discuss the function in detail. I’d like to point out though that the code illustrates important aspects when using COM objects in PowerShell:
When using COM objects like Excel.Application, it can be challenging to free all object references at the end. When you do this wrong, references will stay alive, and so does the excel process in memory. Of course you can always kill the process after use but this might damage excel, and next time you launch it, it starts in recovery mode.
A better approach is to make sure you are storing each object reference in a dedicated variable. Next, make sure you actively release each reference after use by calling ReleaseComObject().
When you did that right, no open reference should survive, and when you call Quit(), excel should be removed from your process list.
Now it’s trivial to convert all downloaded .xls files to the appropriate new formats:
# path with excel files.
# assuming you created this folder and downloaded files to it:
$Path = "$env:tmpExcelsampledata"
# get all xls files and convert them:
Get-ChildItem -Path $Path -Filter *.xls -File | Convert-XlsToXlsx -Verbose
The ImportExcel is a PowerShell module that allows you import to or export data directly from Excel spreadsheets without having Microsoft Excel installed on your computer. In this tutorial, you’ll learn to work with Import-Excel and Export-Excel. The ImportExcel module runs on Windows, Linux, or Mac and now can be used in Azure functions and GitHub Actions. Simply put, if you need to generate reports for work, you must learn this module.
Contents
- Importing data from Excel
- Export data to Excel
- Adding data to an existing spreadsheet
- Exporting data with formatting
- Creating charts
- Editing existing data in an Excel spreadsheet
- Conclusion and links
- Author
- Recent Posts
Mike Kanakos is a Cloud and Datacenter Microsoft MVP, tech blogger and PowerShell community leader. He writes about infrastructure management and cloud automation. You can follow Mike on his blog https://www.commandline.ninja or on Twitter at @MikeKanakos.
Doug Finke, a Microsoft MVP since 2009, builds and maintains the module. Doug is constantly improving the module and releases new module updates frequently. As of this writing, the module is at v7.1.3 and is continually being developed. His module is nearing 1 million downloads since its first release! Installing the module is a simple task with PowerShell code.
Install-Module -Name ImportExcel
Excel is not required to be installed for this module to work. The module installs a .net DLL named epplus.dll that allows the module to import Excel data or export to Excel format. This allows you to install the module on a server without having to install Office on the server.
Importing data from Excel
Getting started with the module is very easy. Let’s start by importing some data from Excel. In this first demo, I’ll be importing some simple data I have from a table in Excel.
Sample Excel table data for import
To import data, I use the Import-Excel cmdlet and specify the path. In this example, I will also save the data to a variable called «Fruit» for later use.
Import-Excel "c:tempExcelDemo.xlsx" -OutVariable Fruit
Excel data import in PowerShell
Now, we have a simple table with data organized in columns and rows. The table properties reveal that PowerShell has created a PSCustomObject with two note properties for the two columns.
Excel table properties
But what if I have a large table of data? I can specify which data gets imported without having to pull in the entire table. Let’s look at how that works.
I have created a new tab in my spreadsheet that contains all the process info from my machine. I have named the tab «Processes.» The spreadsheet has 69 columns of data. I could import all these columns and filter the data, but for this demonstration I just want the Name, ProcessName, CPU, and Memory columns.
Process info data in Excel
Using the Import-Excel cmdlet, I can pull in just the data I am interested in. Let’s pull in the columns I mentioned earlier (Name, ProcessName, CPU, and Memory). For this demo, I only want 6 rows of data. To accomplish this, I use the -ImportColumns, -StartRow and -EndRow parameters.
To pick the columns, I simply count columns from left to right in my spreadsheet starting at 1. I know you can’t see the full spreadsheet, but I have already counted out the columns that I need. To select the columns I want, I will need columns 1, 6, 12, and 46. But if I want to keep them in the order I mentioned above, then the order would have to be 1, 46, 12, and 6.
import-excel C:tempExcelDemo.xlsx -WorksheetName Processes -ImportColumns @(1, 46, 12, 6) -startrow 1 -endrow 7
Process info imported into PowerShell
Export data to Excel
As with the process of importing data, I can also export data to Excel easily with just one line of code. Let’s go back to my previous example: getting the process data. If I want to export all the process info on my machine, all I need to do is type one line:
Get-process | Export-Excel
This results in the Export-Excel cmdlet creating a spreadsheet. If I have Excel installed, it launches Excel and presents the file output to me.
Exporting data to Excel using default values
Notice that I didn’t specify a filename or any other formatting information. However, the Export-Excel cmdlet created the spreadsheet and applied some default formatting (see callout 2) and created a temporary file for me (callout 1).
Of course, I can choose a filename and path on export, if I so desire, by using the -path parameter and inputting a value like so:
Get-process | Export-Excel C:tempProcessList.xlsx
Adding data to an existing spreadsheet
At some point, you will need to add data to an existing spreadsheet. The -Append parameter adds data to an existing spreadsheet. I can specify a worksheet to add to with the -worksheet parameter or I can start a new worksheet with the same parameter but picking a new tab name.
So far, I have been working on a spreadsheet named «ExcelDemo.xlsx,» which contains the Fruit and Processes worksheets. I want to add a new tab named «People» and copy in data from a small table I created.
Table of person and city info saved to the People variable
Exporting this data to my existing Excel spreadsheet and creating a new worksheet would look like this:
$People | Export-Excel c:tempExcelDemo.xlsx -Append -WorksheetName "People"
People table export
This is easy and doesn’t require much code. Below, we can see the worksheet tabs that have been created from Export-Excel.
Excel worksheet tabs created by Export Excel
When you look at the table, you’ll see that it has none of the familiar Excel spreadsheet formatting. I would like to add some formatting to my data. Let me show you how this can be done.
Exporting data with formatting
The Export-Excel cmdlet offers many options for formatting my data on export. I’ll highlight a few options, but make sure you review the parameters available for the Export-Excel cmdlet for a full list of formatting options.
I would like to export the data again. This time, however, I will add a table style and a title for my table, and I would like the table title to be bold. This is possible with Export-Excel. The code used to do this is slightly different from the previous example:
$People | Export-Excel c:tempExcelDemo.xlsx -Append -WorksheetName "PeopleFormatted" -TableStyle Medium16 -title "Demo of Table Formatting" -TitleBold
Formatted version of the People table in Excel
You might wonder what the table style I selected (Medium16) in the last example is. The Export-Excel cmdlet has table styles built in that correspond to the table styles you see in Excel.
Export Excel table styles available
The table styles in Excel are the same. In the screen cap below, I clicked on the «Format As Table» at the top of the spreadsheet, which then displays the table styles. If you hover your mouse over a style, you’ll see some text that provides you the style details. The #1 callout is the style I hovered over. Notice that it says Medium16. This is how I got the name that I used in my previous code example for the table style parameter.
Corresponding Excel table styles
Creating charts
Export-Excel does more than just make spreadsheets. The cmdlet can export table data and turn that data into a chart inside an Excel spreadsheet. For my next example, I have created a table of some simple inventory items and sales data.
Sales data
I would like to chart these sales in a simple bar graph that depicts units sold. To do this, I need to define the properties I want for my table. To do this, I use the New-ExcelChartDefinition cmdlet.
$ChartData = New-ExcelChartDefinition -XRange Item -YRange TotalSold -ChartType ColumnClustered -Title "Total Fruit Sales"
This line of code defines my table properties, and it tells Excel what to use for the xValue in the chart. I first use the Item column, then, I define the yValue (I am using the TotalSold column). Then, I specify a chart type. There are 69 chart types available in the cmdlet, all of which correspond to the chart types in Excel. I chose the «ColumnClustered» type for my example.
I then add a chart title, although this is not required. These values are saved to a variable named $ChartData. The next piece to add to the export cmdlet is this chart definition:
$data | Export-Excel C:tempExcelDemo.xlsx -Append -WorksheetName FruitSalesChart -ExcelChartDefinition $ChartData -AutoNameRange -show -Title "Fruit Sales"
Let’s walk through this example. First, I send the $data variable to the Export-Excel cmdlet. The $data variable is our sales data. The syntax for Export-Excel is a continuation from my previous example. I export and append this to a spreadsheet named «ExcelDemo.xlsx.» I create new worksheet tab named FruitSalesChart. This is all code we saw in the previous examples.
Then, I am add in the chart definition I created earlier by calling the $ChartData variable. Finally, I tell Excel that I want an auto name range. The -show parameter auto opens the spreadsheet after I create it.
Fruit Sales exported to Excel as a table and chart
Editing existing data in an Excel spreadsheet
I find it so easy to export data from PowerShell to Excel that I default to the Export-Excel cmdlet for much of my work. However, you can also update individual data values in an existing spreadsheet. I will connect to the spreadsheet that I used in the previous examples. To connect, use the Open-ExcelPackage cmdlet.
$ExcelPkg = Open-ExcelPackage -Path "C:tempExcelDemo.xlsx"
I can start to work with the data after opening the file.
Spreadsheet info in PowerShell
The first five rows constitute the worksheet tabs I created earlier in the spreadsheet. I can view the data in any of the tabs with some simple code.
#Let's access the data in the "PeopleFormatted" worksheet $WorkSheet = $ExcelPkg.Workbook.Worksheets["PeopleFormatted"].Cells $WorkSheet[3,1] | select value Value ----- Jeremy $WorkSheet[3,2] | select value Value ----- Loxahatchee
The code above probably doesn’t make much sense without a visual reference. Have a look at this screen cap below, which should help explain the code.
In the first code example, I called $WorkSheet[3,1] . If you look at the Excel spreadsheet, «3» represents the 3rd row. «1» represents the first column (starting from left of column A).
In the second code example, I called $WorkSheet[3,2] which is Row 3, Column2 (column B in spreadsheet).
Example of accessing Excel data values
Inserting a new value into the Excel data cell is done with a similar set of code. I will replace the name «Jeremy» with the name «Robert».
$WorkSheet[3,1].Value = "Robert" $WorkSheet[3,1] | select value Value ----- Robert
It’s that easy to update a field in Excel! However, there’s one catch. This change I just made is still in memory inside PowerShell. The file needs to «closed» for the data to be written back into the file.
Close-ExcelPackage $ExcelPkg
Updated spreadsheet value
Conclusion and links
Today, I showed you how to import data from an Excel spreadsheet, create a spreadsheet, create a simple chart, and manipulate the imported data in an existing Excel spreadsheet. The ImportExcel module makes these tasks and others operations simple to complete.
I have touched upon a just few of the many complex tasks you can perform with this module. If you would like to learn more, please visit Doug Finke’s GitHub page for many more examples of demo code you can try for yourself. He has a page dedicated to FAQs and a thorough analysis on examples that you should definitely check out.
Subscribe to 4sysops newsletter!
Many of the code examples in Doug’s module come from community members looking to use Excel in unique ways. If you have ideas for new ways to use his module, please submit a pull request to his repo so that others can learn from your use case.