If you want it to search the entire column for A to Z you would specify the range:
$Range = $Worksheet.Range("A:Z")
Then you should be able to execute a $Range.Find($SearchText)
and if the text is found it will spit back the first cell it finds it in, otherwise it returns nothing. So start Excel like you did, then do a ForEach
loop, and inside that open a workbook, search for your text, if it is found close it, move it, stop the loop. If it is not found close the workbook, and move to the next file. The following worked just fine for me:
$Destination = 'C:TempBackup'
$SearchText = '3/23/2015 10:12:19 AM'
$Excel = New-Object -ComObject Excel.Application
$Files = Get-ChildItem "$env:USERPROFILEDocuments*.xlsx" | Select -Expand FullName
$counter = 1
ForEach($File in $Files){
Write-Progress -Activity "Checking: $file" -Status "File $counter of $($files.count)" -PercentComplete ($counter*100/$files.count)
$Workbook = $Excel.Workbooks.Open($File)
If($Workbook.Sheets.Item(1).Range("A:Z").Find($SearchText)){
$Workbook.Close($false)
Move-Item -Path $File -Destination $Destination
"Moved $file to $destination"
break
}
$workbook.close($false)
$counter++
}
I even got ambitious enough to add a progress bar in there so you can see how many files it has to potentially look at, how many it’s done, and what file it’s looking at right then.
Now this does all assume that you know exactly what the string is going to be (at least a partial) in that cell. If you’re wrong, then it doesn’t work. Checking for ambiguous things takes much longer, since you can’t use Excel’s matching function and have to have PowerShell check each cell in the range one at a time.
PowerShell Pipeline
Find Data in Excel Using PowerShell
For those times when converting Excel files to .CSV just doesn’t make sense.
Sometimes you need to scan some files for a piece of data like a string, phrase or some number, and one of those files just happens to be an Excel spreadsheet.
You could open up the file, launch the search window and begin looking at and taking note of every instance that you find. Or you could automate this using PowerShell by converting the file to a .CSV, then using Import-Csv to work through the data that way.
If neither of those approaches does what you want — perhaps you don’t want to convert the file to .CSV because there’s more than one sheet in the file, or you just feel like converting it is too much of a manual step in your automation process — then you are in luck! I will show you how to take a workbook in Excel and go through each spreadsheet in the file and search for data that not only shows you the data, but also the row and column number that it is located in.
First, I need an Excel spreadsheet with some data in it. Fortunately, I was able to download a sample spreadsheet here that has some data that I can use for my searches. The file is called «Sample — SuperStore.xls» and contains three sheets with various information in them, as shown in the images below.
Now that we know what we are working with, we can connect to the Excel COM object so that we can begin working with the object to search for some data.
$Excel = New-Object -ComObject Excel.Application
Now we will open up the Excel workbook by using the Workbooks.Open() method. Make sure that when you do this, you are providing the full path name for the file. Not doing this will result in the method throwing an error.
$Workbook = $Excel.Workbooks.Open('C:usersproxbDownloadsSample - Superstore.xls')
I will now connect to the first spreadsheet and display its name so we can make sure we are looking at the Orders sheet.
$workSheet = $Workbook.Sheets.Item(1) $WorkSheet.Name
When working with Excel, we use 1 to signify that we are on the first sheet, unlike arrays where they are zero-indexed.
We will begin searching for our data — in this case, it is «Nebraska» — using the Cells.Find() method and supplying the search string as the method parameter. If it does find data, it will be available to view as an object and also will be used as a parameter value in another method.
$Found = $WorkSheet.Cells.Find('Nebraska')
In this case, I found a cell that has «Nebraska» in it. Now we can take the object returned and use it to find the address of the data and save it as the beginning address to show the first item we found. At some point in our search, it will come back to the first match, so we want to stop when we get to that point.
$BeginAddress = $Found.Address(0,0,1,1) $BeginAddress
I now have enough information to build a PowerShell object that contains the data found, as well as the column and row information and the address that I showed in the image above.
[pscustomobject]@{ WorkSheet = $Worksheet.Name Column = $Found.Column Row =$Found.Row Text = $Found.Text Address = $BeginAddress }
From here, we want to continue the search until we reach the top of our matches.
Do { $Found = $WorkSheet.Cells.FindNext($Found) $Address = $Found.Address(0,0,1,1) If ($Address -eq $BeginAddress) { BREAK } [pscustomobject]@{ WorkSheet = $Worksheet.Name Column = $Found.Column Row =$Found.Row Text = $Found.Text Address = $Address } } Until ($False)
Here, you can see that I use the $found object from the initial search as the parameter value in the Cells.FindNext() method. I keep track of each address of the subsequent matches and as soon as I come across the first address, I break out of the Do loop so I do not have duplicate data showing up. Once this completes, I find a total of 38 matches on this particular sheet. Now I know how many times «Nebraska» shows up in the spreadsheet, and where.
I also want to add that you can use wild-card searches in these examples. Using an asterisk (*) will search for multiple characters, and using a question mark (?) will only assume a single character wild-card match to look for.
Something like this would make for a great function, right? Well, I agree, and wrote a small function that you can use to search a full workbook for some data and have it return the data I provided above, but also across all of the worksheets. The full code, called «Search-Excel,» is available at the end of this article and takes parameters of the full file path and the search string.
Search-Excel -Source 'C:usersproxbDownloadsSample - Superstore.xls' -SearchText Nebraska | Format-Table
As you can see, it finds all of the «Nebraska» entries under the Orders sheet but doesn’t find anything on the other two sheets.
You can use PowerShell to search for various pieces of data within an Excel workbook to include all of the worksheets, which can be useful to quickly determine how much of a particular piece of data is in the workbook. Or you can use it to quickly scan an Excel document for sensitive data. Either way, this will hopefully be a useful tool for your PowerShell toolbox!
Search-Excel Source Code
Function Search-Excel { [cmdletbinding()] Param ( [parameter(Mandatory)] [ValidateScript({ Try { If (Test-Path -Path $_) {$True} Else {Throw "$($_) is not a valid path!"} } Catch { Throw $_ } })] [string]$Source, [parameter(Mandatory)] [string]$SearchText #You can specify wildcard characters (*, ?) ) $Excel = New-Object -ComObject Excel.Application Try { $Source = Convert-Path $Source } Catch { Write-Warning "Unable locate full path of $($Source)" BREAK } $Workbook = $Excel.Workbooks.Open($Source) ForEach ($Worksheet in @($Workbook.Sheets)) { # Find Method https://msdn.microsoft.com/en-us/vba/excel-vba/articles/range-find-method-excel $Found = $WorkSheet.Cells.Find($SearchText) #What If ($Found) { # Address Method https://msdn.microsoft.com/en-us/vba/excel-vba/articles/range-address-property-excel $BeginAddress = $Found.Address(0,0,1,1) #Initial Found Cell [pscustomobject]@{ WorkSheet = $Worksheet.Name Column = $Found.Column Row =$Found.Row Text = $Found.Text Address = $BeginAddress } Do { $Found = $WorkSheet.Cells.FindNext($Found) $Address = $Found.Address(0,0,1,1) If ($Address -eq $BeginAddress) { BREAK } [pscustomobject]@{ WorkSheet = $Worksheet.Name Column = $Found.Column Row =$Found.Row Text = $Found.Text Address = $Address } } Until ($False) } Else { Write-Warning "[$($WorkSheet.Name)] Nothing Found!" } } $workbook.close($false) [void][System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$excel) [gc]::Collect() [gc]::WaitForPendingFinalizers() Remove-Variable excel -ErrorAction SilentlyContinue }
About the Author
Boe Prox is a Microsoft MVP in Windows PowerShell and a Senior Windows System Administrator. He has worked in the IT field since 2003, and he supports a variety of different platforms. He is a contributing author in PowerShell Deep Dives with chapters about WSUS and TCP communication. He is a moderator on the Hey, Scripting Guy! forum, and he has been a judge for the Scripting Games. He has presented talks on the topics of WSUS and PowerShell as well as runspaces to PowerShell user groups. He is an Honorary Scripting Guy, and he has submitted a number of posts as a to Microsoft’s Hey, Scripting Guy! He also has a number of open source projects available on Codeplex and GitHub. His personal blog is at http://learn-powershell.net.
- Remove From My Forums
-
Question
-
Hi all,
I am trying to figure out how to loop this script so that it replaces all the text that I am searching for in my Excel file. I am able to get it to run but it only replaces the text in the first column and then ends. Any help is appreciated and thanks
in advance.-Gaz
$File =
«D:test.xlsx«# Setup Excel, open $File and set the the first worksheet
$Excel
= New-Object
-ComObject
Excel.Application
$Excel.visible
= $true
$Workbook
= $Excel.workbooks.open($file)
$Worksheets
= $Workbooks.worksheets
$Worksheet
= $Workbook.Worksheets.Item(1)$SearchString
= «TEMP»
# This is the value that I will be searching for$Range
= $Worksheet.Range(«A1«).EntireColumn
$Search
= $Range.find($SearchString)$Search.value()
= «ABSENT»
# This is the value that i want to replace the text with.
Answers
-
Hi,
You can use the FindNext and
FindPrevious methods to repeat the search.$File = "c:testbook2.xlsx" # Setup Excel, open $File and set the the first worksheet $Excel = New-Object -ComObject Excel.Application $Excel.visible = $true $Workbook = $Excel.workbooks.open($file) $Worksheets = $Workbooks.worksheets $Worksheet = $Workbook.Worksheets.Item(1) $SearchString = "TEMP" # This is the value that I will be searching for $Range = $Worksheet.Range("A1").EntireColumn $Search = $Range.find($SearchString) if ($search -ne $null) { $FirstAddress = $search.Address do { $Search.value() = "ABSENT" $search = $Range.FindNext($search) } while ( $search -ne $null -and $search.Address -ne $FirstAddress) } $WorkBook.Save() $WorkBook.Close() [void]$excel.quit()
If you want to work on multiple columns, let’s say «A1:B10»: pls see the example below
$Worksheet.Range(«A1″,»B10»)
rgds,
-
Edited by
Wednesday, February 19, 2014 8:22 AM
-
Proposed as answer by
Yan Li_
Thursday, February 20, 2014 3:39 AM -
Marked as answer by
Yan Li_
Friday, February 28, 2014 7:08 AM
-
Edited by
To totally unlock this section you need to Log-in
Sometimes we need to search in notes (such as quick .txt files) or text-based configuration files spread over our system to find something specific, but it can be very time-consuming and tedious to sift through many files, manually. Fortunately there is scripting on our side and, specially, Powershell scripting to aid in this usually very time consuming activity.
Let’s consider a directory, «C:Temp» with many text files created. Each of the files has random text data inside. We’re looking for only the files that contain one particular text string. Additionally, since we don’t know how many matches we are going to find, we’re going to create an array to store the found matches.In order to search for strings or string patterns, we’re going to use the cmdlet Select-String.
$Path = "C:temp" $Text = "This is the data that I am looking for" $PathArray = @() $Results = "C:temptest.txt" ### The following code snippet gets all the files in $Path that end in ".txt". Get-ChildItem $Path -Filter "*.txt" | Where-Object { $_.Attributes -ne "Directory"} | ForEach-Object { If (Get-Content $_.FullName | Select-String -Pattern $Text) { $PathArray += $_.FullName $PathArray += $_.FullName } } Write-Host "Contents of ArrayPath:" $PathArray | ForEach-Object {$_}
Here’s the breakdown: this will search the directory $Path for any items that include «.txt» in their names and that are not directories.
Get-ChildItem $Path -Filter "*.txt" | Where-Object { $_.Attributes -ne "Directory"}
For every match it finds, it will check the contents of the match using Get-Content and verify any matches with $Text by using Select-String. If it finds a match, it puts the full name of the match into the $PathArray array.
ForEach-Object { If (Get-Content $_.FullName | Select-String -Pattern $Text) { $PathArray += $_.FullName $PathArray += $_.FullName } }
There you have it. The following is another way about exporting results to file:
If you want to export that all to a file instead of on the screen, then simply pipe it into an Out-File cmdlet. For example:
$PathArray | % {$_} | Out-File "C:Some FolderSome File.txt"
Searching through all subfolders: If you want to search all subfolders for additional *.txt files as well, then add -Recurse to the original Get-ChildItem command:
Get-ChildItem $Path -Filter "*.txt" -Recurse
The -Include approach
Get-Childitem includes two additional parameters, -Include and –Exclude: their functions are pretty simple and they can be very useful when searching for some specific file types.
The -Include parameter says, «Show me only these files in the search», and -Exclude says, «Keep that stuff out of my way.»
As a bonus tip, remember that, to include the search in hidden folders and files, we need to add the –Force parameter to let it examine those folders/files as well.
Get-Childitem –Path C: -Recurse –Force -ErrorAction SilentlyContinue
We could now use this same command fo example to show only the Word documents that we can access and all the files that include «software» word in the filename.
Get-Childitem –Path C: -Include *software* -Recurse -ErrorAction SilentlyContinue
The above command will pull everything with the letters, software, in it, including folder titles. We can tell it to show only files by using PowerShell. This was introduced in version 3 of PowerShell.
Get-Childitem –Path C: -Include *software* -File -Recurse -ErrorAction SilentlyContinue
We can also use the the -Exclude parameter to say, «Don’t show me any TMP, MP3, or JPG» files.:
Get-Childitem –Path C: -Include *software* -Exclude *.JPG,*.MP3,*.TMP -File -Recurse -ErrorAction SilentlyContinue
Filtering by date
Finally, in PowerShell we can also filter out files based upon date and time quite easily.
Right now, let’s get the date for which we will do the search, for example 1 November 2020.
$FindDate=Get-Date -Year 2020 -Month 11 -Day 1
With this information, we can first off target two things. First, show me all Word documents, files only, on the entire C: drive, and keep the error messages to yourself, PowerShell.
Get-ChildItem -Path C: -Include *.doc,*.docx -File -Recurse -ErrorAction SilentlyContinue
In the below example, we can use Where-Object to show only files that were created since the day that we stored in $FindDate. This will include everything since 12:00 AM the morning of that day. We will compare the list against the LastWriteTime property, which is the «Last Time the File was Written to«.
Get-ChildItem -Path C: -Include *.doc,*.docx -File -Recurse -ErrorAction SilentlyContinue | Where-Object { $_.LastWriteTime -ge $FindDate }
We can also filter on a specific day with the following approach, by using the AddDays() method to our $FindDate and give it a range of 24 hours:
Get-ChildItem -Path C: -Include *.doc,*.docx -File -Recurse -ErrorAction SilentlyContinue | Where-Object { $_.LastWriteTime -ge $FindDate -and $_.LastWriteTime -le $Finddate.adddays(1) }
Finally we can also specify multiple paths in which we will do the search following, for example, the below command:
Get-ChildItem -Path C:Users, D:Example1, E:Example2 -Include *.doc,*.docx -File -Recurse -ErrorAction SilentlyContinue | Where-Object { $_.LastWriteTime -ge $FindDate -and $_.LastWriteTime -le $Finddate.adddays(1) }
Searching content in files
A first method to search text string in files (not in filenames) we can use the Select-String cmdlet that expects the path as well as the pattern parameter to be strings, so we do not need to use quotation marks for either the pattern or the path. For example, we can use the following command to search the C:fso folder for files that have the .txt file extension, and contain a pattern match for «success» text string:
Select-String -Path C:fso*.txt -pattern success
The output of this command will be zero or more lines in the format of file.txt:1:success that will show, in this example, that at Line 1 it founds the «success» text string, in the file.txt.
Select-String cmdlet also support, for -Path and -Pattern, array of strings, so we can, for example, also execute commands like the following, to search content in multiple file types or files and/or for multiple patterns:
Select-String -Path C:fso*.txt, C:fso*.log -pattern success
Select-String -Path C:fso*.txt, C:fso*.log -pattern success,failure
In addition to directly using the -Path parameter in the Select-String cmdlet, we can use the Get-Childitem cmdlet for more granular control over the files to be parsed.
In the following command, we use the dir command (an alias for the Get-ChildItem cmdlet, and the -R and -I switches are the same of the -Recurse and -Include used by Get-ChildItem) and provide the path of C:fso, then we include only .txt and .log files and finally we will pipe the results to the Select-String cmdlet and look for the pattern fail (-Pattern is the default parameter and therefore is omitted in the command). The long version of the command is shown here:
Get-ChildItem -Path C:fso -Include *.txt, *.log -Recurse | Select-String -Pattern success
Here is an example of the shorter form of the command:
dir C:fso -I *.txt, *.log -R | Select-String success
Search in Office files
The last part of this article will be focused on how to search content in Microsoft Office files, because the Select-String approach will not work for these kind of files.
A COM approach (ComObject) in Powershell to get the ability to read an search in Worksheets in Excel is the following (the COM approach is usually slow and it requires to have Microsoft Office installed on the system on which the script will run, so keep all of this in mind):
Set-StrictMode -Version latest Function Search-Excel { [cmdletbinding()] Param ( [parameter(Mandatory)] [ValidateScript({ Try { If (Test-Path -Path $_) {$True} Else {Throw "$($_) is not a valid path!"} } Catch { Throw $_ } })] [string]$Source, [parameter(Mandatory)] [string]$SearchText #You can specify wildcard characters (*, ?) ) $Excel = New-Object -ComObject Excel.Application $Files = Get-Childitem $Source -Include *.xlsx,*.xls -Recurse | Where-Object { !($_.psiscontainer) } Foreach ($File In $Files) { Try { $Source = Convert-Path $File } Catch { Write-Warning "Unable locate full path of $($Source)" BREAK } $Workbook = $Excel.Workbooks.Open($File) ForEach ($Worksheet in @($Workbook.Sheets)) { # Find Method https://msdn.microsoft.com/en-us/vba/excel-vba/articles/range-find-method-excel $Found = $WorkSheet.Cells.Find($SearchText) #What If ($Found) { # Address Method https://msdn.microsoft.com/en-us/vba/excel-vba/articles/range-address-property-excel $BeginAddress = $Found.Address(0,0,1,1) #Initial Found Cell [pscustomobject]@{ WorkSheet = $Worksheet.Name Column = $Found.Column Row =$Found.Row Text = $Found.Text Address = $BeginAddress Path = $File.FullName } Do { $Found = $WorkSheet.Cells.FindNext($Found) $Address = $Found.Address(0,0,1,1) If ($Address -eq $BeginAddress) { BREAK } [pscustomobject]@{ WorkSheet = $Worksheet.Name Column = $Found.Column Row =$Found.Row Text = $Found.Text Address = $Address Path = $File.FullName } } Until ($False) } Else { Write-Warning "[$($WorkSheet.Name)] Nothing Found!" } } $workbook.close($false) } [void][System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$excel) [gc]::Collect() [gc]::WaitForPendingFinalizers() Remove-Variable excel -ErrorAction SilentlyContinue }
To use the above function, for .xlsx and .xls files, we can see the following example (the function will recurse in all folders and subforlders if you will give a simple path with no Excel file specified):
### It will search for the text string defined in the Excel file specified: Search-Excel -Source "C:ResultsTry.xlsx" -SearchText Try ### It will recurse for all Excel files and will search for the text string specified: Search-Excel -Source "C:Results" -SearchText Try
Here it is, instead, a way to search a specific string in multiple Word files, like Excel files above, in a function form:
Set-StrictMode -Version latest Function Search-Word { [cmdletbinding()] Param ( [parameter(Mandatory)] [ValidateScript({ Try { If (Test-Path -Path $_) {$True} Else {Throw "$($_) is not a valid path!"} } Catch { Throw $_ } })] [string]$Source, [parameter(Mandatory)] [string]$SearchText #You can specify wildcard characters (*, ?) ) $Word = New-Object -ComObject Word.Application $Files = Get-Childitem $Source -Include *.docx,*.doc -Recurse | Where-Object { !($_.psiscontainer) } Foreach ($File In $Files) { Try { $Source = Convert-Path $File } Catch { Write-Warning "Unable locate full path of $($Source)" BREAK } $Document = $Word.Documents.Open($File.FullName,$false,$true) $Range = $Document.Content If($Range.Text -Match $SearchText){ [pscustomobject]@{ File = $File.FullName Match = $SearchText Text = $Matches[0] } } Else { Write-Warning "[$($File.FullName)]: Nothing Found!" } $Document.close($false) } [void][System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Word) [gc]::Collect() [gc]::WaitForPendingFinalizers() Remove-Variable Word -ErrorAction SilentlyContinue }
To use the above function, for .docx and .doc files, we can see the following example (the function will recurse in all folders and subforlders if you will give a simple path with no Word file specified):
### It will search for the text string defined in the Word file specified: Search-Word -Source "C:ResultsTry.docx" -SearchText computer ### It will recurse for all Word files and will search for the text string specified: Search-Word -Source "C:Results" -SearchText computer
The DocumentFormat.OpenXml approach
Another way of searching and extracting data with Powershell will be using DocumentFormat.OpenXml assembly and importing it in Powershell, and then extracting properties and methods to manipulate Word, Excel and Powerpoint files, but this will be investigated more in a next article.
The main features of DocumentFormat.OpenXml are:
- High-performance generation of word-processing documents, spreadsheets, and presentations.
- Populating content in Word files from an XML data source.
- Splitting up (shredding) a Word or PowerPoint file into multiple files, and combining multiple Word/PowerPoint files into a single file.
- Extraction of data from Excel documents.
- Searching and replacing content in Word/PowerPoint using regular expressions.
- Updating cached data and embedded spreadsheets for charts in Word/PowerPoint.
- Document modification, such as removing tracked revisions or removing unacceptable content from documents.
Summary
Article Name
Searching through files for matching text strings (Powershell)
Description
Let’s see some Powershell ways to search for text or files, also in Word and Excel programs, on a Windows system (server and client). A scripting way to this common tedious activity (searching for files and text in files) would be useful to not waste time, most of the time.
Author
Publisher Name
Heelpbook.net
Have you ever wanted or needed to search an Excel file for specific values from within a PowerShell script? I know I have. The last time I needed to, I just setup a loop and checked every cell in the column for a match until I encountered a blank cell (or until I exhausted the specified range). This is hardly an elegant solution. I figured there had to be a find method somewhere in Excel that I could use.
With some searching on Google and a bit of reading on MSDN, I’ve figured out how to accomplish it. There is a find method in the Excel NamedRange class. The implementation of this method I use is still relatively simple; I do not pass any of the optional parameters to the find method.
In this example, I open an Excel file, search the first column (column A) for the string “Some Value” and then replace it with the string “Another Value.”
$File = "$pwdtest.xlsx" # Setup Excel, open $File and set the the first worksheet $Excel = New-Object -ComObject Excel.Application $Excel.visible = $true $Workbook = $Excel.workbooks.open($file) $Worksheets = $Workbooks.worksheets $Worksheet = $Workbook.Worksheets.Item(1) $SearchString = "Some Value" $Range = $Worksheet.Range("A1").EntireColumn $Search = $Range.find($SearchString) $Search.value() = "Another Value"
The line:
$Range = $Worksheet.Range("A1").EntireColumn
sets a range for the entire A column. If you prefer, you can specify a fixed range like so:
$Range = $Worksheet.Range("A1","A5")
We then set the output from the find method to the $Search variable in this line:
$Search = $Range.find($SearchString)
We simply pass the search string, in this case $SearchString, as a parameter in the find method on a NamedRange object. Note that the returned value will be $null if the search does not find a match.
And, finally, we change the value of the found cell:
$Search.value() = "Another Value"
There are also FindNext and FindPrevious methods that can be called if you want to find all instances.
Take a look at the following MSDN article for more information on the find method: http://msdn.microsoft.com/en-us/library/microsoft.office.tools.excel.namedrange.find(VS.80).aspx. For more on automating Excel in PowerShell (or other language), check out: http://msdn.microsoft.com/en-us/library/microsoft.office.tools.excel(VS.80).aspx. I also found the Developers Guide to the Excel 2007 Application Object to be useful. And, finally, to work with ranges, check out Using the Excel Range Function with PowerShell.
~Daniel