title | keywords | f1_keywords | ms.prod | api_name | ms.assetid | ms.date | ms.localizationpriority |
---|---|---|---|---|---|---|---|
Table.ConvertToText method (Word) |
vbawd10.chm156303379 |
vbawd10.chm156303379 |
word |
Word.Table.ConvertToText |
750db54e-faca-f1eb-8eb8-3a5c0dbb2c25 |
06/08/2017 |
medium |
Table.ConvertToText method (Word)
Converts a table to text and returns a Range object that represents the delimited text.
Syntax
expression. ConvertToText
( _Separator_
, _NestedTables_
)
expression Required. A variable that represents a ‘Table’ object.
Parameters
Name | Required/Optional | Data type | Description |
---|---|---|---|
Separator | Optional | Variant | The character that delimits the converted columns (paragraph marks delimit the converted rows). Can be any WdTableFieldSeparator constants. |
NestedTables | Optional | Variant | True if nested tables are converted to text. This argument is ignored if Separator is not wdSeparateByParagraphs. The default value is True. |
Remarks
When you apply the ConvertToText method to a Table object, the object is deleted. To maintain a reference to the converted contents of the table, you must assign the Range object returned by the ConvertToText method to a new object variable. In the following example, the first table in the active document is converted to text and then formatted as a bulleted list.
Dim tableTemp As Table Dim rngTemp As Range Set tableTemp = ActiveDocument.Tables(1) Set rngTemp = _ tableTemp.ConvertToText(Separator:=wdSeparateByParagraphs) rngTemp.ListFormat.ApplyListTemplate _ ListTemplate:=ListGalleries(wdBulletGallery).ListTemplates(1)
Example
This example creates a table and then converts it to text by using tabs as separator characters.
Dim docNew As Document Dim tableNew As Table Dim intTemp As Integer Dim cellLoop As Cell Dim rngTemp As Range Set docNew = Documents.Add Set tableNew = docNew.Tables.Add(Range:=Selection.Range, _ NumRows:=3, NumColumns:=3) intTemp = 1 For Each cellLoop In tableNew.Range.Cells cellLoop.Range.InsertAfter "Cell " & intTemp intTemp = intTemp + 1 Next cellLoop MsgBox "Click OK to convert table to text." Set rngTemp = _ tableNew.ConvertToText(Separator:=wdSeparateByTabs)
This example converts the table that contains the selection to text, with spaces between the columns.
If Selection.Information(wdWithInTable) = True Then Selection.Tables(1).ConvertToText Separator:=" " Else MsgBox "The insertion point is not in a table." End If
See also
Table Object
[!includeSupport and feedback]
score:9
Accepted answer
Note that an excellent source of information for Microsoft Office applications is the Object Browser. You can access it via Tools
→ Macro
→ Visual Basic Editor
. Once you are in the editor, hit F2 to browse the interfaces, methods, and properties provided by Microsoft Office applications.
Here is an example using Win32::OLE:
#!/usr/bin/perl
use strict;
use warnings;
use File::Spec::Functions qw( catfile );
use Win32::OLE;
use Win32::OLE::Const 'Microsoft Word';
$Win32::OLE::Warn = 3;
my $word = get_word();
$word->{Visible} = 0;
my $doc = $word->{Documents}->Open(catfile $ENV{TEMP}, 'test.docx');
$doc->SaveAs(
catfile($ENV{TEMP}, 'test.txt'),
wdFormatTextLineBreaks
);
$doc->Close(0);
sub get_word {
my $word;
eval {
$word = Win32::OLE->GetActiveObject('Word.Application');
};
die "$@n" if $@;
unless(defined $word) {
$word = Win32::OLE->new('Word.Application', sub { $_[0]->Quit })
or die "Oops, cannot start Word: ",
Win32::OLE->LastError, "n";
}
return $word;
}
__END__
score:0
You can’t do it in VBA if you don’t want to start Word (or another Office application). Even if you meant VB, you’d still have to start a (hidden) instance of Word to do the processing.
score:0
I need a way to convert .doc or .docx extensions to .txt without installing anything
for I in *.doc?; do mv $I `echo $ | sed 's/.docx?/.txt'`; done
Just joking.
You could use antiword for the older versions of Word documents, and try to parse the xml of the new ones.
score:0
With docxtemplater, you can easily get the full text of a word (works with docx only).
Here’s the code (Node.JS)
DocxTemplater=require('docxtemplater');
doc=new DocxTemplater().loadFromFile("input.docx");
result=doc.getFullText();
This is just three lines of code and doesn’t depend on any word instance (all plain JS)
score:1
.doc’s that use the WordprocessingML and .docx’s XML format can have their XML parsed to retrieve the actual text of the document. You’ll have to read their specifications to figure out which tags contain readable text.
score:1
The method of Sinan Ünür works well.
However, I got some crash with the files I was transforming.
Another method is to use Win32::OLE and Win32::Clipboard as such:
- Open the Word document
- Select all the text
- Copy in the Clipboard
- Print the content of Clipboard in a txt file
- Empty the Clipboard and close the Word document
Based on the script given by Sigvald Refsu in http://computer-programming-forum.com/53-perl/c44063de8613483b.htm, I came up with the following script.
Note: I chose to save the txt file with the same basename as the .docx file and in the same folder but this can easily be changed
###########################################
use strict;
use File::Spec::Functions qw( catfile );
use FindBin '$Bin';
use Win32::OLE qw(in with);
use Win32::OLE::Const 'Microsoft Word';
use Win32::Clipboard;
my $monitor_word=0; #set 1 to watch MS Word being opened and closed
sub docx2txt {
##Note: the path shall be in the form "C:dir with spacefile.docx";
my $docx_file=shift;
#MS Word object
my $Word = Win32::OLE->new('Word.Application', 'Quit') or die "Couldn't run Word";
#Monitor what happens in MS Word
$Word->{Visible} = 1 if $monitor_word;
#Open file
my $Doc = $Word->Documents->Open($docx_file);
with ($Doc, ShowRevisions => 0); #Turn of revision marks
#Select the complete document
$Doc->Select();
my $Range = $Word->Selection();
with ($Range, ExtendMode => 1);
$Range->SelectAll();
#Copy selection to clipboard
$Range->Copy();
#Create txt file
my $txt_file=$docx_file;
$txt_file =~ s/.docx$/.txt/;
open(TextFile,">$txt_file") or die "Error while trying to write in $txt_file (!$)";
printf TextFile ("%sn", Win32::Clipboard::Get());
close TextFile;
#Empty the Clipboard (to prevent warning about "huge amount of data in clipboard")
Win32::Clipboard::Set("");
#Close Word file without saving
$Doc->Close({SaveChanges => wdDoNotSaveChanges});
# Disconnect OLE
undef $Word;
}
Hope it can helps you.
score:3
Note that you can also use OpenOffice to perform miscellaneous document, drawing, spreadhseet etc. conversions on both Windows and *nix platforms.
You can access OpenOffice programmatically (in a way analogous to COM on Windows) via UNO from a variety of languages for which a UNO binding exists, including from Perl via the OpenOffice::UNO module.
On the OpenOffice::UNO page you will also find a sample Perl scriptlet which opens a document, all you then need to do is export it to txt
by using the document.storeToURL()
method — see a Python example which can be easily adapted to your Perl needs.
score:4
I strongly recommend AsposeWords if you can do Java or .NET. It can convert, without Word installed, between all major text file types.
score:4
If you have some flavour of unix installed, you can use the ‘strings’ utility to find and extract all readable strings from the document. There will be some mess before and after the text you are looking for, but the results will be readable.
score:5
For .doc, I’ve had some success with the linux command line tool antiword. It extracts the text from .doc very quickly, giving a good rendering of indentation. Then you can pipe that to a text file in bash.
For .docx, I’ve used the OOXML SDK as some other users mentioned. It is just a .NET library to make it easier to work with the OOXML that is zipped up in an OOXML file. There is a lot of metadata that you will want to discard if you are only interested in the text. Some other people have already written the code I see: DocXToText.
Aspose.Words has a very simple API with great support too I have found.
There is also this bash command from commandlinefu.com which works by unzipping the .docx:
unzip -p some.docx word/document.xml | sed -e 's/<[^>]{1,}>//g; s/[^[:print:]]{1,}//g'
score:13
A simple Perl only solution for docx:
-
Use Archive::Zip to get the
word/document.xml
file from yourdocx
file. (A docx is just a zipped archive.) -
Use XML::LibXML to parse it.
-
Then use XML::LibXSLT to transform it into text or html format. Seach the web to find a nice docx2txt.xsl file
Cheers !
J.
Related Query
- Convert Word doc or docx files into text files?
- Loop over PDF files and transform them into doc with word
- Search word doc for text and paste into excel file
- I have to convert excel rows into individual text files and the text files should be UTF-8 encoded
- How to Format Text Style to «Heading 1» while importing all Txt files into Word document
- Microsoft Word VBA macro to convert tracked changes to cross-references into text
- VBA ms word macro: convert an embedded HTML link into plain text
- VBA to find specific text in word doc and copy this text from word doc into a cell in excel
- Word Macro to convert Bullets into simple Text
- Word Macro to Convert Text into Hyperlinks
- VBA: Copy lines from text file into Word doc
- Convert text files into Excel workbooks and call a macro on another module
- Issue reading Text files with html tags and convert them into htmldoc VBA
- Use VBA with Powerpoint to Search titles in a Word Doc and Copy Text into another Word Document
- select a range of text from one Word document and copy into another Word document
- Get the entire Word document text into an array, split by newline
- How to *quickly* convert many .txt files into .xls files
- How do I change text in a Word Doc from Excel VBA when Word Doc is located on a server?
- Insert bold text into Word using VBA
- Extracting Text Box data from multiple Microsoft Word files
- How to import Tab delimited Text files into Excel?
- Import Multiple text files into workbook where worksheet name matches text file name
- Word VBA Code to select text in a cell, cut and paste special back into the same cell
- Can I import multiple text files into one excel sheet?
- Convert RTF (Rich Text Format) code into plain text in Excel
- How to insert text into TextBox in Word Macro
- Export data from 5 worksheets into 5 text files
- How to copy text with hyperlinks from outlook mail body and save into a word document with correct formatting
- How to save as text file as UTF-8 encoding from word doc using VBA?
- Extract a single line of data from numerous text files and import into Excel
More Query from same tag
- Get explicit sheet hyperlink from cell VBA excel
- How to look for repeated rows and then delete one of them?
- VBA Copy 2 Columns based on value of 1 From One Workbook to Another
- Create writable combobox with vba
- How to refer to header rather than column?
- How to save the values of some range into a variable
- Class Module vs Module in VBA
- When the search button is clicked using vba the text entered in search box is not seen by web page
- Trying to use variables as values vba to sql
- R1C1, variable and mathematics in eqation
- Array of random numbers in Excel
- Hard code email address in Excel
- Subscript our of range — two excel document open
- VBA Newbie, Trying to correct date outputs
- VBA code processing stops when adding or removing a named range
- value stored as text after update data userform
- SQL Statement in VBA can not recognize my excel worksheet
- How do I fix error for 64-bit VBA PPT converted from 32-bit
- Excel looks for other files that can’t be found when opening «Sorry, we couldn’t find ……..»
- Access For Loop Hide Objects
- Sum of quantity based on name EXCEL VBA
- MS Excel worksheet change event — keeping record of old cell value against new value
- When i add this code in workbook received an error
- Using loop to less the uses of if formula
- How can I Rank a Column of Numbers in a 2D array using only vba code
- Excel SUMIFS formula with reference to other workbook
- Copy HTML table at once, without looping through cells
- Selection sorting not working as expected on Excel
- Finding matching data in a column in another worksheet
- Extract different numbers from multiple strings
The Table object in Excel and Word is underutilized by most users and developers of Excel and Word, in my not-so-humble opinion. For example, an Excel Table object with formulas calculates faster and takes less storage space than does the comparable normal range. As useful as tables are in Excel and Word, there are times you prefer normal text.
This article explains how to convert a table to normal text manually and using VBA for both Excel and Word. These instructions apply to Office 365/Office 2019 for Windows. These instructions have not changed much from legacy versions of Word and Excel for Windows and Mac OS since 2007.
Convert an Excel Table to a Normal Range
The steps to convert a Table object to text are almost the same for Word and Excel.
1. Position the cursor anywhere in the table. This is usually done by clicking a cell.
2. From the Layout tab, choose Convert to Range (Tools group).
3. Choose Yes to confirm.
Convert a Word Table to Text
Like Excel, a table cell in Word is the intersection of a row and column, as shown here.
1. Position the cursor anywhere in the table. This is usually done by clicking text.
2. From the Layout tab, choose Convert to Text (Data group).
3. Choose how to separate the text between table cells.
3a. If you choose Tabs, the table layout is preserved with tab stops between cells, as shown. This option also works for saving a text file with Tab as the delimiter.
3b. If you choose Paragraphs, the text forms a single list, starting with the left column.
3c.The choice Commas resembles Tabs, except commas are inserted instead of tabs. Choose this option to save the text as a CSV file (Comma-Separated Values). The choice Other gives you a similar result. The character you choose becomes the delimiter.
Convert an Excel Table to a Normal Range Using VBA
The Record Macro command does not record converting a table to a normal range, so you have to type it yourself. The most simple approach is to add one statement to your procedure. The statement uses the Unlist
method of the ListObjects
collection. Reference the table by number to convert the first or only table on the active sheet:
ActiveSheet.ListObjects(1).Unlist
Reference the table by name to convert one table on the active sheet with more than one table. Use the name of the table shown on Table Design tab (Properties group):
ActiveSheet.ListObjects(“MyTable”).Unlist
You can convert all the tables on the active sheet to a range using this procedure:
Sub ConvertTablesToRange() Dim ws As Worksheet Dim tbl As ListObject Set ws = ActiveWorkbook.ActiveSheet For Each tbl In ws.ListObjects tbl.Unlist Next End Sub
Adding a nested For-Next loop converts all the tables in the active workbook:
Sub ConvertTablesToRange() Dim ws As Worksheet Dim tbl As ListObject For Each ws In ActiveWorkbook.Worksheets For Each tbl In ws.ListObjects tbl.Unlist Next Next End Sub
Convert a Word Table to Text Using VBA
The VBA code to convert a Word table object is similar to that of Excel. The comparable statement uses the ConvertToText
method of the Tables
collection.
ActiveDocument.Range.Tables(1).ConvertToText Separator:=wdSeparateByTabs
Word does not name its tables, so manually assign a bookmark to the table if you have to use VBA to convert a specific table to text. To assign a bookmark, first position the cursor in a table cell. From the Insert tab, choose the Links down-arrow button, Bookmark (Step 1, below). For Bookmark Name, type a unique name with no spaces and choose Add (Step 2).
Step 1. Bookmark command
Step 2. Add new bookmark
Here is the statement to convert a “named” table in a document to text. Tables(1) refers to the first table in the range of the “MyTable” bookmark, not the first table in the document.
ActiveDocument.Bookmarks("MyTable").Range.Tables(1).ConvertToText Separator:=wdSeparateByTabs
The VBA code to convert all the tables in a Word document resembles the code to convert all the tables on an Excel worksheet:
Sub ConvertTablesToText() Dim tbl As Table For Each tbl In ActiveDocument.Tables tbl.ConvertToText Separator:=wdSeparateByTabs Next End Sub
Remember to include the Separator parameter, unless you want to accept the default delimiter of hyphen (-):
wdSeparateByTabs wdSeparateByCommas wdSeparateByParagraphs wdSeparateByDefaultListSeparator
If you need to convert multiple Word files to other formats, like TXT, RTF, HTML or PDF, run the script below.
Option Explicit On
Sub ChangeDocsToTxtOrRTFOrHTML()
'with export to PDF in Word 2007
Dim fs As Object
Dim oFolder As Object
Dim tFolder As Object
Dim oFile As Object
Dim strDocName As String
Dim intPos As Integer
Dim locFolder As String
Dim fileType As String
On Error Resume Next
locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:Usersyour_path_here")
Select Case Application.Version
Case Is < 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML")
Case Is >= 12
Do
fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT"))
Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF")
End Select
Application.ScreenUpdating = False
Set fs = CreateObject("Scripting.FileSystemObject")
Set oFolder = fs.GetFolder(locFolder)
Set tFolder = fs.CreateFolder(locFolder & "Converted")
Set tFolder = fs.GetFolder(locFolder & "Converted")
For Each oFile In oFolder.Files
Dim d As Document
Set d = Application.Documents.Open(oFile.Path)
strDocName = ActiveDocument.Name
intPos = InStrRev(strDocName, ".")
strDocName = Left(strDocName, intPos - 1)
ChangeFileOpenDirectory tFolder
Select Case fileType
Case Is = "TXT"
strDocName = strDocName & ".txt"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText
Case Is = "RTF"
strDocName = strDocName & ".rtf"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF
Case Is = "HTML"
strDocName = strDocName & ".html"
ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML
Case Is = "PDF"
strDocName = strDocName & ".pdf"
ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF
End Select
d.Close
ChangeFileOpenDirectory oFolder
Next oFile
Application.ScreenUpdating = True
End Sub
|
|
This is somewhat a continuation on my previous post VBA – Convert XLS to XLSX in which I provided a simple little procedure to upgrade an older xls file to the newer xlsx file format.
I thought to myself, would it be nice to have a more versatile function that could migrate between various other common file formats.
So I set out to take my original function and transform it to enable to user to specify the desired output format and came up with a nice function that enabled anyone to converts Excel compatible files to another Excel compatible format.
Then I said to myself, it must be possible to do something similar for Word and set out to create a function that would enable people to convert file between the various Word compatible formats.
Below are the 2 functions I came up with.
Excel File Format Conversion Function
The following function can be used to convert files between:
- csv -> xlsx
- xls -> xlsx
- xls -> xlsm
- xls -> txt
- xlsx -> txt
- xlsx -> csv
- and so on…
Enum XlFileFormat 'Ref: https://msdn.microsoft.com/en-us/vba/excel-vba/articles/xlfileformat-enumeration-excel xlAddIn = 18 'Microsoft Excel 97-2003 Add-In *.xla xlAddIn8 = 18 'Microsoft Excel 97-2003 Add-In *.xla xlCSV = 6 'CSV *.csv xlCSVMac = 22 'Macintosh CSV *.csv xlCSVMSDOS = 24 'MSDOS CSV *.csv xlCSVWindows = 23 'Windows CSV *.csv xlCurrentPlatformText = -4158 'Current Platform Text *.txt xlDBF2 = 7 'Dbase 2 format *.dbf xlDBF3 = 8 'Dbase 3 format *.dbf xlDBF4 = 11 'Dbase 4 format *.dbf xlDIF = 9 'Data Interchange format *.dif xlExcel12 = 50 'Excel Binary Workbook *.xlsb xlExcel2 = 16 'Excel version 2.0 (1987) *.xls xlExcel2FarEast = 27 'Excel version 2.0 far east (1987) *.xls xlExcel3 = 29 'Excel version 3.0 (1990) *.xls xlExcel4 = 33 'Excel version 4.0 (1992) *.xls xlExcel4Workbook = 35 'Excel version 4.0. Workbook format (1992) *.xlw xlExcel5 = 39 'Excel version 5.0 (1994) *.xls xlExcel7 = 39 'Excel 95 (version 7.0) *.xls xlExcel8 = 56 'Excel 97-2003 Workbook *.xls xlExcel9795 = 43 'Excel version 95 and 97 *.xls xlHtml = 44 'HTML format *.htm; *.html xlIntlAddIn = 26 'International Add-In No file extension xlIntlMacro = 25 'International Macro No file extension xlOpenDocumentSpreadsheet = 60 'OpenDocument Spreadsheet *.ods xlOpenXMLAddIn = 55 'Open XML Add-In *.xlam xlOpenXMLStrictWorkbook = 61 '(&;H3D) Strict Open XML file *.xlsx xlOpenXMLTemplate = 54 'Open XML Template *.xltx xlOpenXMLTemplateMacroEnabled = 53 'Open XML Template Macro Enabled *.xltm xlOpenXMLWorkbook = 51 'Open XML Workbook *.xlsx xlOpenXMLWorkbookMacroEnabled = 52 'Open XML Workbook Macro Enabled *.xlsm xlSYLK = 2 'Symbolic Link format *.slk xlTemplate = 17 'Excel Template format *.xlt xlTemplate8 = 17 ' Template 8 *.xlt xlTextMac = 19 'Macintosh Text *.txt xlTextMSDOS = 21 'MSDOS Text *.txt xlTextPrinter = 36 'Printer Text *.prn xlTextWindows = 20 'Windows Text *.txt xlUnicodeText = 42 'Unicode Text No file extension; *.txt xlWebArchive = 45 'Web Archive *.mht; *.mhtml xlWJ2WD1 = 14 'Japanese 1-2-3 *.wj2 xlWJ3 = 40 'Japanese 1-2-3 *.wj3 xlWJ3FJ3 = 41 'Japanese 1-2-3 format *.wj3 xlWK1 = 5 'Lotus 1-2-3 format *.wk1 xlWK1ALL = 31 'Lotus 1-2-3 format *.wk1 xlWK1FMT = 30 'Lotus 1-2-3 format *.wk1 xlWK3 = 15 'Lotus 1-2-3 format *.wk3 xlWK3FM3 = 32 'Lotus 1-2-3 format *.wk3 xlWK4 = 38 'Lotus 1-2-3 format *.wk4 xlWKS = 4 'Lotus 1-2-3 format *.wks xlWorkbookDefault = 51 'Workbook default *.xlsx xlWorkbookNormal = -4143 'Workbook normal *.xls xlWorks2FarEast = 28 'Microsoft Works 2.0 far east format *.wks xlWQ1 = 34 'Quattro Pro format *.wq1 xlXMLSpreadsheet = 46 'XML Spreadsheet *.xml End Enum '--------------------------------------------------------------------------------------- ' Procedure : XLS_ConvertFileFormat ' Author : Daniel Pineault, CARDA Consultants Inc. ' Website : http://www.cardaconsultants.com ' Purpose : Converts an Excel compatible file format to another format ' Copyright : The following is release as Attribution-ShareAlike 4.0 International ' (CC BY-SA 4.0) - https://creativecommons.org/licenses/by-sa/4.0/ ' Req'd Refs: Uses Late Binding, so none required ' ' Input Variables: ' ~~~~~~~~~~~~~~~~ ' sOrigFile : String - Original file path, name and extension to be converted ' lNewFileFormat: New File format to save the original file as ' bDelOrigFile : True/False - Should the original file be deleted after the conversion ' ' Usage: ' ~~~~~~ ' Convert an xls file into a txt file and delete the xls once completed ' Call XLS_ConvertFileFormat("C:TempTest.xls", xlTextWindows) ' Convert an xls file into a xlsx file and NOT delete the xls once completed ' Call XLS_ConvertFileFormat("C:TempTest.xls", False) ' Convert a csv file into a xlsx file and delete the xls once completed ' Call XLS_ConvertFileFormat("C:TempTest.csv", xlWorkbookDefault, True) ' ' Revision History: ' Rev Date(yyyy/mm/dd) Description ' ************************************************************************************** ' 1 2018-02-27 Initial Release ' 2 2020-12-31 Fixed typo xlDBF24 -> xlDBF4 '--------------------------------------------------------------------------------------- Function XLS_ConvertFileFormat(ByVal sOrigFile As String, _ Optional lNewFileFormat As XlFileFormat = xlOpenXMLWorkbook, _ Optional bDelOrigFile As Boolean = False) As Boolean '#Const EarlyBind = True 'Use Early Binding, Req. Reference Library #Const EarlyBind = False 'Use Late Binding #If EarlyBind = True Then 'Early Binding Declarations Dim oExcel As Excel.Application Dim oExcelWrkBk As Excel.Workbook #Else 'Late Binding Declaration/Constants Dim oExcel As Object Dim oExcelWrkBk As Object #End If Dim bExcelOpened As Boolean Dim sOrigFileExt As String Dim sNewXLSFileExt As String 'Determine the file extension associated with the requested file format 'for properly renaming the output file Select Case lNewFileFormat Case xlAddIn, xlAddIn8 sNewFileExt = ".xla" Case xlCSV, xlCSVMac, xlCSVMSDOS, xlCSVWindows sNewFileExt = ".csv" Case xlCurrentPlatformText, xlTextMac, xlTextMSDOS, xlTextWindows, xlUnicodeText sNewFileExt = ".txt" Case xlDBF2, xlDBF3, xlDBF4 sNewFileExt = ".dbf" Case xlDIF sNewFileExt = ".dif" Case xlExcel12 = 50 'Excel Binary Workbook *.xlsb sNewFileExt = ".xlsb" Case xlExcel2, xlExcel2FarEast, xlExcel3, xlExcel4, xlExcel5, xlExcel7, _ xlExcel8, xlExcel9795, xlWorkbookNormal sNewFileExt = ".xls" Case xlExcel4Workbook = 35 'Excel version 4.0. Workbook format (1992) *.xlw sNewFileExt = ".xlw" Case xlHtml = 44 'HTML format *.htm; *.html sNewFileExt = ".html" Case xlIntlAddIn, xlIntlMacro sNewFileExt = "" Case xlOpenDocumentSpreadsheet 'OpenDocument Spreadsheet *.ods sNewFileExt = ".ods" Case xlOpenXMLAddIn 'Open XML Add-In *.xlam sNewFileExt = ".xlam" Case xlOpenXMLStrictWorkbook, xlOpenXMLWorkbook, xlWorkbookDefault = 51 sNewFileExt = ".xlsx" Case xlOpenXMLTemplate 'Open XML Template *.xltx sNewFileExt = ".xltx" Case xlOpenXMLTemplateMacroEnabled 'Open XML Template Macro Enabled *.xltm sNewFileExt = ".xltm" Case xlOpenXMLWorkbookMacroEnabled 'Open XML Workbook Macro Enabled *.xlsm sNewFileExt = ".xlsm" Case xlSYLK 'Symbolic Link format *.slk sNewFileExt = ".slk" Case xlTemplate, xlTemplate8 ' Template 8 *.xlt sNewFileExt = ".xlt" Case xlTextPrinter 'Printer Text *.prn sNewFileExt = ".prn" Case xlWebArchive 'Web Archive *.mht; *.mhtml sNewFileExt = ".mhtml" Case xlWJ2WD1 'Japanese 1-2-3 *.wj2 sNewFileExt = ".wj2" Case xlWJ3, xlWJ3FJ3 'Japanese 1-2-3 format *.wj3 sNewFileExt = ".wj3" Case xlWK1, xlWK1ALL, xlWK1FMT 'Lotus 1-2-3 format *.wk1 sNewFileExt = ".wk1" Case xlWK3, xlWK3FM3 'Lotus 1-2-3 format *.wk3 sNewFileExt = ".wk3" Case xlWK4 'Lotus 1-2-3 format *.wk4 sNewFileExt = ".wk4" Case xlWKS, xlWorks2FarEast 'Lotus 1-2-3 format *.wks sNewFileExt = ".wks" Case xlWQ1 'Quattro Pro format *.wq1 sNewFileExt = ".wq1" Case xlXMLSpreadsheet 'XML Spreadsheet *.xml sNewFileExt = ".xml" End Select 'Determine the original file's extension for properly renaming the output file sOrigFileExt = "." & Right(sOrigFile, Len(sOrigFile) - InStrRev(sOrigFile, ".")) 'Start Excel On Error Resume Next Set oExcel = GetObject(, "Excel.Application") 'Bind to existing instance of Excel If Err.Number <> 0 Then 'Could not get instance of Excel, so create a new one Err.Clear On Error GoTo Error_Handler Set oExcel = CreateObject("Excel.Application") Else 'Excel was already running bExcelOpened = True End If On Error GoTo Error_Handler oExcel.ScreenUpdating = False oExcel.Visible = False 'Keep Excel hidden until we are done with our manipulation Set oExcelWrkBk = oExcel.Workbooks.Open(sOrigFile) 'Open the original file 'Save it as the requested new file format oExcelWrkBk.SaveAS Replace(sOrigFile, sOrigFileExt, sNewFileExt), lNewFileFormat, , , , False XLS_ConvertFileFormat = True 'Report back that we managed to save the file in the new format oExcelWrkBk.Close False 'Close the workbook If bExcelOpened = False Then oExcel.Quit 'Quit Excel only if we started it Else oExcel.ScreenUpdating = True oExcel.Visible = True End If If bDelOrigFile = True Then Kill (sOrigFile) 'Delete the original file if requested Error_Handler_Exit: On Error Resume Next Set oExcelWrkBk = Nothing Set oExcel = Nothing Exit Function Error_Handler: MsgBox "The following error has occurred" & vbCrLf & vbCrLf & _ "Error Number: " & Err.Number & vbCrLf & _ "Error Source: XLS_ConvertFileFormat" & vbCrLf & _ "Error Description: " & Err.Description & _ Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _ , vbOKOnly + vbCritical, "An Error has Occurred!" oExcel.ScreenUpdating = True oExcel.Visible = True 'Make excel visible to the user Resume Error_Handler_Exit End Function
Word File Format Conversion Function
The following function can be used to convert files between:
- doc -> docx
- docx -> dotx
- docx -> pdf
- docx -> html
- and so on…
Enum WdSaveFormat 'Ref: https://msdn.microsoft.com/en-us/vba/word-vba/articles/wdsaveformat-enumeration-word wdFormatDocument = 0 'Microsoft Office Word 97 - 2003 binary file format. wdFormatDOSText = 4 'Microsoft DOS text format. *.txt wdFormatDOSTextLineBreaks = 5 'Microsoft DOS text with line breaks preserved. *.txt wdFormatEncodedText = 7 'Encoded text format. *.txt wdFormatFilteredHTML = 10 'Filtered HTML format. wdFormatFlatXML = 19 'Open XML file format saved as a single XML file. ' wdFormatFlatXML = 20 'Open XML file format with macros enabled saved as a single XML file. wdFormatFlatXMLTemplate = 21 'Open XML template format saved as a XML single file. wdFormatFlatXMLTemplateMacroEnabled = 22 'Open XML template format with macros enabled saved as a single XML file. wdFormatOpenDocumentText = 23 'OpenDocument Text format. *.odt wdFormatHTML = 8 'Standard HTML format. *.html wdFormatRTF = 6 'Rich text format (RTF). *.rtf wdFormatStrictOpenXMLDocument = 24 'Strict Open XML document format. wdFormatTemplate = 1 'Word template format. wdFormatText = 2 'Microsoft Windows text format. *.txt wdFormatTextLineBreaks = 3 'Windows text format with line breaks preserved. *.txt wdFormatUnicodeText = 7 'Unicode text format. *.txt wdFormatWebArchive = 9 'Web archive format. wdFormatXML = 11 'Extensible Markup Language (XML) format. *.xml wdFormatDocument97 = 0 'Microsoft Word 97 document format. *.doc wdFormatDocumentDefault = 16 'Word default document file format. For Word, this is the DOCX format. *.docx wdFormatPDF = 17 'PDF format. *.pdf wdFormatTemplate97 = 1 'Word 97 template format. wdFormatXMLDocument = 12 'XML document format. wdFormatXMLDocumentMacroEnabled = 13 'XML document format with macros enabled. wdFormatXMLTemplate = 14 'XML template format. wdFormatXMLTemplateMacroEnabled = 15 'XML template format with macros enabled. wdFormatXPS = 18 'XPS format. *.xps End Enum '--------------------------------------------------------------------------------------- ' Procedure : Word_ConvertFileFormat ' Author : Daniel Pineault, CARDA Consultants Inc. ' Website : http://www.cardaconsultants.com ' Purpose : Converts a Word compatible file format to another format ' Copyright : The following is release as Attribution-ShareAlike 4.0 International ' (CC BY-SA 4.0) - https://creativecommons.org/licenses/by-sa/4.0/ ' Req'd Refs: Uses Late Binding, so none required ' ' Input Variables: ' ~~~~~~~~~~~~~~~~ ' sOrigFile : String - Original file path, name and extension to be converted ' lNewFileFormat: New File format to save the original file as ' bDelOrigFile : True/False - Should the original file be deleted after the conversion ' ' Usage: ' ~~~~~~ ' Convert a doc file into a docx file but retain the original copy ' Call Word_ConvertFileFormat("C:UsersDanielDocumentsResume.doc", wdFormatPDF) ' Convert a doc file into a docx file and delete the original doc once converted ' Call Word_ConvertFileFormat("C:UsersDanielDocumentsResume.doc", wdFormatPDF, True) ' ' Revision History: ' Rev Date(yyyy/mm/dd) Description ' ************************************************************************************** ' 1 2018-02-27 Initial Release '--------------------------------------------------------------------------------------- Function Word_ConvertFileFormat(ByVal sOrigFile As String, _ Optional lNewFileFormat As WdSaveFormat = wdFormatDocumentDefault, _ Optional bDelOrigFile As Boolean = False) As Boolean '#Const EarlyBind = True 'Use Early Binding, Req. Reference Library #Const EarlyBind = False 'Use Late Binding #If EarlyBind = True Then 'Early Binding Declarations Dim oWord As Word.Application Dim oDoc As Word.Document #Else 'Late Binding Declaration/Constants Dim oWord As Object Dim oDoc As Object #End If Dim bWordOpened As Boolean Dim sOrigFileExt As String Dim sNewFileExt As String 'Determine the file extension associated with the requested file format 'for properly renaming the output file Select Case lNewFileFormat Case wdFormatDocument sNewFileExt = "." Case wdFormatDOSText, wdFormatDOSTextLineBreaks, wdFormatEncodedText, wdFormatOpenDocumentText, wdFormatText, wdFormatTextLineBreaks, wdFormatUnicodeText sNewFileExt = ".txt" Case wdFormatFilteredHTML, wdFormatHTML sNewFileExt = ".html" Case wdFormatFlatXML, wdFormatXML, wdFormatXMLDocument sNewFileExt = ".xml" Case wdFormatFlatXMLTemplate sNewFileExt = "." Case wdFormatFlatXMLTemplateMacroEnabled sNewFileExt = "." Case wdFormatRTF sNewFileExt = ".rtf" Case wdFormatStrictOpenXMLDocument sNewFileExt = "." Case wdFormatTemplate sNewFileExt = "." Case wdFormatWebArchive sNewFileExt = "." Case wdFormatDocument97 sNewFileExt = ".doc" Case wdFormatDocumentDefault sNewFileExt = ".docx" Case wdFormatPDF sNewFileExt = ".pdf" Case wdFormatTemplate97 sNewFileExt = "." Case wdFormatXMLDocumentMacroEnabled sNewFileExt = ".docm" Case wdFormatXMLTemplate sNewFileExt = ".doct" Case wdFormatXMLTemplateMacroEnabled sNewFileExt = "." Case wdFormatXPS sNewFileExt = ".xps" End Select 'Determine the original file's extension for properly renaming the output file sOrigFileExt = "." & Right(sOrigFile, Len(sOrigFile) - InStrRev(sOrigFile, ".")) 'Start Excel On Error Resume Next Set oWord = GetObject(, "Word.Application") 'Bind to existing instance of Word If Err.Number <> 0 Then 'Could not get instance of Word, so create a new one Err.Clear On Error GoTo Error_Handler Set oWord = CreateObject("Word.Application") Else 'Word was already running bWordOpened = True End If On Error GoTo Error_Handler oWord.Visible = False 'Keep Word hidden until we are done with our manipulation Set oDoc = oWord.Documents.Open(sOrigFile) 'Open the original file 'Save it as the requested new file format oDoc.SaveAs2 Replace(sOrigFile, sOrigFileExt, sNewFileExt), lNewFileFormat Word_ConvertFileFormat = True 'Report back that we managed to save the file in the new format oDoc.Close False 'Close the document If bWordOpened = False Then oWord.Quit 'Quit Word only if we started it Else oWord.Visible = True 'Since it was already open, ensure it is visible End If If bDelOrigFile = True Then Kill (sOrigFile) 'Delete the original file if requested Error_Handler_Exit: On Error Resume Next Set oDoc = Nothing Set oWord = Nothing Exit Function Error_Handler: MsgBox "The following error has occurred" & vbCrLf & vbCrLf & _ "Error Number: " & Err.Number & vbCrLf & _ "Error Source: XLS_ConvertFileFormat" & vbCrLf & _ "Error Description: " & Err.Description & _ Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _ , vbOKOnly + vbCritical, "An Error has Occurred!" oWord.Visible = True 'Make excel visible to the user Resume Error_Handler_Exit End Function
Missing File Extensions
Unlike the Excel function, the Word function is currently missing some of the file extensions. I created the general framework, but could not easily find the associated file extensions to some of the file format. You need only complete the missing entry and it will work. So simply update the
sNewFileExt = "."
entries as applicable.