I use CarlosAG-Dll which creates a XML-Excel-file for me (inside a MemoryStream).
Response.ContentType = "application/vnd.ms-excel";
Response.AppendHeader("content-disposition", "myfile.xml");
memory.WriteTo(Response.OutputStream);
My Problem here is, that I get at client side a myfile.xls (IE) or a myfile.xml.xls (FF) and therefore get an annoying security warning from excel.
I tried it as well with application/vnd.openxmlformats-officedocument.spreadsheetml.sheet (xlsx) but then it won’t even open.
So I need to either cut the .xml and send it as vnd.ms-excel (how?) or take another MIME-type (but which one?).
edit: I found a bug description here
I wonder if this is still open and why?
asked Nov 17, 2011 at 11:17
UNeverNoUNeverNo
5493 gold badges8 silver badges29 bronze badges
0
Use like this
Response.ContentType = "application/vnd.ms-excel";
Response.AppendHeader("content-disposition", "attachment; filename=myfile.xls");
For Excel 2007 and above the MIME type differs
Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
Response.AppendHeader("content-disposition", "attachment; filename=myfile.xlsx");
See list of MIME types
Office 2007 File Format MIME Types
EDIT:
If the content is not a native Excel file format, but is instead a
text based format (such as CSV, TXT, XML), then the web site can add
the following HTTP header to their GET response to tell IE to use an
alternate name, and in the name you can set the extension to the right
content type:Response.AddHeader "Content-Disposition", "Attachment;Filename=myfile.csv"
For more details see this link
David Moles
46.7k27 gold badges133 silver badges231 bronze badges
answered Nov 17, 2011 at 11:52
PrasanthPrasanth
3,02930 silver badges44 bronze badges
5
If your document is an Excel Xml 2003 document, you should use the text/xml content type.
Response.ContentType = "text/xml";
Do not specifiy content-disposition.
This technichs works great with Handler, not with WebForm.
answered Jul 3, 2013 at 13:09
The security warning is NOT about the MIME type — it is a client-side security setting you can’t disable from the server side !
Another point — change Response.AppendHeader("content-disposition", "myfile.xml");
to:
Response.AppendHeader("content-disposition", "attachment; filename=myfile.xlsx");
OR
Response.AppendHeader("content-disposition", "inline; filename=myfile.xlsx");
For reference see http://www.ietf.org/rfc/rfc2183.txt
EDIT — as per comment:
IF the format is not XLSX (Excel 2007 and up) then use myfile.xls
in the above code.
answered Nov 17, 2011 at 11:22
YahiaYahia
69.2k9 gold badges113 silver badges144 bronze badges
3
This topic lists the most common MIME types with corresponding document types, ordered by their common extensions.
The following two important MIME types are the default types:
text/plain
is the default value for textual files. A textual file should be human-readable and must not contain binary data.application/octet-stream
is the default value for all other cases. An unknown file type should use this type. Browsers are particularly careful when manipulating these files to protect users from software vulnerabilities and possible dangerous behavior.
IANA is the official registry of MIME media types and maintains a list of all the official MIME types. This table lists important MIME types for the Web:
Extension | Kind of document | MIME Type |
---|---|---|
.aac |
AAC audio | audio/aac |
.abw |
AbiWord document | application/x-abiword |
.arc |
Archive document (multiple files embedded) | application/x-freearc |
.avif |
AVIF image | image/avif |
.avi |
AVI: Audio Video Interleave | video/x-msvideo |
.azw |
Amazon Kindle eBook format | application/vnd.amazon.ebook |
.bin |
Any kind of binary data | application/octet-stream |
.bmp |
Windows OS/2 Bitmap Graphics | image/bmp |
.bz |
BZip archive | application/x-bzip |
.bz2 |
BZip2 archive | application/x-bzip2 |
.cda |
CD audio | application/x-cdf |
.csh |
C-Shell script | application/x-csh |
.css |
Cascading Style Sheets (CSS) | text/css |
.csv |
Comma-separated values (CSV) | text/csv |
.doc |
Microsoft Word | application/msword |
.docx |
Microsoft Word (OpenXML) | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
.eot |
MS Embedded OpenType fonts | application/vnd.ms-fontobject |
.epub |
Electronic publication (EPUB) | application/epub+zip |
.gz |
GZip Compressed Archive | application/gzip |
.gif |
Graphics Interchange Format (GIF) | image/gif |
.htm , .html |
HyperText Markup Language (HTML) | text/html |
.ico |
Icon format | image/vnd.microsoft.icon |
.ics |
iCalendar format | text/calendar |
.jar |
Java Archive (JAR) | application/java-archive |
.jpeg , .jpg |
JPEG images | image/jpeg |
.js |
JavaScript | text/javascript (Specifications: HTML and RFC 9239) |
.json |
JSON format | application/json |
.jsonld |
JSON-LD format | application/ld+json |
.mid , .midi |
Musical Instrument Digital Interface (MIDI) | audio/midi , audio/x-midi |
.mjs |
JavaScript module | text/javascript |
.mp3 |
MP3 audio | audio/mpeg |
.mp4 |
MP4 video | video/mp4 |
.mpeg |
MPEG Video | video/mpeg |
.mpkg |
Apple Installer Package | application/vnd.apple.installer+xml |
.odp |
OpenDocument presentation document | application/vnd.oasis.opendocument.presentation |
.ods |
OpenDocument spreadsheet document | application/vnd.oasis.opendocument.spreadsheet |
.odt |
OpenDocument text document | application/vnd.oasis.opendocument.text |
.oga |
OGG audio | audio/ogg |
.ogv |
OGG video | video/ogg |
.ogx |
OGG | application/ogg |
.opus |
Opus audio | audio/opus |
.otf |
OpenType font | font/otf |
.png |
Portable Network Graphics | image/png |
.pdf |
Adobe Portable Document Format (PDF) | application/pdf |
.php |
Hypertext Preprocessor (Personal Home Page) | application/x-httpd-php |
.ppt |
Microsoft PowerPoint | application/vnd.ms-powerpoint |
.pptx |
Microsoft PowerPoint (OpenXML) | application/vnd.openxmlformats-officedocument.presentationml.presentation |
.rar |
RAR archive | application/vnd.rar |
.rtf |
Rich Text Format (RTF) | application/rtf |
.sh |
Bourne shell script | application/x-sh |
.svg |
Scalable Vector Graphics (SVG) | image/svg+xml |
.tar |
Tape Archive (TAR) | application/x-tar |
.tif , .tiff |
Tagged Image File Format (TIFF) | image/tiff |
.ts |
MPEG transport stream | video/mp2t |
.ttf |
TrueType Font | font/ttf |
.txt |
Text, (generally ASCII or ISO 8859-n) | text/plain |
.vsd |
Microsoft Visio | application/vnd.visio |
.wav |
Waveform Audio Format | audio/wav |
.weba |
WEBM audio | audio/webm |
.webm |
WEBM video | video/webm |
.webp |
WEBP image | image/webp |
.woff |
Web Open Font Format (WOFF) | font/woff |
.woff2 |
Web Open Font Format (WOFF) | font/woff2 |
.xhtml |
XHTML | application/xhtml+xml |
.xls |
Microsoft Excel | application/vnd.ms-excel |
.xlsx |
Microsoft Excel (OpenXML) | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
.xml |
XML | application/xml is recommended as of RFC 7303 (section 4.1), but text/xml is still used sometimes. You can assign a specific MIME type to a file with .xml extension depending on how its contents are meant to be interpreted. For instance, an Atom feed is application/atom+xml , but application/xml serves as a valid default. |
.xul |
XUL | application/vnd.mozilla.xul+xml |
.zip |
ZIP archive | application/zip |
.3gp |
3GPP audio/video container | video/3gpp ; audio/3gpp if it doesn’t contain video |
.3g2 |
3GPP2 audio/video container | video/3gpp2 ; audio/3gpp2 if it doesn’t contain video |
.7z |
7-zip archive | application/x-7z-compressed |
Microsoft Excel makes it easy to import Extensible Markup Language (XML) data that is created from other databases and applications, to map XML elements from an XML schema to worksheet cells, and to export revised XML data for interaction with other databases and applications. Think of these XML features as turning Office Excel into an XML data file generator with a familiar user interface.
In this article
-
Why use XML in Excel?
-
XML data and schema files
-
Key XML and Excel scenarios
-
-
The basic process of using XML data in Excel
-
Working with XML maps
-
Using the XML Source task pane
-
Element types and their icons
-
Working with single-mapped cells
-
Working with repeating cells in XML tables
-
XML map security considerations
-
Importing XML data
-
Working with an inferred schema
-
Exporting XML data
-
Using the Excel macro-enabled Office Open XML Format file
-
Why use XML in Excel?
XML is a technology that is designed for managing and sharing structured data in a human-readable text file. XML follows industry-standard guidelines and can be processed by a variety of databases and applications. Using XML, application designers can create their own customized tags, data structures, and schemas. In short, XML greatly eases the definition, transmission, validation, and interpretation of data between databases, applications, and organizations.
XML data and schema files
Excel works primarily with two types of XML files:
-
XML data files (.xml), which contain the custom tags and structured data.
-
Schema files (.xsd), which contain schema tags that enforce rules, such as data type and validation.
The XML standard also defines Extensible Stylesheet Language Transformation (XSLT) (.xslt) files, which are used to apply styles and transform XML data into different presentation formats. You can apply these transforms before you import XML files into Excel and after you export XML files from Excel. If XSLT files are linked to XML data files that you import into Excel, you do have the option to apply or not apply the formatting before the data is added to the worksheet, but only when you open an XML file by using the Open command from within Excel. Choose the XML Files (*.xml) file type before you click the Open button to see the XML files in the folder.
Key XML and Excel scenarios
By using XML and Excel, you can manage workbooks and data in ways that were previously impossible or very difficult. By using XML maps, you can easily add, identify, and extract specific pieces of business data from Excel documents. For example, an invoice that contains the name and address of a customer or a report that contains last quarter’s financial results are no longer just static reports. You can easily import this information from databases and applications, revise it, and export it to the same or other databases and applications.
The following are key scenarios that the XML features are designed to address:
-
Extend the functionality of existing Excel templates by mapping XML elements onto existing cells. This makes it easier to get XML data into and out of your templates without having to redesign them.
-
Use XML data as input to your existing calculation models by mapping XML elements onto existing worksheets.
-
Import XML data files into a new workbook.
-
Import XML data from a Web service into your Excel worksheet.
-
Export data in mapped cells to XML data files independent from other data in the workbook.
Top of Page
The basic process of using XML data in Excel
The following diagram shows how the different files and operations work together when you use XML with Excel. Essentially, there are five phases to the process:
Adding an XML schema file (.xsd) to a workbook
Mapping XML schema elements to individual cells or XML tables
Importing an XML data file (.xml) and binding the XML elements to mapped cells
Entering data, moving mapped cells, and leveraging Excel functionality, while preserving XML structure and definitions
Exporting revised data from mapped cells to an XML data file
Working with XML maps
You can create or open a workbook in Excel, attach an XML schema file (.xsd) to the workbook, and then use the XML Source task pane to map XML elements of the schema to individual cells or tables. After you map the XML elements to your worksheet, you can import and export XML data into and out of the mapped cells.
When you add an XML schema file (.xsd) to your workbook, you create an XML map. In general, XML maps are used to create mapped cells and to manage the relationship between mapped cells and individual elements in the XML schema. In addition, these XML maps are used to bind the contents of mapped cells to elements in the schema when you import or export XML data files (.xml).
There are two kinds of mapped cells that you can create: single-mapped cells and repeating cells (which appear as XML tables). To make designing your worksheet more flexible, you can drag the mapped cells anywhere on a worksheet and into any order — even one different from the XML schema. You can also choose which elements to map and not map.
The following rules about using XML maps are important to know:
-
A workbook can contain one or more XML maps.
-
You can only map one element to one location in a workbook at a time.
-
Each XML map is an independent entity, even if multiple XML maps in the same workbook refer to the same schema.
-
An XML map can only contain one root element. If you add a schema that defines more than one root element, you are prompted to choose the root element to use for the new XML map.
Using the XML Source task pane
You use the XML Source task pane to manage XML maps. To open it, on the Developer tab, in the XML group, click Source. The following diagram shows the main features of this task pane.
1. Lists XML maps that were added to the workbook
2. Displays a hierarchical list of XML elements in the currently listed XML map
3. Sets options when working with the XML Source task pane and the XML data, such as how to preview the data and control headings
4. Opens the XML Maps dialog box, which you can use to add, delete, or rename XML maps
5. Verifies whether you can export XML data through the current XML map
Top of Page
Element types and their icons
The following table summarizes each type of XML element that Excel can work with and the icon that is used to represent each type of element.
Element type |
Icon |
---|---|
Parent element |
|
Required parent element |
|
Repeating parent element |
|
Required repeating parent element |
|
Child element |
|
Required child element |
|
Repeating child element |
|
Required repeating child element |
|
Attribute |
|
Required attribute |
|
Simple content in a complex structure |
|
Required simple content in a complex structure |
|
Top of Page
Working with single-mapped cells
A single-mapped cell is a cell that has been mapped to a nonrepeating XML element. You create a single-mapped cell by dragging a nonrepeating XML element from the XML Source task pane onto a single cell in your worksheet.
When you drag a nonrepeating XML element onto the worksheet, you can use a smart tag to choose to include the XML element name as a heading above or just to the left of the single-mapped cell, or you can use an existing cell value as a heading.
You can also use a formula in a single-mapped cell, if the cell is mapped to an XML element with an XML Schema Definition (XSD) data type that Excel interprets as a number, date, or time.
Top of Page
Working with repeating cells in XML tables
XML tables are similar in appearance and functionality to Excel tables. An XML table is an Excel table that has been mapped to one or more XML repeating elements. Each column in the XML table represents an XML element.
An XML table is created when you:
-
Use the Import command (in the XML group on the Developer tab) to import an XML data file.
-
Use the Open command from within Excel to open an XML data file — and then select As an XML table in the Open XML dialog box.
-
Use the From XML Data Import command (from the From Other Sources command button, in the Get External Data group, on the Data tab) to import an XML data file — and then select XML table in existing worksheet or New worksheet in the Import Data dialog box.
-
Drag one or more repeating elements from the XML Source task pane to a worksheet.
When you create an XML table, the XML element names are automatically used as column headings. You can change these to any column headings that you want. However, the original XML element names are always used when you export data from the mapped cells.
Two options under the Options button in the XML Source task pane are useful when you work with XML tables:
-
Automatically Merge Elements When Mapping
When selected, Excel creates one XML table from multiple fields as they are dropped onto the worksheet. This option works as long as the multiple fields are dropped on the same row, one adjacent to the other. When this option is cleared, each element appears as its own XML table. -
My Data Has Headings
When selected, existing heading data is used as column headings for repeating elements that you map to your worksheet. When this option is cleared, the XML element names are used as column headings.
Using XML tables, you can easily import, export, sort, filter, and print data based on an XML data source. However, XML tables do have some limitations regarding how they can be arranged on the worksheet.
-
XML tables are row-based, meaning that they grow from the header row down. You cannot add new entries above existing rows.
-
You cannot transpose an XML table so that new entries will be added to the right.
You can use formulas in columns that are mapped to XML elements with an XML Schema Definition (XSD) data type that Excel interprets as a number, date, or time. Just as in an Excel table, formulas in an XML table are filled down the column when new rows are added to the table.
XML map security considerations
An XML map and its data source information are saved with the Excel workbook, not a specific worksheet. A malicious user can view this map information by using a Microsoft Visual Basic for Applications (VBA) macro. Furthermore, if you save your workbook as a macro-enabled Excel Office Open XML Format File, this map information can be viewed through Microsoft Notepad or through another text-editing program.
If you want to keep using the map information but remove the potentially sensitive data source information, you can delete the data source definition of the XML schema from the workbook, but still export the XML data, by clearing the Save data source definition in workbook check box in the XML Map Properties dialog box, which is available from the Map Properties command in the XML group on the Developer tab.
If you delete a worksheet before you delete a map, the map information about the data sources, and possibly other sensitive information, is still saved in the workbook. If you are updating the workbook to remove sensitive information, make sure that you delete the XML map before you delete the worksheet, so that the map information is permanently removed from the workbook.
Top of Page
Importing XML data
You can import XML data into an existing XML map in your workbook. When you import data, you bind the data from the file to an XML map that is stored in your workbook. This means that each data element in the XML data file has a corresponding element, in the XML schema, that you mapped from an XML Schema file or inferred schema. Each XML map can only have one XML data binding, and an XML data binding is bound to all of the mappings that were created from a single XML map.
You can display the XML Map Properties dialog box (Click Map Properties in the XML group on the Developer tab.), which has three options, all selected by default, that you can set or clear to control the behavior of an XML data binding:
-
Validate data against schema for import and export Specifies whether Excel validates data against the XML map when importing data. Click this option when you want to ensure that the XML data that you import conforms to the XML schema.
-
Overwrite existing data with new data Specifies whether data is overwritten when you import data. Click this option when you want to replace the current data with new data, for example, when up-to-date data is contained in the new XML data file.
-
Append new data to existing XML tables Specifies whether the contents of the data source are appended to the existing data on the worksheet. Click this option, for example, when you are consolidating data from several similar XML data files into an XML table, or you do not want to overwrite the contents of a cell that contains a function.
When you import XML data, you may want to overwrite some mapped cells but not others. For example, some mapped cells may contain formulas and you don’t want to overwrite the formula when you import an XML file. There are two approaches that you can take:
-
Unmap the elements that you don’t want overwritten, before you import the XML data. After you import the XML data, you can remap the XML element to the cells containing the formulas, so that you can export the results of the formulas to the XML data file.
-
Create two XML maps from the same XML schema. Use one XML map for importing the XML data. In this «Import» XML map, don’t map elements to the cells that contain formulas or other data that you don’t want overwritten. Use another XML map for exporting the data. In this «Export» XML map, map the elements that you want to export to an XML file.
Note: The ability to import XML data from a Web service by using a Data Retrieval Service Connection (.uxdc) file to connect to a data source is no longer supported in versions later than Excel 2003 through the user interface. If you open a workbook that was created in Excel 2003, you can still view the data, but you cannot edit or refresh the source data.
Working with an inferred schema
If you import XML data without first adding a corresponding XML schema to create an XML map, Excel tries to infer a schema for you based on the tags that are defined in the XML data file. The inferred schema is stored with the workbook, and the inferred schema allows you to work with XML data if an XML schema file isn’t associated with the workbook.
When you work with imported XML data that has an inferred schema, you can also customize the XML Source task pane. Select the Preview Data in Task Pane option from the Options button to display the first row of data as sample data in the element list, if you imported XML data associated with the XML map in the current session of Excel.
You cannot export the Excel inferred schema as a separate XML schema data file (.xsd). Although there are XML schema editors and other methods for creating an XML schema file, you may not have convenient access to them or know how to use them. As an alternative, you can use the Excel 2003 XML Tools Add-in Version 1.1, which can create a schema file from an XML map. For more information, see Using the Excel 2003 XML Tools Add-in Version 1.1.
Exporting XML data
You export XML data by exporting the contents of mapped cells on the worksheet. When you export data, Excel applies the following rules to determine what data to save and how to save it:
-
Empty items are not created when blank cells exist for an optional element, but empty items are created when blank cells exist for a required element.
-
Unicode Transformation Format-8 (UTF-8) encoding is used to write the data.
-
All namespaces are defined in the Root XML element.
-
Excel overwrites existing namespace prefixes. The default namespace is assigned a prefix of ns0. Successive namespaces are designated ns1, ns2 to ns<count> where <count> is the number of namespaces written to the XML file.
-
Comment nodes are not preserved.
You can display the XML Map Properties dialog box (Click Map Properties in the XML group on the Developer tab.) and then use the Validate data against schema for import and export option (active by default) to specify whether Excel validates data against the XML map when exporting data. Click this option when you want to ensure that the XML data you export conforms to the XML schema.
Using the Excel Macro-enabled Office Open XML Format File
You can save an Excel workbook in a variety of file formats, including the Excel macro-enabled Office Open XML Format File (.xlsm). Excel has a defined XML schema that defines the contents of an Excel workbook, including XML tags that store all workbook information, such as data and properties, and define the overall structure of the workbook. Custom applications can use this Excel macro-enabled Office XML Format File. For example, developers may want to create a custom application to search for data in multiple workbooks that are saved in the this format and create a reporting system based on the data found.
Top of Page
Need more help?
You can always ask an expert in the Excel Tech Community or get support in the Answers community.
See Also
Import XML data
Map XML elements to cells in an XML Map
Export XML data
Append or overwrite mapped XML data
The content type for .xlsx files is:
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Or use this:
Response.ContentType = "application/vnd.ms-excel"; Response.AppendHeader("content-disposition", "attachment; filename=myfile.xls");
For Excel 2007 and above the MIME type differs
Response.ContentType = "application/application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"; Response.AppendHeader("content-disposition", "attachment; filename=myfile.xlsx");
Or if you are trying to read the file then try this:
DataSet objds = new DataSet(); string ConnStr = ""; if (FileExtension == ".xlsx") { ConnStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + FileName + ";Extended Properties="Excel 12.0 Xml;HDR=No;IMEX=1";"; } else { ConnStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + FileName + ";Extended Properties="Excel 8.0;HDR=No;IMEX=1";"; } OleDbCommand selectCommand = new OleDbCommand(); OleDbConnection connection = new OleDbConnection(); OleDbDataAdapter adapter = new OleDbDataAdapter(); connection.ConnectionString = ConnStr; string strSQL = "SELECT * FROM [Sheet1$]"; if (connection.State != ConnectionState.Open) connection.Open(); OleDbCommand cmd = new OleDbCommand(strSQL, connection); OleDbDataAdapter da = new OleDbDataAdapter(cmd); da.Fill(objds); connection.Close();
All the best.
—Amit
Table of Contents
- Problem Exploration
- The Excel XLSX file format
- .NET classes to create real Excel XLSX file from scratch
- Anatomy of a minimal Excel XLSX package file
- Minimal package structure
- Minimal package parts
- Required the document “start part”: workbook.xml
- Required: one (main) relationship part: .rels
- Required one worksheet: sheet1.xml
- Required: workbook relationship part
- Worksheet content
- The PowerShell code
- Links:
- Office Open XML Format Links:
- PowerShell and the Excel COM Object Model:
- Tips:
- See Also
Problem Exploration
I had the need to store data into a Microsoft Excel compatible file.
Attempt 1: Use the Excel COM object model.
This is not a good solution because:
PowerShell runs very often on Servers or clients without a Microsoft Office / Excel installation.
The use of the Excel COM Object can cause errors inside a Scheduled Task.
Excel can read and store CSV data.
Attempt 2: Use CSV data (with Export-CSV)
This is not a good solution either because:
CSV is not just another type of Excel file. On opening a CSV data file, Microsoft Excel converts data automatically. This is not acceptable.
If Microsoft Excel outputs an Excel worksheet into a CSV file, the output does not always follow the CSV format rules. Excel only places quotes around certain fields, not in all fields. This leads to unreadable CSV files.
I had the following requirements:
- The solution that works in PowerShell 2.0 and 3.0 (and later)
- Create an Excel compatible file without having Excel
- (do not use the Excel COM object model)
- The solution which works without 3rd party tools
- Should work similar like the Export-CSV Cmdlet
- Should have the possibility to append a worksheet with data (-append parameter)
My Internet research shows no solution which fits these requirements.
But I found a C# code to do the Job. So here is my Translation of this code into PowerShell.
For C# code see here:
How to use the Office XML file format and the packaging components from the .NET Framework 3.0 to create a simple Excel 2007 workbook or a simple Word 2007 document
http://support.microsoft.com/kb/931866/en-us%20
The Excel XLSX file format
Starting with the Microsoft Office Version of 2007 Microsoft has changed the default application file formats from old, proprietary, closed formats (DOC, XLS, PPT) to new, open and standardized Open XML formats (DOCX, XLSX, and PPTX).
The Office Open XML (also informally known as OOXML or OpenXML) is a zipped, XML-based file format. To represent spreadsheets, charts, presentations, and word processing documents.
Office Open XML is standardized by the European Computer Manufacturers Association (ECMA) where they became ECMA-376 and, in later versions, by ISO and IEC (as ISO/IEC 29500).
Every Open XML file is a zip file (a package) typical containing a number of UTF-8 encoded XML files («parts»).
Inside the XML parts of the package Multipurpose Internet Mail Extensions (MIME) types and Namespaces are used as metadata.
The XML parts (files) of the package are encoded in specialized markup languages. In the case of Microsoft Excel, this is the markup language called SpreadsheetML.
The package also contains relationship files (part). The relationship parts have the extension .rels. They can be found in a folder with the name _rels.
The relationship parts define the relationships between the parts inside the package (internal) and to resources outside of the package (external).
The package may also contain other (binary) media files such as sounds or images.
The structure of the package is organized according to the Open Packaging Conventions as outlined in the OOXML standard.
You can look at the file structure and the files that comprise an XLSX file by simply unzipping the .xlsx file.
.NET classes to create real Excel XLSX file from scratch
With .Net 3.0 Microsoft has introduced the System.IO.Packaging namespace which lives inside the WindowsBase.dll
WindowsBase.dll is one of the core Assemblies used for Windows Presentation Foundation WPF.
(The Windows Presentation Foundation WPF is Microsoft’s next generation UI framework to create applications with a rich user experience even for the new Windows 8 tiles GUI.)
So you don’t have to worry that WindowsBase.dll moves around or goes away.
WindowsBase.dll can be found in:
C:Program FilesReference AssembliesMicrosoftFrameworkv3.0WindowsBase.dll
The System.IO.Packaging namespace provides classes that support Office Open XML Zip compressed containers and other formats, which store multiple data objects in a single container.
System.IO.Packaging contains the ZipPackage class to work with Zip compressed package files.
See the Microsoft developer network (MSDN) documentation for this namespace and classes.
http://msdn.microsoft.com/en-US/library/System.IO.Packaging.aspx%20
PowerShell can use this .NET namespace and can easily deal with XML files, so here is the way to go.
PowerShell code to load the WindowsBase.dll assembly:
$Null = [Reflection.Assembly]::LoadWithPartialName("WindowsBase")
Anatomy of a minimal Excel XLSX package file
The number and types of the XLSX package parts will vary based on what is in the spreadsheet. I will describe the minimal XLSX needs here:
Minimal package structure
Example of a minimal basic structure, of a XLSX package file, with 1 mandatory worksheet:
./[Content_Types].xml
./_rels/.rels
./xl/workbook.xml
./xl/_rels/workbook.xml.rels
./xl/worksheets/sheet1.xml
Minimal package parts
Required is the main file: [Content_Types].xml
Required part for all Open XML documents
- Three content types must be defined:
- 1. SpreadsheetML main document (for the start part)
- 2. Worksheet
- 3. Package relationships (for the required relationships)
The [Content_Types].xml part (file) is generated automatically by the ZipPackage class on creation of the Excel XLSX package file.
Here is the PowerShell code to create the package file on disk:
# create the main package on disk with filemode create
$exPkg = [System.IO.Packaging.Package]::Open
"C:test.xlsx"
, [System.IO.FileMode]::Create)
The [Content_Types].xml file contains definitions of the content types included in the ZIP package, such as the main document, the document theme, and the file properties. This file also stores definitions of the file extensions used in the ZIP package, such
as the file formats like .png or .wav. So you can store pictures or sounds inside a document.
Example of a minimal [Content_Types].xml part the package contains a workbook with one worksheet:
<?xml version=»1.0″ encoding=»UTF-8″ standalone=»yes»?>
<Types xmlns=»http://schemas.openxmlformats.org/package/2006/content-types»>
<Default Extension=»bin» ContentType=»application/vnd.openxmlformats-officedocument.spreadsheetml.printerSettings» />
<Default Extension=»rels» ContentType=»application/vnd.openxmlformats-package.relationships+xml» />
<Default Extension=»xml» ContentType=»application/xml» />
<Override PartName=»/xl/workbook.xml» ContentType=»application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml» />
<Override PartName=»/xl/worksheets/sheet1.xml» ContentType=»application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml» />
</Types>
Required the document “start part”: workbook.xml
workbook.xml requires one relationship part workbook.xml.rels which links mainly to the worksheets
Example of a minimal workbook.xml part:
<?xml version=»1.0″ encoding=»UTF-8″ standalone=»yes»?>
<workbook xmlns=»http://schemas.openxmlformats.org/spreadsheetml/2006/main» xmlns:r=»http://schemas.openxmlformats.org/officeDocument/2006/relationships»>
<sheets>
<sheet name=»Table0″ sheetId=»1″ r:id=»rId1″ />
</sheets>
</workbook>
I use the .NET XML classes to create the XML document part from scratch. Here is the PowerShell Code:
# create the Workbook.xml part XML document
# create empty XML Document
$xl_Workbook_xml = New-Object System.Xml.XmlDocument
# Obtain a reference to the root node, and then add the XML declaration.
$XmlDeclaration = $xl_Workbook_xml.CreateXmlDeclaration(
"1.0"
,
"UTF-8"
,
"yes"
)
$Null = $xl_Workbook_xml.InsertBefore($XmlDeclaration, $xl_Workbook_xml.DocumentElement)
# Create and append the workbook node to the document.
$workBookElement = $xl_Workbook_xml.CreateElement(
"workbook"
)
# add the office open xml namespaces to the XML document
$Null = $xl_Workbook_xml.AppendChild($workBookElement)
# Create and append the sheets node to the workBook node.
$Null = $xl_Workbook_xml.DocumentElement.AppendChild($xl_Workbook_xml.CreateElement(
"sheets"
))
The URI is defined as a relative path to the package root. The URI defines the part and the folder(s) to create.
The Namespace in the Create() method declares the type of relationship being defined from the applicable Office Open XML schema.
The GetStream() Method returns the destination file stream to write the XML document.
# create the workbook.xml package part
# create URI for workbook.xml package part
$Uri_xl_workbook_xml = New-Object System.Uri -ArgumentList (
"/xl/workbook.xml"
,
[System.UriKind]::Relative)
# create workbook.xml part
$Part_xl_workbook_xml = $exPkg.CreatePart($Uri_xl_workbook_xml,
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml"
)
# get writeable stream from workbook.xml part
$dest = $part_xl_workbook_xml.GetStream([System.IO.FileMode]::Create,[System.IO.FileAccess]::Write)
# write workbook.xml XML document to part stream
$xl_workbook_xml.Save($dest)
Required: one (main) relationship part: .rels
Must be in a _rels folder.
After you have created the Workbook.xml part, you have to create the relationship from the Main [Content_Types].xml to the document body Workbook.xml.
The .rels file in the _rels folders is the main top-level relationship file in an Office Open XML package file.
This file defines relationships between core files in the ZIP package and the applicable Office Open XML schema.
The main relationship file «.rels» and its folder «_rels» is automatically created by a call to the CreateRelationship() Method from the ZipPackage class.
The Target of a relationship is the location of the referenced file. The target can be within the XLSX ZIP package (internal) or outside (external) of the XLSX ZIP package. We store all files and information’s inside the ZIP package, so we use the Target mode
Internal.
The Namespace declares the type of relationship being defined from the applicable Office Open XML schema. In this case, the file workbook.xml is being defined as type officeDocument. This information tells Excel that the file workbook.xml contains the document
body.
The Relationship Id (rId1 in this case) simply provides a unique identifier for the referenced file.
PowerShell code to create the relationship between the package parts [Content_Types].xml and the main document workbook.xml
# create package general main relationships
Required one worksheet: sheet1.xml
Inside the worksheet XML part, the <sheetdata> node is required, but may be empty
Example of a minimal worksheet XML part:
<?xml version=»1.0″ encoding=»UTF-8″ standalone=»yes»?>
<worksheet xmlns=»http://schemas.openxmlformats.org/spreadsheetml/2006/main» xmlns:r=»http://schemas.openxmlformats.org/officeDocument/2006/relationships»>
<sheetData />
</worksheet>
I use the .NET XML classes to create the XML document part from scratch. The Name of the worksheet part used in the URI, is dynamically generated with the pattern Sheet + number + .xml in the $NewWorkSheetPartName variable. (Example names: Sheet1.xml, Sheet2.xml,
Sheet3.xml and so on …)
# create worksheet XML document
# create empty XML Document
$New_Worksheet_xml = New-Object System.Xml.XmlDocument
# obtain a reference to the root node, and then add the XML declaration.
$XmlDeclaration = $New_Worksheet_xml.CreateXmlDeclaration(
"1.0"
,
"UTF-8"
,
"yes"
)
$Null = $New_Worksheet_xml.InsertBefore($XmlDeclaration, $New_Worksheet_xml.DocumentElement)
# create and append the worksheet node to the document.
$workSheetElement = $New_Worksheet_xml.CreateElement(
"worksheet"
)
# add the Excel related office open xml namespaces to the XML document
$Null = $New_Worksheet_xml.AppendChild($workSheetElement)
# create and append the sheetData node to the worksheet node.
$Null = $New_Worksheet_xml.DocumentElement.AppendChild($New_Worksheet_xml.CreateElement(
"sheetData"
))
The URI is defined as a relative path to the package root. The URI defines the part and the folder(s) to create.
The Namespace in the Create() method declares the type of relationship being defined from the applicable Office Open XML schema.
The GetStream() Method returns the destination file stream to write the XML document.
# create the worksheet package part
# create URI for worksheet package part
$Uri_xl_worksheets_sheet_xml = New-Object System.Uri -ArgumentList (
"/xl/worksheets/$NewWorkSheetPartName"
,
[System.UriKind]::Relative)
# create worksheet part
$Part_xl_worksheets_sheet_xml = $exPkg.CreatePart($Uri_xl_worksheets_sheet_xml,
"application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml"
)
# get writeable stream from part
$dest = $part_xl_worksheets_sheet_xml.GetStream([System.IO.FileMode]::Create,[System.IO.FileAccess]::Write)
# write $New_Worksheet_xml XML document to part stream
$New_Worksheet_xml.Save($dest)
Required: workbook relationship part
Every folder in an XLSX ZIP package can contain his own _rels folder to define relationships within that folder. The main document folder «xl» always contains a «_rels» folder with relationship parts.
The relationship part for the workbook.xml is named workbook.xml.rels.
The workbook.xml.rels part is created by use of the CreateRelationship() Method from the ZipPackage class, the «_rels» Folder which contains this part is created automatically by use of the URI.
So first you have to create the XML package part files and then you can create the relationships between them.
The unique ID of the relationship is determined from the workbook.xml and dynamically generated with the pattern rID + Number in the variable $NewWorkBookRelId (Example: rID1, rID2, rID3 and so on …).
# create workbook to worksheet relationship
Everything else is optional
Worksheet content
If you put data into a Microsoft Excel worksheet, Excel will automatically convert some data into the format that Excel thinks is best.
For example, Excel will remove leading Zeros of Numbers, change Date/Time Formats or uses the scientific number format for large Numbers and others.
This can go unnoticed in large data sets.
To prevent Excel from converting the data, you must tell Excel to import/store the data in Text format.
There are two ways to store data with Type of Text in an Excel XLSX worksheet package part!
1. Inline strings which are stored inside the XML worksheet package part (file)
• Provided for ease of translation/conversion
• Useful in XSLT scenarios
• Excel and other consumers may convert to shared strings
• to export the data programmatically into the worksheet
2. Using a shared-strings XML package part as a table with unique strings
• All worksheets points/links to the strings stored in the shared-strings package part
• Each unique string is stored once (reduced file size, improved performance)
• Cells store the 0-based index of the string
Both approaches may be mixed/combined
I will use the inline string approach here in my PowerShell solution because it is easier to create and maintain.
Example of an Excel XML worksheet part which contains only inline content, formatted as Type of text:
<?xml version=»1.0″ encoding=»UTF-8″ standalone=»yes»?>
<worksheet xmlns=»http://schemas.openxmlformats.org/spreadsheetml/2006/main» xmlns:r=»http://schemas.openxmlformats.org/officeDocument/2006/relationships»>
<sheetData>
<row>
<c t=»inlineStr»>
<is>
<t>Name</t>
</is>
</c>
</row>
<row>
<c t=»inlineStr»>
<is>
<t>acrotray</t>
</is>
</c>
</row>
<row>
<c t=»inlineStr»>
<is>
<t>Name</t>
</is>
</c>
</row>
<sheetData>
A row is represented as <row>-Element.
A cell is represented as <c>-Element. The type of the cell is defined by the «t» attribute here as type of «inlineStr» which means a type of text.
If the cell has a type of «inlineStr» the <c> node must contain a <is> node.
For a simple string (text) without formatting the <is> node contains a <t> node with the value of the string.
Warning:
By default, Excel uses and stores strings into the shared-strings XML package part.
Excel transfers the inline strings into the shared-strings part on save actions!
So, after Excel has converted the data into shared strings, the data cannot easily be accessed!
The PowerShell code
There are several golden rules for the code design.
Two of them are:
A Function should always be concentrated to solve only one task and not being a Swiss army knife.
A Function and scripts should always return well defined and structured Objects
So I have divided my PowerShell code into several functions.
New-XLSXWorkBook
Function to create a new empty Excel .xlsx workbook (XLSX package) without a worksheet
Add-XLSXWorkSheet
Function to append a new empty Excel worksheet to an existing Excel .xlsx workbook
Export-WorkSheet
Function to fill an empty existing Excel worksheet with the string typed data
These functions are only used internally. So best is to hide these functions.
To hide functions you have these options in PowerShell
• nest functions inside other function (in case of advance functions put it inside the begin block)
• nest functions inside the begin block of an advanced script
• Create a module and specify the public module members with the Cmdlet Export-ModuleMember
I don’t want to force the user of my script to import it as a module.
In the fact that a PowerShell script can look and behave like a function, I have decided to nest the functions inside the begin block of the script.
So you can use this script by simply calling it (by its path) and by use of the parameters.
You can Download the full code on the Microsoft Code Repository:
Links:
Office Open XML Format Links:
ISO and IEC standards
http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html%20
Ecma standard 376
http://www.ecma-international.org/publications/standards/Ecma-376.htm%20
Office Open XML Learn resources:
Exploring the Office Open XML Formats
http://office.microsoft.com/en-us/training/office-open-xml-i-exploring-the-office-open-xml-formats-RZ010243529.aspx?section=1%20
Editing Documents in the XML
http://office.microsoft.com/en-us/training/open-xml-ii-editing-documents-in-the-xml-RZ010357030.aspx?CTT=1%20
Good Open XML XLSX Link:
Read and write Open XML files (MS Office 2007)
http://www.developerfusion.com/article/6170/read-and-write-open-xml-files-ms-office-2007/
SpreadsheetML or XLSX
http://officeopenxml.com/anatomyofOOXML-xlsx.php%20
PowerShell and the Excel COM Object Model:
For documentation of the Excel object model search the Microsoft Developer Network (MSDN) for:
» Excel Object Model Reference»
Excel 2003 and 2007:http://msdn.microsoft.com/en-us/library/bb149081%28v=office.12%29.aspx
Excel 2003 and 2007:http://msdn.microsoft.com/en-Us/library/wss56bz7%28v=vs.90%29.aspx
Excel 2013: http://msdn.microsoft.com/en-us/library/office/ff194068.aspx
Article series: Integrating Microsoft Excel with PowerShell by Jeffery Hicks:
http://www.petri.co.il/export-to-excel-with-powershell.htm%20
http://www.petri.co.il/export-to-excel-with-powershell-part-2.htm%20
http://www.petri.co.il/export-to-excel-with-powershell-part-3.htm%20
How Can I Use Windows PowerShell to Automate Microsoft Excel? By Ed Wilson:
http://blogs.technet.com/b/heyscriptingguy/archive/2006/09/08/how-can-i-use-windows-powershell-to-automate-microsoft-excel.aspx
Tips:
WindowsBase.dll can even be used in PowerShell 2.0 and 3.0 to create ZIP Files:
PowerShell-ZIP
http://thewalkingdev.blogspot.de/2012/07/powershellzip.html%20
See Also
- PowerShell Portal
- Wiki: Portal of TechNet Wiki Portals