Mso application progid excel sheet

Добрый день.
Если кто сталкивался с подобным или какой косяк виден — скажите пож что делаю не так.

Делаю вывод в Excel посредством XSLT преобразования.
Сначала преобразую внутреннюю таблицу в XML посредством стандартного преобразования ID, затем полученный XML преобразую в Excel совместимый XML посредством написанного XSLT преобразования ZST_1, затем сохраняю файл на рабочую станцию в файл с расширением .XLS.

При открытии файла в Excel происходит предупреждение, что формат открываемого файла отличается от указанного в расширении. Если предупреждение игнорировать, то файл открывается.
Если же сохранять файл с расширением .XML то Excel его не открывает, потому что при выполнении трансформации ZST_1 почему то исчезает строка <?mso-application progid=»Excel.Sheet»?> по которой определяется что это XML совместимый с Excel.
Как избавиться от этого сообщения?

И есть ли способ указать в преобразовании, что нужно сохранять в UTF-16, потому как моё преобразование сохраняет в UTF-8 что в ряде случае приводит к неотображению нужного текста.

программа

Code:

REPORT Z_TEST1.

data:
BEGIN OF z_PARTNER,
LIFNR TYPE LIFNR,
NAME(25),
end OF z_PARTNER,
T_PARTNER like STANDARD TABLE OF z_PARTNER.

CLEAR: T_PARTNER, z_PARTNER.
z_PARTNER-LIFNR = ‘1’.
z_PARTNER-NAME = ‘номер 1’.
APPEND z_PARTNER to t_PARTNER.
z_PARTNER-LIFNR = ‘2’.
z_PARTNER-NAME = ‘номер 2’.
APPEND z_PARTNER to t_PARTNER.

* трансформация ABAP2XML
TYPES: z_xml(1024) TYPE x.
DATA: lt_xml TYPE STANDARD TABLE OF z_xml,
lt_xml_xls LIKE lt_xml.
CALL TRANSFORMATION id
SOURCE data_node = t_PARTNER
RESULT XML lt_xml.

* трансформация XML2XML(EXCEL)
CALL TRANSFORMATION ZST_1
SOURCE XML lt_xml[]
RESULT XML lt_xml_XLS.

* выгрузка
* временная директория
DATA:
l_filename TYPE string,
l_dirname TYPE string.
CALL METHOD cl_gui_frontend_services=>get_sapgui_workdir
CHANGING
sapworkdir = l_dirname
EXCEPTIONS
OTHERS = 0.
CHECK sy-subrc = 0.

* имя файла для выгрузки
CLEAR: l_filename.
CONCATENATE l_dirname ‘TST’ ‘_’ sy-datum ‘_’ sy-uzeit ‘.xls’ INTO l_filename.
* выгрузка файла
CALL METHOD cl_gui_frontend_services=>gui_download
EXPORTING
filename = l_filename
filetype = ‘BIN’
CHANGING
data_tab = lt_xml_XLS.

* откроем выгруженный XML-эксель
cl_gui_frontend_services=>execute( document = l_filename operation = ‘OPEN’ ).

преобразование ZST_1

Code:

<xsl:transform xmlns:xsl=»http://www.w3.org/1999/XSL/Transform» xmlns:sap=»http://www.sap.com/sapxsl» xmlns:asx=»http://www.sap.com/abapxml» version=»1.0″>
<xsl:strip-space elements=»*»/>
<xsl:template match=»/»>
<?mso-application progid=»Excel.Sheet»?>
<Workbook xmlns=»urn:schemas-microsoft-com:office:spreadsheet» xmlns:o=»urn:schemas-microsoft-com:office:office» xmlns:x=»urn:schemas-microsoft-com:office:excel» xmlns:ss=»urn:schemas-microsoft-com:office:spreadsheet» xmlns:html=
«http://www.w3.org/TR/REC-html40»>
<DocumentProperties xmlns=»urn:schemas-microsoft-com:office:office»>
<Version>12.00</Version>
</DocumentProperties>
<ExcelWorkbook xmlns=»urn:schemas-microsoft-com:office:excel»>
<WindowHeight>8580</WindowHeight>
<WindowWidth>17100</WindowWidth>
<WindowTopX>360</WindowTopX>
<WindowTopY>45</WindowTopY>
<ProtectStructure>False</ProtectStructure>
<ProtectWindows>False</ProtectWindows>
</ExcelWorkbook>
<Styles>
<Style ss:ID=»Default» ss:Name=»Normal»>
<Alignment ss:Vertical=»Bottom»/>
<Borders/>
<Font ss:Color=»#000000″ ss:FontName=»Calibri» ss:Size=»11″ x:CharSet=»204″ x:Family=»Swiss»/>
<Interior/>
<NumberFormat/>
<Protection/>
</Style>
<Style ss:ID=»s62″>
<Borders>
<Border ss:LineStyle=»Continuous» ss:Position=»Bottom» ss:Weight=»1″/>
<Border ss:LineStyle=»Continuous» ss:Position=»Left» ss:Weight=»1″/>
<Border ss:LineStyle=»Continuous» ss:Position=»Right» ss:Weight=»1″/>
<Border ss:LineStyle=»Continuous» ss:Position=»Top» ss:Weight=»1″/>
</Borders>
</Style>
</Styles>

<WorksheetOptions xmlns=»urn:schemas-microsoft-com:office:excel»>
<PageSetup>
<Header x:Margin=»0.3″/>
<Footer x:Margin=»0.3″/>
<PageMargins x:Bottom=»0.75″ x:Left=»0.7″ x:Right=»0.7″ x:Top=»0.75″/>
</PageSetup>
<Selected/>
<Panes>
<Pane>
<Number>3</Number>
</Pane>
</Panes>
<ProtectObjects>False</ProtectObjects>
<ProtectScenarios>False</ProtectScenarios>
</WorksheetOptions>
</Worksheet>
</Workbook>

</xsl:template>
</xsl:transform>

Источник

NovaInfo 2, скачать PDF
Опубликовано 13 июля 2010
Раздел: Технические науки
Просмотров за месяц: 53

Аннотация

Ключевые слова

XML, TESTCOMPLETE, MS EXCEL

Текст научной работы

В настоящее время разбор и анализ таблиц Excel в программе TestComplete вызывает сложности при разработке тест-скриптов. Одним из вариантов решения данной проблемы может стать анализ таблиц Excel, сохраненных в XML-формате. MS Office позволяет сохранять таблицы в формате «Таблица XML». В данной статье мы рассмотрим возможность разбора и анализа таблиц в данном формате. Более подробную информацию о формате можно получить на сайте Microsoft. Информацию об использованных в этой статье методах и свойствах MS XML можно подчеркнуть из статьи Разбор и анализ XML-файла в TestComplete.

Создадим таблицу Excel, например, подобную этой:

Рисунок 1. Пример таблицы Excel для разбора и анализа с помощью XML

При сохранении в XML формате данная таблица будет выглядеть следующим образом:

<?xml version="1.0"?><?mso-application progid="Excel.Sheet"?><Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">  <Author>Долганов Алексей Александрович</Author>  <LastAuthor>Долганов Алексей Александрович</LastAuthor>  <Created>2010-07-13T09:35:04Z</Created>  <Version>12.00</Version> </DocumentProperties> <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">  <WindowHeight>11820</WindowHeight>  <WindowWidth>15315</WindowWidth>  <WindowTopX>120</WindowTopX>  <WindowTopY>45</WindowTopY>  <ProtectStructure>False</ProtectStructure>  <ProtectWindows>False</ProtectWindows> </ExcelWorkbook> <Styles>  <Style ss:ID="Default" ss:Name="Normal">   <Alignment ss:Vertical="Bottom"/>   <Borders/>   <Font ss:FontName="Calibri" x:CharSet="204" x:Family="Swiss" ss:Size="11" ss:Color="#000000"/>   <Interior/>   <NumberFormat/>   <Protection/>  </Style>  <Style ss:ID="s62">   <NumberFormat ss:Format="#,##0.00&quot;р.&quot;"/>  </Style> </Styles> <Worksheet ss:Name="Лист1">  <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="3" x:FullColumns="1" x:FullRows="1" ss:DefaultRowHeight="15">   <Row ss:AutoFitHeight="0">    <Cell><Data ss:Type="String">Молоко</Data></Cell>    <Cell ss:StyleID="s62"><Data ss:Type="Number">10</Data></Cell>   </Row>   <Row ss:AutoFitHeight="0">    <Cell><Data ss:Type="String">Мясо</Data></Cell>    <Cell ss:StyleID="s62"><Data ss:Type="Number">50</Data></Cell>   </Row>   <Row ss:AutoFitHeight="0">    <Cell><Data ss:Type="String">Яблоки</Data></Cell>    <Cell ss:StyleID="s62"><Data ss:Type="Number">20</Data></Cell>   </Row>  </Table>  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">   <PageSetup>    <Header x:Margin="0.3"/>    <Footer x:Margin="0.3"/>    <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>   </PageSetup>   <Unsynced/>   <Print>    <ValidPrinterInfo/>    <PaperSizeIndex>9</PaperSizeIndex>    <HorizontalResolution>600</HorizontalResolution>    <VerticalResolution>600</VerticalResolution>   </Print>   <Selected/>   <ProtectObjects>False</ProtectObjects>   <ProtectScenarios>False</ProtectScenarios>  </WorksheetOptions> </Worksheet></Workbook>

Структура XML-файла начинается с корневого элемента Workbook, обозначающего книгу. Перейдем к его дочерним элементам.

DocumentProperties (Свойства документа). В этом элементе не содержится никаких данных из таблицы, поэтому этот элемент мы рассматривать не будем;
ExcelWorkbook (Книга Excel). Также в этом элементе не содержится важной для нас информации, поэтому также пропускаем;
Styles (Стили). Содержит форматирование таблицы. Расмотрим этот элемент вкратце. Будем считать что нам важнее сами данные, чем их форматирование;
Worksheet (Лист). Данных элементов может быть несколько, в зависимости от количества листов в книге. Отобрать данные элементы можно с помощью XML-метода getElementsByTagName.

WorkSheet

Обязательные параметры:

Ss:Name (Название листа).

Необязательные параметры:

Ss:Protected (Информация о защите листа);
Ss:RightToLeft (Направление текста).

Дочерние элементы:

Table (Таблица). Данные;
WorksheetOptions (Настройки). Не содержит данных. рассматриваться не будет.

Table

Обязательные параметры: нет

Необязательные параметры:

Ss:DefaultColumnWidth (Ширина столбцов по умолчанию). Указывается в pt (1pt = 4/3px);
Ss:DefaultRowHeight (Высота строк по умолчанию). Указывается в pt (1pt = 4/3px);
Ss:ExpandedColumnCount (Общее число столбцов в этой таблице);
Ss:ExpandedRowCount (Общее число строк в этой таблице);
Ss:LeftCell (Начало таблицы слева);
Ss:StyleID (Стиль таблицы). Ссылается на элемент Styles (подробнее ниже);
Ss:TopCell (Начала таблицы сверху).

Дочерние элементы:

Column (Столбцы);
Row (Строки).

Column

Обязательные параметры: нет

Необязательные параметры:

C:Caption (Заголовок);
Ss:AutoFitWidth (Автоматическая ширина столбца). Истина если содержит значение 1;
Ss:Hidden (Признак скрытия столбца);
Ss:Index (Индекс столбца);
Ss:Span (Количество столбцов с одинаковым форматированием);
Ss:StyleID (Стиль столбца);
Ss:Width (Ширина столбца). Указывается в pt (1pt = 4/3px).

Остановимся подробнее на параметрах ss:Index и ss:Span. Например, имеется 5 столбцов:

Ширина 100pt;
Ширина 20pt;
Ширина 20pt;
Ширина 20pt;
Ширина 50pt.

В XML-файле столбцы должны быть описаны следующим образом:

<column ss:width="100"/><column ss:span="2" ss:width="20"/><column ss:index="5" ss:width="50"/>

Row

Обязательные параметры: нет

Необязательные параметры:

C:Caption (Заголовок);
Ss:AutoFitWidth (Автоматическая высота строки). Истина если содержит значение 1;
Ss:Height (Высота строки). Указывается в pt (1pt = 4/3px);
Ss:Hidden (Признак скрытия строки);
Ss:Index (Индекс строки);
Ss:Span (Количество строк с одинаковым форматированием);
Ss:StyleID (Стиль строки).

Дочерние элементы:

Cell (Ячейка).

Cell

Обязательные параметры:

Ss:Type (Тип ячейки). Возможные значения: Number (Числовой); DateTime (Дата и время); Boolean (Логический); String (Строковый); Error (Ошибка). Возможные значения: #NULL!; #DIV/0!; #VALUE!; #REF!; #NAME?; #NUM!; #N/A; #CIRC!

Необязательные параметры: нет

Дочерние элементы:

B (Жирным);
Font (Шрифт);
I (Курсив);
S (Зачеркнутый);
Span (Форматированный);
Sub (Верхний регистр);
Sup (Нижний регистр);
U (Подчеркивание).

Значение элемента: Значение ячейки

Цитировать

Долганов, А.А. Разбор и анализ таблиц Excel с помощью MS XML в TestComplete / А.А. Долганов. — Текст : электронный // NovaInfo, 2010. — № 2. — URL: https://novainfo.ru/article/158 (дата обращения: 14.04.2023).

Источник

Here is a guide to create an xls file in an XML document.

First step is the declaration of XML document. This defines the XML version and the encoding.

<?xml version=”1.0″? encoding=”ISO-8859-1″?>
<?mso-application progid=”Excel.Sheet”?>

Next is the root tag and schemas for excel.

Under the <Workbook> Tag, there are Elements that constant for an excel format.

First element is the <DocumentProperties>. This tag set the excel properties such as Author, Title, Date and Time created and so on.

<DocumentProperties xmlns=”urn:schemas-microsoft-com:office:office”>

As for child node of <DocumentProperties>.

The next element is <ExcelWorkbook>. Below is the format and the corresponding child node.

<ExcelWorkbook>
<WindowHeight>8700</WindowHeight>
<WindowWidth>12315</WindowWidth>
<WindowTopY>120</WindowTopY>
<WindowTopX>60</WindowTopX>
<ProtectStructure>False</ProtectStructure>
<ProtectWindows>False</ProtectStructure>
</ExcelWorkbook>

Now is for the style of the data to be represent in the excel spreadsheet. This is similar to the CCS.

The tag element will be <Styles> and each child node will be <Style>. And each element of <Style> node is the format of how is the data to be represent.

Those are some example style. You can put other style depending on how you like your data to be represent.

The next node is the <Worksheet> node. This node hold the Table informations and data and style of each cells.

<Worksheet ss:Name=”Sheet Name”>

Then under the <Worksheet> Node, is the <Table> node.

<Table ss:ExpandedColumnCount=”256″ ss:ExpandedRowCount=”21″ x:FullColumns=”1″ x:FullRows=”1″>

We will create a table with 5 columns.

The first width is the height of the cell and the second width is the actual width of the cell.

The StyleID is the id we declared in the <Styles> node.

<Row Num=”1″>
<Cell ss:Index=”2″ ss:StyleID=”Header” ss:MergeAcross=”4″><Data ss:Type=”String”>This is the first row</Data></Cell>
</Row>

The ss:Index=”2″, means that the data will be place at the 2nd column which is Column B. The ss:MergeAcross=”4″, merge the cells from column B to E.

<Row Num=”3″ ss:Index=”3″>
<Cell ss:Index=”2″><Data ss:Type=”String”>DefaultNumber</Data></Cell>
<Cell><Data ss:Type=”String”>BoldItalic</Data></Cell>
<Cell><Data ss:Type=”String”>SimpleUnderline</Data></Cell>
<Cell><Data ss:Type=”String”>Currency3Decimals</Data></Cell>
</Row>
<Row Num=”4″ ss:Index=”4″>
<Cell ss:Index=”2″><Data ss:Type=”Number”>123456</Data></Cell>
<Cell><Data ss:Type=”String”>Bold and Italic</Data></Cell>
<Cell><Data ss:Type=”String”>Underline</Data></Cell>
<Cell><Data ss:Type=”Number”>123456</Data></Cell>
</Row>

The last node for the worksheet is the <WorksheetOptions>. Its contain the Print, Pane, Selected, etc.

<WorksheetOptions xmlns=”urn:schemas-microsoft-com:office:excel”>
<Print>
<ValidPrinterInfo/>
<HorizontalResolution>200</HorizontalResolution>
<VerticalResolution>200</VerticalResolution>
<NumberofCopies>0</NumberofCopies>
</Print>
<Selected/>
<Panes>
<Pane>
<Number>3</Number>
<ActiveRow>1</ActiveRow>
</Pane>
</Panes>
<ProtectObjects>False</ProtectObjects>
<ProtectScenarios>False</ProtectScenarios>
</WorksheetOptions>

The <Worksheet> node can be use repeatedly, if you want to create more than one sheet in your excel file.

And now we can close the XML with the </Workbook> closing tag.

</Workbook>

Hope this guide help you with your project.

Источник

I get the following warning when opening an XML file with the ending .xls but I want to use it as xls:

«The file you are trying to open, ‘[filename]’, is in a different format than specified by the file extension. Verify that the file is not corrupted and is from a trusted source before opening the file. Do you want to open the file now?» (Yes | No | Help)

Quoted from the MSDN blog article ‘Excel 2007 Extension Warning On Opening Excel Workbook from a Web Site’ archive link original link (broken).

How to solve this?

I use .xls with this source code:

<?xml version="1.0" encoding="utf-8"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
<Worksheet ss:Name="Export">
<Table>

<Row> 
<Cell><Data ss:Type="Number">3</Data></Cell>

<Cell><Data ss:Type="Number">22123497</Data></Cell>

</Row>
</Table>
</Worksheet>
</Workbook>

User5910

4635 silver badges13 bronze badges

asked Sep 14, 2011 at 12:12

Well as the commenters already mentioned your example-document is definitely not an xls-file (as those are binary) and Excel rightly complains to that fact (because a document might trick you with the wrong extension).

What you should do is to save the document with file extension xml and add the processing-instruction for an office document (or in this case SpreadsheetML as opposed to the original binary/ proprietary excel-format)

<?xml version="1.0"?>
   <?mso-application progid="Excel.Sheet"?>
   <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
   ...

This used to work, but I just noticed that with Office 2007 the XML-processing component («XML Editor») doesn’t seem to be installed as default app for XML files. This did send XML-files to the correct application when they were opened (according to the processing instruction). Maybe on your machine this works as it was intended to work (otherwise you might have to change this behavior).

So this is basically the same the other commenters already said. Still I hope this helps.

Mads Hansen

62.9k12 gold badges113 silver badges144 bronze badges

answered Nov 16, 2011 at 16:17

Andreas JAndreas J

5264 silver badges18 bronze badges

Источник

Office Open XML (OOX) has become the default format with the release of Office 2007, but back in the 2003’s days, Microsoft had already developed a format to store Excel workbooks as XML.
A comprehensive overview is available here :
Dive into SpreadsheetML (Part 1 of 2)
Dive into SpreadsheetML (Part 2 of 2)

Contrary to OOX where data and metadata are stored in a multipart archive, an Excel workbook file in SpreadsheetML 2003 format consists in a single XML instance, and therefore easily managed using built-in Oracle XML functions and XML DB features.
In this article, I’ll focus on how to create and read such files with the help of SQL/XML functions, XSLT and XQuery.

1. Writing a file

The minimum valid structure for an instance looks like this :

<?xml version="1.0" encoding="UTF-8"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
 <Worksheet ss:Name="Sheet1">
  <Table>
   <Row>
    <Cell>
     <Data ss:Type="String">Employee No</Data>
    </Cell>
    <Cell>
     <Data ss:Type="String">Employee Name</Data>
    </Cell>
    <Cell>
     <Data ss:Type="String">Job</Data>
    </Cell>
   </Row>
   <Row>
    <Cell>
     <Data ss:Type="Number">7839</Data>
    </Cell>
    <Cell>
     <Data ss:Type="String">KING</Data>
    </Cell>
    <Cell>
     <Data ss:Type="String">PRESIDENT</Data>
    </Cell>
   </Row>
   <!-- More rows -->
  </Table>
 </Worksheet>
</Workbook>

It can be generated this way, with SQL/XML functions :

SELECT XMLConcat(
         XMLPi("mso-application", 'progid="Excel.Sheet"')
       , XMLElement("Workbook",
           XMLAttributes(
             'urn:schemas-microsoft-com:office:spreadsheet' as "xmlns"
           , 'urn:schemas-microsoft-com:office:spreadsheet' as "xmlns:ss"
           )
         , XMLElement("Worksheet",
             XMLAttributes('Sheet1' as "ss:Name")
           , XMLElement("Table",
               XMLElement("Row",
                 XMLForest(
                   XMLElement("Data", XMLAttributes('String' as "ss:Type"), 'Employee No') as "Cell"
                 , XMLElement("Data", XMLAttributes('String' as "ss:Type"), 'Employee Name') as "Cell"
                 , XMLElement("Data", XMLAttributes('String' as "ss:Type"), 'Job') as "Cell"
                 )
               )
             , XMLAgg(
                 XMLElement("Row",
                   XMLForest(
                     XMLElement("Data", XMLAttributes('Number' as "ss:Type"), e.empno) as "Cell"
                   , XMLElement("Data", XMLAttributes('String' as "ss:Type"), e.ename) as "Cell"
                   , XMLElement("Data", XMLAttributes('String' as "ss:Type"), e.job) as "Cell"
                   )
                 )
                 order by e.empno
               )
             )
           )
         )
       )
FROM scott.emp e
;

Although this query is relatively simple and efficient, we can imagine how cumbersome it could get to write queries for more complex requirements.
So this is where XSLT comes into play. By creating a stylesheet working on a canonical XML input, we can hide the transformation logic and separate the data layer from the presentation layer.

Following is an example generating a multisheet workbook and some additional Excel-specific formattings (frozen headers and tab color set to red for total salaries higher than 10,000).
Here, I first stored the XSLT stylesheet in the XML DB repository. That’s not mandatory, we can also declare the stylesheet inline in the PL/SQL block, but it’s a good practice to keep the stylesheets in the database (repository or XMLType column in a relational table) if we intend to use them with the internal XSLT processor.

The stylesheet (test.xsl) :

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns="urn:schemas-microsoft-com:office:spreadsheet"  
 xmlns:x="urn:schemas-microsoft-com:office:excel" 
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
 <xsl:output method="xml" encoding="UTF-8"/>
 <xsl:template match="/">
  <xsl:processing-instruction name="mso-application">progid="Excel.Sheet"</xsl:processing-instruction>
  <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" 
            xmlns:x="urn:schemas-microsoft-com:office:excel" 
            xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
   <Styles>
    <Style ss:ID="h">
     <Interior ss:Color="#C0C0C0" ss:Pattern="Solid"/>
    </Style>
   </Styles>
   <xsl:apply-templates/>
  </Workbook>
 </xsl:template>
 <xsl:template match="ROWSET">
  <Worksheet ss:Name="{@name}">
   <Table>
    <Row>
     <xsl:for-each select="ROW[1]/*">
      <Cell ss:StyleID="h">
       <Data ss:Type="String">
        <xsl:value-of select="translate(local-name(), '_', ' ')"/>
       </Data>
      </Cell>
     </xsl:for-each>
    </Row>
    <xsl:apply-templates/>
   </Table>
   <x:WorksheetOptions>
    <x:FrozenNoSplit/>
    <x:SplitHorizontal>1</x:SplitHorizontal>
    <x:TopRowBottomPane>1</x:TopRowBottomPane>
    <x:ActivePane>2</x:ActivePane>
    <xsl:if test="@color">
     <x:TabColorIndex><xsl:value-of select="@color"/></x:TabColorIndex>
    </xsl:if>
   </x:WorksheetOptions>
  </Worksheet>
 </xsl:template>
 <xsl:template match="ROW">
  <Row>
   <xsl:apply-templates/>
  </Row>
 </xsl:template>
 <xsl:template match="ROW/*">
  <Cell>
   <Data ss:Type="String">
    <xsl:value-of select="."/>
   </Data>
  </Cell>
 </xsl:template>
</xsl:stylesheet>

The transformation code :

DECLARE
  
  xmldoc CLOB;
 
BEGIN
 
  select xmlserialize(document
           xmltransform(
             xmlelement("ROOT",
               xmlagg(
                 xmlelement("ROWSET",
                   xmlattributes(
                     d.dname as "name"
                   , case when sum(e.sal) > 10000 then '2' end as "color"
                   )
                 , xmlagg(
                     xmlelement("ROW",
                       xmlforest(
                         e.empno as "Employee_No"
                       , e.ename as "Employee_Name"
                       , e.job   as "Job"
                       , e.sal   as "Salary"
                       )
                     ) order by e.empno
                   )
                 ) order by d.deptno
               )
             )
           , xdburitype('/office/excel/stylesheets/out/test.xsl').getXML()
           )
           as clob
         )
  into xmldoc
  from scott.dept d
       join scott.emp e on e.deptno = d.deptno
  group by d.deptno
         , d.dname
  ;
  
  dbms_xslprocessor.clob2file(xmldoc, 'TEST_DIR', 'test.xml');
 
END;
/

The output file :

One of the most used Excel features is the Pivot Table generator. Creating such content is also possible directly from the database, using XSLT.
For instance, here’s some “raw” data :

SQL> select employee_id
  2       , first_name
  3       , last_name
  4       , extract(year from hire_date) as hire_year
  5       , job_id
  6  from hr.employees
  7  ;
 
EMPLOYEE_ID FIRST_NAME           LAST_NAME                  HIRE_YEAR JOB_ID
----------- -------------------- ------------------------- ---------- ----------
        198 Donald               OConnell                        2007 SH_CLERK
        199 Douglas              Grant                           2008 SH_CLERK
        200 Jennifer             Whalen                          2003 AD_ASST
        201 Michael              Hartstein                       2004 MK_MAN
        202 Pat                  Fay                             2005 MK_REP
        203 Susan                Mavris                          2002 HR_REP
        204 Hermann              Baer                            2002 PR_REP
        205 Shelley              Higgins                         2002 AC_MGR
        206 William              Gietz                           2002 AC_ACCOUNT
        100 Steven               King                            2003 AD_PRES
        101 Neena                Kochhar                         2005 AD_VP
        102 Lex                  De Haan                         2001 AD_VP
        103 Alexander            Hunold                          2006 IT_PROG

...

        195 Vance                Jones                           2007 SH_CLERK
        196 Alana                Walsh                           2006 SH_CLERK
        197 Kevin                Feeney                          2006 SH_CLERK
 
107 rows selected

and we want to display, for a given job, the number of employees hired per year.
In SQL, that’s called a dynamic pivot but it’s not possible – with conventional methods – to produce such a result set out of a single SELECT statement (because the number of columns has to be known at parse time).

The PIVOT XML operator (11g) provides a partial answer to the problem by generating an XMLType containing aggregated “columns” (actually XML “elements”). The same functionality can be simulated in 10g too with XMLAgg and a partitioned outer join.
But with that method, we still have to build the pivot in SQL, in the database.

What I describe below let Excel do the job for us, through its standard pivot table functionality. We just have to generate a tab containing the raw data (hereafter named “DataSource”), and a tab (“PivotTable”) containing the minimum pivot table definition, i.e. no data and no cache.

The stylesheet (pivot.xsl) :

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
 <xsl:output method="xml" encoding="UTF-8"/>
 <xsl:param name="filename"/>
 <xsl:template match="/">
  <xsl:processing-instruction name="mso-application">progid="Excel.Sheet"</xsl:processing-instruction>
  <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
            xmlns:x="urn:schemas-microsoft-com:office:excel"
            xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
   <xsl:apply-templates/>
  </Workbook>
 </xsl:template>
 <xsl:template match="ROWSET">
  <Worksheet ss:Name="DataSource">
   <Table>
    <Row>
     <Cell><Data ss:Type="String">EMPLOYEE_ID</Data></Cell>
     <Cell><Data ss:Type="String">FIRST_NAME</Data></Cell>
     <Cell><Data ss:Type="String">LAST_NAME</Data></Cell>
     <Cell><Data ss:Type="String">HIRE_YEAR</Data></Cell>
     <Cell><Data ss:Type="String">JOB_ID</Data></Cell>
    </Row>
    <xsl:apply-templates/>
   </Table>
  </Worksheet>
  <Worksheet ss:Name="PivotTable">
   <Table/>
   <x:PivotTable>
    <x:Name>My Pivot Table</x:Name>
    <x:ImmediateItemsOnDrop/>
    <x:ShowPageMultipleItemLabel/>
    <x:GrandTotalString>Total</x:GrandTotalString>
    <x:Location>R3C1:R4C2</x:Location>
    <x:PivotField>
     <x:Name>EMPLOYEE_ID</x:Name>
     <x:DataType>Integer</x:DataType>
    </x:PivotField>
    <x:PivotField>
     <x:Name>FIRST_NAME</x:Name>
    </x:PivotField>
    <x:PivotField>
     <x:Name>LAST_NAME</x:Name>
    </x:PivotField>
    <x:PivotField>
     <x:Name>HIRE_YEAR</x:Name>
     <x:Orientation>Column</x:Orientation>
     <x:AutoSortOrder>Ascending</x:AutoSortOrder>
     <x:Position>1</x:Position>
     <x:DataType>Integer</x:DataType>
     <x:PivotItem>
      <x:Name/>
     </x:PivotItem>
    </x:PivotField>
    <x:PivotField>
     <x:Name>JOB_ID</x:Name>
     <x:Orientation>Row</x:Orientation>
     <x:AutoSortOrder>Ascending</x:AutoSortOrder>
     <x:Position>1</x:Position>
     <x:PivotItem>
      <x:Name/>
     </x:PivotItem>
    </x:PivotField>
    <x:PivotField>
     <x:DataField/>
     <x:Name>Data</x:Name>
     <x:Orientation>Row</x:Orientation>
     <x:Position>-1</x:Position>
    </x:PivotField>
    <x:PivotField>
     <x:Name>Number of Employees</x:Name>
     <x:ParentField>EMPLOYEE_ID</x:ParentField>
     <x:Orientation>Data</x:Orientation>
     <x:Function>Count</x:Function>
     <x:Position>1</x:Position>
    </x:PivotField>
    <x:PTLineItems>
     <x:PTLineItem>
      <x:Item>0</x:Item>
     </x:PTLineItem>
    </x:PTLineItems>
    <x:PTLineItems>
     <x:Orientation>Column</x:Orientation>
     <x:PTLineItem>
      <x:Item>0</x:Item>
     </x:PTLineItem>
    </x:PTLineItems>
    <x:PTSource>
     <x:RefreshOnFileOpen/>
     <x:ConsolidationReference>
      <x:FileName>[<xsl:value-of select="$filename"/>]DataSource</x:FileName>
      <x:Reference>R1C1:R<xsl:value-of select="count(ROW)+1"/>C5</x:Reference>
     </x:ConsolidationReference>
    </x:PTSource>
   </x:PivotTable>
  </Worksheet>
 </xsl:template>
 <xsl:template match="ROW">
  <Row>
   <Cell><Data ss:Type="Number"><xsl:value-of select="EMPLOYEE_ID"/></Data></Cell>
   <Cell><Data ss:Type="String"><xsl:value-of select="FIRST_NAME"/></Data></Cell>
   <Cell><Data ss:Type="String"><xsl:value-of select="LAST_NAME"/></Data></Cell>
   <Cell><Data ss:Type="Number"><xsl:value-of select="HIRE_YEAR"/></Data></Cell>
   <Cell><Data ss:Type="String"><xsl:value-of select="JOB_ID"/></Data></Cell>
  </Row>
 </xsl:template>
 <xsl:template name="PivotTable">
 </xsl:template>
</xsl:stylesheet>

The transformation code :

DECLARE

  res clob;
  v_filename varchar2(260) := 'test_pivot.xml';

BEGIN

  select xmlserialize(document
           xmltransform(
             xmlelement("ROWSET",
               xmlagg(
                 xmlelement("ROW",
                   xmlforest(
                     employee_id
                   , first_name
                   , last_name
                   , extract(year from hire_date) as hire_year
                   , job_id
                   )
                 )
               )
             )
           , xdburitype('/office/excel/stylesheets/out/pivot.xsl').getXML()
           , 'filename="'''||v_filename||'''"'
           )
         )
  into res
  from hr.employees
  ;

  dbms_xslprocessor.clob2file(res, 'TEST_DIR', v_filename);

END;
/

The output file :

2. Reading a file

I’ll divide this section in two parts : querying and optimizing.

a) “One-shot” queries

Let’s say we want to read this document (saved as XML 2003 format) as if it were a relational table :

As usual, we’ll use an XMLType table to store the original file and then query from it.
Examples in the present article were tested on 11g XE (11.2.0.2) so storage is Binary XML by default :

create table tmp_xml of xmltype;

insert into tmp_xml values(
  xmltype(
    bfilename('XML_DIR','test.xml')
  , nls_charset_id('AL32UTF8')
  )
);

The query involves two XMLTable() functions, the first one to break the document into separate worksheets, and the second to extract each row from them :

SQL> select x1.sheetname
  2       , x2.id
  3       , x2.comments
  4       , x2.dt
  5  from tmp_xml t
  6     , xmltable(
  7         xmlnamespaces( default 'urn:schemas-microsoft-com:office:spreadsheet'
  8                      , 'urn:schemas-microsoft-com:office:spreadsheet' as "ss" )
  9       , '/Workbook/Worksheet'
 10         passing t.object_value
 11         columns sheetname varchar2(31) path '@ss:Name'
 12               , rowset    xmltype      path 'Table/Row'
 13       ) x1
 14     , xmltable(
 15         xmlnamespaces(default 'urn:schemas-microsoft-com:office:spreadsheet')
 16       , '/Row[position()>1]'
 17         passing x1.rowset
 18         columns id        number         path 'Cell[1]/Data'
 19               , comments  varchar2(2000) path 'Cell[2]/Data'
 20               , dt        timestamp      path 'substring-before(Cell[3]/Data,".")'
 21       ) x2
 22  where x1.sheetname = 'MyData-1'
 23  ;
 
SHEETNAME               ID COMMENTS                                           DT
--------------- ---------- -------------------------------------------------- -------------------------
MyData-1                 1 This is a comment for line #1                      09/02/12 12:09:37,000000
MyData-1                 2 This is a comment for line #2                      10/02/12 12:09:36,000000
MyData-1                 3 This is a comment for line #3                      11/02/12 12:09:36,000000
MyData-1                 4 This is a comment for line #4                      12/02/12 12:09:36,000000
MyData-1                 5 This is a comment for line #5                      13/02/12 12:09:36,000000
MyData-1                 6 This is a comment for line #6                      14/02/12 12:09:36,000000
MyData-1                 7 This is a comment for line #7                      15/02/12 12:09:36,000000
MyData-1                 8 This is a comment for line #8                      16/02/12 12:09:36,000000
MyData-1                 9 This is a comment for line #9                      17/02/12 12:09:36,000000
MyData-1                10 This is a comment for line #10                     18/02/12 12:09:36,000000
MyData-1                11 This is a comment for line #11                     19/02/12 12:09:36,000000
MyData-1                12 This is a comment for line #12                     20/02/12 12:09:36,000000
MyData-1                13 This is a comment for line #13                     21/02/12 12:09:36,000000
MyData-1                14 This is a comment for line #14                     22/02/12 12:09:36,000000
MyData-1                15 This is a comment for line #15                     23/02/12 12:09:36,000000
MyData-1                16 This is a comment for line #16                     24/02/12 12:09:36,000000
 
16 rows selected

b) Optimized access of the document

If loading these documents in the database is a recurring task then, provided the structure doesn’t change, queries on the data can be optimized by creating a structured XML index on the XMLType table.
With such an index in place, and depending on the size of the document, there could be a significant overhead at insert time, but it’s a trade-off : subsequent queries will be considerably faster.

Here’s a small test case based on the following document (a 50,000-row worksheet, no header) :

Document properties (I’ll define a virtual column to hold the title property) :

Set up and query plan :

-- Table creation : 
create table ext_smldata of xmltype
xmltype store as binary xml
virtual columns (
  title as (
    XMLCast(
      XMLQuery(
      'declare default element namespace "urn:schemas-microsoft-com:office:spreadsheet"; (::)
       declare namespace o = "urn:schemas-microsoft-com:office:office"; (::)
       /Workbook/o:DocumentProperties/o:Title'
      passing object_value returning content
      )
      as varchar2(200)
    )
  )
);

-- Index on the "TITLE" virtual column : 
create index ext_smldata_title_idx on ext_smldata (title);

-- Structured XML index on the table : 
create index ext_smldata_sxi on ext_smldata (object_value)
indextype is xdb.xmlindex
parameters (q'#
 XMLTable ext_smldata_xtb
   XMLNamespaces (default 'urn:schemas-microsoft-com:office:spreadsheet')
 , '/Workbook/Worksheet/Table/Row'
   COLUMNS rec_id      NUMBER       PATH 'Cell[1]/Data/text()'
         , description VARCHAR2(80) PATH 'Cell[2]/Data/text()'
         , rec_value   VARCHAR2(30) PATH 'Cell[3]/Data/text()'
#');

-- Insert : 
insert into ext_smldata values(
  xmltype(
    bfilename('XML_DIR','smldata.xml')
  , nls_charset_id('AL32UTF8')
  )
);

SQL> set timing on
SQL> set autotrace traceonly
SQL> SELECT x.*
  2  FROM ext_smldata t
  3     , XMLTable(
  4         XMLNamespaces (default 'urn:schemas-microsoft-com:office:spreadsheet')
  5       , '/Workbook/Worksheet/Table/Row'
  6         PASSING t.object_value
  7         COLUMNS rec_id      NUMBER       PATH 'Cell[1]/Data/text()'
  8               , description VARCHAR2(80) PATH 'Cell[2]/Data/text()'
  9               , rec_value   VARCHAR2(30) PATH 'Cell[3]/Data/text()'
 10       ) x
 11  WHERE t.title = 'SampleData1'
 12  ;

50000 rows selected.

Elapsed: 00:00:01.69

Execution Plan
----------------------------------------------------------
Plan hash value: 3987672269

------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                       | 59520 |  5405K|   174   (2)| 00:00:03 |
|*  1 |  HASH JOIN                   |                       | 59520 |  5405K|   174   (2)| 00:00:03 |
|   2 |   TABLE ACCESS BY INDEX ROWID| EXT_SMLDATA           |     1 |    29 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | EXT_SMLDATA_TITLE_IDX |     1 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS FULL          | EXT_SMLDATA_XTB       | 59520 |  3720K|   171   (1)| 00:00:03 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T"."SYS_NC_OID$"="SYS_SXI_0"."OID")
   3 - access("T"."TITLE"='SampleData1')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3962  consistent gets
         77  physical reads
       4796  redo size
    2823321  bytes sent via SQL*Net to client
      37083  bytes received via SQL*Net from client
       3335  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      50000  rows processed

The explain plan shows that the underlying relational table supporting the XML index is used to retrieve the data.

Источник

Аннотация

Ключевые слова

Текст научной работы

Читайте также

Табличная имитация алгоритмов искусственного интеллекта в MS Excel

Применение MS Excel в решение логистических задач

Создание xml-файла средствами MS XML в TestComplete

Разбор и анализ xml-файла в TestComplete

Основы работы с odt в TestComplete

Цитировать

Поделиться

1. Writing a file

2. Reading a file

a) “One-shot” queries

b) Optimized access of the document