Apache poi excel xml

I have a need to convert my excel files into XML. Currently I have a tool that uses POI that spits out 2010 excel files (xlsx), but I would like to extend that further and have it spit out XML as well.

I can’t seem to find any explicit examples of how to do that in POI, but searching suggests it is possible. Was hoping someone would have some direct experience with this?

thanks!

Perception's user avatar

Perception

78.9k19 gold badges184 silver badges195 bronze badges

asked Dec 10, 2012 at 16:42

rkd80's user avatar

I’ve never used POI directly, though I’ve used Apache Tika (which implements POI) to do something similar. The parser/handler interface automatically converts a document into XML which you should be able to adjust for your own purposes.

https://tika.apache.org/1.2/parser.html

answered Dec 10, 2012 at 17:07

winchella's user avatar

winchellawinchella

1622 silver badges11 bronze badges

0

Check out www.servingxml.com. It converts excel/csv to XML. No Java code involved. All you have to create transformation xml file which is very easy to do. Lots of examples in the site.

answered Dec 10, 2012 at 17:19

mavrav's user avatar

mavravmavrav

5402 gold badges5 silver badges13 bronze badges

2

Learn to read excel, write excel, evaluate formula cells and apply custom formatting to the generated excel files using Apache POI library with examples.

If we are building software for the HR or Finance domain, there is usually a requirement for generating excel reports across management levels. Apart from reports, we can also expect some input data for the applications coming in the form of excel sheets and the application is expected to support this requirement.

Apache POI is a well-trusted library among many other open-source libraries to handle such usecases involving excel files. Please note that, in addition, we can read and write MS Word and MS PowerPoint files also using the Apache POI library.

This Apache POI tutorial will discuss some everyday excel operations in real-life applications.

  1. 1. Maven Dependency
  2. 2. Important Classes in POI Library
  3. 3. Writing an Excel File
  4. 4. Reading an Excel File
  5. 5. Add and Evaluate Formula Cells
  6. 6. Formatting the Cells
  7. 7. Conclusion

1. Maven Dependency

If we are working on a maven project, we can include the Apache POI dependencies in pom.xml file using this:

<dependency>
  <groupId>org.apache.poi</groupId>
  <artifactId>poi</artifactId>
  <version>5.2.2</version>
</dependency>

<dependency>
  <groupId>org.apache.poi</groupId>
  <artifactId>poi-ooxml</artifactId>
  <version>5.2.2</version>
</dependency>

2. Important Classes in POI Library

  1. HSSF, XSSF and XSSF classes

    Apache POI main classes usually start with either HSSF, XSSF or SXSSF.

    • HSSF – is the POI Project’s pure Java implementation of the Excel 97(-2007) file format. e.g., HSSFWorkbook, HSSFSheet.
    • XSSF – is the POI Project’s pure Java implementation of the Excel 2007 OOXML (.xlsx) file format. e.g., XSSFWorkbook, XSSFSheet.
    • SXSSF (since 3.8-beta3) – is an API-compatible streaming extension of XSSF to be used when huge spreadsheets have to be produced and heap space is limited. e.g., SXSSFWorkbook, SXSSFSheet. SXSSF achieves its low memory footprint by limiting access to the rows within a sliding window, while XSSF gives access to all rows in the document.
  2. Row and Cell

    Apart from the above classes, Row and Cell interact with a particular row and a particular cell in an excel sheet.

  3. Styling Related Classes

    A wide range of classes like CellStyle, BuiltinFormats, ComparisonOperator, ConditionalFormattingRule, FontFormatting, IndexedColors, PatternFormatting, SheetConditionalFormatting etc. are used when you have to add formatting to a sheet, primarily based on some rules.

  4. FormulaEvaluator

    Another helpful class FormulaEvaluator is used to evaluate the formula cells in an excel sheet.

3. Writing an Excel File

I am taking this example first so we can reuse the excel sheet created by this code in further examples.

Writing excel using POI is very simple and involves the following steps:

  1. Create a workbook
  2. Create a sheet in workbook
  3. Create a row in sheet
  4. Add cells to sheet
  5. Repeat steps 3 and 4 to write more data

It seems very simple, right? Let’s have a look at the code doing these steps.

Java program to write an excel file using Apache POI library.

package com.howtodoinjava.demo.poi;
//import statements
public class WriteExcelDemo 
{
    public static void main(String[] args) 
    {
        //Blank workbook
        XSSFWorkbook workbook = new XSSFWorkbook(); 
         
        //Create a blank sheet
        XSSFSheet sheet = workbook.createSheet("Employee Data");
          
        //This data needs to be written (Object[])
        Map<String, Object[]> data = new TreeMap<String, Object[]>();
        data.put("1", new Object[] {"ID", "NAME", "LASTNAME"});
        data.put("2", new Object[] {1, "Amit", "Shukla"});
        data.put("3", new Object[] {2, "Lokesh", "Gupta"});
        data.put("4", new Object[] {3, "John", "Adwards"});
        data.put("5", new Object[] {4, "Brian", "Schultz"});
          
        //Iterate over data and write to sheet
        Set<String> keyset = data.keySet();
        int rownum = 0;
        for (String key : keyset)
        {
            Row row = sheet.createRow(rownum++);
            Object [] objArr = data.get(key);
            int cellnum = 0;
            for (Object obj : objArr)
            {
               Cell cell = row.createCell(cellnum++);
               if(obj instanceof String)
                    cell.setCellValue((String)obj);
                else if(obj instanceof Integer)
                    cell.setCellValue((Integer)obj);
            }
        }
        try
        {
            //Write the workbook in file system
            FileOutputStream out = new FileOutputStream(new File("howtodoinjava_demo.xlsx"));
            workbook.write(out);
            out.close();
            System.out.println("howtodoinjava_demo.xlsx written successfully on disk.");
        } 
        catch (Exception e) 
        {
            e.printStackTrace();
        }
    }
}
poi-demo-write-file

See Also: Appending Rows to Excel

4. Reading an Excel File

Reading an excel file using POI is also very simple if we divide this into steps.

  1. Create workbook instance from an excel sheet
  2. Get to the desired sheet
  3. Increment row number
  4. iterate over all cells in a row
  5. repeat steps 3 and 4 until all data is read

Let’s see all the above steps in code. I am writing the code to read the excel file created in the above example. It will read all the column names and the values in it – cell by cell.

Java program to read an excel file using Apache POI library.

package com.howtodoinjava.demo.poi;
//import statements
public class ReadExcelDemo 
{
    public static void main(String[] args) 
    {
        try
        {
            FileInputStream file = new FileInputStream(new File("howtodoinjava_demo.xlsx"));
 
            //Create Workbook instance holding reference to .xlsx file
            XSSFWorkbook workbook = new XSSFWorkbook(file);
 
            //Get first/desired sheet from the workbook
            XSSFSheet sheet = workbook.getSheetAt(0);
 
            //Iterate through each rows one by one
            Iterator<Row> rowIterator = sheet.iterator();
            while (rowIterator.hasNext()) 
            {
                Row row = rowIterator.next();
                //For each row, iterate through all the columns
                Iterator<Cell> cellIterator = row.cellIterator();
                 
                while (cellIterator.hasNext()) 
                {
                    Cell cell = cellIterator.next();
                    //Check the cell type and format accordingly
                    switch (cell.getCellType()) 
                    {
                        case Cell.CELL_TYPE_NUMERIC:
                            System.out.print(cell.getNumericCellValue() + "t");
                            break;
                        case Cell.CELL_TYPE_STRING:
                            System.out.print(cell.getStringCellValue() + "t");
                            break;
                    }
                }
                System.out.println("");
            }
            file.close();
        } 
        catch (Exception e) 
        {
            e.printStackTrace();
        }
    }
}

Program Output:

ID      NAME        LASTNAME
1.0     Amit        Shukla  
2.0     Lokesh      Gupta   
3.0     John        Adwards 
4.0     Brian       Schultz 

See Also: Apache POI – Read an Excel File using SAX Parser

5. Add and Evaluate Formula Cells

When working on complex excel sheets, we encounter many cells with formulas to calculate their values. These are formula cells. Apache POI also has excellent support for adding formula cells and evaluating already present formula cells.

Let’s see one example of how to add formula cells in excel?

The sheet has four cells in a row and the fourth one in the multiplication of all the previous 3 rows. So the formula will be: A2*B2*C2 (in the second row)

Java program to add formula in an excel file using Apache POI library.

public static void main(String[] args) 
{
    XSSFWorkbook workbook = new XSSFWorkbook();
    XSSFSheet sheet = workbook.createSheet("Calculate Simple Interest");
  
    Row header = sheet.createRow(0);
    header.createCell(0).setCellValue("Pricipal");
    header.createCell(1).setCellValue("RoI");
    header.createCell(2).setCellValue("T");
    header.createCell(3).setCellValue("Interest (P r t)");
      
    Row dataRow = sheet.createRow(1);
    dataRow.createCell(0).setCellValue(14500d);
    dataRow.createCell(1).setCellValue(9.25);
    dataRow.createCell(2).setCellValue(3d);
    dataRow.createCell(3).setCellFormula("A2*B2*C2");
      
    try {
        FileOutputStream out =  new FileOutputStream(new File("formulaDemo.xlsx"));
        workbook.write(out);
        out.close();
        System.out.println("Excel with foumula cells written successfully");
          
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

Similarly, we want to read a file with formula cells and use the following logic to evaluate formula cells.

Java program to evaluate formula in an excel file using Apache POI library.

public static void readSheetWithFormula()
{
    try
    {
        FileInputStream file = new FileInputStream(new File("formulaDemo.xlsx"));
 
        //Create Workbook instance holding reference to .xlsx file
        XSSFWorkbook workbook = new XSSFWorkbook(file);
 
        FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
         
        //Get first/desired sheet from the workbook
        XSSFSheet sheet = workbook.getSheetAt(0);
 
        //Iterate through each rows one by one
        Iterator<Row> rowIterator = sheet.iterator();
        while (rowIterator.hasNext()) 
        {
            Row row = rowIterator.next();
            //For each row, iterate through all the columns
            Iterator<Cell> cellIterator = row.cellIterator();
             
            while (cellIterator.hasNext()) 
            {
                Cell cell = cellIterator.next();
                //Check the cell type after eveluating formulae
                //If it is formula cell, it will be evaluated otherwise no change will happen
                switch (evaluator.evaluateInCell(cell).getCellType()) 
                {
                    case Cell.CELL_TYPE_NUMERIC:
                        System.out.print(cell.getNumericCellValue() + "tt");
                        break;
                    case Cell.CELL_TYPE_STRING:
                        System.out.print(cell.getStringCellValue() + "tt");
                        break;
                    case Cell.CELL_TYPE_FORMULA:
                        //Not again
                        break;
                }
            }
            System.out.println("");
        }
        file.close();
    } 
    catch (Exception e) 
    {
        e.printStackTrace();
    }
}

Program Output:

Pricipal        RoI         T       Interest (P r t)        
14500.0         9.25        3.0     402375.0  
poi-demo-write-formula

6. Formatting the Cells

So far we have seen examples of reading/writing and excel files using Apache POI. But, when creating a report in an excel file, it is essential to add formatting on cells that fit into any pre-determined criteria.

This formatting can be a different coloring based on a specific value range, expiry date limit etc.

In the below examples, we are taking a couple of such cell formatting examples for various purposes.

6.1. Cell value in a specific range

This code will color any cell in a range whose value is between a configured range. [e.g., between 50 and 70]

static void basedOnValue(Sheet sheet) 
{
    //Creating some random values
    sheet.createRow(0).createCell(0).setCellValue(84);
    sheet.createRow(1).createCell(0).setCellValue(74);
    sheet.createRow(2).createCell(0).setCellValue(50);
    sheet.createRow(3).createCell(0).setCellValue(51);
    sheet.createRow(4).createCell(0).setCellValue(49);
    sheet.createRow(5).createCell(0).setCellValue(41);
 
    SheetConditionalFormatting sheetCF = sheet.getSheetConditionalFormatting();
 
    //Condition 1: Cell Value Is   greater than  70   (Blue Fill)
    ConditionalFormattingRule rule1 = sheetCF.createConditionalFormattingRule(ComparisonOperator.GT, "70");
    PatternFormatting fill1 = rule1.createPatternFormatting();
    fill1.setFillBackgroundColor(IndexedColors.BLUE.index);
    fill1.setFillPattern(PatternFormatting.SOLID_FOREGROUND);
 
    //Condition 2: Cell Value Is  less than      50   (Green Fill)
    ConditionalFormattingRule rule2 = sheetCF.createConditionalFormattingRule(ComparisonOperator.LT, "50");
    PatternFormatting fill2 = rule2.createPatternFormatting();
    fill2.setFillBackgroundColor(IndexedColors.GREEN.index);
    fill2.setFillPattern(PatternFormatting.SOLID_FOREGROUND);
 
    CellRangeAddress[] regions = {
            CellRangeAddress.valueOf("A1:A6")
    };
 
    sheetCF.addConditionalFormatting(regions, rule1, rule2);
}
poi-demo-formatting-1

6.2. Highlight Duplicate Values

Highlight all cells which have duplicate values in observed cells.

static void formatDuplicates(Sheet sheet) {
    sheet.createRow(0).createCell(0).setCellValue("Code");
    sheet.createRow(1).createCell(0).setCellValue(4);
    sheet.createRow(2).createCell(0).setCellValue(3);
    sheet.createRow(3).createCell(0).setCellValue(6);
    sheet.createRow(4).createCell(0).setCellValue(3);
    sheet.createRow(5).createCell(0).setCellValue(5);
    sheet.createRow(6).createCell(0).setCellValue(8);
    sheet.createRow(7).createCell(0).setCellValue(0);
    sheet.createRow(8).createCell(0).setCellValue(2);
    sheet.createRow(9).createCell(0).setCellValue(8);
    sheet.createRow(10).createCell(0).setCellValue(6);
 
    SheetConditionalFormatting sheetCF = sheet.getSheetConditionalFormatting();
 
    // Condition 1: Formula Is   =A2=A1   (White Font)
    ConditionalFormattingRule rule1 = sheetCF.createConditionalFormattingRule("COUNTIF($A$2:$A$11,A2)>1");
    FontFormatting font = rule1.createFontFormatting();
    font.setFontStyle(false, true);
    font.setFontColorIndex(IndexedColors.BLUE.index);
 
    CellRangeAddress[] regions = {
            CellRangeAddress.valueOf("A2:A11")
    };
 
    sheetCF.addConditionalFormatting(regions, rule1);
 
    sheet.getRow(2).createCell(1).setCellValue("<== Duplicates numbers in the column are highlighted.  " +
            "Condition: Formula Is =COUNTIF($A$2:$A$11,A2)>1   (Blue Font)");
}
poi-demo-formatting-2

6.3. Alternate Color Rows in Different Colors

A simple code to color each alternate row in a different color.

static void shadeAlt(Sheet sheet) {
    SheetConditionalFormatting sheetCF = sheet.getSheetConditionalFormatting();
 
    // Condition 1: Formula Is   =A2=A1   (White Font)
    ConditionalFormattingRule rule1 = sheetCF.createConditionalFormattingRule("MOD(ROW(),2)");
    PatternFormatting fill1 = rule1.createPatternFormatting();
    fill1.setFillBackgroundColor(IndexedColors.LIGHT_GREEN.index);
    fill1.setFillPattern(PatternFormatting.SOLID_FOREGROUND);
 
    CellRangeAddress[] regions = {
            CellRangeAddress.valueOf("A1:Z100")
    };
 
    sheetCF.addConditionalFormatting(regions, rule1);
 
    sheet.createRow(0).createCell(1).setCellValue("Shade Alternating Rows");
    sheet.createRow(1).createCell(1).setCellValue("Condition: Formula Is  =MOD(ROW(),2)   (Light Green Fill)");
}
poi-demo-formatting-3

6.4. Color amounts that are going to expire in the next 30 days

A handy code for financial projects which keeps track of deadlines.

static void expiryInNext30Days(Sheet sheet) 
{
    CellStyle style = sheet.getWorkbook().createCellStyle();
    style.setDataFormat((short)BuiltinFormats.getBuiltinFormat("d-mmm"));
 
    sheet.createRow(0).createCell(0).setCellValue("Date");
    sheet.createRow(1).createCell(0).setCellFormula("TODAY()+29");
    sheet.createRow(2).createCell(0).setCellFormula("A2+1");
    sheet.createRow(3).createCell(0).setCellFormula("A3+1");
 
    for(int rownum = 1; rownum <= 3; rownum++) sheet.getRow(rownum).getCell(0).setCellStyle(style);
 
    SheetConditionalFormatting sheetCF = sheet.getSheetConditionalFormatting();
 
    // Condition 1: Formula Is   =A2=A1   (White Font)
    ConditionalFormattingRule rule1 = sheetCF.createConditionalFormattingRule("AND(A2-TODAY()>=0,A2-TODAY()<=30)");
    FontFormatting font = rule1.createFontFormatting();
    font.setFontStyle(false, true);
    font.setFontColorIndex(IndexedColors.BLUE.index);
 
    CellRangeAddress[] regions = {
            CellRangeAddress.valueOf("A2:A4")
    };
 
    sheetCF.addConditionalFormatting(regions, rule1);
 
    sheet.getRow(0).createCell(1).setCellValue("Dates within the next 30 days are highlighted");
}
poi-demo-formatting-4

I am ending this apache poi tutorial here to keep the post within a limit.

7. Conclusion

In this tutorial, we learned to read excel, write excel, set and evaluate formula cells, and format the cells with color codings using the Apache POI library.

Happy Learning !!

Source Code on Github

Today we’re going to show how to read a XML file and convert it’s entries to lines on an excel file.

The XML file is located at https://github.com/jbaysolutions/xml-to-excel/blob/master/Publication1.xml?raw=true.

The XML file’s main nodes are «Substances», each one has a few properties «Name», «entry_force», «directive» and a list of «Product». We’re going to create an excel row for each Product. Each row will also have the Product parent Substance details.

Below is a sample of the XML structure:

<?xml version="1.0" encoding="UTF-8"?>
<Pesticides>
<Header>
    <Creation_Date>09/07/2015 13:45</Creation_Date>
</Header>
<Substances>
    <Name>Garlic extract (++)</Name>
    <entry_force>01/09/2008</entry_force>
    <directive>Reg. (EC) No 839/2008</directive>
    <Product>
        <Product_name>FRUITS, FRESH or FROZEN; TREE NUTS</Product_name>
        <Product_code>0100000</Product_code>
        <MRL/>
        <ApplicationDate>01/09/2008</ApplicationDate>
    </Product>
    <Product>
        <Product_name>Oranges (Bergamots, Bitter oranges/sour oranges, Blood oranges, Cara caras, Chinottos,
            Trifoliate oranges, Other hybrids of Citrus sinensis, not elsewhere mentioned,)
        </Product_name>
        <Product_code>0110020</Product_code>
        <MRL/>
        <ApplicationDate>01/09/2008</ApplicationDate>
    </Product>
    <Product>
        <Product_name>Lemons (Buddha's hands/Buddha's fingers, Citrons,)</Product_name>
        <Product_code>0110030</Product_code>
        <MRL/>
        <ApplicationDate>01/09/2008</ApplicationDate>
    </Product>
    <Product>
        <Product_name>Limes (Indian sweet limes/Palestine sweet limes, Kaffir limes, Sweet limes/mosambis, Tahiti
            limes,)
        </Product_name>
        <Product_code>0110040</Product_code>
        <MRL/>
        <ApplicationDate>01/09/2008</ApplicationDate>
    </Product>
</Substances>
<Substances>
(...)
</Substances>

As usual, we use Apache POI, to create the excel file.

You can get the sample project used in this post at GitHub.

Downloading the file

We start by downloading the file from it’s original URL location:

File xmlFile = File.createTempFile("substances", "tmp");
String xmlFileUrl = "http://ec.europa.eu/food/plant/pesticides/eu-pesticides-database/public/?event=Execute.DownLoadXML&id=1";
URL url = new URL(xmlFileUrl);
System.out.println("downloading file from " + xmlFileUrl + " ...");
FileUtils.copyURLToFile(url, xmlFile);
System.out.println("downloading finished, parsing...");

Preparing the Excel file

To create the Excel file where we’re writing, we start by creating a new workbook, an empty sheet and writing the first line with the column headers:

workbook = new XSSFWorkbook();

CellStyle style = workbook.createCellStyle();
Font boldFont = workbook.createFont();
boldFont.setBold(true);
style.setFont(boldFont);
style.setAlignment(CellStyle.ALIGN_CENTER);

Sheet sheet = workbook.createSheet();
rowNum = 0;
Row row = sheet.createRow(rowNum++);
Cell cell = row.createCell(SUBSTANCE_NAME_COLUMN);
cell.setCellValue("Substance name");
cell.setCellStyle(style);

cell = row.createCell(SUBSTANCE_ENTRY_FORCE_COLUMN);
cell.setCellValue("Substance entry_force");
cell.setCellStyle(style);

cell = row.createCell(SUBSTANCE_DIRECTIVE_COLUMN);
cell.setCellValue("Substance directive");
cell.setCellStyle(style);

cell = row.createCell(PRODUCT_NAME_COLUMN);
cell.setCellValue("Product name");
cell.setCellStyle(style);

cell = row.createCell(PRODUCT_CODE_COLUMN);
cell.setCellValue("Product code");
cell.setCellStyle(style);

cell = row.createCell(PRODUCT_MRL_COLUMN);
cell.setCellValue("MRL");
cell.setCellStyle(style);

cell = row.createCell(APPLICATION_DATE_COLUMN);
cell.setCellValue("Application Date");
cell.setCellStyle(style);

Parsing

For this sample, the XML file is parsed using DOM.

We get the reference to the excel file sheet:

Sheet sheet = workbook.getSheetAt(0);

We start by loading the XML document using DOM and getting the Substances node list:

DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);

NodeList nList = doc.getElementsByTagName("Substances");

Then we iterate through the Substances list and get the Substance properties:

for (int i = 0; i < nList.getLength(); i++) {
    System.out.println("Processing element " + (i+1) + "/" + nList.getLength());
    Node node = nList.item(i);
    if (node.getNodeType() == Node.ELEMENT_NODE) {
        Element element = (Element) node;
        String substanceName = element.getElementsByTagName("Name").item(0).getTextContent();
        String entryForce = element.getElementsByTagName("entry_force").item(0).getTextContent();
        String directive = element.getElementsByTagName("directive").item(0).getTextContent();

        NodeList prods = element.getElementsByTagName("Product");

When we get to the Product element, we get it as a NodeList and iterate it to get it’s details:

for (int j = 0; j < prods.getLength(); j++) {
    Node prod = prods.item(j);
    if (prod.getNodeType() == Node.ELEMENT_NODE) {
        Element product = (Element) prod;
        String prodName = product.getElementsByTagName("Product_name").item(0).getTextContent();
        String prodCode = product.getElementsByTagName("Product_code").item(0).getTextContent();
        String lmr = product.getElementsByTagName("MRL").item(0).getTextContent();
        String applicationDate = product.getElementsByTagName("ApplicationDate").item(0).getTextContent();

Now that we have all the details we want to write on the excel file, we create a row with all the details:

Row row = sheet.createRow(rowNum++);
Cell cell = row.createCell(SUBSTANCE_NAME_COLUMN);
cell.setCellValue(substanceName);

cell = row.createCell(SUBSTANCE_ENTRY_FORCE_COLUMN);
cell.setCellValue(entryForce);

cell = row.createCell(SUBSTANCE_DIRECTIVE_COLUMN);
cell.setCellValue(directive);

cell = row.createCell(PRODUCT_NAME_COLUMN);
cell.setCellValue(prodName);

cell = row.createCell(PRODUCT_CODE_COLUMN);
cell.setCellValue(prodCode);

cell = row.createCell(PRODUCT_MRL_COLUMN);
cell.setCellValue(lmr);

cell = row.createCell(APPLICATION_DATE_COLUMN);
cell.setCellValue(applicationDate);

When all the elements are written, we write the excel to the filesystem:

FileOutputStream fileOut = new FileOutputStream("C:/Temp/Excel-Out.xlsx");
workbook.write(fileOut);
workbook.close();
fileOut.close();

Finally, we delete the downloaded XML file:

if (xmlFile.exists()) {
    System.out.println("delete file-> " + xmlFile.getAbsolutePath());
    if (!xmlFile.delete()) {
        System.out.println("file '" + xmlFile.getAbsolutePath() + "' was not deleted!");
    }
}

Conclusion

The sample project used in this post at GitHub has a main class XmlToExcelConverter to download, parse the file and create the excel file.

Feel free to copy and adapt the code to read other XML files!
Hope it helped anyone having the same issues as us!

References

DOM

DOM Tutorial

Apache POI

Рассказывает автор блога javarevisited.blogspot.ru


Из этой статьи вы сможете узнать о записи и чтении данных из Excel файлов в Java (будет рассмотрен как XLS, так и XLSX формат). Мы будем использовать библиотеку Apache POI и сосредоточимся на работе с типами String и Date, работа с последним происходит достаточно хитро. Напомню, что работу с числами мы уже рассмотрели в другой статье.

Библиотеку poi-XX.jar вы можете использовать для всех старых (xls, doc, ppt) файлов Microsoft Office, для новых (xlsx, docx, pptx) вам понадобится poi-ooxml-XX.jar. Очень важно понимать, что к чему относится, т.к. используемые классы тоже разные — для старых расширений это HSSFWorkbook, а для новых — XSSFWorkbook.

Подготовка: загрузка библиотек и зависимостей

Конечно, существует достаточно много открытых библиотек, которые позволяют работать с Excel файлами в Java, например, JXL, но мы будем использовать имеющую самый обширный API и самую популярную — Apache POI. Чтобы её использовать, вам нужно скачать jar файлы и добавить их через Eclipse вручную, или вы можете предоставить это Maven.

Во втором случае вам нужно просто добавить следующие две зависимости:

<dependencies>
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi</artifactId>
        <version>3.12</version>
    </dependency>
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml</artifactId>
        <version>3.12</version>
    </dependency>
  </dependencies>

Самое удобное в Maven — что он загрузит не только указанные poi.jar и poi-ooxml.jar, но и все jar файлы, которые используются внутри, то есть xmlbeans-2.6.0.jar, stax-api-1.0.1.jar, poi-ooxml-schemas-3.12.jar и commons-codec-1.9.jar.

Если вы будете добавлять библиотеки вручную — не забудьте о вышеназванных файлах. Скачать всё можно отсюда. Помните — если вы загрузите только poi-XX.jar, то ваш код скомпилируется без ошибок, но потом упадёт с java.lang.NoClassDefFoundError: org/apache/xmlbeans/XmlObject, так как внутри будет вызываться xmlbeans.jar.

Запись

В этом примере мы запишем в xls файл следующие данные: в первую ячейку — строку с именем, а во вторую — дату рождения. Вот пошаговая инструкция:

  • Создаём объект HSSFWorkBook;
  • Создаём лист, используя на объекте, созданном в предыдущем шаге, createSheet();
  • Создаём на листе строку, используя createRow();
  • Создаём в строке ячейку — createCell();
  • Задаём значение ячейки через setCellValue();
  • Записываем workbook в File через FileOutputStream;
  • Закрываем workbook, вызывая close().

Для записи строк или чисел этого вполне достаточно, но чтобы записать дату, нам понадобится сделать ещё кое-что:

  • Создать DateFormat;
  • Создать CellStyle;
  • Записать DateFormat в CellStyle;
  • Записать CellStyle в ячейку;
  • Теперь в эту ячейку можно записать объект Date через всё тот же setCellValue;
  • Чтобы дата поместилась в ячейку, нам нужно добавить столбцу свойство автоматически менять размер: sheet.autoSizeColumn(1).

Всё вместе это будет выглядеть так:

@SuppressWarnings("deprecation")
    public static void writeIntoExcel(String file) throws FileNotFoundException, IOException{
        Workbook book = new HSSFWorkbook();
        Sheet sheet = book.createSheet("Birthdays");

        // Нумерация начинается с нуля
        Row row = sheet.createRow(0); 
        
        // Мы запишем имя и дату в два столбца
        // имя будет String, а дата рождения --- Date,
        // формата dd.mm.yyyy
        Cell name = row.createCell(0);
        name.setCellValue("John");
        
        Cell birthdate = row.createCell(1);
        
        DataFormat format = book.createDataFormat();
        CellStyle dateStyle = book.createCellStyle();
        dateStyle.setDataFormat(format.getFormat("dd.mm.yyyy"));
        birthdate.setCellStyle(dateStyle);
        
 
        // Нумерация лет начинается с 1900-го
        birthdate.setCellValue(new Date(110, 10, 10));
        
        // Меняем размер столбца
        sheet.autoSizeColumn(1);
        
        // Записываем всё в файл
        book.write(new FileOutputStream(file));
        book.close();
    }

Чтение

Теперь мы считаем из только что созданного файла то, что мы туда записали.

  • Для начала создадим HSSFWorkBook, передав в конструктор FileInputStream;
  • Получаем лист, передавая в getSheet() его номер или название;
  • Получаем строку, используя getRow();
  • Получаем ячейку, используя getCell();
  • Узнаём тип ячейки, используя на ней getCellType();
  • В зависимости от типа ячейки, читаем её значения, используя getStringCellValue(), getNumericCellValue() или getDateCellValue();
  • Закрываем workbook используя close().

Напомню, что дату Excel хранит как числа, т.е. тип ячейки всё равно будет CELL_TYPE_NUMERIC.

В виде кода это будет выглядеть следующим образом:

public static void readFromExcel(String file) throws IOException{
        HSSFWorkbook myExcelBook = new HSSFWorkbook(new FileInputStream(file));
        HSSFSheet myExcelSheet = myExcelBook.getSheet("Birthdays");
        HSSFRow row = myExcelSheet.getRow(0);
        
        if(row.getCell(0).getCellType() == HSSFCell.CELL_TYPE_STRING){
            String name = row.getCell(0).getStringCellValue();
            System.out.println("name : " + name);
        }
        
        if(row.getCell(1).getCellType() == HSSFCell.CELL_TYPE_NUMERIC){
            Date birthdate = row.getCell(1).getDateCellValue();
            System.out.println("birthdate :" + birthdate);
        }
        
        myExcelBook.close();
        
    }

В заключение

Как уже упомналось выше, чтение из xlsx файлов ничем принципиально не отличается — нужно только вместо HSSFWorkBook, HSSFSheet, HSSFRow (и прочих) из poi-XX.jar использовать XSSFWorkBook, XSSFSheet, XSSFRow из poi-ooxml-XX.jar. Это всё, что вам нужно знать для чтения и записи в файлы Excel. Разумеется, с помощью библиотеки Apache POI вы можете сделать гораздо больше, но эта статья должна помочь вам быстрее в ней освоиться.

Перевод статьи «How to Read Write Excel file in Java — POI Example»

Содержание

  1. Работа с Excel в Java через Apache POI
  2. Подготовка: загрузка библиотек и зависимостей
  3. Запись
  4. Чтение
  5. Apache POI – Read and Write Excel File in Java
  6. 1. Maven Dependency
  7. 2. Important Classes in POI Library
  8. HSSF, XSSF and XSSF classes
  9. Row and Cell
  10. Styling Related Classes
  11. FormulaEvaluator
  12. 3. Writing an Excel File
  13. 4. Reading an Excel File
  14. 5. Add and Evaluate Formula Cells
  15. 6. Formatting the Cells
  16. 6.1. Cell value in a specific range
  17. 6.2. Highlight Duplicate Values
  18. 6.3. Alternate Color Rows in Different Colors
  19. 6.4. Color amounts that are going to expire in the next 30 days
  20. 7. Conclusion
  21. Web technologies
  22. Read Excel file
  23. Write XML
  24. Normalizing sheet names and column headers
  25. Quick overview of XSLX format
  26. Whole code
  27. How to convert Excel to XML using java?
  28. 5 Answers 5
  29. Linked
  30. Related
  31. Hot Network Questions
  32. Subscribe to RSS
  33. How to read/write XML maps from/in excel with Apache POI in Java?
  34. 1 Answer 1

Работа с Excel в Java через Apache POI

Из этой статьи вы сможете узнать о записи и чтении данных из Excel файлов в Java (будет рассмотрен как XLS , так и XLSX формат). Мы будем использовать библиотеку Apache POI и сосредоточимся на работе с типами String и Date , работа с последним происходит достаточно хитро. Напомню, что работу с числами мы уже рассмотрели в другой статье.

Библиотеку poi-XX.jar вы можете использовать для всех старых ( xls , doc , ppt ) файлов Microsoft Office, для новых ( xlsx , docx , pptx ) вам понадобится poi-ooxml-XX.jar . Очень важно понимать, что к чему относится, т.к. используемые классы тоже разные — для старых расширений это HSSFWorkbook , а для новых — XSSFWorkbook .

Подготовка: загрузка библиотек и зависимостей

Конечно, существует достаточно много открытых библиотек, которые позволяют работать с Excel файлами в Java, например, JXL, но мы будем использовать имеющую самый обширный API и самую популярную — Apache POI. Чтобы её использовать, вам нужно скачать jar файлы и добавить их через Eclipse вручную, или вы можете предоставить это Maven.

Во втором случае вам нужно просто добавить следующие две зависимости:

Самое удобное в Maven — что он загрузит не только указанные poi.jar и poi-ooxml.jar , но и все jar файлы, которые используются внутри, то есть xmlbeans-2.6.0.jar , stax-api-1.0.1.jar , poi-ooxml-schemas-3.12.jar и commons-codec-1.9.jar .

Если вы будете добавлять библиотеки вручную — не забудьте о вышеназванных файлах. Скачать всё можно отсюда. Помните — если вы загрузите только poi-XX.jar , то ваш код скомпилируется без ошибок, но потом упадёт с java.lang.NoClassDefFoundError: org/apache/xmlbeans/XmlObject , так как внутри будет вызываться xmlbeans.jar .

Запись

В этом примере мы запишем в xls файл следующие данные: в первую ячейку — строку с именем, а во вторую — дату рождения. Вот пошаговая инструкция:

  • Создаём объект HSSFWorkBook ;
  • Создаём лист, используя на объекте, созданном в предыдущем шаге, createSheet() ;
  • Создаём на листе строку, используя createRow() ;
  • Создаём в строке ячейку — createCell() ;
  • Задаём значение ячейки через setCellValue();
  • Записываем workbook в File через FileOutputStream ;
  • Закрываем workbook , вызывая close() .

Для записи строк или чисел этого вполне достаточно, но чтобы записать дату, нам понадобится сделать ещё кое-что:

  • Создать DateFormat ;
  • Создать CellStyle ;
  • Записать DateFormat в CellStyle ;
  • Записать CellStyle в ячейку;
  • Теперь в эту ячейку можно записать объект Date через всё тот же setCellValue ;
  • Чтобы дата поместилась в ячейку, нам нужно добавить столбцу свойство автоматически менять размер: sheet.autoSizeColumn(1) .

Всё вместе это будет выглядеть так:

Чтение

Теперь мы считаем из только что созданного файла то, что мы туда записали.

  • Для начала создадим HSSFWorkBook , передав в конструктор FileInputStream ;
  • Получаем лист, передавая в getSheet() его номер или название;
  • Получаем строку, используя getRow() ;
  • Получаем ячейку, используя getCell() ;
  • Узнаём тип ячейки, используя на ней getCellType() ;
  • В зависимости от типа ячейки, читаем её значения, используя getStringCellValue() , getNumericCellValue() или getDateCellValue() ;
  • Закрываем workbook используя close() .

Напомню, что дату Excel хранит как числа, т.е. тип ячейки всё равно будет CELL_TYPE_NUMERIC .

В виде кода это будет выглядеть следующим образом:

Источник

Apache POI – Read and Write Excel File in Java

Last Updated: October 1, 2022

Learn to read excel, write excel, evaluate formula cells and apply custom formatting to the generated excel files using Apache POI library with examples.

If we are building software for the HR or Finance domain, there is usually a requirement for generating excel reports across management levels. Apart from reports, we can also expect some input data for the applications coming in the form of excel sheets and the application is expected to support this requirement.

Apache POI is a well-trusted library among many other open-source libraries to handle such usecases involving excel files. Please note that, in addition, we can read and write MS Word and MS PowerPoint files also using the Apache POI library.

This Apache POI tutorial will discuss some everyday excel operations in real-life applications.

1. Maven Dependency

If we are working on a maven project, we can include the Apache POI dependencies in pom.xml file using this:

2. Important Classes in POI Library

HSSF, XSSF and XSSF classes

Apache POI main classes usually start with either HSSF, XSSF or SXSSF.

  • HSSF – is the POI Project’s pure Java implementation of the Excel 97(-2007) file format. e.g., HSSFWorkbook, HSSFSheet.
  • XSSF – is the POI Project’s pure Java implementation of the Excel 2007 OOXML (.xlsx) file format. e.g., XSSFWorkbook, XSSFSheet.
  • SXSSF (since 3.8-beta3) – is an API-compatible streaming extension of XSSF to be used when huge spreadsheets have to be produced and heap space is limited. e.g., SXSSFWorkbook, SXSSFSheet. SXSSF achieves its low memory footprint by limiting access to the rows within a sliding window, while XSSF gives access to all rows in the document.

Row and Cell

Apart from the above classes, Row and Cell interact with a particular row and a particular cell in an excel sheet.

FormulaEvaluator

Another helpful class FormulaEvaluator is used to evaluate the formula cells in an excel sheet.

3. Writing an Excel File

I am taking this example first so we can reuse the excel sheet created by this code in further examples.

Writing excel using POI is very simple and involves the following steps:

  1. Create a workbook
  2. Create a sheet in workbook
  3. Create a row in sheet
  4. Add cells to sheet
  5. Repeat steps 3 and 4 to write more data

It seems very simple, right? Let’s have a look at the code doing these steps.

Java program to write an excel file using Apache POI library.

4. Reading an Excel File

Reading an excel file using POI is also very simple if we divide this into steps.

  1. Create workbook instance from an excel sheet
  2. Get to the desired sheet
  3. Increment row number
  4. iterate over all cells in a row
  5. repeat steps 3 and 4 until all data is read

Let’s see all the above steps in code. I am writing the code to read the excel file created in the above example. It will read all the column names and the values in it – cell by cell.

Java program to read an excel file using Apache POI library.

5. Add and Evaluate Formula Cells

When working on complex excel sheets, we encounter many cells with formulas to calculate their values. These are formula cells. Apache POI also has excellent support for adding formula cells and evaluating already present formula cells.

Let’s see one example of how to add formula cells in excel?

The sheet has four cells in a row and the fourth one in the multiplication of all the previous 3 rows. So the formula will be: A2*B2*C2 (in the second row)

Java program to add formula in an excel file using Apache POI library.

Similarly, we want to read a file with formula cells and use the following logic to evaluate formula cells.

Java program to evaluate formula in an excel file using Apache POI library.

6. Formatting the Cells

So far we have seen examples of reading/writing and excel files using Apache POI. But, when creating a report in an excel file, it is essential to add formatting on cells that fit into any pre-determined criteria.

This formatting can be a different coloring based on a specific value range, expiry date limit etc.

In the below examples, we are taking a couple of such cell formatting examples for various purposes.

6.1. Cell value in a specific range

This code will color any cell in a range whose value is between a configured range. [e.g., between 50 and 70]

6.2. Highlight Duplicate Values

Highlight all cells which have duplicate values in observed cells.

6.3. Alternate Color Rows in Different Colors

A simple code to color each alternate row in a different color.

6.4. Color amounts that are going to expire in the next 30 days

A handy code for financial projects which keeps track of deadlines.

I am ending this apache poi tutorial here to keep the post within a limit.

7. Conclusion

In this tutorial, we learned to read excel, write excel, set and evaluate formula cells, and format the cells with color codings using the Apache POI library.

Источник

Web technologies

Excel is a human readable and writable format, and XML is an important machine language. We need an efficient bridge between those two technologies.

Since 2007, the Excel files are a zip file containing XML data. Those XML files must be serialized into a minimal XML containing the cells data inside tags with the column headers as names. The resulting XML will look like that :

Here is the code to do it.

Read Excel file

In that post, I only treat the case of the recent XLSX format. I will use Apache POI and I want it to be able to manage very big files, so I have to use an event API.

The POI streaming API only treats the old XLS format. For XLSX, you have to use the SAX API which only helps you providing access to a stream of the XML of the sheets. The reading of the tags must be coded into your SAX handler and the given example is very useful but not complete.

The code in order to open a file and trigger the SAX reading is :

Write XML

As I said, I want that serializer to be fast. So I will not use a XML API in order to write the XML output. With some lines of codes, I write the content as a XML containing a namespace for reusability.

Here is the code :

Normalizing sheet names and column headers

In order to avoid issues with XML tag names, I normalize the sheet names and the column headers I found. It removes accents, normalizes spaces, and omits any characters which is not a letter, ‘_’ or a digit.

Here is the normalizing code :

Quick overview of XSLX format

The contents of the cells are inside the tag of the XML files of the sheets.
In the files I use, I saw 3 different structures :

    • one with attribute t=»inlineStr» :
    • a structure with the value in clear, with attribute t=»n» :
    • and a last case, where it uses the “shared strings”. In that case the value is the index of the String in that list of shared strings :

I did not see any tag , as it is expected in example from Apache POI. And in the case of an attribute t=»inlineStr» , the value is in a tag instead of , as expected in example.

Whole code

You can download the whole code here :
FastXlsx2XmlSerializer

Источник

How to convert Excel to XML using java?

i want to convert my input Excel file into the Output XML file.

If anybody has any solution in java for how to take input Excel file and how to write to XML as output,please give any code or any URL or any other solution.

5 Answers 5

Look into the jexcel or Apache POI libraries for reading in the Excel file.

Creating an XML file is simple, either just write the XML out to a file directly, or append to an XML Document and then write that out using the standard Java libs or Xerces or similar.

I have done conversion of Excel(xlsx) to xml in Java recently. I assumed each row in excel as a single object here. Here are the steps I followed:-

  1. Read Excel file using Apache POI
  2. Created a xsd file and generated corresponding classes
  3. Read each row created, created corresponding objects and initilaized values using the generated getter/setter methods in the classes
  4. Added the objects to an arraylist which holds only objects the same type
  5. Using Jaxb Marshelled the arraylist object to an output file

JExcel was easy for me to use. Put jxl.jar on the classpath and code something like:

Download jxl and use this code

Linked

Hot Network Questions

To subscribe to this RSS feed, copy and paste this URL into your RSS reader.

Site design / logo © 2023 Stack Exchange Inc; user contributions licensed under CC BY-SA . rev 2023.3.17.43323

By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.

Источник

How to read/write XML maps from/in excel with Apache POI in Java?

A little bit of context, in excel there is a tab named Developer, where you can see/add XML maps in the current workbook:

I am working with Apache POI and I want to read and also write XML maps in excel.

Do you know where can I found documentation regarding on how to read/write XML maps in excel using Apache POI?

1 Answer 1

To read XML mappings from existing workbooks there are API methods available.

There is XSSFWorkbook.getMapInfo which gets the MapInfo. And there is XSSFWorkbook.getCustomXMLMappings which gets a List of all the XSSFMap. So reading should not be the problem.

But until now there is nothing to create new MapInfo and/or putting additional schemas and/or maps in that MapInfo . So to create a new workbook having XML mappings using the low level underlaying objects is necessary.

The following complete example shows this. It provides methods to create a MapInfo and add schemas and maps to it. It uses the following class.xsd file as schema definition:

It creates a MapInfo , adds the schema and the map and creates a XSSFMap . Then it creates a XSSFTable in first scheet which refers to the map. So it is possible collecting data in that table to export as XML then.

Источник

Понравилась статья? Поделить с друзьями:
  • Apache poi excel to pdf
  • Apache poi excel read
  • Apache poi excel maven
  • Apache poi excel kotlin
  • Apache poi excel format