Read excel node js

read-excel-file

Read small to medium *.xlsx files in a browser or Node.js. Parse to JSON with a strict schema.

Demo

Also check out write-excel-file for writing simple *.xlsx files.

Install

npm install read-excel-file --save

If you’re not using a bundler then use a standalone version from a CDN.

Use

Browser

<input type="file" id="input" />
import readXlsxFile from 'read-excel-file'

// File.
const input = document.getElementById('input')
input.addEventListener('change', () => {
  readXlsxFile(input.files[0]).then((rows) => {
    // `rows` is an array of rows
    // each row being an array of cells.
  })
})

// Blob.
fetch('https://example.com/spreadsheet.xlsx')
  .then(response => response.blob())
  .then(blob => readXlsxFile(blob))
  .then((rows) => {
    // `rows` is an array of rows
    // each row being an array of cells.
  })

// ArrayBuffer.
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer
//
// Could be obtained from:
// * File
// * Blob
// * Base64 string
//
readXlsxFile(arrayBuffer).then((rows) => {
  // `rows` is an array of rows
  // each row being an array of cells.
})

Note: Internet Explorer 11 requires a Promise polyfill. Example.

Node.js

const readXlsxFile = require('read-excel-file/node')

// File path.
readXlsxFile('/path/to/file').then((rows) => {
  // `rows` is an array of rows
  // each row being an array of cells.
})

// Readable Stream.
readXlsxFile(fs.createReadStream('/path/to/file')).then((rows) => {
  // `rows` is an array of rows
  // each row being an array of cells.
})

// Buffer.
readXlsxFile(Buffer.from(fs.readFileSync('/path/to/file'))).then((rows) => {
  // `rows` is an array of rows
  // each row being an array of cells.
})

Web Worker

const worker = new Worker('web-worker.js')

worker.onmessage = function(event) {
  // `event.data` is an array of rows
  // each row being an array of cells.
  console.log(event.data)
}

worker.onerror = function(event) {
  console.error(event.message)
}

const input = document.getElementById('input')

input.addEventListener('change', () => {
  worker.postMessage(input.files[0])
})
web-worker.js
import readXlsxFile from 'read-excel-file/web-worker'

onmessage = function(event) {
  readXlsxFile(event.data).then((rows) => {
    // `rows` is an array of rows
    // each row being an array of cells.
    postMessage(rows)
  })
}

JSON

To read spreadsheet data and then convert it to an array of JSON objects, pass a schema option when calling readXlsxFile(). In that case, instead of returning an array of rows of cells, it will return an object of shape { rows, errors } where rows is gonna be an array of JSON objects created from the spreadsheet data according to the schema, and errors is gonna be an array of errors encountered while converting spreadsheet data to JSON objects.

Each property of a JSON object should be described by an «entry» in the schema. The key of the entry should be the column’s title in the spreadsheet. The value of the entry should be an object with properties:

  • property — The name of the object’s property.
  • required — (optional) Required properties can be marked as required: true.
  • validate(value) — (optional) Cell value validation function. Is only called on non-empty cells. If the cell value is invalid, it should throw an error with the error message set to the error code.
  • type — (optional) The type of the value. Defines how the cell value will be parsed. If no type is specified then the cell value is returned «as is»: as a string, number, date or boolean. A type could be a:
    • Built-in type:
      • String
      • Number
      • Boolean
      • Date
    • «Utility» type exported from the library:
      • Integer
      • Email
      • URL
    • Custom type:
      • A function that receives a cell value and returns a parsed value. If the value is invalid, it should throw an error with the error message set to the error code.

Sidenote: When converting cell values to object properties, by default, it skips all null values (skips all empty cells). That’s for simplicity. In some edge cases though, it may be required to keep all null values for all the empty cells. For example, that’s the case when updating data in an SQL database from an XLSX spreadsheet using Sequelize ORM library that requires a property to explicitly be null in order to clear it during an UPDATE operation. To keep all null values, pass includeNullValues: true option when calling readXlsxFile().

errors

If there were any errors while converting spreadsheet data to JSON objects, the errors property returned from the function will be a non-empty array. An element of the errors property contains properties:

  • error: string — The error code. Examples: "required", "invalid".
    • If a custom validate() function is defined and it throws a new Error(message) then the error property will be the same as the message value.
    • If a custom type() function is defined and it throws a new Error(message) then the error property will be the same as the message value.
  • reason?: string — An optional secondary error code providing more details about the error. Currently, it’s only returned for «built-in» types. Example: { error: "invalid", reason: "not_a_number" } for type: Number means that «the cell value is invalid because it’s not a number«.
  • row: number — The row number in the original file. 1 means the first row, etc.
  • column: string — The column title.
  • value?: any — The cell value.
  • type?: any — The schema type for this column.

An example of using a schema

// An example *.xlsx document:
// -----------------------------------------------------------------------------------------
// | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE |    CONTACT     |  STATUS   |
// -----------------------------------------------------------------------------------------
// | 03/24/2018 |         10         |   true  |  Chemistry   | (123) 456-7890 | SCHEDULED |
// -----------------------------------------------------------------------------------------

const schema = {
  'START DATE': {
    // JSON object property name.
    prop: 'date',
    type: Date
  },
  'NUMBER OF STUDENTS': {
    prop: 'numberOfStudents',
    type: Number,
    required: true
  },
  // Nested object example.
  // 'COURSE' here is not a real Excel file column name,
  // it can be any string — it's just for code readability.
  'COURSE': {
    // Nested object path: `row.course`
    prop: 'course',
    // Nested object schema:
    type: {
      'IS FREE': {
        prop: 'isFree',
        type: Boolean
      },
      'COURSE TITLE': {
        prop: 'title',
        type: String
      }
    }
  },
  'CONTACT': {
    prop: 'contact',
    required: true,
    // A custom `type` can be defined.
    // A `type` function only gets called for non-empty cells.
    type: (value) => {
      const number = parsePhoneNumber(value)
      if (!number) {
        throw new Error('invalid')
      }
      return number
    }
  },
  'STATUS': {
    prop: 'status',
    type: String,
    oneOf: [
      'SCHEDULED',
      'STARTED',
      'FINISHED'
    ]
  }
}

readXlsxFile(file, { schema }).then(({ rows, errors }) => {
  // `errors` list items have shape: `{ row, column, error, reason?, value?, type? }`.
  errors.length === 0

  rows === [{
    date: new Date(2018, 2, 24),
    numberOfStudents: 10,
    course: {
      isFree: true,
      title: 'Chemistry'
    },
    contact: '+11234567890',
    status: 'SCHEDULED'
  }]
})

Tips and Features

Custom type example.

{
  'COLUMN_TITLE': {
    // This function will only be called for a non-empty cell.
    type: (value) => {
      try {
        return parseValue(value)
      } catch (error) {
        console.error(error)
        throw new Error('invalid')
      }
    }
  }
}

Ignoring empty rows.

By default, it ignores any empty rows. To disable that behavior, pass ignoreEmptyRows: false option.

readXlsxFile(file, {
  schema,
  ignoreEmptyRows: false
})

How to fix spreadsheet data before schema parsing. For example, how to ignore irrelevant rows.

Sometimes, a spreadsheet doesn’t exactly have the structure required by this library’s schema parsing feature: for example, it may be missing a header row, or contain some purely presentational / irrelevant / «garbage» rows that should be removed. To fix that, one could pass an optional transformData(data) function that would modify the spreadsheet contents as required.

readXlsxFile(file, {
  schema,
  transformData(data) {
    // Add a missing header row.
    return [['ID', 'NAME', ...]].concat(data)
    // Remove irrelevant rows.
    return data.filter(row => row.filter(column => column !== null).length > 0)
  }
})

The function for converting data to JSON objects using a schema is exported from this library too, if anyone wants it.

import convertToJson from "read-excel-file/schema"

// `data` is an array of rows, each row being an array of cells.
// `schema` is a "to JSON" convertion schema (see above).
const { rows, errors } = convertToJson(data, schema)

A React component for displaying errors that occured during schema parsing/validation.

import { parseExcelDate } from 'read-excel-file'

function ParseExcelError({ children }) {
  const { type, value, error, reason, row, column } = children

  // Error summary.
  return (
    <div>
      <code>"{error}"</code>
      {reason && ' '}
      {reason && <code>("{reason}")</code>}
      {' for value '}
      <code>{stringifyValue(value)}</code>
      {' in column '}
      <code>"{column}"</code>
      {' in row '}
      <code>{row}</code>
      {' of spreadsheet'}
    </div>
  )
}

function stringifyValue(value) {
  // Wrap strings in quotes.
  if (typeof value === 'string') {
    return '"' + value + '"'
  }
  return String(value)
}

JSON (mapping)

Same as above, but simpler: without any parsing or validation.

Sometimes, a developer might want to use some other (more advanced) solution for schema parsing and validation (like yup). If a developer passes a map option instead of a schema option to readXlsxFile(), then it would just map each data row to a JSON object without doing any parsing or validation. Cell values will remain «as is»: as a string, number, date or boolean.

// An example *.xlsx document:
// ------------------------------------------------------------
// | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE |
// ------------------------------------------------------------
// | 03/24/2018 |         10         |   true  |  Chemistry   |
// ------------------------------------------------------------

const map = {
  'START DATE': 'date',
  'NUMBER OF STUDENTS': 'numberOfStudents',
  'COURSE': {
    'course': {
      'IS FREE': 'isFree',
      'COURSE TITLE': 'title'
    }
  }
}

readXlsxFile(file, { map }).then(({ rows }) => {
  rows === [{
    date: new Date(2018, 2, 24),
    numberOfStudents: 10,
    course: {
      isFree: true,
      title: 'Chemistry'
    }
  }]
})

Multiple Sheets

By default, it reads the first sheet in the document. If you have multiple sheets in your spreadsheet then pass either a sheet number (starting from 1) or a sheet name in the options argument.

readXlsxFile(file, { sheet: 2 }).then((data) => {
  ...
})
readXlsxFile(file, { sheet: 'Sheet1' }).then((data) => {
  ...
})

By default, options.sheet is 1.

To get the names of all sheets, use readSheetNames() function:

readSheetNames(file).then((sheetNames) => {
  // sheetNames === ['Sheet1', 'Sheet2']
})

Dates

XLSX format originally had no dedicated «date» type, so dates are in almost all cases stored simply as numbers (the count of days since 01/01/1900) along with a «format» description (like "d mmm yyyy") that instructs the spreadsheet viewer software to format the date in the cell using that certain format.

When using readXlsx() with a schema parameter, all schema columns having type Date are automatically parsed as dates. When using readXlsx() without a schema parameter, this library attempts to guess whether a cell contains a date or just a number by examining the cell’s «format» — if the «format» is one of the built-in date formats then such cells’ values are automatically parsed as dates. In other cases, when date cells use a non-built-in format (like "mm/dd/yyyy"), one can pass an explicit dateFormat parameter to instruct the library to parse numeric cells having such «format» as dates:

readXlsxFile(file, { dateFormat: 'mm/dd/yyyy' })

Trim

By default, it automatically trims all string values. To disable this feature, pass trim: false option.

readXlsxFile(file, { trim: false })

Transform

Sometimes, a spreadsheet doesn’t exactly have the structure required by this library’s schema parsing feature: for example, it may be missing a header row, or contain some purely presentational / empty / «garbage» rows that should be removed. To fix that, one could pass an optional transformData(data) function that would modify the spreadsheet contents as required.

readXlsxFile(file, {
  schema,
  transformData(data) {
    // Add a missing header row.
    return [['ID', 'NAME', ...]].concat(data)
    // Remove empty rows.
    return data.filter(row => row.filter(column => column !== null).length > 0)
  }
})

Limitations

Performance

There have been some reports about performance issues when reading very large *.xlsx spreadsheets using this library. It’s true that this library’s main point have been usability and convenience, and not performance when handling huge datasets. For example, the time of parsing a file with 2000 rows / 20 columns is about 3 seconds. So, for reading huge datasets, perhaps use something like xlsx package instead. There’re no comparative benchmarks between the two, so if you’ll be making one, share it in the Issues.

Formulas

Dynamically calculated cells using formulas (SUM, etc) are not supported.

TypeScript

I’m not a TypeScript expert, so the community has to write the typings (and test those). See example index.d.ts.

CDN

One can use any npm CDN service, e.g. unpkg.com or jsdelivr.net

<script src="https://unpkg.com/read-excel-file@5.x/bundle/read-excel-file.min.js"></script>

<script>
  var input = document.getElementById('input')
  input.addEventListener('change', function() {
    readXlsxFile(input.files[0]).then(function(rows) {
      // `rows` is an array of rows
      // each row being an array of cells.
    })
  })
</script>

TypeScript

This library comes with TypeScript «typings». If you happen to find any bugs in those, create an issue.

References

Uses xmldom for parsing XML.

GitHub

On March 9th, 2020, GitHub, Inc. silently banned my account (erasing all my repos, issues and comments, even in my employer’s private repos) without any notice or explanation. Because of that, all source codes had to be promptly moved to GitLab. The GitHub repo is now only used as a backup (you can star the repo there too), and the primary repo is now the GitLab one. Issues can be reported in any repo.

License

MIT

Node.js is an open-source and cross-platform JavaScript runtime environment that can also be used to read from a file and write to a file which can be in txt, ods, xlsx, docx, etc format.

The following example covers how an excel file(.xlsx) file is read from an excel file and then converted into JSON and also to write to it. It can be achieved using a package called xlsx to achieve our goal.

Module Installation: You can install xlsx module using the following command:

npm install xlsx

Note: For the following example, text.xlsx is a dummy data file that has been used.

Filename: test.xlsx 

Sheet 1:

Sheet 2:

So the excel file test.xlsx has 2 sheets, one having Student details and another having lecturer details.

Read Operation Filename: read.js 

Javascript

const reader = require('xlsx')

const file = reader.readFile('./test.xlsx')

let data = []

const sheets = file.SheetNames

for(let i = 0; i < sheets.length; i++)

{

   const temp = reader.utils.sheet_to_json(

        file.Sheets[file.SheetNames[i]])

   temp.forEach((res) => {

      data.push(res)

   })

}

console.log(data)

Explanation: First, the npm module is included in the read.js file and then the excel file is read into a workbook i.e constant file in the above program.

The number of files in that particular excel file is available in the SheetNames property of the workbook. It can be accessed as follows:

const sheets = file.SheetNames  // Here the value of the sheets will be 2

A for loop is run until the end of the excel file starting from the first page. One of the most important functions used in the code above is the sheet_to_json() function present in the utils module of the xlsx package. It accepts a worksheet object as a parameter and returns an array of JSON objects.

There is a forEach loop which iterates through every JSON object present in the array temp and pushes it into a variable data which would contain all the data in JSON format.

Finally, the data is printed or any other modification can be performed on the array of JSON objects.

Step to run the application:

Run the read.js file using the following command:

node read.js

Output:

Write Operation In the following example, we will convert an array of JSON objects into an excel sheet and append it to the file.

Filename: write.js

Javascript

const reader = require('xlsx')

const file = reader.readFile('./test.xlsx')

let student_data = [{

    Student:'Nikhil',

    Age:22,

    Branch:'ISE',

    Marks: 70

},

{

    Student:'Amitha',

    Age:21,

    Branch:'EC',

    Marks:80

}]

const ws = reader.utils.json_to_sheet(student_data)

reader.utils.book_append_sheet(file,ws,"Sheet3")

reader.writeFile(file,'./test.xlsx')

Explanation: Here we have an array of JSON objects called student_data. We use two main functions in this program i.e json_to_sheet() which accepts an array of objects and converts them into a worksheet and another function is the book_append_sheet() to append the worksheet into the workbook.

Finally, all the changes are written to the test.xlsx file using writeFile() function which takes a workbook and a excel file as input parameter.

Step to run the application:

Run the read.js file using the following command:

node write.js

Output: The final test.xlsx file would look something like this: 

Sheet 1:

Sheet 2:

Sheet 3: We can see sheet 3 is appended into the test.xlsx as shown below:

This article shows you how to read and extract content from an Excel (.xlsx) file by using Node.js. Without any further ado, let’s get our hands dirty and start coding.

Getting Things Ready

We are going to work with a simple Excel file that contains some information as follows:

You can download the file from the link below to your computer:

https://www.kindacode.com/wp-content/uploads/2021/11/KindaCode.com-Example.xlsx.zip

Writing Code

There are many great libraries that can help us easily read Excel files, such as xlsx (SheetJS), exceljs (ExcelJS), node-xlsx (Node XLSX). In the sample project below, we will use xlsx. It’s super popular and supports TypeScript out-of-the-box.

1. Create a new folder named example (the name doesn’t matter and is totally up to you), then initialize a new Node.js project by running:

npm init

2. In your project directory, create a subfolder called src, then add an empty index.js file. Copy the Excel file you’ve downloaded before into the src folder.

Here’s the file structure:

.
├── node_modules
├── package-lock.json
├── package.json
└── src
    ├── KindaCode.com Example.xlsx
    └── index.js

3. Installing the xlsx package:

npm i xlsx

4. Below are the code snippets for index.js. There are 2 different code snippets. The first one uses the CommonJS syntax (with require), while the second one uses the ES Modules syntax (with import). Choose the one that fits your need.

CommonJS:

const path = require("path");
const xlsx = require("xlsx");

const filePath = path.resolve(__dirname, "Kindacode.com Example.xlsx");

const workbook = xlsx.readFile(filePath);
const sheetNames = workbook.SheetNames;

// Get the data of "Sheet1"
const data = xlsx.utils.sheet_to_json(workbook.Sheets[sheetNames[0]])

/// Do what you need with the received data
data.map(person => {
  console.log(`${person.Name} is ${person.Age} years old`);
})

ES Modules:

// import with ES6 Modules syntax
import path from 'path';
import xlsx from 'xlsx';

import { fileURLToPath } from 'url'
import { dirname } from 'path'
const __filename = fileURLToPath(import.meta.url)
const __dirname = dirname(__filename)

const filePath = path.resolve(__dirname, "Kindacode.com Example.xlsx");

const workbook = xlsx.readFile(filePath);
const sheetNames = workbook.SheetNames;

// Get the data of "Sheet1"
const data = xlsx.utils.sheet_to_json(workbook.Sheets[sheetNames[0]])

/// Do what you need with the received data
data.map(person => {
  console.log(`${person.Name} is ${person.Age} years old`);
})

5. Test our project:

node src/index.js

Output:

John Doe is 37 years old
Jane Doe is 37 years old
Captain is 72 years old
Voldermort is 89 years old

Conclusion

We’ve written a simple Node.js program to retrieve the data from a sheet of an Excel workbook. If you’d like to explore more new and interesting things about modern Node.js, take a look at the following articles:

  • Node.js: How to Ping a Remote Server/ Website
  • Best open-source ORM and ODM libraries for Node.js
  • 7 Best Open-Source HTTP Request Libraries for Node.js
  • Node.js: Executing a Piece of Code after a Delay
  • Express + TypeScript: Extending Request and Response objects

You can also check out our Node.js category page for the latest tutorials and examples.

In this quick guide, we will teach you how to very efficiently read the excel file data in the node js application using the third-party package.

Excel file holds tons of data in row and column format; the great thing about excel is not limited to one thing; instead, its advantages are diversified:

It offers the following benefits:

  • Best way to store data.
  • Great for calculations.
  • Transform and clean data.
  • Makes data analysis facile.
  • Accessible to data visualizations with charts.
  • Gather or print reports effortlessly.

Enough about knowing the background of excel; our goal in this post is diffrent.

Excel file contains the data in row and column format. Therefore, we will show you how to install and configure the read-excel-file module in the node app. And how to parse the excel data row by row in node js environment.

Node Js Extract Row by Row Data from Excel File Example

  • Step 1: Build Folder
  • Step 1: Make Package JSON
  • Step 2: Formulate Server File
  • Step 3: Add Excel Module
  • Step 4: Read Excel or XLSX File
  • Step 5: Display Excel Data

Build Folder

Get on to the console screen, type the command that we have mentioned below.

As soon as you hit enter onto the keyboard a new blank folder will be created on your system.

mkdir node-vlog

Step inside the newly created directory.

cd node-vlog

Make Package JSON

Now, we have to create a file where our project related information will reside.

This information is simple scripts, commands or project related details.

Hence, you have to run the following command from the command prompt.

npm init

Formulate Server File

In this step, we need to formulate the app.js file.

Also, you require to add this script in the scripts section of package.json file.

...
...
  "scripts": {
    "start": "node app.js"
  },
...
...

Add Excel Module

In this step, we need to add the react excel module make sure to type and execute the given command.

npm install read-excel-file

Read Excel or XLSX File

Now, you have to open the app.js file and insert the following code line by line within the file.

const readExcel = require('read-excel-file/node')
readExcel('./sample.xlsx').then((data) => {
  for (i in data) {
    for (j in data[i]) {
      console.log(data[i][j])
    }
  }
})

In order to read the excel file, we need a sample file that has some data to be read. Consequently, create a sample.xlsx file and keep it at the root of our node app directory.

Display Excel Data

Head over to the console, on the command prompt you have to type the given command.

node app.js

After the script is executed, following output will appear on your console screen.

Price
Payment Type
Name
City
1200
Visa
Sophie
Newtown
1200
Mastercard
asuncion
Centennial
1200
Mastercard
Sandrine
Walnut Creek
3600
Amex
Brittany
Orlando
1200
Visa
Carmen
Arlington
1200
Amex
Corinne
Anthem
1200
Amex
Francoise
Danville
1200
Visa
Katherine
Marietta
1200
Mastercard
Laura
Fairfield

Conclusion

How to Read Contents from Excel File in Node Application

In this short guide, we started with setting up a basic node app; in node js set up, we created some files modules with the aim to extract data from an excel file in node.

Remember, you have to keep the simple xlsx file at the root of your node app, then only you can read and print the excel data on the terminal.

  • Eric Cabrel TIOGO

Read an Excel file in Node.js and Typescript

Photo by Lance Anderson / Unsplash

Photo by Lance Anderson / Unsplash

Reading data from a data source is very common when building web applications. Excel is widely used among the many data sources because of how easily the data are formatted inside, making it easy to parse these files.

In this tutorial, we will see how to read the content of an Excel file and then parse its content for further usage in the application.

Dataset to use

For the tutorial, we need a sample file with data. I found an Excel file containing the Hockey players of the USA and Canada for the 2018 Olympic Games. You can download this file at this link.

If you work with Excel and want to level up your skill, I recommend the guide below, containing 36 tutorials covering Excel basics, functions, and advanced formulas.

36 Excel Tutorials To Help You Master Spreadsheets — Digital.com

Excel spreadsheets are essentially databases. Here are 36 Excel tutorials you can use to master Excel and help your career or business grow.

Digital.comAkshat Biyani

Let’s open the sample file and see what is inside:

Content of the Excel file to read.

Content of the Excel file to read.

So here, our goal is to read these data and convert them to a Typescript object to use inside the application, like saving in a database or returning it as a JSON response.

Excel Column Typescript property Typescript type
ID id number
Team team M or W
Country country Canada or USA
NameF firstName string
NameL lastName string
Weight weight number
Height height number
DOB dateOfBirth string (YYY-MM-DD)
Hometown hometown string
Prov province string
Pos position enum
Age age number
HeightFt heightFt number
Htln htln number
BMI bmi number

From the table above, the type Player will look like this:

type Team = 'M' | 'W';
type Country = 'Canada' | 'USA';
type Position = 'Goalie' | 'Defence' | 'Forward';

type Player = {
  id: number;
  team: Team;
  country: Country;
  firstName: string;
  lastName: string;
  weight: number;
  height: number;
  dateOfBirth: string; // (YYY-MM-DD)
  hometown: string;
  province: string;
  position: Position;
  age: number;
  heightFt: number;
  htln: number;
  bmi: number;
};

Set up the project

Initialize a Node.js project with Typescript

mkdir node-excel-read

cd node-excel-read

yarn init -y

yarn add -D typescript ts-node @types/node

yarn tsc --init

touch index.ts

Download the excel file below and copy it into the project directory; it is the file we will read the content:

Install the Node package to use for reading the file called exceljs.

yarn add exceljs

Add the code below to the file index.ts

import * as path from 'path';
import Excel from 'exceljs';

const filePath = path.resolve(__dirname, 'olympic-hockey-player.xlsx');

type Team = 'M' | 'W';
type Country = 'Canada' | 'USA';
type Position = 'Goalie' | 'Defence' | 'Forward';

type Player = {
  id: number;
  team: Team;
  country: Country;
  firstName: string;
  lastName: string;
  weight: number;
  height: number;
  dateOfBirth: string; // (YYY-MM-DD)
  hometown: string;
  province: string;
  position: Position;
  age: number;
  heightFt: number;
  htln: number;
  bmi: number;
};

const getCellValue = (row:  Excel.Row, cellIndex: number) => {
  const cell = row.getCell(cellIndex);
  
  return cell.value ? cell.value.toString() : '';
};

const main = async () => {
  const workbook = new Excel.Workbook();
  const content = await workbook.xlsx.readFile(filePath);

  const worksheet = content.worksheets[1];
  const rowStartIndex = 4;
  const numberOfRows = worksheet.rowCount - 3;

  const rows = worksheet.getRows(rowStartIndex, numberOfRows) ?? [];

  const players = rows.map((row): Player => {
    return {
      // @ts-ignore
      id: getCellValue(row,1),
      // @ts-ignore
      team: getCellValue(row, 2),
      // @ts-ignore
      country: getCellValue(row, 3),
      firstName: getCellValue(row, 4),
      lastName: getCellValue(row, 5),
      // @ts-ignore
      weight: getCellValue(row, 6),
      height: +getCellValue(row, 7),
      dateOfBirth: getCellValue(row, 8), // (YYY-MM-DD)
      hometown: getCellValue(row, 9),
      province: getCellValue(row, 10),
      // @ts-ignore
      position: getCellValue(row, 11),
      age: +getCellValue(row, 12),
      heightFt: +getCellValue(row, 13),
      htln: +getCellValue(row, 14),
      bmi: +getCellValue(row, 15),
    }
  });

  console.log(players);
};

main().then();

Save and run the code with the command yarn start

Display the content of the Excel file in the console.

Display the content of the Excel file in the console.

As you can see, we retrieved the content of the excel file as expected, but there are improvements to make.

  • There are many // @ts-ignore in the code, meaning Typescript complains about the value we get from the cell.
  • The value of properties team, height, dateOfBirth and height aren’t well-formed.
  • The properties age, heightFt, htln and bmi return NaN for a particular reason that we will see how to fix.

Fix the value of the property «team»

The value must be M or W yet the output return Men or Women. We will write the code to transform the value:

const transformTeam = (value: string): Team => {
  return value === 'Men' ? 'M' : 'W';
};

The property team will be now:

team: transformTeam(getCellValue(row, 2)),

Fix the value of the property «height»

The value is NaN; we try to convert an invalid string number. The code below fixes it:

const transformHeight = (value: string): number => {
  return +value.replace("'", ".");
};

Retrieve value from cell formula

The properties age, heightFt, htln and bmi return NaN because they are formulas, meaning the cell’s value is the arithmetical formula from other cells.

If we look at the value of those cells, we have this output:

View the structure of a formula cell.

View the structure of a formula cell.

In the contrary to simple cells that have a plain value, formula cells have an object, and the value we want is inside the property result. We will create a function getCellFormulaValue for this specific case.

Here is the final code below:

import * as path from 'path';
import Excel from 'exceljs';

const filePath = path.resolve(__dirname, 'olympic-hockey-player.xlsx');

type Team = 'M' | 'W';
type Country = 'Canada' | 'USA';
type Position = 'Goalie' | 'Defence' | 'Forward';

type Player = {
  id: number;
  team: Team;
  country: Country;
  firstName: string;
  lastName: string;
  weight: number;
  height: number;
  dateOfBirth: string; // (YYY-MM-DD)
  hometown: string;
  province: string;
  position: Position;
  age: number;
  heightFt: number;
  htln: number;
  bmi: number;
};

const getCellValue = (row:  Excel.Row, cellIndex: number) => {
  const cell = row.getCell(cellIndex);

  console.log(cell.value);

  return cell.value ? cell.value.toString() : '';
};

const getCellFormulaValue = (row:  Excel.Row, cellIndex: number) => {
  const value = row.getCell(cellIndex).value as Excel.CellFormulaValue;

  return value.result ? value.result.toString() : '';
};

const transformTeam = (value: string): Team => {
  return value === 'Men' ? 'M' : 'W';
};

const transformHeight = (value: string): number => {
  return +value.replace("'", ".");
};

const transformDateOfBirth = (value: string) => {
  const date = new Date(value);

  return `${date.getFullYear()}-${date.getMonth() + 1}-${date.getDate()}`;
};

const main = async () => {
  const workbook = new Excel.Workbook();
  const content = await workbook.xlsx.readFile(filePath);

  const worksheet = content.worksheets[1];
  const rowStartIndex = 4;
  const numberOfRows = worksheet.rowCount - 3;

  const rows = worksheet.getRows(rowStartIndex, numberOfRows) ?? [];

  const players = rows.map((row): Player => {
    return {
      id: +getCellValue(row,1),
      team: transformTeam(getCellValue(row, 2)),
      country: getCellValue(row, 3) as Country,
      firstName: getCellValue(row, 4),
      lastName: getCellValue(row, 5),
      weight: +getCellValue(row, 6),
      height: transformHeight(getCellValue(row, 7)),
      dateOfBirth: transformDateOfBirth(getCellValue(row, 8)), // (YYY-MM-DD)
      hometown: getCellValue(row, 9),
      province: getCellValue(row, 10),
      position: getCellValue(row, 11) as Position,
      age: +getCellFormulaValue(row, 12),
      heightFt: +getCellFormulaValue(row, 13),
      htln: +getCellFormulaValue(row, 14),
      bmi: +getCellFormulaValue(row, 15),
    }
  });

  console.log(players);
};

main().then();

Re-run the command yarn start to see the output:

The content of the Excel is well extracted and displayed in the console.

The content of the Excel is well extracted and displayed in the console.

Wrap up

exceljs make it easy to read and parse an Excel file and while using it, make sure to handle simple cell and formula values. There are other cell types like RichText, Hyperlink, Date, and Shared Formula.

You can do a lot more with this library, so check out the documentation to see where you can go with.

You can find the code source on the GitHub repository.

Follow me on Twitter or subscribe to my newsletter to avoid missing the upcoming posts and the tips and tricks I occasionally share.

My favourite library for reading, writing, and manipulating Excel files in Node.JS is ExcelJS. In this tutorial, we’ll look at how to install the library and how to dynamically read an Excel file.

In the business world, there’s a joke that businesses run off Excel files. Sadly, this can be very true – I’ve received a lot of Excel files over the years and had to do data imports or build visualizations off of them. I’ve used code like we’ll look at today in a lot of applications.

Installing ExcelJS

Installing ExcelJS is pretty easy. In the root of your project, just simply run “npm i exceljs” from the terminal or command prompt.

Setting Up Our Code for ExcelJS

We can import or require ExcelJS by just requiring exceljs. If you need ES5 transpiled code, you’ll need to import ‘exceljs/dist/es5’ instead and add the core-js dependencies too.

In Microsoft Excel, a workbook is a collection of worksheets (or “spreadsheets”) in a single file. In the Microsoft Excel API, operations usually begin from the workbook object.

One thing to keep in mind whenever doing something with Excel, is that everything is one based.

Reading a Particular Worksheet

There’s a few different ways that we can access worksheets. Worksheets can be accessed by iteration, through the Id, or by the worksheet tab name.

I prefer to access worksheets by iteration or by name because these methods are a lot more predictable. When doing By Id or by Number keep in mind that the numbers aren’t necessarily sequential because a worksheet may no longer exist.

Accessing Worksheets Through Iteration

As mentioned above, I find accessing worksheets through iteration to be the easiest and more predictable way of reading an Excel sheet.

With ExcelJS we have a few different ways we can iterate through through the worksheets. We can do it by a for loop, a .forEach off of workbook.worksheets.forEach, or doing workbook.eachSheet.

I find workbook.eachSheet to be the easiest to read.

Build In Metrics

Within ExcelJS, there are a few different variables that we can use to make life easier.

We can use worksheet.actualColumnCount to get the number of columns that potentially have any data in them.

We can use worksheet.actualRowCount to get the size of the rows that potentially have data in them.

As we’re iterating through rows, we can use row.cellCount to get the number of cells.

Reading a Particular Cell

In Excel, a cell is the boxes you see in the grid of an Excel worksheet. Cells are traditionally referenced by the column letter, and row number. For example “A1” is the first cell on an Excel worksheet.

Within ExcelJS, we can also use a number instead of a letter as the column reference. For example “1,1” could also be a reference to “A1”.

If we need the value of the cell, we need to use .getCell() and then use the .value property. Like this:

It’s much easier to use a number as a reference, although it can be confusing when you are comparing the code to a spreadsheet. And of course, when looping this is a lot easier than trying to translate a number to a letter.

Dynamically Reading an Entire Document

When I work with Excel documents in Node.js I like to convert them into a JavaScript array of objects so I can easily manipulate them. So, an Excel document that looks like this will end up looking like this JavaScript array of objects.

I usually start with a helper function that takes in the path to the Excel file, and then it reads it and loops through the various worksheets.

For each worksheet it assumes that the first row is a header row. In a lot of cases, this is a good assumption but it never hurts to check. 😊

For each worksheet, we loop through the rows and and the available columns building an object that we eventually add an object called “theRow” into the “theData” array.

At the end, we return theData which is our array of objects.

Wrapping It Up

As you can see, having a knowledge of loops and the ExcelJS library can allow us to do some pretty advanced things without necessarily having to have a human do a lot of data entry.

In this blog post on ExcelJS we covered how to read in an Excel file and convert it to a JavaScript array that contains objects based on headers built from the first row.

Понравилась статья? Поделить с друзьями:
  • Read excel in to xml
  • Read excel files with python
  • Read excel files with php
  • Read excel files in javascript
  • Read excel files from python