I’m looking for a way to convert xlsx files to csv files on Linux.
I do not want to use PHP/Perl or anything like that since I’m looking at processing several millions of lines, so I need something quick. I found a program on the Ubuntu repos called xls2csv but it will only convert xls (Office 2003) files (which I’m currently using) but I need support for the newer Excel files.
Any ideas?
asked May 11, 2012 at 19:18
8
The Gnumeric spreadsheet application comes with a command line utility called ssconvert that can convert between a variety of spreadsheet formats:
$ ssconvert Book1.xlsx newfile.csv
Using exporter Gnumeric_stf:stf_csv
$ cat newfile.csv
Foo,Bar,Baz
1,2,3
123.6,7.89,
2012/05/14,,
The,last,Line
To install on Ubuntu:
apt-get install gnumeric
To install on Mac:
brew install gnumeric
answered May 14, 2012 at 9:34
jmcnamarajmcnamara
37k6 gold badges86 silver badges105 bronze badges
16
You can do this with LibreOffice:
libreoffice --headless --convert-to csv $filename --outdir $outdir
For reasons not clear to me, you might need to run this with sudo. You can make LibreOffice work with sudo without requiring a password by adding this line to you sudoers file:
users ALL=(ALL) NOPASSWD: libreoffice
answered Feb 13, 2013 at 14:54
spiffytechspiffytech
6,0417 gold badges43 silver badges56 bronze badges
14
If you already have a desktop environment then I’m sure Gnumeric or LibreOffice would work well, but on a headless server (e.g. any cloud-based environment), they require dozens of dependencies that you also need to install.
I found this Python alternative: xlsx2csv
easy_install xlsx2csv
xlsx2csv file.xlsx > newfile.csv
It took two seconds to install and works like a charm.
If you have multiple sheets, you can export all at once, or one at a time:
xlsx2csv file.xlsx --all > all.csv
xlsx2csv file.xlsx --all -p '' > all-no-delimiter.csv
xlsx2csv file.xlsx -s 1 > sheet1.csv
He also links to several alternatives built in Bash, Python, Ruby, and Java.
answered Feb 14, 2014 at 18:34
andrewtweberandrewtweber
24k22 gold badges87 silver badges109 bronze badges
8
In Bash, I used this LibreOffice command (executable libreoffice
) to convert all my .xlsx files in the current directory:
for i in *.xlsx; do libreoffice --headless --convert-to csv "$i" ; done
Close all your LibreOffice open instances before executing, or it will fail silently.
The command takes care of spaces in the filename.
I tried it again some years later, and it didn’t work. This question gives some tips, but the quickest solution was to run as root (or running a sudo libreoffice
). It is not elegant, but quick.
Use the command scalc.exe in Windows.
answered Feb 8, 2014 at 20:54
nevesneves
31.3k25 gold badges150 silver badges186 bronze badges
4
Another option would be to use R via a small Bash wrapper for convenience:
xlsx2txt(){
echo '
require(xlsx)
write.table(read.xlsx2(commandArgs(TRUE)[1], 1), stdout(), quote=F, row.names=FALSE, col.names=T, sep="t")
' | Rscript --vanilla - $1 2>/dev/null
}
xlsx2txt file.xlsx > file.txt
answered Sep 2, 2014 at 15:03
Holger BrandlHolger Brandl
10.4k1 gold badge62 silver badges62 bronze badges
If the .xlsx
file has many sheets, the -s
flag can be used to get the sheet you want. For example:
xlsx2csv "my_file.xlsx" -s 2 second_sheet.csv
second_sheet.csv
would contain the data of the second sheet in my_file.xlsx
.
answered Nov 12, 2014 at 21:43
AkavallAkavall
81.4k51 gold badges205 silver badges248 bronze badges
Using the Gnumeric spreadsheet application which comes which a commandline utility called ssconvert is indeed super simple:
find . -name '*.xlsx' -exec ssconvert -T Gnumeric_stf:stf_csv {} ;
and you’re done!
answered Jun 11, 2016 at 15:45
2
If you are OK to run Java command line then you can do it with Apache POI HSSF’s Excel Extractor. It has a main
method that says to be the command line extractor. This one seems to just dump everything out. They point out to this example that converts to CSV. You would have to compile it before you can run it but it too has a main
method so you should not have to do much coding per se to make it work.
Another option that might fly but will require some work on the other end is to make your Excel files come to you as Excel XML Data or XML Spreadsheet of whatever MS calls that format these days. It will open a whole new world of opportunities for you to slice and dice it the way you want.
answered May 11, 2012 at 19:42
Pavel VellerPavel Veller
6,0531 gold badge25 silver badges24 bronze badges
1
You can use executable libreoffice
to convert your .xlsx files to csv:
libreoffice --headless --convert-to csv ABC.xlsx
Argument —headless indicates that we don’t need GUI.
answered Dec 30, 2021 at 4:17
UdeshUdesh
1,9482 gold badges16 silver badges27 bronze badges
1
As others said, executable libreoffice
can convert Excel files (.xls) files to CSV. The problem for me was the sheet selection.
This LibreOffice Python script does a fine job at converting a single sheet to CSV.
Usage is:
./libreconverter.py File.xls:"Sheet Name" output.csv
The only downside (on my end) is that --headless
doesn’t seem to work. I have a LibreOffice window that shows up for a second and then quits.
That’s OK with me; it’s the only tool that does the job rapidly.
answered Dec 16, 2016 at 10:22
Benoit DuffezBenoit Duffez
11.6k12 gold badges74 silver badges123 bronze badges
You can use script getsheets.py. Add dependencies first:
pip3 install pandas xlrd openpyxl
Then call the script: python3 getsheets.py <file.xlsx>
answered Apr 1, 2022 at 12:33
kaiyakaiya
2711 gold badge3 silver badges16 bronze badges
XLSX is a format/file extension for Open XML Spreadsheet file format used by Microsoft Excel. Transforming xlsx to csv (comma-separated file) is easy via command line. Here are few methods to convert xlsx to csv format.
Gnumeric Spreadsheet Program
It is a GNOME-based program, a spreadsheet application that replicates basic features of a popular commercial program like Excel. It can import and export data to and from multiple formats like including CSV, Microsoft Excel, HTML, OpenDocument, Quattro Pro, and LaTeX.
To install Gnumeric in Linux use the apt-get command to install the Gnumeric repository via Linux terminal.
$ sudo apt-get install gnumeric
Now to convert xlsx format to csv format using ssconvert command of Gnumeric to convert the file.
$ ssconvert --export-type=Gnumeric_stf:stf_csv SampleData.xlsx convert.csv
To view the contents of the file using the cat command to check the csv file.
xlsx2csc converter
A python application to convert XLSX/XLS files to CSV format. There is also an option to convert a specific sheet or all the sheets from the data. xlsx2csv has a feature to export all sheets at once.
To install xlsx2csv python should be installed in the Linux machine.
pip install xlsx2csv
Or we can install it in Linux terminal with the help of apt-get command as Kali Linux has the package is pre-installed,
$ sudo apt install xlsx2csv
Now convert the file from xlsx to csv and view the contents:
xlsx2csv file_name.xlsx > New_file.csv
Here we see that xlsx2cv has only converted a single sheet, but our sheet has multiple sheets so to convert all of them to csv, the xlsx2csv converter has many parameters to convert the files/sheets.
- -a, –all – export all sheets
- -d – DELIMITER for columns delimiter in csv
- -p – SHEETDELIMITER for sheet delimiter used to separate sheets, pass
- -s – SHEETID for the sheet number to convert
$ xlsx2csv SampleData.xlsx --all > Output.csv
To convert all the sheets from the file, execute the following command, on viewing the contents of the file we see that there are 4 sheets (sheet 1: Instructions, sheet 2: sales order, sheet 3: Sample number, and sheet 4: my link) being converted from xlsx to csv format
csvkit Tool
The tool is a python library for working with csv files with this we can manipulate, organize and analyze data. To use the tool in2csv is executed for converting files. The tool is light and user-friendly. Install the csvkit tool from Linux CLI using the command given below,
$ sudo apt install csvkit
Convert the xlsx file to csv by executing the following command:
$ in2csv SampleData.xlsx > sample.csv
Viewing contents of the file using cat command:
$ cat sample.csv
unoconv
A command-line program to convert office document file format. It also uses LibreOffice to do the conversion, where it can import to any file format that LibreOffice is capable of exporting. For installation in Linux machine use the following the command,
$ sudo apt install unoconv
Converting the Microsoft Excel file to csv format we execute or run the command given below, this command is capable to convert many files format. This also means that it indicates desired format we want to convert the file to, to mention the output format the following parameters are used.
$ unoconv -f csv -o data.csv SampleData.xlsx
unoconv options:
- -f, –format=format: Specify the output format
- -o, –output=name: Output basename, filename or directory
View contents or to check if the output file has successfully converted the file, we use the cat command
$ cat data.csv
LibreOffice Headless
A package allowing LibreOffice to work in command-line with the help of a headless flag allows the application to work without a user interface. It works by converting a single or group of files from one format to another. We can directly use it from CLI, installation of LibreOffice is not required. Indicate the output file format (csv) that you would desire to get with –convert-to parameter followed by file for conversion as done below. After this use ls command to check the output file of xlsx format
$ libreoffice --headless --convert-to csv --outdir conv/ SampleData.xlsx
Use cat command to view contents of the file:
$ cat conv/SampleData.csv
Converting Microsoft Excel sheet (XLS file) to a Comma Separated file (CSV) is relatively very easy while using an Office product, but it could be a tedious task for programmers to do it in command line. The situation may arrive when you have a XLS file and you need to fill the database from it after formatting the data. Converting the XLS to CSV is the ideal way here as the CSV is the format that can easily be manipulated in any language, be it Shell, Perl, Ruby, Python or Java. In this post, we will see the best ways to convert the XLS file to CSV and we will also discuss the pro and cons of using these methods.
catdoc (in C)
The first command line tool we are going to talk about is catdoc. The tool is written in C by V.B. Vagner.
1.1 How to install it:
Download the tool from here. Go to your downloads directory and untar it. You can use the following commands (in case you are facing problem):
gunzip catdoc-0.94.2.tar.gz tar xvf catdoc-0.94.2.tar
Now we have a catdoc-0.94.2 directory. Go inside this directory and run the following commands to install it:
./configure make make install
The installation is an easy process and you should not face any problem here.
1.2 How to use it:
There are several option to run the command. I’ll tell the options that works best for the Microsoft Excel conversion:
xls2csv -x "Path_of_Your_XLS_File" -s cp1252 -d 8859-1 > "Path_of_Your_CSV_File"
Note the option “-s” and “-d” (stands for source and destination). These options are used to specify which character encoding is used in the source file and what would be the character encoding for the destination file. Here I have used cpl1252 which is Microsoft character encoding and 8859-1 which is used for Western European character encoding. You can use what other options are available by using the help command.
1.3 Pros and Cons:
Pros: Straight forward installation
Cons: No selective conversion in multiple sheet scenario, it coverts all the sheets present in the xls file (one workaround would be to explicitly specify a footer in each sheet and then use option -b in the command), problems with few European characters, problem with date fields (the date fields are messed up big times), messes with quotes.
xls2csv (in Perl)
The second tool we are going to talk about is a Perl script xls2csv written by Ken Prows in Perl.
2.1 How to install it:
Download the script here. Gunzip and tar it like we did in previous section and go the extracted directory and use the following commands to install it:
perl Makefile.PL make make test make install
Remember this Perl script uses a number of other Perl modules:
Locale::Recode Unicode::Map Spreadsheet::ParseExcel Text::CSV_XS
While installing xls2csv, it will give error that the mentioned perl modules have not been installed. It will ask you to download the modules. Download and install these modules when asked. All these module installation requires root privileges. If you do not have root access, then you should follow the instruction given here to install a Perl module.
2.2 How to use it:
The following command can be used to convert Microsoft excel to csv:
xls2csv -x "Path_of_Your_XLS_File" -b cp1252 -w WorkSheetName -c "Path_of_Your_CSV_File" -a 8859-1
Option x and c (means xls and csv) are used to specify the input and output files where as b and a (means before and after) are used to specify the respective character encoding. We have use the same character encoding as in previous tool.
2.3 Pros and cons:
Pros: Good with western European character conversion and date fields, supports selective multiple sheet conversion,
Cons: Several Perl modules need to be installed, first cell should not be empty (otherwise it skips the whole row), messes with quotes
There are couple of other ways as well. Some scripts in Python and Java are also available to use but they are not as good as these two discussed here. I hope the article solves your problem. Question and suggestions are always welcomed. Cheers
Abhishek Prakash
Abhishek is a Linux lover and Opens Source enthusiast. He takes a keen interest in day-to-day computer life and wishes to share his experience with others to make their computer experience better and easier. He is the owner of increasingly popular tech blog Computer And You and Open Source blog It’s FOSS.
Subscribe to our newsletter!
Our latest tutorials delivered straight to your inbox
Excel is an Microsoft spreadsheet program and its extension is .xls or .xlsx. Now we as Linux user can convert xls file to csv file in Linux and .xls to .csv. This quick guide will help you how we can convert xls file to csv file in Linux using program called SSConvert and Unoconv.
1. Installing SSConvert to convert xls file to csv file in Linux
For REDHAT ,CENTOS and FEDORA.
yum install gnumeric
For UBUNTU and DEBIAN.
apt-get install gnumeric
Now to show demo on how to convert xls file to csv file in Linux. I am creating sample csv file as below:
# cat example.csv surname,name,age Manmohan,Mirkar,23 Vikrant,Shinde,46 Nilesh,Patil,69 Vipul,Marathe,23 #
So Firstly we will convert this file to xls.
# ssconvert example.csv myexcel.xls
In case you need xlsx file extension use below command:
# ssconvert example.csv myexcel.xlsx
You can also reconfirm file type using the file command as below:
# file example.csv example.csv: ASCII text, with CRLF line terminators # file myexcel.xls myexcel.xls: Composite Document File V2 Document, Little Endian, Os: Windows, Version 4.10, Code page: 1252, Create Time/Date: Tue May 2 13:30:10 2017 # file myexcel.xlsx myexcel.xlsx: Microsoft OOXML #
Now lets convert this xls back to new csv file.
# ssconvert myexcel.xls mynewcsv.csv # file mynewcsv.csv mynewcsv.csv: ASCII text # cat mynewcsv.csv surname,name,age Manmohan,Mirkar,23 Vikrant,Shinde,46 Nilesh,Patil,69 Vipul,Marathe,23 #
Even you can change the delimiter other than comma if you want. So lets take “;” as delimiter.
# ssconvert -O 'separator=; format=raw' myexcel.xls new_delimitor.txt # file new_delimitor.txt new_delimitor.txt: ASCII text # cat new_delimitor.txt surname;name;age Manmohan;Mirkar;23 Vikrant;Shinde;46 Nilesh;Patil;69 Vipul;Marathe;23 #
2. Installing unoconv program to convert xls file to csv file in Linux
For REDHAT ,CENTOS and FEDORA.
yum install unoconv
For UBUNTU and DEBIAN.
apt-get install unoconv
For showing demo to convert xls file to csv file in Linux we will be renaming our old example file as example1.csv.
# cp example.csv example1.csv # cat example1.csv surname,name,age Manmohan,Mirkar,23 Vikrant,Shinde,46 Nilesh,Patil,69 Vipul,Marathe,23 #
Now lets convert this “example1.csv” to xls file.
#unoconv --format xls example1.csv
This will create new “example1.xls” file in the same directory. In case you want to convert to xlsx format command would be:
# unoconv --format xlsx example1.csv
Now lets convert xls file to csv file in Linux.
#unoconv --format csv example1.xls
Now lets check the file type of the generated files:
# file example1.xls example1.xls: Composite Document File V2 Document, Little Endian, Os: Windows, Version 1.0, Code page: -535, Revision Number: 0 # file example1.xlsx example1.xlsx: Microsoft OOXML # file example1.csv example1.csv: ASCII text #
I realize that this is not an entirely unix/linux related question. But since this is something I’ll do on linux, I hope someone has an answer.
I have an online excel file (.xlsx
) which gets updated periodically (by someone else). I want to write a script and put it in as a cronjob in order to to process that excel sheet. But to do that, I need to convert that into a text file (so a .csv
) with semicolon separated columns. It can’t be comma separated unfortunately since some columns have commas in them. Is it at all possible to do this conversion from shell? I have Open office installed and I can do this by using its GUI, but want to know if it is possible to do this from command line. Thanks!
PS: I have a Mac machine as well, so if some solution can work there, thats good as well.
peterph
30.3k2 gold badges69 silver badges75 bronze badges
asked Nov 1, 2011 at 3:45
1
OpenOffice comes with the unoconv program to perform format conversions on the command line.
unoconv -f csv filename.xlsx
For more complex requirements, you can parse XLSX files with Spreadsheet::XLSX
in Perl or openpyxl
in Python. For example, here’s a quickie script to print out a worksheet as a semicolon-separated CSV file (warning: untested, typed directly in the browser):
perl -MSpreadsheet::XLSX -e '
$ = "n"; $, = ";";
my $workbook = Spreadsheet::XLSX->new()->parse($ARGV[0]);
my $worksheet = ($workbook->worksheets())[0];
my ($row_min, $row_max) = $worksheet->row_range();
my ($col_min, $col_max) = $worksheet->col_range();
for my $row ($row_min..$row_max) {
print map {$worksheet->get_cell($row,$_)->value()} ($col_min..$col_max);
}
' filename.xlsx >filename.csv
answered Nov 1, 2011 at 22:13
4
I’m using Perl’s xls2csv to convert xls
files to csv
.
Not sure tho if it works with xlsx
too.
About:
It can’t be comma separated unfortunately since some columns have
commas in them
that’s why quoting has been introduced:
1,2,"data,data, more data"
answered Nov 1, 2011 at 22:22
neurinoneurino
1,8093 gold badges19 silver badges25 bronze badges
2
I use PHP. Just instal the PHPExel library from http://phpexcel.codeplex.com/
and probably you need XML functions too.
This is my code :
<?php
error_reporting(E_ALL);
date_default_timezone_set('Europe/London');
/** PHPExcel_IOFactory */
require_once '/home/markov/Downloads/1.7.6/Classes/PHPExcel/IOFactory.php';
$file="RIF394305.xlsx"; //PATH TO CSV FILE
// Check prerequisites
if (!file_exists($file)) {
exit("Please run 06largescale.php first.n");
}
$objReader = PHPExcel_IOFactory::createReader('Excel2003XML');
$objPHPExcel = $objReader->load($file);
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'CSV');
$objWriter->save(str_replace('.xlsx', '.csv',$file));
?>
You can revert the process or use different Excel/CSV format. Look at
the different php files in the PHPExcel directory.
Mat
51.1k10 gold badges155 silver badges139 bronze badges
answered Jan 28, 2012 at 16:54
1