Преобразовать excel в csv python

In this article, we will be dealing with the conversion of Excel (.xlsx) file into .csv.  There are two formats mostly used in Excel :

  1. (*.xlsx) : Excel Microsoft Office Open XML Format Spreadsheet file.
  2. (*.xls) : Excel Spreadsheet (Excel 97-2003 workbook).

Let’s Consider a dataset of a shopping store having data about Customer Serial Number, Customer Name, Customer ID, and Product Cost stored in Excel file. 

check all used files here.

Python3

import pandas as pd

df = pd.DataFrame(pd.read_excel("Test.xlsx"))

df

Output : 

shopping dataframe

Now, let’s see different ways to convert an Excel file into a CSV file :

Method 1: Convert Excel file to CSV file using the pandas library.

Pandas is an open-source software library built for data manipulation and analysis for Python programming language. It offers various functionality in terms of data structures and operations for manipulating numerical tables and time series. It can read, filter, and re-arrange small and large datasets and output them in a range of formats including Excel, JSON, CSV.

For reading an excel file, using the read_excel() method and convert the data frame into the CSV file, use to_csv() method of pandas.

Code:

Python3

import pandas as pd

read_file = pd.read_excel ("Test.xlsx")

read_file.to_csv ("Test.csv"

                  index = None,

                  header=True)

df = pd.DataFrame(pd.read_csv("Test.csv"))

df

 Output: 

shopping dataframefile show

Method 2: Convert Excel file to CSV file using xlrd and CSV library.

xlrd is a library with the main purpose to read an excel file. 

csv is a library with the main purpose to read and write a csv file.

Code:

Python3

import xlrd 

import csv

import pandas as pd

sheet = xlrd.open_workbook("Test.xlsx").sheet_by_index(0)

col = csv.writer(open("T.csv"

                      'w'

                      newline=""))

for row in range(sheet.nrows):

    col.writerow(sheet.row_values(row))

df = pd.DataFrame(pd.read_csv("T.csv"))

df

 Output: 

shopping dataframefile show

Method 3: Convert Excel file to CSV file using openpyxl and CSV library.

openpyxl is a library to read/write Excel 2010 xlsx/xlsm/xltx/xltm files.It was born from lack of existing library to read/write natively from Python the Office Open XML format.

Code:

Python3

import openpyxl

import csv

import pandas as pd

excel = openpyxl.load_workbook("Test.xlsx")

sheet = excel.active

col = csv.writer(open("tt.csv",

                      'w'

                      newline=""))

for r in sheet.rows:

    col.writerow([cell.value for cell in r])

df = pd.DataFrame(pd.read_csv("tt.csv"))

df

 Output: 

shopping dataframefiles show

In this article, we will show you how to convert an excel file to the CSV File (Comma Separated Values) using python.

Assume we have taken an excel file with the name sampleTutorialsPoint.xlsx containing some random text. We will return a CSV File after converting the given excel file into a CSV file.

sampleTutorialsPoint.xlsx

Player Name Age Type Country Team Runs Wickets
Virat Kohli 33 Batsman India Royal Challengers Bangalore 6300 20
Bhuvaneshwar Kumar 34 Batsman India Sun Risers Hyderabad 333 140
Mahendra Singh Dhoni 39 Batsman India Chennai Super Kings 4500 0
Rashid Khan 28 Bowler Afghanistan Gujarat Titans 500 130
Hardik Pandya 29 All rounder India Gujarat Titans 2400 85
David Warner 34 Batsman Australia Delhi Capitals 5500 12
Kieron Pollard 35 All rounder West Indies Mumbai Indians 3000 67
Rohit Sharma 33 Batsman India Mumbai Indians 5456 20
Kane Williamson 33 Batsman New Zealand Sun Risers Hyderabad 3222 5
Kagiso Rabada 29 Bowler South Africa Lucknow Capitals 335 111

Method 1: Converting Excel to CSV using Pandas Module

Algorithm (Steps)

Following are the Algorithm/steps to be followed to perform the desired task −

  • Import the pandas module (Pandas is a Python open-source data manipulation and analysis package)

  • Create a variable to store the path of the input excel file.

  • Read the given excel file content using the pandas read_excel() function(reads an excel file object into a data frame object).

  • Convert the excel file into a CSV file using the to_csv() function(converts object into a CSV file) by passing the output excel file name, index as None, and header as true as arguments.

  • Read the output CSV file with the read_csv() function(loads a CSV file as a pandas data frame) and convert it to a data frame object with the pandas module’s DataFrame() function.

  • Show/display the data frame object.

Example

The following program converts an excel file into a CSV file and returns a new CSV file

import pandas as pd inputExcelFile ="sampleTutorialsPoint.xlsx" excelFile = pd.read_excel (inputExcelFile) excelFile.to_csv ("ResultCsvFile.csv", index = None, header=True) dataframeObject = pd.DataFrame(pd.read_csv("ResultCsvFile.csv")) dataframeObject

Output

On executing, the above program will generate the following output −

|  index | Player Name         | Age | Type      | Country          | Team                      |Runs  | Wickets |
|--------|---------------------|-----|-----------|------------------|---------------------------|----- |---------|
|   0    |Virat Kohli          |   33|Batsman    |   India          |Royal Challengers Bangalore| 6300 |   20    |
|   1    |Bhuvaneshwar Kumar   |   34|Batsman    |   India          |Sun Risers Hyderabad       | 333  |   140   |
|   2    |Mahendra Singh Dhoni |   39|Batsman    |   India          |Chennai Super Kings        | 4500 |    0    |
|   3    |Rashid Khan          |   28|Bowler     |   Afghanistan    |Gujarat Titans             | 500  |   130   |
|   4    |Hardik Pandya        |   29|All rounder|   India          |Gujarat Titans             | 2400 |    85   |
|   5    |David Warner         |   34|Batsman    |   Australia      |Delhi Capitals             | 5500 |    12   |
|   6    |Kieron Pollard       |   35|All rounder|   West Indies    |Mumbai Indians             | 3000 |    67   | 
|   7    |Rohit Sharma         |   33|Batsman    |   India          |Mumbai Indians             | 5456 |    20   |
|   8    |Kane Williamson      |   33|Batsman    |   New Zealand    |Sun Risers Hyderabad       | 3222 |     5   |
|   9    |Kagiso Rabada        |   29|Bowler     |   South Africa   |Lucknow Capitals           | 335  |    111  |

In this program, we use the pandas read_excel() function to read an excel file containing some random dummy data, and then we use the to csv() function to convert the excel file to csv. If we pass the index as a false argument, the final CSV file does not contain the index row at the beginning. Then we converted the CSV to a data frame to see if the values from the excel file were copied into the CSV file.

Method 2: Converting Excel to CSV using openpyxl and CSV Modules

Algorithm (Steps)

Following are the Algorithm/steps to be followed to perform the desired task −

  • Use the import keyword, to import the openpyxl(Openpyxl is a Python package for interacting with and managing Excel files. Excel 2010 and later files with the xlsx/xlsm/xltx/xltm extensions are supported. Data scientists use Openpyxl for data analysis, data copying, data mining, drawing charts, styling sheets, formula addition, and other operations) and CSV modules.

pip install openpyxl
  • Create a variable to store the path of the input excel file.

  • To create/load a workbook object, pass the input excel file to the openpyxl module’s load_workbook() function (loads a workbook).

  • Opening an output CSV file in write mode with open() and writer() functions to convert an input excel file into a CSV file.

  • Using the for loop, traverse each row of the worksheet.

  • Use the writerow() function, to write cell data of the excel file into the result CSV file row-by-row.

Example

The following program converts an excel file into a CSV file and returns a new CSV file −

import openpyxl import csv inputExcelFile = 'sampleTutorialsPoint.xlsx' newWorkbook = openpyxl.load_workbook(inputExcelFile) firstWorksheet = newWorkbook.active OutputCsvFile = csv.writer(open("ResultCsvFile.csv", 'w'), delimiter=",") for eachrow in firstWorksheet.rows: OutputCsvFile.writerow([cell.value for cell in eachrow])

Output

On executing, the above program a new CSV file (ResultCsvFile.csv) will be created with data of Excel.

In this program, we have an excel file with some random dummy data, which we load as an openpyxl work and set to use using the active attribute. Then we made a new CSV file and opened it in writing mode, then we went through the excel file row by row and copied the data into the newly created CSV file.

Conclusion

In this tutorial, we learned how to read an excel file and convert it to an openpyxl workbook, then how to convert it to a CSV file and remove the index, and finally how to convert the CSV file to a pandas data frame.

Need to convert an Excel file to a CSV file using Python?

If so, you may use the following template to convert your file:

import pandas as pd

read_file = pd.read_excel (r'Path where the Excel file is storedFile name.xlsx')
read_file.to_csv (r'Path to store the CSV fileFile name.csv', index = None, header=True)

And if you have a specific Excel sheet that you’d like to convert, you may then use this template:

import pandas as pd

read_file = pd.read_excel (r'Path where the Excel file is storedFile name.xlsx', sheet_name='Your Excel sheet name')
read_file.to_csv (r'Path to store the CSV fileFile name.csv', index = None, header=True)

In the next section, you’ll see the complete steps to convert your Excel file to a CSV file using Python.

Step 1: Install the Pandas Package

If you haven’t already done so, install the Pandas package. You may use the following command to install Pandas (under Windows):

pip install pandas

Step 2: Capture the Path where the Excel File is Stored

Next, capture the path where the Excel file is stored on your computer.

Here is an example of a path where an Excel file is stored:

C:UsersRonDesktopTestProduct_List.xlsx

Where ‘Product_List‘ is the Excel file name, and ‘xlsx‘ is the file extension.

Step 3: Specify the Path where the New CSV File will be Stored

Now you’ll need to specify the path where the new CSV file will be stored. For example:

C:UsersRonDesktopTestNew_Products.csv

Where ‘New_Products‘ is the new file name, and ‘csv‘ is the file extension.

Step 4: Convert the Excel to CSV using Python

For the final part, use the following template to assist you in the conversion of Excel to CSV:

import pandas as pd

read_file = pd.read_excel (r'Path where the Excel file is storedFile name.xlsx')
read_file.to_csv (r'Path to store the CSV fileFile name.csv', index = None, header=True)

This is how the code would look like in the context of our example (you’ll need to modify the paths to reflect the location where the files will be stored on your computer):

import pandas as pd

read_file = pd.read_excel (r'C:UsersRonDesktopTestProduct_List.xlsx')
read_file.to_csv (r'C:UsersRonDesktopTestNew_Products.csv', index = None, header=True)

Once you run the code (adjusted to you paths), you’ll get the new CSV file at your specified location.

You may also want to check the following source for the steps to convert CSV to Excel using Python.

XLSX is a file extension for Microsoft Excel spreadsheets, while CSV is a Comma-Separated Value file.

This article discusses using Python to convert XLSX into CSV using two methods.

  • Method 1: Using the pandas package and,
  • Method 2: Using openpyxl and csv modules.

We will use the employees.xlsx Excel with two worksheets – names and roles. See the Figure below.

The objective is to learn how to use the two methods stated above to convert any or all of the sheets in the XLSX file into CSV.

Method 1: Using pandas Package

This method involves reading the XLSX file into pandas DataFrame using pandas.read_excel() function and then write the DataFrame into a CSV file using DataFrame.to_csv().

For this method, you may need to install pandas and openpyxl packages using pip as follows:

pip install openpyxl

pip install pandas

Let’s see an example.

# You may need to install pandas and its dependency — openpyxl and pandas

# using pip or conda

# pip install openpyxl

# pip install pandas

import pandas as pd

# Change the path of the XLSX file accordingly.

df_xlsx = pd.read_excel(«employees.xlsx»)

# Convert the active sheet on the XLSX into employees.csv

df_xlsx.to_csv(«employees.csv», index=False)

The code snippet above converts the first sheet only. You can also specify the XLSX worksheet you want to load and convert.

# Convert specific sheet on XLSX into CSV

import pandas as pd

# For pandas < 0.21.0, use sheetname argument, not sheet_name.

df_xlsx = pd.read_excel(«employees.xlsx», sheet_name=«roles»)

df_xlsx.to_csv(«employees_roles.csv»)

Lastly, you can implement a for-loop to convert each sheet into a CSV. We can do that as follows.

import pandas as pd

# Convert all sheets through a for-loop.

# Create xlsx_file handler

filepath = «employees.xlsx»

xlsx_file = pd.ExcelFile(filepath)

# List of all sheet names

sheets = xlsx_file.sheet_names

# Loop through each sheet, load it, and convert it to CSV.

for sheet in sheets:

    # read the worksheet on the XLSX

    df = pd.read_excel(filepath, sheet_name=sheet)

    # Convert the sheet into CSV naming it like the worksheet.

    df.to_csv(f«{sheet}.csv», index=False)

Method 2: Using openpyxl and csv packages

This method involves opening the XLSX file and writing its content into a CSV row by row. If the openpyxl package is not installed, you can do that using pip by running the following command line.

pip install openpyxl

The following code snippet shows how to convert the first worksheet (or any other sheet) on the XLSX file into CSV.

import openpyxl

import csv

# Load in the workbook

workbook = openpyxl.load_workbook(filename=«employees.xlsx»)

# Grap the active worksheet — the first sheet, by default.

worksheet = workbook.active  # use wb[<sheet>] to get specific sheet

# eg wb[«roles»]

# Create the CSV file and write the rows of the XLSX file into it.

with open(«results.csv», «w», newline=«») as infile:

    c = csv.writer(infile)

    # Loop through each row of the XLSX file and write the result into CSV

    for row in worksheet.rows:

        c.writerow([cell.value for cell in row])

Like in Method 1, we can convert all the sheets on the XLSX file into CSV through a for-loop, as shown below.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

# Convert all sheets

from openpyxl import load_workbook

import csv

# Load in the workbook

workbook = load_workbook(filename=«employees.xlsx»)

# get all sheets on the XLSX file.

sheets = workbook.sheetnames

# Loop through each sheet and convert it to CSV

for sheet in sheets:

    # Create the CSV file and write the rows to it

    with open(f«{sheet}.csv», «w», newline=«») as infile:

        c = csv.writer(infile)

        # Loop all rows of a given sheet.

        for row in workbook[sheet]:

            c.writerow([cell.value for cell in row])

Conclusion

This article discussed two methods of converting XLSX to CSV in Python: using pandas and openpyxl.

You can choose one of the methods based on the task at hand or the data size.

If you are dealing with many data manipulation tasks, you can go for pandas because it is a great tool for that purpose. Otherwise, if you need to read and write excel files and maintain excel format, you should use openpyxl.

Note also that the method using pandas is slightly faster than openpyxl when converting a large XLSX into CSV.

Ezoic

  1. the XLSX and CSV File Formats
  2. Use the Pandas Library to Convert XLSX to CSV File in Python
  3. Use the xlrd and csv Modules to Convert XLSX to CSV File in Python
  4. Use the openpyxl and csv Modules to Convert XLSX to CSV File in Python
  5. Conclusion

Convert XLSX to CSV File in Python

This tutorial will demonstrate converting an XLSX file to CSV in Python.

the XLSX and CSV File Formats

The default format of an excel file is XLSX. It stores all the workbook data and the formulas, graphs, and other things.

We can also store an Excel workbook as a CSV file.

A CSV is a comma-separated text file. This text file can be accessed using a simple text editor as well.

A CSV file takes less memory and can be accessed more quickly. However, a CSV file only stores the data.

All the formulas, charts, and pivots will be lost if an Excel workbook is stored as CSV.

XLSX is the latest format of excel workbooks. Till Excel 2003, the file format was XLS.

The methods discussed below will work for both file formats.

Use the Pandas Library to Convert XLSX to CSV File in Python

The pandas module allows us to create and work with DataFrame objects. The data is organized into rows and columns in a DataFrame.

We can read XLSX and CSV files into a DataFrame using the Pandas library.

To convert XLSX to CSV using Pandas, we will read an XLSX file into a DataFrame and export this as a CSV file.

To read excel files, we can use the pandas.read_excel() function. This stores the data in a DataFrame.

Then, this is saved as a CSV file using the pandas.to_csv() function.

Example:

import pandas as pd
df = pd.read_excel('sample.xlsx')
df.to_csv('sample.csv')

Use the xlrd and csv Modules to Convert XLSX to CSV File in Python

The xlrd module provides an efficient way to read excel files. The file’s contents can be written to a CSV file using the csv module.

Let us discuss how.

The xlrd.open_workbook() can be used to read an XLSX workbook. We assume that we only want to convert the first sheet of the workbook to CSV.

This sheet is accessed using the sheet_by_index() function. The index of the first sheet, which is zero, is passed to this function.

We will create a CSV file using the open() function, and create a writer object using the csv.writer() constructor. This object will allow us to write data to the CSV file.

We will iterate the total number of rows in the file and write each row using the writer object with the writerow() function. We get the row’s content using the row_values() function.

We will implement this in the following example.

import xlrd
import csv
ob = csv.writer(open("sample.csv",'w', newline = ""))
data = xlrd.open_workbook('sample.xlsx').sheet_by_index(0)
for r in range(data.nrows):
    ob.writerow(data.row_values(r))

Use the openpyxl and csv Modules to Convert XLSX to CSV File in Python

The openpyxl module is used in Python to perform reading and writing operations on Excel files. We can use this module with the csv library in a similar approach as we did previously.

The openpyxl module will be used to read the XLSX file using the load_workbook() function. We will only convert the current sheet to CSV.

This sheet is accessed using the active attribute.

We will write the contents of this sheet to the CSV file using the csv.writer object, as done previously. We will iterate through the sheet and read the contents of the row using list comprehension.

These contents will be written to the CSV file.

See the code below.

import openpyxl
import csv
ob = csv.writer(open("sample.csv",'w', newline = ""))
data = openpyxl.load_workbook('sample.xlsx').active
for r in data.rows:
    row = [a.value for a in r]
    ob.writerow(row)

Conclusion

This tutorial discussed the methods to convert XLSX files to CSV using Python.

The pandas module provides the simplest way to achieve this in three lines of code. The other methods require reading XLSX files using the xlrd and openpyxl modules and writing them to CSV files using the csv module.

Понравилась статья? Поделить с друзьями:
  • Преобразовать dbf в excel онлайн
  • Преобразовать asd в word онлайн
  • Преобразование формата xml в excel
  • Преобразовать adobe acrobat document в word
  • Преобразование файла в word что делать