Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article
XlsxWriter is a Python module for writing files in the XLSX file format. It can be used to write text, numbers, and formulas to multiple worksheets. Also, it supports features such as formatting, images, charts, page setup, auto filters, conditional formatting and many others.
Use this command to install xlsxwriter module:
pip install xlsxwriter
Note: Throughout XlsxWriter, rows and columns are zero indexed. The first cell in a worksheet, A1 is (0, 0), B1 is (0, 1), A2 is (1, 0), B2 is (1, 1) ..similarly for all.
Let’s see how to create and write to an excel-sheet using Python.
Code #1 : Using A1 notation(cell name) for writing data in the specific cells.
Python3
import
xlsxwriter
workbook
=
xlsxwriter.Workbook(
'hello.xlsx'
)
worksheet
=
workbook.add_worksheet()
worksheet.write(
'A1'
,
'Hello..'
)
worksheet.write(
'B1'
,
'Geeks'
)
worksheet.write(
'C1'
,
'For'
)
worksheet.write(
'D1'
,
'Geeks'
)
workbook.close()
Output:
Code #2 : Using the row-column notation(indexing value) for writing data in the specific cells.
Python3
import
xlsxwriter
workbook
=
xlsxwriter.Workbook(
'Example2.xlsx'
)
worksheet
=
workbook.add_worksheet()
row
=
0
column
=
0
content
=
[
"ankit"
,
"rahul"
,
"priya"
,
"harshita"
,
"sumit"
,
"neeraj"
,
"shivam"
]
for
item
in
content :
worksheet.write(row, column, item)
row
+
=
1
workbook.close()
Output:
Code #3 : Creating a new sheet with the specific name
Python3
import
xlsxwriter
workbook
=
xlsxwriter.Workbook(
'Example3.xlsx'
)
worksheet
=
workbook.add_worksheet(
"My sheet"
)
scores
=
(
[
'ankit'
,
1000
],
[
'rahul'
,
100
],
[
'priya'
,
300
],
[
'harshita'
,
50
],
)
row
=
0
col
=
0
for
name, score
in
(scores):
worksheet.write(row, col, name)
worksheet.write(row, col
+
1
, score)
row
+
=
1
workbook.close()
Output:
XlsxWriter has some advantages and disadvantages over the alternative Python modules for writing Excel files.
Advantages:
- It supports more Excel features than any of the alternative modules.
- It has a high degree of fidelity with files produced by Excel. In most cases the files produced are 100% equivalent to files produced by Excel.
- It has extensive documentation, example files and tests.
- It is fast and can be configured to use very little memory even for very large output files.
Disadvantages:
- It cannot read or modify existing Excel XLSX files.
Like Article
Save Article
Last Updated on March 11, 2022 by
We can use the openpyxl library to create an Excel File in Python. This ability is an essential step for any Excel process automation. The best part – we don’t even need to have the Excel program installed.
Library
Open up a command prompt, and use pip install to download openpyxl library:
pip install openpyxl
Create Excel Workbook and Add Worksheets using Python
For this step, we are creating an Excel file in Python and don’t need to open the Excel program. The below code creates an Excel file in memory, then adds two sheets, namely “Sheet_one” and “Sheet_two”.
Note:
- When a Workbook is created, by default it contains at least one sheet. So this file will contain three (3) sheets in total.
- This Excel file doesn’t exist on the hard drive yet and it doesn’t have a name.
from openpyxl import Workbook
wb = Workbook()
wb.create_sheet("Sheet_one")
wb.create_sheet("Sheet_two")
wb.sheetnames
['Sheet', 'Sheet_one', 'Sheet_two']
We can change the sheet name any time by setting the ws.title attribute:
wb['Sheet'].title = 'Sheet_zero'
wb.sheetnames
['Sheet_zero', 'Sheet_one', 'Sheet_two']
Copy A Worksheet
To copy a worksheet, call the wb.copy_worksheet() method:
ws0 = wb['Sheet_zero']
wb.copy_worksheet(ws0)
wb.sheetnames
['Sheet_zero', 'Sheet_one', 'sheet_two', 'Sheet_zero Copy']
Note that openpyxl has limitations and can not copy everything on a sheet. For example, Images and Charts will not be copied.
Changing Sheet/tab Color
We can use the .sheet_properties.tabColor attribute to set tab colors. The value has to be a RGB hex value, and the easiest way to get those values is from the “Hex” box on Excel’s color palette.
wb['Sheet_zero'].sheet_properties.tabColor ='00FF00'
Write Some Data Into Excel
Check out this tutorial to learn how to access Sheets and Cells.
Let’s put some values into “Sheet_one”. As the below code shows, it’s easy to access a cell using the column-row notation.
ws1 = wb['Sheet_one']
ws1['B2'] = 'Fruits'
ws1['C2'] = 'Sales'
To enter values into Sheet programmatically, it’s easier to use the cell() method, which takes two arguments: row and column, both starting from an index of 1.
fruits = ['Apple','Peach','Cherry','Watermelon']
sales = [100,200,300,400]
row_start = 3 #start below the header row 2
col_start = 2 #starts from column B
for i in range(0,len(fruits)):
ws1.cell(row_start+i, col_start).value = fruits[i]
ws1.cell(row_start+i, col_start+1).value = sales[i]
Saving Excel File
Up to this point, the entire Excel file including everything we’ve done is still inside our computer’s memory. We need to save the file to our disk by calling the wb.save() method. If we don’t do this step, our Excel file will disappear after shutting down the Python environment.
wb.save('book_eg.xlsx')
This is how we can create an Excel file using Python. Check the below for more resources.
Additional Resoruces
How to Access Excel File In Python
How to Use Python to Read Excel Formula
Work with Excel Named Range in Python
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Editing Excel Spreadsheets in Python With openpyxl
Excel spreadsheets are one of those things you might have to deal with at some point. Either it’s because your boss loves them or because marketing needs them, you might have to learn how to work with spreadsheets, and that’s when knowing openpyxl
comes in handy!
Spreadsheets are a very intuitive and user-friendly way to manipulate large datasets without any prior technical background. That’s why they’re still so commonly used today.
In this article, you’ll learn how to use openpyxl to:
- Manipulate Excel spreadsheets with confidence
- Extract information from spreadsheets
- Create simple or more complex spreadsheets, including adding styles, charts, and so on
This article is written for intermediate developers who have a pretty good knowledge of Python data structures, such as dicts and lists, but also feel comfortable around OOP and more intermediate level topics.
Before You Begin
If you ever get asked to extract some data from a database or log file into an Excel spreadsheet, or if you often have to convert an Excel spreadsheet into some more usable programmatic form, then this tutorial is perfect for you. Let’s jump into the openpyxl
caravan!
Practical Use Cases
First things first, when would you need to use a package like openpyxl
in a real-world scenario? You’ll see a few examples below, but really, there are hundreds of possible scenarios where this knowledge could come in handy.
Importing New Products Into a Database
You are responsible for tech in an online store company, and your boss doesn’t want to pay for a cool and expensive CMS system.
Every time they want to add new products to the online store, they come to you with an Excel spreadsheet with a few hundred rows and, for each of them, you have the product name, description, price, and so forth.
Now, to import the data, you’ll have to iterate over each spreadsheet row and add each product to the online store.
Exporting Database Data Into a Spreadsheet
Say you have a Database table where you record all your users’ information, including name, phone number, email address, and so forth.
Now, the Marketing team wants to contact all users to give them some discounted offer or promotion. However, they don’t have access to the Database, or they don’t know how to use SQL to extract that information easily.
What can you do to help? Well, you can make a quick script using openpyxl
that iterates over every single User record and puts all the essential information into an Excel spreadsheet.
That’s gonna earn you an extra slice of cake at your company’s next birthday party!
Appending Information to an Existing Spreadsheet
You may also have to open a spreadsheet, read the information in it and, according to some business logic, append more data to it.
For example, using the online store scenario again, say you get an Excel spreadsheet with a list of users and you need to append to each row the total amount they’ve spent in your store.
This data is in the Database and, in order to do this, you have to read the spreadsheet, iterate through each row, fetch the total amount spent from the Database and then write back to the spreadsheet.
Not a problem for openpyxl
!
Learning Some Basic Excel Terminology
Here’s a quick list of basic terms you’ll see when you’re working with Excel spreadsheets:
Term | Explanation |
---|---|
Spreadsheet or Workbook | A Spreadsheet is the main file you are creating or working with. |
Worksheet or Sheet | A Sheet is used to split different kinds of content within the same spreadsheet. A Spreadsheet can have one or more Sheets. |
Column | A Column is a vertical line, and it’s represented by an uppercase letter: A. |
Row | A Row is a horizontal line, and it’s represented by a number: 1. |
Cell | A Cell is a combination of Column and Row, represented by both an uppercase letter and a number: A1. |
Getting Started With openpyxl
Now that you’re aware of the benefits of a tool like openpyxl
, let’s get down to it and start by installing the package. For this tutorial, you should use Python 3.7 and openpyxl 2.6.2. To install the package, you can do the following:
After you install the package, you should be able to create a super simple spreadsheet with the following code:
from openpyxl import Workbook
workbook = Workbook()
sheet = workbook.active
sheet["A1"] = "hello"
sheet["B1"] = "world!"
workbook.save(filename="hello_world.xlsx")
The code above should create a file called hello_world.xlsx
in the folder you are using to run the code. If you open that file with Excel you should see something like this:
Woohoo, your first spreadsheet created!
Reading Excel Spreadsheets With openpyxl
Let’s start with the most essential thing one can do with a spreadsheet: read it.
You’ll go from a straightforward approach to reading a spreadsheet to more complex examples where you read the data and convert it into more useful Python structures.
Dataset for This Tutorial
Before you dive deep into some code examples, you should download this sample dataset and store it somewhere as sample.xlsx
:
This is one of the datasets you’ll be using throughout this tutorial, and it’s a spreadsheet with a sample of real data from Amazon’s online product reviews. This dataset is only a tiny fraction of what Amazon provides, but for testing purposes, it’s more than enough.
A Simple Approach to Reading an Excel Spreadsheet
Finally, let’s start reading some spreadsheets! To begin with, open our sample spreadsheet:
>>>
>>> from openpyxl import load_workbook
>>> workbook = load_workbook(filename="sample.xlsx")
>>> workbook.sheetnames
['Sheet 1']
>>> sheet = workbook.active
>>> sheet
<Worksheet "Sheet 1">
>>> sheet.title
'Sheet 1'
In the code above, you first open the spreadsheet sample.xlsx
using load_workbook()
, and then you can use workbook.sheetnames
to see all the sheets you have available to work with. After that, workbook.active
selects the first available sheet and, in this case, you can see that it selects Sheet 1 automatically. Using these methods is the default way of opening a spreadsheet, and you’ll see it many times during this tutorial.
Now, after opening a spreadsheet, you can easily retrieve data from it like this:
>>>
>>> sheet["A1"]
<Cell 'Sheet 1'.A1>
>>> sheet["A1"].value
'marketplace'
>>> sheet["F10"].value
"G-Shock Men's Grey Sport Watch"
To return the actual value of a cell, you need to do .value
. Otherwise, you’ll get the main Cell
object. You can also use the method .cell()
to retrieve a cell using index notation. Remember to add .value
to get the actual value and not a Cell
object:
>>>
>>> sheet.cell(row=10, column=6)
<Cell 'Sheet 1'.F10>
>>> sheet.cell(row=10, column=6).value
"G-Shock Men's Grey Sport Watch"
You can see that the results returned are the same, no matter which way you decide to go with. However, in this tutorial, you’ll be mostly using the first approach: ["A1"]
.
The above shows you the quickest way to open a spreadsheet. However, you can pass additional parameters to change the way a spreadsheet is loaded.
Additional Reading Options
There are a few arguments you can pass to load_workbook()
that change the way a spreadsheet is loaded. The most important ones are the following two Booleans:
- read_only loads a spreadsheet in read-only mode allowing you to open very large Excel files.
- data_only ignores loading formulas and instead loads only the resulting values.
Importing Data From a Spreadsheet
Now that you’ve learned the basics about loading a spreadsheet, it’s about time you get to the fun part: the iteration and actual usage of the values within the spreadsheet.
This section is where you’ll learn all the different ways you can iterate through the data, but also how to convert that data into something usable and, more importantly, how to do it in a Pythonic way.
Iterating Through the Data
There are a few different ways you can iterate through the data depending on your needs.
You can slice the data with a combination of columns and rows:
>>>
>>> sheet["A1:C2"]
((<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>),
(<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Sheet 1'.C2>))
You can get ranges of rows or columns:
>>>
>>> # Get all cells from column A
>>> sheet["A"]
(<Cell 'Sheet 1'.A1>,
<Cell 'Sheet 1'.A2>,
...
<Cell 'Sheet 1'.A99>,
<Cell 'Sheet 1'.A100>)
>>> # Get all cells for a range of columns
>>> sheet["A:B"]
((<Cell 'Sheet 1'.A1>,
<Cell 'Sheet 1'.A2>,
...
<Cell 'Sheet 1'.A99>,
<Cell 'Sheet 1'.A100>),
(<Cell 'Sheet 1'.B1>,
<Cell 'Sheet 1'.B2>,
...
<Cell 'Sheet 1'.B99>,
<Cell 'Sheet 1'.B100>))
>>> # Get all cells from row 5
>>> sheet[5]
(<Cell 'Sheet 1'.A5>,
<Cell 'Sheet 1'.B5>,
...
<Cell 'Sheet 1'.N5>,
<Cell 'Sheet 1'.O5>)
>>> # Get all cells for a range of rows
>>> sheet[5:6]
((<Cell 'Sheet 1'.A5>,
<Cell 'Sheet 1'.B5>,
...
<Cell 'Sheet 1'.N5>,
<Cell 'Sheet 1'.O5>),
(<Cell 'Sheet 1'.A6>,
<Cell 'Sheet 1'.B6>,
...
<Cell 'Sheet 1'.N6>,
<Cell 'Sheet 1'.O6>))
You’ll notice that all of the above examples return a tuple
. If you want to refresh your memory on how to handle tuples
in Python, check out the article on Lists and Tuples in Python.
There are also multiple ways of using normal Python generators to go through the data. The main methods you can use to achieve this are:
.iter_rows()
.iter_cols()
Both methods can receive the following arguments:
min_row
max_row
min_col
max_col
These arguments are used to set boundaries for the iteration:
>>>
>>> for row in sheet.iter_rows(min_row=1,
... max_row=2,
... min_col=1,
... max_col=3):
... print(row)
(<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>)
(<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Sheet 1'.C2>)
>>> for column in sheet.iter_cols(min_row=1,
... max_row=2,
... min_col=1,
... max_col=3):
... print(column)
(<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.A2>)
(<Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.B2>)
(<Cell 'Sheet 1'.C1>, <Cell 'Sheet 1'.C2>)
You’ll notice that in the first example, when iterating through the rows using .iter_rows()
, you get one tuple
element per row selected. While when using .iter_cols()
and iterating through columns, you’ll get one tuple
per column instead.
One additional argument you can pass to both methods is the Boolean values_only
. When it’s set to True
, the values of the cell are returned, instead of the Cell
object:
>>>
>>> for value in sheet.iter_rows(min_row=1,
... max_row=2,
... min_col=1,
... max_col=3,
... values_only=True):
... print(value)
('marketplace', 'customer_id', 'review_id')
('US', 3653882, 'R3O9SGZBVQBV76')
If you want to iterate through the whole dataset, then you can also use the attributes .rows
or .columns
directly, which are shortcuts to using .iter_rows()
and .iter_cols()
without any arguments:
>>>
>>> for row in sheet.rows:
... print(row)
(<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>
...
<Cell 'Sheet 1'.M100>, <Cell 'Sheet 1'.N100>, <Cell 'Sheet 1'.O100>)
These shortcuts are very useful when you’re iterating through the whole dataset.
Manipulate Data Using Python’s Default Data Structures
Now that you know the basics of iterating through the data in a workbook, let’s look at smart ways of converting that data into Python structures.
As you saw earlier, the result from all iterations comes in the form of tuples
. However, since a tuple
is nothing more than an immutable list
, you can easily access its data and transform it into other structures.
For example, say you want to extract product information from the sample.xlsx
spreadsheet and into a dictionary where each key is a product ID.
A straightforward way to do this is to iterate over all the rows, pick the columns you know are related to product information, and then store that in a dictionary. Let’s code this out!
First of all, have a look at the headers and see what information you care most about:
>>>
>>> for value in sheet.iter_rows(min_row=1,
... max_row=1,
... values_only=True):
... print(value)
('marketplace', 'customer_id', 'review_id', 'product_id', ...)
This code returns a list of all the column names you have in the spreadsheet. To start, grab the columns with names:
product_id
product_parent
product_title
product_category
Lucky for you, the columns you need are all next to each other so you can use the min_column
and max_column
to easily get the data you want:
>>>
>>> for value in sheet.iter_rows(min_row=2,
... min_col=4,
... max_col=7,
... values_only=True):
... print(value)
('B00FALQ1ZC', 937001370, 'Invicta Women's 15150 "Angel" 18k Yellow...)
('B00D3RGO20', 484010722, "Kenneth Cole New York Women's KC4944...)
...
Nice! Now that you know how to get all the important product information you need, let’s put that data into a dictionary:
import json
from openpyxl import load_workbook
workbook = load_workbook(filename="sample.xlsx")
sheet = workbook.active
products = {}
# Using the values_only because you want to return the cells' values
for row in sheet.iter_rows(min_row=2,
min_col=4,
max_col=7,
values_only=True):
product_id = row[0]
product = {
"parent": row[1],
"title": row[2],
"category": row[3]
}
products[product_id] = product
# Using json here to be able to format the output for displaying later
print(json.dumps(products))
The code above returns a JSON similar to this:
{
"B00FALQ1ZC": {
"parent": 937001370,
"title": "Invicta Women's 15150 ...",
"category": "Watches"
},
"B00D3RGO20": {
"parent": 484010722,
"title": "Kenneth Cole New York ...",
"category": "Watches"
}
}
Here you can see that the output is trimmed to 2 products only, but if you run the script as it is, then you should get 98 products.
Convert Data Into Python Classes
To finalize the reading section of this tutorial, let’s dive into Python classes and see how you could improve on the example above and better structure the data.
For this, you’ll be using the new Python Data Classes that are available from Python 3.7. If you’re using an older version of Python, then you can use the default Classes instead.
So, first things first, let’s look at the data you have and decide what you want to store and how you want to store it.
As you saw right at the start, this data comes from Amazon, and it’s a list of product reviews. You can check the list of all the columns and their meaning on Amazon.
There are two significant elements you can extract from the data available:
- Products
- Reviews
A Product has:
- ID
- Title
- Parent
- Category
The Review has a few more fields:
- ID
- Customer ID
- Stars
- Headline
- Body
- Date
You can ignore a few of the review fields to make things a bit simpler.
So, a straightforward implementation of these two classes could be written in a separate file classes.py
:
import datetime
from dataclasses import dataclass
@dataclass
class Product:
id: str
parent: str
title: str
category: str
@dataclass
class Review:
id: str
customer_id: str
stars: int
headline: str
body: str
date: datetime.datetime
After defining your data classes, you need to convert the data from the spreadsheet into these new structures.
Before doing the conversion, it’s worth looking at our header again and creating a mapping between columns and the fields you need:
>>>
>>> for value in sheet.iter_rows(min_row=1,
... max_row=1,
... values_only=True):
... print(value)
('marketplace', 'customer_id', 'review_id', 'product_id', ...)
>>> # Or an alternative
>>> for cell in sheet[1]:
... print(cell.value)
marketplace
customer_id
review_id
product_id
product_parent
...
Let’s create a file mapping.py
where you have a list of all the field names and their column location (zero-indexed) on the spreadsheet:
# Product fields
PRODUCT_ID = 3
PRODUCT_PARENT = 4
PRODUCT_TITLE = 5
PRODUCT_CATEGORY = 6
# Review fields
REVIEW_ID = 2
REVIEW_CUSTOMER = 1
REVIEW_STARS = 7
REVIEW_HEADLINE = 12
REVIEW_BODY = 13
REVIEW_DATE = 14
You don’t necessarily have to do the mapping above. It’s more for readability when parsing the row data, so you don’t end up with a lot of magic numbers lying around.
Finally, let’s look at the code needed to parse the spreadsheet data into a list of product and review objects:
from datetime import datetime
from openpyxl import load_workbook
from classes import Product, Review
from mapping import PRODUCT_ID, PRODUCT_PARENT, PRODUCT_TITLE,
PRODUCT_CATEGORY, REVIEW_DATE, REVIEW_ID, REVIEW_CUSTOMER,
REVIEW_STARS, REVIEW_HEADLINE, REVIEW_BODY
# Using the read_only method since you're not gonna be editing the spreadsheet
workbook = load_workbook(filename="sample.xlsx", read_only=True)
sheet = workbook.active
products = []
reviews = []
# Using the values_only because you just want to return the cell value
for row in sheet.iter_rows(min_row=2, values_only=True):
product = Product(id=row[PRODUCT_ID],
parent=row[PRODUCT_PARENT],
title=row[PRODUCT_TITLE],
category=row[PRODUCT_CATEGORY])
products.append(product)
# You need to parse the date from the spreadsheet into a datetime format
spread_date = row[REVIEW_DATE]
parsed_date = datetime.strptime(spread_date, "%Y-%m-%d")
review = Review(id=row[REVIEW_ID],
customer_id=row[REVIEW_CUSTOMER],
stars=row[REVIEW_STARS],
headline=row[REVIEW_HEADLINE],
body=row[REVIEW_BODY],
date=parsed_date)
reviews.append(review)
print(products[0])
print(reviews[0])
After you run the code above, you should get some output like this:
Product(id='B00FALQ1ZC', parent=937001370, ...)
Review(id='R3O9SGZBVQBV76', customer_id=3653882, ...)
That’s it! Now you should have the data in a very simple and digestible class format, and you can start thinking of storing this in a Database or any other type of data storage you like.
Using this kind of OOP strategy to parse spreadsheets makes handling the data much simpler later on.
Appending New Data
Before you start creating very complex spreadsheets, have a quick look at an example of how to append data to an existing spreadsheet.
Go back to the first example spreadsheet you created (hello_world.xlsx
) and try opening it and appending some data to it, like this:
from openpyxl import load_workbook
# Start by opening the spreadsheet and selecting the main sheet
workbook = load_workbook(filename="hello_world.xlsx")
sheet = workbook.active
# Write what you want into a specific cell
sheet["C1"] = "writing ;)"
# Save the spreadsheet
workbook.save(filename="hello_world_append.xlsx")
Et voilà, if you open the new hello_world_append.xlsx
spreadsheet, you’ll see the following change:
Notice the additional writing on cell C1
.
Writing Excel Spreadsheets With openpyxl
There are a lot of different things you can write to a spreadsheet, from simple text or number values to complex formulas, charts, or even images.
Let’s start creating some spreadsheets!
Creating a Simple Spreadsheet
Previously, you saw a very quick example of how to write “Hello world!” into a spreadsheet, so you can start with that:
1from openpyxl import Workbook
2
3filename = "hello_world.xlsx"
4
5workbook = Workbook()
6sheet = workbook.active
7
8sheet["A1"] = "hello"
9sheet["B1"] = "world!"
10
11workbook.save(filename=filename)
The highlighted lines in the code above are the most important ones for writing. In the code, you can see that:
- Line 5 shows you how to create a new empty workbook.
- Lines 8 and 9 show you how to add data to specific cells.
- Line 11 shows you how to save the spreadsheet when you’re done.
Even though these lines above can be straightforward, it’s still good to know them well for when things get a bit more complicated.
One thing you can do to help with coming code examples is add the following method to your Python file or console:
>>>
>>> def print_rows():
... for row in sheet.iter_rows(values_only=True):
... print(row)
It makes it easier to print all of your spreadsheet values by just calling print_rows()
.
Basic Spreadsheet Operations
Before you get into the more advanced topics, it’s good for you to know how to manage the most simple elements of a spreadsheet.
Adding and Updating Cell Values
You already learned how to add values to a spreadsheet like this:
>>>
>>> sheet["A1"] = "value"
There’s another way you can do this, by first selecting a cell and then changing its value:
>>>
>>> cell = sheet["A1"]
>>> cell
<Cell 'Sheet'.A1>
>>> cell.value
'hello'
>>> cell.value = "hey"
>>> cell.value
'hey'
The new value is only stored into the spreadsheet once you call workbook.save()
.
The openpyxl
creates a cell when adding a value, if that cell didn’t exist before:
>>>
>>> # Before, our spreadsheet has only 1 row
>>> print_rows()
('hello', 'world!')
>>> # Try adding a value to row 10
>>> sheet["B10"] = "test"
>>> print_rows()
('hello', 'world!')
(None, None)
(None, None)
(None, None)
(None, None)
(None, None)
(None, None)
(None, None)
(None, None)
(None, 'test')
As you can see, when trying to add a value to cell B10
, you end up with a tuple with 10 rows, just so you can have that test value.
Managing Rows and Columns
One of the most common things you have to do when manipulating spreadsheets is adding or removing rows and columns. The openpyxl
package allows you to do that in a very straightforward way by using the methods:
.insert_rows()
.delete_rows()
.insert_cols()
.delete_cols()
Every single one of those methods can receive two arguments:
idx
amount
Using our basic hello_world.xlsx
example again, let’s see how these methods work:
>>>
>>> print_rows()
('hello', 'world!')
>>> # Insert a column before the existing column 1 ("A")
>>> sheet.insert_cols(idx=1)
>>> print_rows()
(None, 'hello', 'world!')
>>> # Insert 5 columns between column 2 ("B") and 3 ("C")
>>> sheet.insert_cols(idx=3, amount=5)
>>> print_rows()
(None, 'hello', None, None, None, None, None, 'world!')
>>> # Delete the created columns
>>> sheet.delete_cols(idx=3, amount=5)
>>> sheet.delete_cols(idx=1)
>>> print_rows()
('hello', 'world!')
>>> # Insert a new row in the beginning
>>> sheet.insert_rows(idx=1)
>>> print_rows()
(None, None)
('hello', 'world!')
>>> # Insert 3 new rows in the beginning
>>> sheet.insert_rows(idx=1, amount=3)
>>> print_rows()
(None, None)
(None, None)
(None, None)
(None, None)
('hello', 'world!')
>>> # Delete the first 4 rows
>>> sheet.delete_rows(idx=1, amount=4)
>>> print_rows()
('hello', 'world!')
The only thing you need to remember is that when inserting new data (rows or columns), the insertion happens before the idx
parameter.
So, if you do insert_rows(1)
, it inserts a new row before the existing first row.
It’s the same for columns: when you call insert_cols(2)
, it inserts a new column right before the already existing second column (B
).
However, when deleting rows or columns, .delete_...
deletes data starting from the index passed as an argument.
For example, when doing delete_rows(2)
it deletes row 2
, and when doing delete_cols(3)
it deletes the third column (C
).
Managing Sheets
Sheet management is also one of those things you might need to know, even though it might be something that you don’t use that often.
If you look back at the code examples from this tutorial, you’ll notice the following recurring piece of code:
This is the way to select the default sheet from a spreadsheet. However, if you’re opening a spreadsheet with multiple sheets, then you can always select a specific one like this:
>>>
>>> # Let's say you have two sheets: "Products" and "Company Sales"
>>> workbook.sheetnames
['Products', 'Company Sales']
>>> # You can select a sheet using its title
>>> products_sheet = workbook["Products"]
>>> sales_sheet = workbook["Company Sales"]
You can also change a sheet title very easily:
>>>
>>> workbook.sheetnames
['Products', 'Company Sales']
>>> products_sheet = workbook["Products"]
>>> products_sheet.title = "New Products"
>>> workbook.sheetnames
['New Products', 'Company Sales']
If you want to create or delete sheets, then you can also do that with .create_sheet()
and .remove()
:
>>>
>>> workbook.sheetnames
['Products', 'Company Sales']
>>> operations_sheet = workbook.create_sheet("Operations")
>>> workbook.sheetnames
['Products', 'Company Sales', 'Operations']
>>> # You can also define the position to create the sheet at
>>> hr_sheet = workbook.create_sheet("HR", 0)
>>> workbook.sheetnames
['HR', 'Products', 'Company Sales', 'Operations']
>>> # To remove them, just pass the sheet as an argument to the .remove()
>>> workbook.remove(operations_sheet)
>>> workbook.sheetnames
['HR', 'Products', 'Company Sales']
>>> workbook.remove(hr_sheet)
>>> workbook.sheetnames
['Products', 'Company Sales']
One other thing you can do is make duplicates of a sheet using copy_worksheet()
:
>>>
>>> workbook.sheetnames
['Products', 'Company Sales']
>>> products_sheet = workbook["Products"]
>>> workbook.copy_worksheet(products_sheet)
<Worksheet "Products Copy">
>>> workbook.sheetnames
['Products', 'Company Sales', 'Products Copy']
If you open your spreadsheet after saving the above code, you’ll notice that the sheet Products Copy is a duplicate of the sheet Products.
Freezing Rows and Columns
Something that you might want to do when working with big spreadsheets is to freeze a few rows or columns, so they remain visible when you scroll right or down.
Freezing data allows you to keep an eye on important rows or columns, regardless of where you scroll in the spreadsheet.
Again, openpyxl
also has a way to accomplish this by using the worksheet freeze_panes
attribute. For this example, go back to our sample.xlsx
spreadsheet and try doing the following:
>>>
>>> workbook = load_workbook(filename="sample.xlsx")
>>> sheet = workbook.active
>>> sheet.freeze_panes = "C2"
>>> workbook.save("sample_frozen.xlsx")
If you open the sample_frozen.xlsx
spreadsheet in your favorite spreadsheet editor, you’ll notice that row 1
and columns A
and B
are frozen and are always visible no matter where you navigate within the spreadsheet.
This feature is handy, for example, to keep headers within sight, so you always know what each column represents.
Here’s how it looks in the editor:
Notice how you’re at the end of the spreadsheet, and yet, you can see both row 1
and columns A
and B
.
Adding Filters
You can use openpyxl
to add filters and sorts to your spreadsheet. However, when you open the spreadsheet, the data won’t be rearranged according to these sorts and filters.
At first, this might seem like a pretty useless feature, but when you’re programmatically creating a spreadsheet that is going to be sent and used by somebody else, it’s still nice to at least create the filters and allow people to use it afterward.
The code below is an example of how you would add some filters to our existing sample.xlsx
spreadsheet:
>>>
>>> # Check the used spreadsheet space using the attribute "dimensions"
>>> sheet.dimensions
'A1:O100'
>>> sheet.auto_filter.ref = "A1:O100"
>>> workbook.save(filename="sample_with_filters.xlsx")
You should now see the filters created when opening the spreadsheet in your editor:
You don’t have to use sheet.dimensions
if you know precisely which part of the spreadsheet you want to apply filters to.
Adding Formulas
Formulas (or formulae) are one of the most powerful features of spreadsheets.
They gives you the power to apply specific mathematical equations to a range of cells. Using formulas with openpyxl
is as simple as editing the value of a cell.
You can see the list of formulas supported by openpyxl
:
>>>
>>> from openpyxl.utils import FORMULAE
>>> FORMULAE
frozenset({'ABS',
'ACCRINT',
'ACCRINTM',
'ACOS',
'ACOSH',
'AMORDEGRC',
'AMORLINC',
'AND',
...
'YEARFRAC',
'YIELD',
'YIELDDISC',
'YIELDMAT',
'ZTEST'})
Let’s add some formulas to our sample.xlsx
spreadsheet.
Starting with something easy, let’s check the average star rating for the 99 reviews within the spreadsheet:
>>>
>>> # Star rating is column "H"
>>> sheet["P2"] = "=AVERAGE(H2:H100)"
>>> workbook.save(filename="sample_formulas.xlsx")
If you open the spreadsheet now and go to cell P2
, you should see that its value is: 4.18181818181818. Have a look in the editor:
You can use the same methodology to add any formulas to your spreadsheet. For example, let’s count the number of reviews that had helpful votes:
>>>
>>> # The helpful votes are counted on column "I"
>>> sheet["P3"] = '=COUNTIF(I2:I100, ">0")'
>>> workbook.save(filename="sample_formulas.xlsx")
You should get the number 21
on your P3
spreadsheet cell like so:
You’ll have to make sure that the strings within a formula are always in double quotes, so you either have to use single quotes around the formula like in the example above or you’ll have to escape the double quotes inside the formula: "=COUNTIF(I2:I100, ">0")"
.
There are a ton of other formulas you can add to your spreadsheet using the same procedure you tried above. Give it a go yourself!
Adding Styles
Even though styling a spreadsheet might not be something you would do every day, it’s still good to know how to do it.
Using openpyxl
, you can apply multiple styling options to your spreadsheet, including fonts, borders, colors, and so on. Have a look at the openpyxl
documentation to learn more.
You can also choose to either apply a style directly to a cell or create a template and reuse it to apply styles to multiple cells.
Let’s start by having a look at simple cell styling, using our sample.xlsx
again as the base spreadsheet:
>>>
>>> # Import necessary style classes
>>> from openpyxl.styles import Font, Color, Alignment, Border, Side
>>> # Create a few styles
>>> bold_font = Font(bold=True)
>>> big_red_text = Font(color="00FF0000", size=20)
>>> center_aligned_text = Alignment(horizontal="center")
>>> double_border_side = Side(border_style="double")
>>> square_border = Border(top=double_border_side,
... right=double_border_side,
... bottom=double_border_side,
... left=double_border_side)
>>> # Style some cells!
>>> sheet["A2"].font = bold_font
>>> sheet["A3"].font = big_red_text
>>> sheet["A4"].alignment = center_aligned_text
>>> sheet["A5"].border = square_border
>>> workbook.save(filename="sample_styles.xlsx")
If you open your spreadsheet now, you should see quite a few different styles on the first 5 cells of column A
:
There you go. You got:
- A2 with the text in bold
- A3 with the text in red and bigger font size
- A4 with the text centered
- A5 with a square border around the text
You can also combine styles by simply adding them to the cell at the same time:
>>>
>>> # Reusing the same styles from the example above
>>> sheet["A6"].alignment = center_aligned_text
>>> sheet["A6"].font = big_red_text
>>> sheet["A6"].border = square_border
>>> workbook.save(filename="sample_styles.xlsx")
Have a look at cell A6
here:
When you want to apply multiple styles to one or several cells, you can use a NamedStyle
class instead, which is like a style template that you can use over and over again. Have a look at the example below:
>>>
>>> from openpyxl.styles import NamedStyle
>>> # Let's create a style template for the header row
>>> header = NamedStyle(name="header")
>>> header.font = Font(bold=True)
>>> header.border = Border(bottom=Side(border_style="thin"))
>>> header.alignment = Alignment(horizontal="center", vertical="center")
>>> # Now let's apply this to all first row (header) cells
>>> header_row = sheet[1]
>>> for cell in header_row:
... cell.style = header
>>> workbook.save(filename="sample_styles.xlsx")
If you open the spreadsheet now, you should see that its first row is bold, the text is aligned to the center, and there’s a small bottom border! Have a look below:
As you saw above, there are many options when it comes to styling, and it depends on the use case, so feel free to check openpyxl
documentation and see what other things you can do.
Conditional Formatting
This feature is one of my personal favorites when it comes to adding styles to a spreadsheet.
It’s a much more powerful approach to styling because it dynamically applies styles according to how the data in the spreadsheet changes.
In a nutshell, conditional formatting allows you to specify a list of styles to apply to a cell (or cell range) according to specific conditions.
For example, a widespread use case is to have a balance sheet where all the negative totals are in red, and the positive ones are in green. This formatting makes it much more efficient to spot good vs bad periods.
Without further ado, let’s pick our favorite spreadsheet—sample.xlsx
—and add some conditional formatting.
You can start by adding a simple one that adds a red background to all reviews with less than 3 stars:
>>>
>>> from openpyxl.styles import PatternFill
>>> from openpyxl.styles.differential import DifferentialStyle
>>> from openpyxl.formatting.rule import Rule
>>> red_background = PatternFill(fgColor="00FF0000")
>>> diff_style = DifferentialStyle(fill=red_background)
>>> rule = Rule(type="expression", dxf=diff_style)
>>> rule.formula = ["$H1<3"]
>>> sheet.conditional_formatting.add("A1:O100", rule)
>>> workbook.save("sample_conditional_formatting.xlsx")
Now you’ll see all the reviews with a star rating below 3 marked with a red background:
Code-wise, the only things that are new here are the objects DifferentialStyle
and Rule
:
DifferentialStyle
is quite similar toNamedStyle
, which you already saw above, and it’s used to aggregate multiple styles such as fonts, borders, alignment, and so forth.Rule
is responsible for selecting the cells and applying the styles if the cells match the rule’s logic.
Using a Rule
object, you can create numerous conditional formatting scenarios.
However, for simplicity sake, the openpyxl
package offers 3 built-in formats that make it easier to create a few common conditional formatting patterns. These built-ins are:
ColorScale
IconSet
DataBar
The ColorScale gives you the ability to create color gradients:
>>>
>>> from openpyxl.formatting.rule import ColorScaleRule
>>> color_scale_rule = ColorScaleRule(start_type="min",
... start_color="00FF0000", # Red
... end_type="max",
... end_color="0000FF00") # Green
>>> # Again, let's add this gradient to the star ratings, column "H"
>>> sheet.conditional_formatting.add("H2:H100", color_scale_rule)
>>> workbook.save(filename="sample_conditional_formatting_color_scale.xlsx")
Now you should see a color gradient on column H
, from red to green, according to the star rating:
You can also add a third color and make two gradients instead:
>>>
>>> from openpyxl.formatting.rule import ColorScaleRule
>>> color_scale_rule = ColorScaleRule(start_type="num",
... start_value=1,
... start_color="00FF0000", # Red
... mid_type="num",
... mid_value=3,
... mid_color="00FFFF00", # Yellow
... end_type="num",
... end_value=5,
... end_color="0000FF00") # Green
>>> # Again, let's add this gradient to the star ratings, column "H"
>>> sheet.conditional_formatting.add("H2:H100", color_scale_rule)
>>> workbook.save(filename="sample_conditional_formatting_color_scale_3.xlsx")
This time, you’ll notice that star ratings between 1 and 3 have a gradient from red to yellow, and star ratings between 3 and 5 have a gradient from yellow to green:
The IconSet allows you to add an icon to the cell according to its value:
>>>
>>> from openpyxl.formatting.rule import IconSetRule
>>> icon_set_rule = IconSetRule("5Arrows", "num", [1, 2, 3, 4, 5])
>>> sheet.conditional_formatting.add("H2:H100", icon_set_rule)
>>> workbook.save("sample_conditional_formatting_icon_set.xlsx")
You’ll see a colored arrow next to the star rating. This arrow is red and points down when the value of the cell is 1 and, as the rating gets better, the arrow starts pointing up and becomes green:
The openpyxl
package has a full list of other icons you can use, besides the arrow.
Finally, the DataBar allows you to create progress bars:
>>>
>>> from openpyxl.formatting.rule import DataBarRule
>>> data_bar_rule = DataBarRule(start_type="num",
... start_value=1,
... end_type="num",
... end_value="5",
... color="0000FF00") # Green
>>> sheet.conditional_formatting.add("H2:H100", data_bar_rule)
>>> workbook.save("sample_conditional_formatting_data_bar.xlsx")
You’ll now see a green progress bar that gets fuller the closer the star rating is to the number 5:
As you can see, there are a lot of cool things you can do with conditional formatting.
Here, you saw only a few examples of what you can achieve with it, but check the openpyxl
documentation to see a bunch of other options.
Adding Images
Even though images are not something that you’ll often see in a spreadsheet, it’s quite cool to be able to add them. Maybe you can use it for branding purposes or to make spreadsheets more personal.
To be able to load images to a spreadsheet using openpyxl
, you’ll have to install Pillow
:
Apart from that, you’ll also need an image. For this example, you can grab the Real Python logo below and convert it from .webp
to .png
using an online converter such as cloudconvert.com, save the final file as logo.png
, and copy it to the root folder where you’re running your examples:
Afterward, this is the code you need to import that image into the hello_word.xlsx
spreadsheet:
from openpyxl import load_workbook
from openpyxl.drawing.image import Image
# Let's use the hello_world spreadsheet since it has less data
workbook = load_workbook(filename="hello_world.xlsx")
sheet = workbook.active
logo = Image("logo.png")
# A bit of resizing to not fill the whole spreadsheet with the logo
logo.height = 150
logo.width = 150
sheet.add_image(logo, "A3")
workbook.save(filename="hello_world_logo.xlsx")
You have an image on your spreadsheet! Here it is:
The image’s left top corner is on the cell you chose, in this case, A3
.
Adding Pretty Charts
Another powerful thing you can do with spreadsheets is create an incredible variety of charts.
Charts are a great way to visualize and understand loads of data quickly. There are a lot of different chart types: bar chart, pie chart, line chart, and so on. openpyxl
has support for a lot of them.
Here, you’ll see only a couple of examples of charts because the theory behind it is the same for every single chart type:
For any chart you want to build, you’ll need to define the chart type: BarChart
, LineChart
, and so forth, plus the data to be used for the chart, which is called Reference
.
Before you can build your chart, you need to define what data you want to see represented in it. Sometimes, you can use the dataset as is, but other times you need to massage the data a bit to get additional information.
Let’s start by building a new workbook with some sample data:
1from openpyxl import Workbook
2from openpyxl.chart import BarChart, Reference
3
4workbook = Workbook()
5sheet = workbook.active
6
7# Let's create some sample sales data
8rows = [
9 ["Product", "Online", "Store"],
10 [1, 30, 45],
11 [2, 40, 30],
12 [3, 40, 25],
13 [4, 50, 30],
14 [5, 30, 25],
15 [6, 25, 35],
16 [7, 20, 40],
17]
18
19for row in rows:
20 sheet.append(row)
Now you’re going to start by creating a bar chart that displays the total number of sales per product:
22chart = BarChart()
23data = Reference(worksheet=sheet,
24 min_row=1,
25 max_row=8,
26 min_col=2,
27 max_col=3)
28
29chart.add_data(data, titles_from_data=True)
30sheet.add_chart(chart, "E2")
31
32workbook.save("chart.xlsx")
There you have it. Below, you can see a very straightforward bar chart showing the difference between online product sales online and in-store product sales:
Like with images, the top left corner of the chart is on the cell you added the chart to. In your case, it was on cell E2
.
Try creating a line chart instead, changing the data a bit:
1import random
2from openpyxl import Workbook
3from openpyxl.chart import LineChart, Reference
4
5workbook = Workbook()
6sheet = workbook.active
7
8# Let's create some sample sales data
9rows = [
10 ["", "January", "February", "March", "April",
11 "May", "June", "July", "August", "September",
12 "October", "November", "December"],
13 [1, ],
14 [2, ],
15 [3, ],
16]
17
18for row in rows:
19 sheet.append(row)
20
21for row in sheet.iter_rows(min_row=2,
22 max_row=4,
23 min_col=2,
24 max_col=13):
25 for cell in row:
26 cell.value = random.randrange(5, 100)
With the above code, you’ll be able to generate some random data regarding the sales of 3 different products across a whole year.
Once that’s done, you can very easily create a line chart with the following code:
28chart = LineChart()
29data = Reference(worksheet=sheet,
30 min_row=2,
31 max_row=4,
32 min_col=1,
33 max_col=13)
34
35chart.add_data(data, from_rows=True, titles_from_data=True)
36sheet.add_chart(chart, "C6")
37
38workbook.save("line_chart.xlsx")
Here’s the outcome of the above piece of code:
One thing to keep in mind here is the fact that you’re using from_rows=True
when adding the data. This argument makes the chart plot row by row instead of column by column.
In your sample data, you see that each product has a row with 12 values (1 column per month). That’s why you use from_rows
. If you don’t pass that argument, by default, the chart tries to plot by column, and you’ll get a month-by-month comparison of sales.
Another difference that has to do with the above argument change is the fact that our Reference
now starts from the first column, min_col=1
, instead of the second one. This change is needed because the chart now expects the first column to have the titles.
There are a couple of other things you can also change regarding the style of the chart. For example, you can add specific categories to the chart:
cats = Reference(worksheet=sheet,
min_row=1,
max_row=1,
min_col=2,
max_col=13)
chart.set_categories(cats)
Add this piece of code before saving the workbook, and you should see the month names appearing instead of numbers:
Code-wise, this is a minimal change. But in terms of the readability of the spreadsheet, this makes it much easier for someone to open the spreadsheet and understand the chart straight away.
Another thing you can do to improve the chart readability is to add an axis. You can do it using the attributes x_axis
and y_axis
:
chart.x_axis.title = "Months"
chart.y_axis.title = "Sales (per unit)"
This will generate a spreadsheet like the below one:
As you can see, small changes like the above make reading your chart a much easier and quicker task.
There is also a way to style your chart by using Excel’s default ChartStyle
property. In this case, you have to choose a number between 1 and 48. Depending on your choice, the colors of your chart change as well:
# You can play with this by choosing any number between 1 and 48
chart.style = 24
With the style selected above, all lines have some shade of orange:
There is no clear documentation on what each style number looks like, but this spreadsheet has a few examples of the styles available.
Here’s the full code used to generate the line chart with categories, axis titles, and style:
import random
from openpyxl import Workbook
from openpyxl.chart import LineChart, Reference
workbook = Workbook()
sheet = workbook.active
# Let's create some sample sales data
rows = [
["", "January", "February", "March", "April",
"May", "June", "July", "August", "September",
"October", "November", "December"],
[1, ],
[2, ],
[3, ],
]
for row in rows:
sheet.append(row)
for row in sheet.iter_rows(min_row=2,
max_row=4,
min_col=2,
max_col=13):
for cell in row:
cell.value = random.randrange(5, 100)
# Create a LineChart and add the main data
chart = LineChart()
data = Reference(worksheet=sheet,
min_row=2,
max_row=4,
min_col=1,
max_col=13)
chart.add_data(data, titles_from_data=True, from_rows=True)
# Add categories to the chart
cats = Reference(worksheet=sheet,
min_row=1,
max_row=1,
min_col=2,
max_col=13)
chart.set_categories(cats)
# Rename the X and Y Axis
chart.x_axis.title = "Months"
chart.y_axis.title = "Sales (per unit)"
# Apply a specific Style
chart.style = 24
# Save!
sheet.add_chart(chart, "C6")
workbook.save("line_chart.xlsx")
There are a lot more chart types and customization you can apply, so be sure to check out the package documentation on this if you need some specific formatting.
Convert Python Classes to Excel Spreadsheet
You already saw how to convert an Excel spreadsheet’s data into Python classes, but now let’s do the opposite.
Let’s imagine you have a database and are using some Object-Relational Mapping (ORM) to map DB objects into Python classes. Now, you want to export those same objects into a spreadsheet.
Let’s assume the following data classes to represent the data coming from your database regarding product sales:
from dataclasses import dataclass
from typing import List
@dataclass
class Sale:
quantity: int
@dataclass
class Product:
id: str
name: str
sales: List[Sale]
Now, let’s generate some random data, assuming the above classes are stored in a db_classes.py
file:
1import random
2
3# Ignore these for now. You'll use them in a sec ;)
4from openpyxl import Workbook
5from openpyxl.chart import LineChart, Reference
6
7from db_classes import Product, Sale
8
9products = []
10
11# Let's create 5 products
12for idx in range(1, 6):
13 sales = []
14
15 # Create 5 months of sales
16 for _ in range(5):
17 sale = Sale(quantity=random.randrange(5, 100))
18 sales.append(sale)
19
20 product = Product(id=str(idx),
21 name="Product %s" % idx,
22 sales=sales)
23 products.append(product)
By running this piece of code, you should get 5 products with 5 months of sales with a random quantity of sales for each month.
Now, to convert this into a spreadsheet, you need to iterate over the data and append it to the spreadsheet:
25workbook = Workbook()
26sheet = workbook.active
27
28# Append column names first
29sheet.append(["Product ID", "Product Name", "Month 1",
30 "Month 2", "Month 3", "Month 4", "Month 5"])
31
32# Append the data
33for product in products:
34 data = [product.id, product.name]
35 for sale in product.sales:
36 data.append(sale.quantity)
37 sheet.append(data)
That’s it. That should allow you to create a spreadsheet with some data coming from your database.
However, why not use some of that cool knowledge you gained recently to add a chart as well to display that data more visually?
All right, then you could probably do something like this:
38chart = LineChart()
39data = Reference(worksheet=sheet,
40 min_row=2,
41 max_row=6,
42 min_col=2,
43 max_col=7)
44
45chart.add_data(data, titles_from_data=True, from_rows=True)
46sheet.add_chart(chart, "B8")
47
48cats = Reference(worksheet=sheet,
49 min_row=1,
50 max_row=1,
51 min_col=3,
52 max_col=7)
53chart.set_categories(cats)
54
55chart.x_axis.title = "Months"
56chart.y_axis.title = "Sales (per unit)"
57
58workbook.save(filename="oop_sample.xlsx")
Now we’re talking! Here’s a spreadsheet generated from database objects and with a chart and everything:
That’s a great way for you to wrap up your new knowledge of charts!
Bonus: Working With Pandas
Even though you can use Pandas to handle Excel files, there are few things that you either can’t accomplish with Pandas or that you’d be better off just using openpyxl
directly.
For example, some of the advantages of using openpyxl
are the ability to easily customize your spreadsheet with styles, conditional formatting, and such.
But guess what, you don’t have to worry about picking. In fact, openpyxl
has support for both converting data from a Pandas DataFrame into a workbook or the opposite, converting an openpyxl
workbook into a Pandas DataFrame.
First things first, remember to install the pandas
package:
Then, let’s create a sample DataFrame:
1import pandas as pd
2
3data = {
4 "Product Name": ["Product 1", "Product 2"],
5 "Sales Month 1": [10, 20],
6 "Sales Month 2": [5, 35],
7}
8df = pd.DataFrame(data)
Now that you have some data, you can use .dataframe_to_rows()
to convert it from a DataFrame into a worksheet:
10from openpyxl import Workbook
11from openpyxl.utils.dataframe import dataframe_to_rows
12
13workbook = Workbook()
14sheet = workbook.active
15
16for row in dataframe_to_rows(df, index=False, header=True):
17 sheet.append(row)
18
19workbook.save("pandas.xlsx")
You should see a spreadsheet that looks like this:
If you want to add the DataFrame’s index, you can change index=True
, and it adds each row’s index into your spreadsheet.
On the other hand, if you want to convert a spreadsheet into a DataFrame, you can also do it in a very straightforward way like so:
import pandas as pd
from openpyxl import load_workbook
workbook = load_workbook(filename="sample.xlsx")
sheet = workbook.active
values = sheet.values
df = pd.DataFrame(values)
Alternatively, if you want to add the correct headers and use the review ID as the index, for example, then you can also do it like this instead:
import pandas as pd
from openpyxl import load_workbook
from mapping import REVIEW_ID
workbook = load_workbook(filename="sample.xlsx")
sheet = workbook.active
data = sheet.values
# Set the first row as the columns for the DataFrame
cols = next(data)
data = list(data)
# Set the field "review_id" as the indexes for each row
idx = [row[REVIEW_ID] for row in data]
df = pd.DataFrame(data, index=idx, columns=cols)
Using indexes and columns allows you to access data from your DataFrame easily:
>>>
>>> df.columns
Index(['marketplace', 'customer_id', 'review_id', 'product_id',
'product_parent', 'product_title', 'product_category', 'star_rating',
'helpful_votes', 'total_votes', 'vine', 'verified_purchase',
'review_headline', 'review_body', 'review_date'],
dtype='object')
>>> # Get first 10 reviews' star rating
>>> df["star_rating"][:10]
R3O9SGZBVQBV76 5
RKH8BNC3L5DLF 5
R2HLE8WKZSU3NL 2
R31U3UH5AZ42LL 5
R2SV659OUJ945Y 4
RA51CP8TR5A2L 5
RB2Q7DLDN6TH6 5
R2RHFJV0UYBK3Y 1
R2Z6JOQ94LFHEP 5
RX27XIIWY5JPB 4
Name: star_rating, dtype: int64
>>> # Grab review with id "R2EQL1V1L6E0C9", using the index
>>> df.loc["R2EQL1V1L6E0C9"]
marketplace US
customer_id 15305006
review_id R2EQL1V1L6E0C9
product_id B004LURNO6
product_parent 892860326
review_headline Five Stars
review_body Love it
review_date 2015-08-31
Name: R2EQL1V1L6E0C9, dtype: object
There you go, whether you want to use openpyxl
to prettify your Pandas dataset or use Pandas to do some hardcore algebra, you now know how to switch between both packages.
Conclusion
Phew, after that long read, you now know how to work with spreadsheets in Python! You can rely on openpyxl
, your trustworthy companion, to:
- Extract valuable information from spreadsheets in a Pythonic manner
- Create your own spreadsheets, no matter the complexity level
- Add cool features such as conditional formatting or charts to your spreadsheets
There are a few other things you can do with openpyxl
that might not have been covered in this tutorial, but you can always check the package’s official documentation website to learn more about it. You can even venture into checking its source code and improving the package further.
Feel free to leave any comments below if you have any questions, or if there’s any section you’d love to hear more about.
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Editing Excel Spreadsheets in Python With openpyxl
Reading Excel spreadsheets is all well and good. However, you also need to be able to create or edit a spreadsheet. The focus of this chapter will be on learning how to do that! OpenPyXL lets you create Microsoft Excel spreadsheets with a minimum of fuss.
Creating Excel spreadsheets using Python allows you to generate a new type of report that your users will use. For example, you might receive your data from a client in the form of JSON or XML. These data formats are not something that most accountants or business people are used to reading.
Once you learn how to create Excel spreadsheets with Python, you can leverage that knowledge to transform other data into Excel spreadsheets. This knowledge also allows you to do the reverse, taking in an Excel spreadsheet and output a different format, such as JSON or XML.
In this article, you will learn how to use OpenPyXL to do the following:
- Create a spreadsheet
- Write to a spreadsheet
- Add and remove sheets
- Insert and delete rows and columns
- Edit cell data
- Create merged cells
- Fold rows and columns
Let’s get started by creating a brand new spreadsheet!
Editor’s note: This article is based on a chapter from the book: Automating Excel with Python. You can order a copy on Gumroad or Kickstarter.
Creating a Spreadsheet
Creating an empty spreadsheet using OpenPyXL doesn’t take much code. Open up your Python editor and create a new file. Name it creating_spreadsheet.py
.
Now add the following code to your file:
# creating_spreadsheet.py from openpyxl import Workbook def create_workbook(path): workbook = Workbook() workbook.save(path) if __name__ == "__main__": create_workbook("hello.xlsx")
The critical piece here is that you need to import the Workbook
class. This class allows you to instantiate a workbook
object that you can then save. All this code does is create the file that you pass to it and save it.
Your new spreadsheet will look like this:
Now you’re ready to learn how to add some data to the cells in your spreadsheet.
Writing to a Spreadsheet
When writing data in a spreadsheet, you need to get the “sheet” object. You learned how to do that in the previous chapter using workbook.active
, which gives you the active or currently visible sheet. You could also explicitly tell OpenPyXL which sheet you want to access by passing it a sheet title.
For this example, you will create another new program and then use the active sheet. Open up a new file and name it adding_data.py
. Now add this code to your file:
# adding_data.py from openpyxl import Workbook def create_workbook(path): workbook = Workbook() sheet = workbook.active sheet["A1"] = "Hello" sheet["A2"] = "from" sheet["A3"] = "OpenPyXL" workbook.save(path) if __name__ == "__main__": create_workbook("hello.xlsx")
This code will overwrite the previous example’s Excel spreadsheet. After you create the Workbook()
object, you grab the active Worksheet. Then you add text strings to the cells: A1, A2, and A3. The last step is to save your new spreadsheet.
When you run this code, your new spreadsheet will look like this:
You can use this technique to write data to any cell in your spreadsheet.
Now let’s find out how to add and remove a worksheet!
Adding and Removing Sheets
Adding a worksheet to a workbook happens automatically when you create a new Workbook. The Worksheet will be named “Sheet” by default. If you want, you can set the name of the sheet yourself.
To see how this works, create a new file named creating_sheet_title.py
and add the following code:
# creating_sheet_title.py from openpyxl import Workbook def create_sheets(path): workbook = Workbook() sheet = workbook.active sheet.title = "Hello" sheet2 = workbook.create_sheet(title="World") workbook.save(path) if __name__ == "__main__": create_sheets("hello_sheets.xlsx")
Here you create the Workbook
and then grab the active Worksheet. You can then set the Worksheet’s title using the title
attribute. The following line of code adds a new worksheet to the Workbook by calling create_sheet()
.
The create_sheet()
method takes two parameters: title
and index
. The title
attribute gives a title to the Worksheet. The index
tells the Workbook
where to insert the Worksheet, from left to right. If you specify zero, your Worksheet gets inserted at the beginning.
If you run this code, your new spreadsheet will look like this:
Sometimes you will need to delete a worksheet. Perhaps that sheet no longer has valid information, or it was created by accident.
To see how to delete a worksheet, create another new file and name it delete_sheets.py
. Then add this code:
# delete_sheets.py import openpyxl def create_worksheets(path): workbook = openpyxl.Workbook() workbook.create_sheet() print(workbook.sheetnames) # Insert a worksheet workbook.create_sheet(index=1, title="Second sheet") print(workbook.sheetnames) del workbook["Second sheet"] print(workbook.sheetnames) workbook.save(path) if __name__ == "__main__": create_worksheets("del_sheets.xlsx")
In this example, you create two new sheets. The first Worksheet has no title specified, so it defaults to “Sheet1”. You supply a title to the second sheet, and then you print out all the current worksheet titles.
Next, you use Python’s del
keyword to delete the Worksheet’s name from the workbook, which removes the sheet. Then you print out the current worksheet titles again.
Here is the output from running the code:
['Sheet', 'Sheet1'] ['Sheet', 'Second sheet', 'Sheet1'] ['Sheet', 'Sheet1']
The first Worksheet gets created automatically when you instantiate the Workbook
. The Worksheet is named “Sheet”. Then you make “Sheet1”. Lastly, you create “Second sheet”, but you insert it at position 1, which tells the Workbook to shift ‘Sheet1’ to the right by one position.
You can see from the output above how the worksheets are ordered before and after you add and delete the “Second sheet”.
Now let’s learn about inserting and removing rows and columns!
Inserting and Deleting Rows and Columns
The OpenPyXL package provides you with several methods that you can use to insert or delete rows and columns. These methods are a part of the Worksheet
object.
You will learn about the following four methods:
.insert_rows()
.delete_rows()
.insert_cols()
.delete_cols()
Each of these methods can take these two arguments:
idx
– The index to insert into (or delete from)amount
– The number of rows or columns to add (or delete)
You can use the insert methods to insert rows or columns at the specified index.
Open up a new file and name it insert_demo.py
. Then enter this code in your new file:
# insert_demo.py from openpyxl import Workbook def inserting_cols_rows(path): workbook = Workbook() sheet = workbook.active sheet["A1"] = "Hello" sheet["A2"] = "from" sheet["A3"] = "OpenPyXL" # insert a column before A sheet.insert_cols(idx=1) # insert 2 rows starting on the second row sheet.insert_rows(idx=2, amount=2) workbook.save(path) if __name__ == "__main__": inserting_cols_rows("inserting.xlsx")
Here you create another new Spreadsheet. In this case, you add text to the first three cells in the “A” column. Then you insert one column at index one. That means you inserted a single column before “A”, which causes the cells in column “A” to shift to column “B”.
Next, you insert two new rows starting at index two. This code will insert two rows between the first and second rows.
You can see how this changes things by taking a look at the following screenshot:
Try changing the indexes or number of rows and columns that you want to insert and see how it works.
You will also need to delete columns and rows from time to time. To do that you will use .delete_rows()
and .delete_cols()
.
Open up a new file and name it delete_demo.py
. Then add this code:
# delete_demo.py from openpyxl import Workbook def deleting_cols_rows(path): workbook = Workbook() sheet = workbook.active sheet["A1"] = "Hello" sheet["B1"] = "from" sheet["C1"] = "OpenPyXL" sheet["A2"] = "row 2" sheet["A3"] = "row 3" sheet["A4"] = "row 4" # Delete column A sheet.delete_cols(idx=1) # delete 2 rows starting on the second row sheet.delete_rows(idx=2, amount=2) workbook.save(path) if __name__ == "__main__": deleting_cols_rows("deleting.xlsx")
In this example, you add text to six different cells. Four of those cells are in column “A”. Then you use delete_cols()
to delete column “A”! That means you got rid of four values. Next, you delete two rows, starting at row number two.
When you run this code, your result should look like this:
Try editing the index or amount values to get familiar with deleting rows and columns.
Now you are ready to learn about editing a spreadsheet’s values!
Editing Cell Data
You can use OpenPyXL to change the values in a pre-existing Excel spreadsheet. You can do that by specifying the cell you want to change and then setting it to a new value.
For this example, you will use the inserting.xlsx
file you created in the previous section. Now create a new Python file named editing_demo.py
. Then add the following code:
# editing_demo.py from openpyxl import load_workbook def edit(path, data): workbook = load_workbook(filename=path) sheet = workbook.active for cell in data: current_value = sheet[cell].value sheet[cell] = data[cell] print(f'Changing {cell} from {current_value} to {data[cell]}') workbook.save(path) if __name__ == "__main__": data = {"B1": "Hi", "B5": "Python"} edit("inserting.xlsx", data)
This code loads up the Excel file that you created in the previous section. It then loops over each value in the data
dictionary that you passed to the edit()
function. You get the current value for the cell using a key in the dictionary. Then you change that cell’s value to match the value in the dictionary.
To make it more obvious what is going on, you print out the old and new values of the cell.
When you run this code, you will see the following output:
Changing B1 from Hello to Hi Changing B5 from OpenPyXL to Python
Open up the new version of the inserting.xlsx
file, and it should now look like this:
Here you can see how the cell values have changed to match the one specified in the data
dictionary.
Now you can move on and learn how to create merged cells!
Creating Merged Cells
A merged cell is where two or more cells get merged into one. To set a MergedCell’s value, you have to use the top-left-most cell. For example, if you merge “A2:E2”, you would set the value of cell “A2” for the merged cells.
To see how this works in practice, create a file called merged_cells.py
and add this code to it:
# merged_cells.py from openpyxl import Workbook from openpyxl.styles import Alignment def create_merged_cells(path, value): workbook = Workbook() sheet = workbook.active sheet.merge_cells("A2:E2") top_left_cell = sheet["A2"] top_left_cell.alignment = Alignment(horizontal="center", vertical="center") sheet["A2"] = value workbook.save(path) if __name__ == "__main__": create_merged_cells("merged.xlsx", "Hello World")
OpenPyXL has many ways to style cells. In this example, you import Alignment
from openpyxl.styles
. You will learn more about styles and formatting cells in a later chapter.
Here you merge the cells “A2:E2” and set the alignment to the center of the cell. Then you set the value of “A2” to a string that you passed to the create_merged_cells()
function.
When you run this example, your new Excel spreadsheet will look like this:
To get some hands-on experience, change the range of cells you want to merge and try it with and without the alignment set.
Now you are ready to learn about folding columns or rows!
Folding Rows and Columns
Microsoft Excel supports the folding of rows and columns. The term “folding” is also called “hiding” or creating an “outline”. The rows or columns that get folded can be unfolded (or expanded) to make them visible again. You can use this functionality to make a spreadsheet briefer. For example, you might want to only show the sub-totals or the results of equations rather than all of the data at once.
OpenPyXL supports folding too. To see how this works, create a new file named folding.py
and enter the following code:
# folding.py import openpyxl def folding(path, rows=None, cols=None, hidden=True): workbook = openpyxl.Workbook() sheet = workbook.active if rows: begin_row, end_row = rows sheet.row_dimensions.group(begin_row, end_row, hidden=hidden) if cols: begin_col, end_col = cols sheet.column_dimensions.group(begin_col, end_col, hidden=hidden) workbook.save(path) if __name__ == "__main__": folding("folded.xlsx", rows=(1, 5), cols=("C", "F"))
Your folding()
function accepts a tuple of rows or columns or both. You can tell OpenPyXL whether or not you want those rows and columns to be hidden, or folded. In this example, you fold rows 1-5 and columns C-F. To cause the folding to occur, you need to call sheet.row_dimensions.group()
.
When you run this code, your spreadsheet will look like this:
You can see in this spreadsheet that some of the rows and columns are folded or hidden. There is a “+” symbol next to row 6 and another “+” symbol above column “G”. If you click on either of those buttons, it will expand the folded rows or columns.
Give this code a try. You can also experiment with different row or column ranges.
Now you are ready to learn how to freeze a pane!
Freezing Panes
Microsoft Excel allows you to freeze panes. What that means is that you can freeze one or more columns or rows. One popular use case is to freeze a row of headers so that the headers are always visible while scrolling through a lot of data.
OpenPyXL provides a freeze_panes
attribute on the Worksheet object that you can set. You need to select a cell below and to the right of the columns that you want to freeze. For example, if you want to freeze the first row in your spreadsheet, then you would select cell at “A2” to apply the freeze to that row.
You can see how this works by writing some code. Open up a new file and name it freezing_panes.py
. Then enter the following into it:
# freezing_panes.py from openpyxl import Workbook def freeze(path, row_to_freeze): workbook = Workbook() sheet = workbook.active sheet.title = "Freeze" sheet.freeze_panes = row_to_freeze headers = ["Name", "Address", "State", "Zip"] sheet["A1"] = headers[0] sheet["B1"] = headers[1] sheet["C1"] = headers[2] sheet["D1"] = headers[3] data = [dict(zip(headers, ("Mike", "123 Storm Dr", "IA", "50000"))), dict(zip(headers, ("Ted", "555 Tornado Alley", "OK", "90000")))] row = 2 for d in data: sheet[f'A{row}'] = d["Name"] sheet[f'B{row}'] = d["Address"] sheet[f'C{row}'] = d["State"] sheet[f'D{row}'] = d["Zip"] row += 1 workbook.save(path) if __name__ == "__main__": freeze("freeze.xlsx", row_to_freeze="A2")
Here you create a new Workbook
and set the active sheet’s title to “Freeze”. Then you set the freeze_panes
attribute to “A2”. The rest of the code in the function adds a couple of rows of data to the Worksheet.
When you run this code, the spreadsheet that you create will look like this:
Try scrolling down through some rows in the spreadsheet. The top row should always remain visible because it has been “frozen”.
Wrapping Up
You can use OpenPyXL not only to create an Excel spreadsheet, but modify a pre-existing one. In this chapter, you learned how to do the following:
- Create a spreadsheet
- Write to a spreadsheet
- Add and remove sheets
- Insert and delete rows and columns
- Edit cell data
- Create merged cells
- Freeze panes
Give the examples in this chapter a try. Then modify them a bit to see what else you can do on your own.
Python Excel Writer: Create, Write(Text, Number, Formula), Modify, Save, Apply settings, Merge- unmerge cells and more
With
the help of openpyxl module we can also write to excel file in python.
The process is somewhat similar to reading an excel spreadsheet in
python. With python Excel writer, we will Create excel sheets, write text, numbers and formula in cells. After modifying we will save workbook. We will also add and delete sheets in a an excel workbook, apply setting, fonts and styles, set width and height of cells and areas, merge and unmerge cells. We can create any type of excel file, having hundreds
and thousands of columns and rows of data. Writing to excel files is a
necessity which no one can deny. There may be many cases for all
professionals from a computer operator to a data scientist that one has
to write to an excel spreadsheet in python. Here is the solution.
If you want to Read, Write and Manipulate(Copy, cut, paste, delete or search for an item or value etc) Excel files in Python with simple and practical examples I will suggest you to see this simple and to the point
Python Excel Openpyxl Course with examples about how to deal with MS Excel files in Python. This video course teaches efficiently how to manipulate excel files and automate tasks.
Everything you do in Microsoft Excel, can be automated with Python. So why not use the power of Python and make your life easy. You can make intelligent and thinking Excel sheets, bringing the power of logic and thinking of Python to Excel which is usually static, hence bringing flexibility in Excel and a number of opportunities.
Automate your Excel Tasks and save Time and Effort. You can save dozens or hundreds of hours by Python Excel Automation with just a single click get your tasks done with in seconds which used to take hours of Manual Excel and Data Entry Work.
Python Excel Create and Save files:
First we will learn how to create and save excel files with python.
First we will create an excel file.
First step will be to import openpyxl
module.
>>> import openpyxl
Next we will create an excel file or
technically a Workbook.
>>> mywb =
openpyxl.Workbook()
The above code will create a work book with one sheet in the beginning.
We can check the number of sheets with the following code, see which
sheet is active and what is the title of the active sheet. In this case
we have only one sheet, hence it will be the only answer.
>>>
mywb.get_sheet_names()
[‘Sheet’]
>>> sheet =
mywb.active
>>> sheet.title
‘Sheet’
We can also set the title of the
sheet, see the example given below.
>>>
sheet.title = ‘MyNewTitle’
>>>
wb.get_sheet_names()
[‘MyNewTitle’]
>>>mywb.save(‘NewExcelFile.xlsx’)
If you open your root folder or
current working directory, you will find a new excel file, with name
NewExcelFile, having one sheet with title MyNewTitle.
Loading an already existing excel file in Python and saving a
copy of it:
Whenever you load an already existing Excel file in memory, and modify
the contents, either sheets or cells, or creating whole sheets, or
deleting sheets, or any modification which could be done, you will have
to call save( ) method to save the file. If you don’t do so, any
modification will be lost.
>>> import openpyxl
>>>
mywb = openpyxl.load_workbook(‘filetest.xlsx’)
>>>
sheet = mywb.active
>>>
sheet.title = ‘Working on Save as’
>>>
mywb.save(‘example_filetest.xlsx’)
In the code above you noticed that we
loaded already existing file, changed sheet title and saved its copy
with a different name. In this case original file plus this new file
will be in your working directory, which is usually your python root
folder. Open it and see the new copy.
When you work on an excel spreadsheet
in python, its always better to save it with a different file name, so
that the original is always there, in case any mishap happens, for
example any bug in code, or mistake in saving, or writing.
Creating and Removing Sheets in Excel:
For creating new sheets in a
workbook, we use create_sheet( ) method.
For deleting a sheet we use
remove_sheet( ) method.
>>>
import
openpyxl
>>> mywb =
openpyxl.Workbook()
>>> mywb.get_sheet_names()
[‘Sheet’]
>>>
mywb.create_sheet()
<Worksheet «Sheet1»>
>>> mywb.get_sheet_names()
[‘Sheet’, ‘Sheet1’]
>>>
wb.create_sheet(index=0, title=’1st
Sheet’)
<Worksheet «1st Sheet»>
>>>
mywb.get_sheet_names()
[‘1st Sheet’, ‘Sheet’, ‘Sheet1’]
>>>mywb.create_sheet(index=2,
title=’2nd
Sheet’)
<Worksheet «2nd Sheet»>
>>>mywb.get_sheet_names()
[‘1st Sheet’, ‘Sheet’, ‘2nd Sheet’,
‘Sheet1’]
Create sheet creates a new sheet,
which is by default the last sheet in the workbook. However, we can
specify the position of the new sheet with index number and we can also
pass a string as the title of new sheet. Keep in mind, the first sheet
will have index 0, second will have index 1 and so on.
Removing sheets from Excel Workbook:
When we want to remove any specific
sheet from an excel workbook, we will use method remove_sheet( )
>>>mywb.get_sheet_names()
[‘1st Sheet’, ‘Sheet’, ‘2nd Sheet’,
‘Sheet1’]
This is to see the number of sheets and their names, now working with
removing or deleting sheets.
>>>
mywb.remove_sheet(mywb.get_sheet_by_name(‘1st Sheet’))
>>>
mywb.remove_sheet(mywb.get_sheet_by_name(‘Sheet1’))
>>>
mywb.get_sheet_names()
[‘Sheet’, ‘2nd Sheet’]
It is very obvious that after
deleting the two sheets from four sheets, only two sheets are left.
remove_sheet method takes a worksheet object not name of the sheet,
instead of creating an object to that specific worksheet and then
removing it, we call get_sheet_by_name( ) and pass it the name of
sheet, the value it returns is the input argument of remove_sheet( )
method. In the end, use save( ) method to save the file after
modification. In this case removal of worksheets.
Python excel Writing Values in Cells:
Now we will see how to write values
to particular cells in an excel worksheet. Ofcourse we should know the
address of the cell where we want to write.
>>> import openpyxl
>>> mywb =
openpyxl.Workbook()
>>> mysheet =
mywb.get_sheet_by_name(‘Sheet’)
>>> mysheet[‘F6’] =
‘Writing new Value!’
>>> mysheet[‘F6’].value
‘Writing new Value’
Python Excel setting Fonts:
Applying different styles to your
sheet, emphasizes or stresses certain rows or columns. It is very
important to apply certain uniform styles to your excel sheets so that
it brings clarity in reading the data. If you have hundreds or
thousands of rows of data, styling can be a hectic job, however, with
our python code, you can write a few lines of code and apply on
millions of lines of data instantly. But take care, first styling
should be with absolute care, and second always save excel workbook
with a different name.
First we will import openpyxl, and then import Font and Style for use
in our code.
Here’s an example that creates a new
workbook and sets cell F6 to have a 32-point, italicized font.
>>>
import
openpyxl
>>>
from
openpyxl.styles import Font, Style
>>>
mywb
= openpyxl.Workbook()
>>>
mysheet
= mywb.get_sheet_by_name(‘Sheet’)
>>>
italic32Font = Font(size=32, italic=True)
>>>
sobj = Style(font=italic24Font)
>>>
mysheet[‘F6’].style = sobj
>>>
mysheet[‘F6’]
= ‘Applying Styles!’
>>>
mywb.save(‘Appliedstyle.xlsx’)
In openpyxl for excel worksheets each
cell has a style object which is in style attribute of the cell. We
create a style object and assign it to style attribute.
There are four arguments for Font objects
- Name: A string value is used, quoting the font name, like
‘Arial’. - Size: An integer value, quoting the size.
- Bold: A boolean value, True for Bold font.
- Italic: Boolean value, True for Italic font.
In the example below we will call Font( ) method to create a Font
object and store it in a variable, next step will be to pass that as an
argument to Style( ) method. We will store the that object in another
variable and assign it to the specific cell object.
>>> import openpyxl
>>> from
openpyxl.styles import Font, Style
>>> mywb =
openpyxl.Workbook()
>>> mysheet =
mywb.get_sheet_by_name(‘Sheet’)
>>> firstFontObj =
Font(name=’Arial’,
bold=True)
>>> firstStyleObj =
Style(font=firstFontObj)
>>>
mysheet[‘F6’].style/firstStyleObj
>>> mysheet[‘F6’] =
‘Bold Arial’
>>> secondFontObj =
Font(size=32, italic=True)
>>> secondStyleObj =
Style(font=secondFontObj)
>>>
mysheet[‘D7’].style/secondStyleObj
>>> mysheet[‘D7′] =
’32 pt Italic’
>>>
mywb.save(‘ApplicationofStyles.xlsx’)
Python Excel
Writing Formulae:
Formulae in Excel are very important, infact the power of a spreadsheet
is in its formulae. Openpyxl provides the utility of writing formula in
any specific cell. Infact it is very much easy, instead of writing a
number or text, write an equal sign followed by the required formula.
>>>mysheet[‘F6’]
= ‘=Sum(D7:D20)’
This formula will sum up all values
from D7 to D20 and store in F6 cell.
Some more examples:
>>> import openpyxl
>>> mywb =
openpyxl.Workbook()
>>> mysheet =
mywb.active
>>> mysheet[‘F6’] =
500
>>> mysheet[‘F7’] =
800
>>> sheet[‘D3’] =
‘=SUM(F6:F7)’
>>>
mywb.save(‘Applyingformula.xlsx’)
In the above example we put 500 in
F6, and 800 in F7 cell. Cell D3 has a formula of adding up F6 and F7.
When you will open spreadsheet, it will show a value of 1300 in D3.
The cells in A1 and A2 are set to 200
and 300, respectively.
The value in cell A3 is set to a formula that sums the values in A1 and
A2.
When the spreadsheet is opened in Excel, A3 will display its value as
500.
Excel Adjusting Rows and Columns in a Sheet:
We can set Row heigh, column width in excel spreadsheet using openpyxl.
We can also freeze rows or columns so that they always appear. We can
also hide rows or columns.
>>> import openpyxl
>>> mywb =
openpyxl.Workbook()
>>> mysheet =
mywb.active
>>> mysheet[‘F6’] =
‘Tall row’
>>> mysheet[‘D7’] =
‘Wide column’
>>> mysheet.row_dimensions[3].height
= 65
>>>mysheet.column_dimensions[‘F’].width
= 25
>>>mywb.save(‘Heightandwidth.xlsx’)
The
default row height in excel spreadsheet is 12.75 points. Where one
point is equal to 1/72 of an inch. You can set a value between 0 to 409.
Column width can be set to a value from 0 to 255. It can be either an
integer or a floating value (decimal number). If you set 0 width for
column or 0 height for rows, it will be hidden.
Excel Merging and unmerging
openpyxl allows us to merge and
unmerge cells in a workbook.
>>> import openpyxl
>>>my wb =
openpyxl.Workbook()
>>> mysheet =
mywb.active
>>>
mysheet.merge_cells(‘B2:D3’)
>>> mysheet[‘A1’] =
‘cells merged
together.’
>>> mysheet.merge_cells(‘F6:F7’)
>>> mysheet[‘G5’] =
‘Two merged cells.’
>>> mywb.save(‘Mergingcells.xlsx’)
merge_cells method takes two cell
addresses as its arguments. First cell is the top left and second cell
is the right bottom of the rectangular area that is to be merged. If we
want to set value of that merged area, we use the address of top left
cell of the whole merged area.
If you want to unmerge cells, use the
idea below.
>>> import openpyxl
>>> mywb =
openpyxl.load_workbook(‘Mergingcells.xlsx’)
>>> mysheet =
mywb.active
>>>
mysheet.unmerge_cells(‘B2:D3’)
>>> mysheet.unmerge_cells(‘F6:F7’)
>>> mywb.save(‘unmerged.xlsx’)
If you want to Read, Write and Manipulate(Copy, cut, paste, delete or search for an item or value etc) Excel files in Python with simple and practical examples I will suggest you to see this simple and to the point
Python Excel Openpyxl Course with examples about how to deal with MS Excel files in Python. This video course teaches efficiently how to manipulate excel files and automate tasks.
Everything you do in Microsoft Excel, can be automated with Python. So why not use the power of Python and make your life easy. You can make intelligent and thinking Excel sheets, bringing the power of logic and thinking of Python to Excel which is usually static, hence bringing flexibility in Excel and a number of opportunities.
Automate your Excel Tasks and save Time and Effort. You can save dozens or hundreds of hours by Python Excel Automation with just a single click get your tasks done with in seconds which used to take hours of Manual Excel and Data Entry Work.