Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article
Python is a high-level, general-purpose, and very popular programming language. Python programming language (latest Python 3) is being used in web development, Machine Learning applications, along with all cutting-edge technology in Software Industry.
In this article, we will learn how to convert an Excel File to PDF File Using Python
Here we will use the win32com module for file conversion.
What is win32com?
Python extensions for Microsoft Windows Provide access to much of the Win32 API, the ability to create and use COM objects, and the Pythonwin environment.
pip install pywin32
Approach:
- First, we will create a COM Object Using Dispatch() method.
- Then we will read the Excel file pass “Excel.Application” inside the Dispatch method
- Pass Excel file path
- Then we will convert it into PDF using the ExportAsFixedFormat() method
ExportAsFixedFormat():- The ExportAsFixedFormat method is used to publish a workbook in either PDF or XPS format.
Syntax:
ExportAsFixedFormat (Type, FileName, Quality, IncludeDocProperties, IgnorePrintAreas, From, To,
OpenAfterPublish, FixedFormatExtClassPtr)
Excel File click here
Input:
EXCEL FILE
Below is the Implementation:
Python3
from
win32com
import
client
excel
=
client.Dispatch(
"Excel.Application"
)
sheets
=
excel.Workbooks.
Open
(
'Excel File Path'
)
work_sheets
=
sheets.Worksheets[
0
]
work_sheets.ExportAsFixedFormat(
0
,
'PDF File Path'
)
Output:
PDF FILE
Like Article
Save Article
In this post, you’ll learn how to convert Excel files to PDFs in your Python application using PSPDFKit’s XLSX to PDF Python API. With our API, you can convert up to 100 PDF files per month for free. All you need to do is create a free account to get access to your API key.
PSPDFKit API
Document conversion is just one of our 30+ PDF API tools. You can combine our conversion tool with other tools to create complex document processing workflows. You’ll be able to convert various file formats into PDFs and then:
-
Merge several resulting PDFs into one
-
OCR, watermark, or flatten PDFs
-
Remove or duplicate specific PDF pages
Once you create your account, you’ll be able to access all our PDF API tools.
Step 1 — Creating a Free Account on PSPDFKit
Go to our website, where you’ll see the page below, prompting you to create your free account.
Once you’ve created your account, you’ll be welcomed by the page below, which shows an overview of your plan details.
As you can see in the bottom-left corner, you’ll start with 100 documents to process, and you’ll be able to access all our PDF API tools.
Step 2 — Obtaining the API Key
After you’ve verified your email, you can get your API key from the dashboard. In the menu on the left, click API Keys. You’ll see the following page, which is an overview of your keys:
Copy the Live API Key, because you’ll need this for the Excel to PDF API.
Step 3 — Setting Up Folders and Files
Now, create a folder called excel_to_pdf
and open it in a code editor. For this tutorial, you’ll use VS Code as your primary code editor. Next, create two folders inside excel_to_pdf
and name them input_documents
and processed_documents
.
Now, copy your Excel file to the input_documents
folder and rename it to document.xlsx
. You can use our demo document as an example.
Then, in the root folder, excel_to_pdf
, create a file called processor.py
. This is the file where you’ll keep your code.
Your folder structure will look like this:
excel_to_pdf ├── input_documents | └── document.xlsx ├── processed_documents └── processor.py
Step 4 — Writing the Code
Open the processor.py
file and paste the code below into it:
import requests import json instructions = { 'parts': [ { 'file': 'document' } ] } response = requests.request( 'POST', 'https://api.pspdfkit.com/build', headers = { 'Authorization': 'Bearer YOUR API KEY HERE' }, files = { 'document': open('input_documents/document.xlsx', 'rb') }, data = { 'instructions': json.dumps(instructions) }, stream = True ) if response.ok: with open('processed_documents/result.pdf', 'wb') as fd: for chunk in response.iter_content(chunk_size=8096): fd.write(chunk) else: print(response.text) exit()
ℹ️ Note: Make sure to replace
YOUR_API_KEY_HERE
with your API key.
Code Explanation
In the code above, you first import the requests
and json
dependencies. After that, you create the instructions for the API call.
You then use the requests
module to make the API call, and once it succeeds, you store the result in the processed_documents
folder.
Output
To execute the code, use the command below:
Once the code has been executed, you’ll see a new processed file under the processed_documents
folder called result.pdf
.
The folder structure will look like this:
excel_to_pdf ├── input_documents | └── document.xlsx ├── processed_documents | └── result.pdf └── processor.py
Final Words
In this post, you learned how to easily and seamlessly convert Excel files to PDF documents for your Python application using our Excel to PDF Python API.
You can integrate these functions into your existing applications. With the same API token, you can also perform other operations, such as merging several documents into a single PDF, adding watermarks, and more. To get started with a free trial, sign up here.
With the help of this .doc to pdf using python
Link I am trying for excel (.xlsx and xls formats)
Following is modified Code for Excel:
import os
from win32com import client
folder = "C:\Oprance\Excel\XlsxWriter-0.5.1"
file_type = 'xlsx'
out_folder = folder + "\PDF_excel"
os.chdir(folder)
if not os.path.exists(out_folder):
print 'Creating output folder...'
os.makedirs(out_folder)
print out_folder, 'created.'
else:
print out_folder, 'already exists.n'
for files in os.listdir("."):
if files.endswith(".xlsx"):
print files
print 'nn'
word = client.DispatchEx("Excel.Application")
for files in os.listdir("."):
if files.endswith(".xlsx") or files.endswith('xls'):
out_name = files.replace(file_type, r"pdf")
in_file = os.path.abspath(folder + "\" + files)
out_file = os.path.abspath(out_folder + "\" + out_name)
doc = word.Workbooks.Open(in_file)
print 'Exporting', out_file
doc.SaveAs(out_file, FileFormat=56)
doc.Close()
It is showing following error :
>>> execfile('excel_to_pdf.py')
Creating output folder...
C:ExcelXlsxWriter-0.5.1PDF_excel created.
apms_trial.xlsx
~$apms_trial.xlsx
Exporting C:ExcelXlsxWriter-0.5.1PDF_excelapms_trial.pdf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "excel_to_pdf.py", line 30, in <module>
doc = word.Workbooks.Open(in_file)
File "<COMObject <unknown>>", line 8, in Open
pywintypes.com_error: (-2147352567, 'Exception occurred.', (0, u'Microsoft Excel
', u"Excel cannot open the file '~$apms_trial.xlsx' because the file format or f
ile extension is not valid. Verify that the file has not been corrupted and that
the file extension matches the format of the file.", u'xlmain11.chm', 0, -21468
27284), None)
>>>
There is problem in
doc.SaveAs(out_file, FileFormat=56)
What should be FileFormat file format?
Please Help
In this tutorial, we will learn to convert Excel Files to PDF in python. We will learn a bit about PyWin32 library.
PyWin32 is a Python library for Microsoft Windows that enables the Win32 application programming interface (API) features on Python. To install the PyWin32 library, open command prompt/ Terminal and write
pip install pywin32
Approach
Let’s see a fundamental approach to convert Excel files (.xlsx or .xls format) to PDF format.
- Create a COM object using Dispatch() function.
- Read the Excel file pass “Excel.Application” inside the Dispatch function.
- Write Excel file path.
- Convert Excel into PDF format using ExportAsFixedFormat() and pass the path to which you want to save the pdf file.
Import Client submodule from win32com library
# Import Library from win32com import client
client.Dispatch() function will help to extract the excel application from windows system. Workbook methods will help to read the excel file.
# Opening Microsoft Excel excel = client.Dispatch("Excel.Application") # Read Excel File sheets = excel.Workbooks.Open('Enter Excel file path') work_sheets = sheets.Worksheets[0]
Converting Excel file to PDF file. ExportAsFixedFormat() method is used to convert a workbook into either PDF format.
# Converting into PDF File work_sheets.ExportAsFixedFormat(0, 'Enter PDF file path')
Combining all code
# Import Library from win32com import client # Opening Microsoft Excel excel = client.Dispatch("Excel.Application") # Read Excel File sheets = excel.Workbooks.Open('Enter Excel file path') work_sheets = sheets.Worksheets[0] # Converting into PDF File work_sheets.ExportAsFixedFormat(0, 'Enter PDF file path')
We have successfully converted Excel file to PDF file format using Python.