I am trying to read a macro-enabled Excel worksheet using pandas.read_excel
with the xlrd library. It’s running fine in local, but when I try to push the same into PCF, I am getting this error:
2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] df1=pd.read_excel(os.path.join(APP_PATH, os.path.join("Data", "aug_latest.xlsm")),sheet_name=None)
2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] return open_workbook(filepath_or_buffer)
2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/xlrd/__init__.py", line 170, in open_workbook
2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported')
2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] xlrd.biffh.XLRDError: Excel xlsx file; not supported
How can I resolve this error?
asked Dec 11, 2020 at 15:53
1
As noted in the release email, linked to from the release tweet and noted in large orange warning that appears on the front page of the documentation, and less orange, but still present, in the readme on the repository and the release on pypi:
xlrd has explicitly removed support for anything other than xls files.
In your case, the solution is to:
- make sure you are on a recent version of Pandas, at least 1.0.1,
and preferably the latest release. 1.2 will make his
even clearer. - install
openpyxl
: https://openpyxl.readthedocs.io/en/stable/ - change your Pandas code to be:
df1 = pd.read_excel( os.path.join(APP_PATH, "Data", "aug_latest.xlsm"), engine='openpyxl', )
answered Dec 12, 2020 at 14:49
Chris WithersChris Withers
10.7k4 gold badges33 silver badges50 bronze badges
10
The previous version, xlrd 1.2.0, may appear to work, but it could also expose you to potential security vulnerabilities. With that warning out of the way, if you still want to give it a go, type the following command:
pip install xlrd==1.2.0
answered Dec 11, 2020 at 16:47
tryhardtryhard
2,7171 gold badge3 silver badges3 bronze badges
7
Problem
You are have xlrd installed on your cluster and are attempting to read files in the Excel .xlsx format when you get an error.
XLRDError: Excel xlsx file; not supported
Cause
xlrd 2.0.0 and above can only read .xls files.
Support for .xlsx files was removed from xlrd due to a potential security vulnerability.
Solution
Use openpyxl to open .xlsx files instead of xlrd.
- Install the openpyxl library on your cluster (AWS | Azure | GCP).
- Confirm that you are using pandasversion 1.0.1 or above.
%python import pandas as pd print(pd.__version__)
- Specify openpyxl when reading .xlsx files with pandas.
%python import pandas df = pandas.read_excel(`<name-of-file>.xlsx`, engine=`openpyxl`)
Refer to the openpyxl documentation for more information.
Problem:
When trying to read an .xlsx
file using pandas pd.read_excel()
you see this error message:
XLRDError: Excel xlsx file; not supported
Solution:
The xlrd library only supports .xls
files, not .xlsx
files. In order to make pandas able to read .xlsx
files, install openpyxl
:
sudo pip3 install openpyxl
After that, retry running your script (if you are running a Jupyter Notebook, be sure to restart the notebook to reload pandas!).
If the error still persists, you have two choices:
Choice 1 (preferred): Update pandas
Pandas 1.1.3 doesn’t automatically select the correct XLSX reader engine, but pandas 1.3.1 does:
sudo pip3 install --upgrade pandas
If you are running a Jupyter Notebook, be sure to restart the notebook to load the updated pandas version!
Choice 2: Explicitly set the engine in pd.read_excel()
Add engine='openpyxl'
to your pd.read_excel()
command, for example:
pd.read_excel('my.xlsx', engine='openpyxl')
Pandas provide methods to read different file formats using a single line of code. When reading an xlsx file, rd.biffh.XLRDError: Excel xlsx file; not supported error might occur.
You can solve xlrd.biffh.XLRDError: Excel xlsx file; not supported Error by updating the Pandas library version to the latest version.
Now let us see the reason for this error and how to solve it.
Reason For the Error
The Pandas cannot open an Excel (.xlsx) file when you use the read_excel()
method available in the Pandas library version earlier than V1.2.0.
Because the versions older than 1.2.0 internally uses the xlrd library to read the excel files. The xlrd
library supports ONLY reading the .xls
files. It doesn’t support reading the .xlsx
or .xlsm
files any longer.
Hence, you need to use a different approach to solve this problem.
Solution 1 — Upgrading Pandas To the Latest Version
You can solve the xlrd.biffh.XLRDError: Excel xlsx file; not supported
error by upgrading the Pandas version to the latest version.
By default, the latest version uses the openpyxl library. This library supports reading the .xlsx
file and .xlsm
files.
To know the current Pandas library version, use the below snippet.
Code
pd.__version__
Output
'1.1.2'
You can upgrade the Pandas library to the latest version using the below statement.
Code
%pip install --upgrade pandas
Prefixing the %
in PIP lets you update the packages directly from Jupyter.
Output
Successfully installed pandas-1.4.2
Note: you may need to restart the kernel to use updated packages.
The Pandas library is upgraded to the latest version, and also the dependent libraries are updated.
With the latest library, you can use the read_excel() method directly to read the excel files.
Use the code below to read the xlsx file or xlsm file.
Code
import pandas as pd
df = pd.read_excel(
"sample.xlsx”
)
The file will be read, and the dataframe will be populated. You can print the dataframe to see the values in the excel file.
Code
df
Dataframe Will Look Like
Column 0 | Column 1 | Column 2 | |
---|---|---|---|
0 | NaN | NaN | NaN |
1 | NaN | NaN | NaN |
2 | NaN | NaN | NaN |
3 | NaN | NaN | NaN |
This is how you need to solve the xlrd.biffh.XLRDError: Excel xlsx file; not supported Error.
Solution 2 — Using Openpyxl Engine
You can use the openpyxl engine to read the xlsx file.
This is an alternate way to solve the 8xlrd.biffh.XLRDError: Excel xlsx file; not supported* Error.
If you do not want to upgrade the Pandas library to the latest version, you shall use this solution. However, it is always recommended to use the latest library versions to avoid security threats to your application.
Install the openpyxl
library using the below snippet.
Code
%pip install openpyxl
Prefixing the %
in PIP lets you install the packages directly from Jupyter.
Output
Successfully installed openpyxl v3.0.9
Note: you may need to restart the kernel to use updated packages.
Now, specify the openpyxl
engine in the pandas read_excel()
method to read the xlsx file or an xlsm file.
The code below demonstrates how to specify the openpyxl
engine in the read_excel()
method.
Code
df = pd.read_excel(
"sample.xlsm",
engine='openpyxl')
The file will be read successfully, and you can print the dataframe to see the values.
df
Conclusion
Pandas cannot open an Excel (.xlsx) file using the read_excel()
method when you’re using the Pandas version earlier than V1.2.0.
You’ve learned how to solve the xlrd.biffh.XLRDError: Excel xlsx file; not supported error by updating the pandas library or using the alternate engine openpyxl
.
It is also recommended to update the libraries to the latest versions to avoid any security threats to your application.
You May Also Like
- How To Solve Truth Value Of A Series Is Ambiguous. Use A.Empty, A.Bool(), A.Item(), A.Any() Or A.All()
- How To Solve Python Pandas Error Tokenizing Data Error?
- How to write pandas dataframe to CSV
@keiv-fly you would have to open with a reproducible example and version info
A reproducible example with the latest stable pandas and xlrd:
import pandas as pd
pd.read_excel("file.xlsx")
file.xlsx is a new completely empty excel sheet.
Error:
---------------------------------------------------------------------------
XLRDError Traceback (most recent call last)
<ipython-input-1-4c247bba163b> in <module>
1 import pandas as pd
----> 2 pd.read_excel("file.xlsx")
~anaconda3envspy38libsite-packagespandasutil_decorators.py in wrapper(*args, **kwargs)
294 )
295 warnings.warn(msg, FutureWarning, stacklevel=stacklevel)
--> 296 return func(*args, **kwargs)
297
298 return wrapper
~anaconda3envspy38libsite-packagespandasioexcel_base.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols)
302
303 if not isinstance(io, ExcelFile):
--> 304 io = ExcelFile(io, engine=engine)
305 elif engine and engine != io.engine:
306 raise ValueError(
~anaconda3envspy38libsite-packagespandasioexcel_base.py in __init__(self, path_or_buffer, engine)
865 self._io = stringify_path(path_or_buffer)
866
--> 867 self._reader = self._engines[engine](self._io)
868
869 def __fspath__(self):
~anaconda3envspy38libsite-packagespandasioexcel_xlrd.py in __init__(self, filepath_or_buffer)
20 err_msg = "Install xlrd >= 1.0.0 for Excel support"
21 import_optional_dependency("xlrd", extra=err_msg)
---> 22 super().__init__(filepath_or_buffer)
23
24 @property
~anaconda3envspy38libsite-packagespandasioexcel_base.py in __init__(self, filepath_or_buffer)
351 self.book = self.load_workbook(filepath_or_buffer)
352 elif isinstance(filepath_or_buffer, str):
--> 353 self.book = self.load_workbook(filepath_or_buffer)
354 elif isinstance(filepath_or_buffer, bytes):
355 self.book = self.load_workbook(BytesIO(filepath_or_buffer))
~anaconda3envspy38libsite-packagespandasioexcel_xlrd.py in load_workbook(self, filepath_or_buffer)
35 return open_workbook(file_contents=data)
36 else:
---> 37 return open_workbook(filepath_or_buffer)
38
39 @property
~anaconda3envspy38libsite-packagesxlrd__init__.py in open_workbook(filename, logfile, verbosity, use_mmap, file_contents, encoding_override, formatting_info, on_demand, ragged_rows, ignore_workbook_corruption)
168 # files that xlrd can parse don't start with the expected signature.
169 if file_format and file_format != 'xls':
--> 170 raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported')
171
172 bk = open_workbook_xls(
XLRDError: Excel xlsx file; not supported
pip freeze:
The most important:
pandas==1.2.0
xlrd==2.0.1 (pip freeze gives a directory, but the version is this one)
openpyxl==3.0.5 (pip freeze gives a directory, but the version is this one)
All output:
alabaster==0.7.12
anaconda-client==1.7.2
anaconda-project==0.8.3
appdirs==1.4.4
argh==0.26.2
argon2-cffi @ file:///C:/ci/argon2-cffi_1596828585465/work
asn1crypto @ file:///tmp/build/80754af9/asn1crypto_1596577642040/work
astroid @ file:///C:/ci/astroid_1592487315634/work
astropy @ file:///C:/ci/astropy_1606904903785/work
async-generator==1.10
atomicwrites==1.4.0
attrs @ file:///tmp/build/80754af9/attrs_1604765588209/work
autopep8 @ file:///tmp/build/80754af9/autopep8_1596578164842/work
Babel @ file:///tmp/build/80754af9/babel_1607110387436/work
backcall==0.2.0
backports.shutil-get-terminal-size @ file:///tmp/build/80754af9/backports.shutil_get_terminal_size_1608222128777/work
bcrypt @ file:///C:/ci/bcrypt_1597936263757/work
beautifulsoup4 @ file:///tmp/build/80754af9/beautifulsoup4_1601924105527/work
bitarray @ file:///C:/ci/bitarray_1605065210072/work
bkcharts==0.2
black==19.10b0
bleach @ file:///tmp/build/80754af9/bleach_1600439572647/work
bokeh @ file:///C:/ci/bokeh_1603297934731/work
boto==2.49.0
Bottleneck==1.3.2
brotlipy==0.7.0
certifi==2020.12.5
cffi @ file:///C:/ci/cffi_1606255208697/work
chardet @ file:///C:/ci/chardet_1607690654534/work
click==7.1.2
cloudpickle @ file:///tmp/build/80754af9/cloudpickle_1598884132938/work
clyent==1.2.2
colorama @ file:///tmp/build/80754af9/colorama_1607707115595/work
comtypes==1.1.8
contextlib2==0.6.0.post1
cryptography @ file:///C:/ci/cryptography_1607637849569/work
cycler==0.10.0
Cython @ file:///C:/ci/cython_1605457704149/work
cytoolz==0.11.0
dask @ file:///tmp/build/80754af9/dask-core_1607706933335/work
decorator==4.4.2
defusedxml==0.6.0
diff-match-patch @ file:///tmp/build/80754af9/diff-match-patch_1594828741838/work
distributed @ file:///C:/ci/distributed_1607714134337/work
docutils==0.16
entrypoints==0.3
et-xmlfile==1.0.1
fastcache==1.1.0
filelock==3.0.12
flake8 @ file:///tmp/build/80754af9/flake8_1601911421857/work
Flask==1.1.2
fsspec @ file:///tmp/build/80754af9/fsspec_1602684995936/work
future==0.18.2
gevent @ file:///C:/ci/gevent_1609093188435/work
glob2==0.7
gmpy2==2.0.8
greenlet @ file:///C:/ci/greenlet_1600885448389/work
h5py==2.10.0
HeapDict==1.0.1
html5lib @ file:///tmp/build/80754af9/html5lib_1593446221756/work
idna @ file:///tmp/build/80754af9/idna_1593446292537/work
imagecodecs @ file:///C:/ci/imagecodecs_1603271628972/work
imageio @ file:///tmp/build/80754af9/imageio_1594161405741/work
imagesize==1.2.0
importlib-metadata @ file:///tmp/build/80754af9/importlib-metadata_1602276842396/work
iniconfig @ file:///tmp/build/80754af9/iniconfig_1602780191262/work
intervaltree @ file:///tmp/build/80754af9/intervaltree_1598376443606/work
ipykernel @ file:///C:/ci/ipykernel_1596190155316/work/dist/ipykernel-5.3.4-py3-none-any.whl
ipython @ file:///C:/ci/ipython_1604083276484/work
ipython-genutils @ file:///tmp/build/80754af9/ipython_genutils_1606773439826/work
ipywidgets @ file:///tmp/build/80754af9/ipywidgets_1608834360725/work
isort @ file:///tmp/build/80754af9/isort_1602603989581/work
itsdangerous==1.1.0
jdcal==1.4.1
jedi @ file:///C:/ci/jedi_1606914528444/work
Jinja2==2.11.2
joblib @ file:///tmp/build/80754af9/joblib_1607970656719/work
json5==0.9.5
jsonschema @ file:///tmp/build/80754af9/jsonschema_1602607155483/work
jupyter==1.0.0
jupyter-client @ file:///tmp/build/80754af9/jupyter_client_1601311786391/work
jupyter-console @ file:///tmp/build/80754af9/jupyter_console_1598884538475/work
jupyter-core @ file:///C:/ci/jupyter_core_1606149073702/work
jupyterlab==2.2.6
jupyterlab-pygments @ file:///tmp/build/80754af9/jupyterlab_pygments_1601490720602/work
jupyterlab-server @ file:///tmp/build/80754af9/jupyterlab_server_1594164409481/work
keyring @ file:///C:/ci/keyring_1608756176625/work
kiwisolver @ file:///C:/ci/kiwisolver_1604014703538/work
lazy-object-proxy @ file:///C:/b/work
libarchive-c==2.9
line-profiler==2.1.2
llvmlite==0.34.0
locket==0.2.0
lxml @ file:///C:/ci/lxml_1606498900999/work
MarkupSafe==1.1.1
matplotlib==3.3.3
mccabe==0.6.1
menuinst==1.4.16
mistune==0.8.4
mkl-fft==1.2.0
mkl-random==1.1.1
mkl-service==2.3.0
mock @ file:///tmp/build/80754af9/mock_1607622725907/work
more-itertools @ file:///tmp/build/80754af9/more-itertools_1605111547926/work
mpmath==1.1.0
msgpack @ file:///C:/ci/msgpack-python_1607965861637/work
multipledispatch==0.6.0
mypy-extensions==0.4.3
nbclient @ file:///tmp/build/80754af9/nbclient_1602783176460/work
nbconvert @ file:///C:/ci/nbconvert_1601914925608/work
nbformat @ file:///tmp/build/80754af9/nbformat_1602783287752/work
nest-asyncio @ file:///tmp/build/80754af9/nest-asyncio_1606153767164/work
networkx @ file:///tmp/build/80754af9/networkx_1598376031484/work
nltk @ file:///tmp/build/80754af9/nltk_1592496090529/work
nose @ file:///tmp/build/80754af9/nose_1606773131901/work
notebook @ file:///C:/ci/notebook_1608817539935/work
numba==0.51.2
numexpr @ file:///C:/ci/numexpr_1607675674463/work
numpy @ file:///C:/ci/numpy_and_numpy_base_1603466732592/work
numpydoc @ file:///tmp/build/80754af9/numpydoc_1605117425582/work
olefile==0.46
openpyxl @ file:///tmp/build/80754af9/openpyxl_1598113097404/work
packaging @ file:///tmp/build/80754af9/packaging_1607971725249/work
pandas==1.2.0
pandocfilters @ file:///C:/ci/pandocfilters_1605102497129/work
paramiko @ file:///tmp/build/80754af9/paramiko_1598886428689/work
parso==0.7.0
partd==1.1.0
path @ file:///C:/ci/path_1607537314965/work
pathlib2 @ file:///C:/ci/pathlib2_1607025069150/work
pathspec==0.7.0
pathtools==0.1.2
patsy==0.5.1
pep8==1.7.1
pexpect @ file:///tmp/build/80754af9/pexpect_1605563209008/work
pickleshare @ file:///tmp/build/80754af9/pickleshare_1606932040724/work
Pillow @ file:///C:/ci/pillow_1603823068645/work
pkginfo==1.6.1
plotly==4.14.1
pluggy==0.13.1
ply==3.11
prometheus-client @ file:///tmp/build/80754af9/prometheus_client_1606344362066/work
prompt-toolkit @ file:///tmp/build/80754af9/prompt-toolkit_1602688806899/work
psutil @ file:///C:/ci/psutil_1598370330503/work
ptyprocess @ file:///tmp/build/80754af9/ptyprocess_1605560620615/work/dist/ptyprocess-0.6.0-py2.py3-none-any.whl
py @ file:///tmp/build/80754af9/py_1607971587848/work
pyarrow==1.0.1
pycodestyle==2.6.0
pycosat==0.6.3
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
pycurl==7.43.0.6
pydocstyle @ file:///tmp/build/80754af9/pydocstyle_1598885001695/work
pyerfa @ file:///C:/ci/pyerfa_1606860299350/work
pyflakes==2.2.0
Pygments @ file:///tmp/build/80754af9/pygments_1607368905949/work
pylint @ file:///C:/ci/pylint_1598617153160/work
pyls-black @ file:///tmp/build/80754af9/pyls-black_1607553132291/work
pyls-spyder @ file:///tmp/build/80754af9/pyls-spyder_1608134179673/work
PyNaCl @ file:///C:/ci/pynacl_1595000047588/work
pyodbc===4.0.0-unsupported
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1608057966937/work
pyparsing==2.4.7
pyreadline==2.1
pyrsistent @ file:///C:/ci/pyrsistent_1600141795814/work
PySocks @ file:///C:/ci/pysocks_1605287845585/work
pytest==0.0.0
python-dateutil==2.8.1
python-jsonrpc-server @ file:///tmp/build/80754af9/python-jsonrpc-server_1600278539111/work
python-language-server @ file:///tmp/build/80754af9/python-language-server_1607972495879/work
pytz @ file:///tmp/build/80754af9/pytz_1608922264688/work
PyWavelets @ file:///C:/ci/pywavelets_1601658407916/work
pywin32==227
pywin32-ctypes==0.2.0
pywinpty==0.5.7
PyYAML==5.3.1
pyzmq==20.0.0
QDarkStyle==2.8.1
QtAwesome @ file:///tmp/build/80754af9/qtawesome_1602272867890/work
qtconsole @ file:///tmp/build/80754af9/qtconsole_1600870028330/work
QtPy==1.9.0
regex @ file:///C:/ci/regex_1606691183008/work
requests @ file:///tmp/build/80754af9/requests_1608241421344/work
retrying==1.3.3
rope @ file:///tmp/build/80754af9/rope_1602264064449/work
rstcheck==3.3.1
Rtree==0.9.4
ruamel-yaml==0.15.87
scikit-image==0.17.2
scikit-learn @ file:///C:/ci/scikit-learn_1598377018496/work
scipy @ file:///C:/ci/scipy_1604596260408/work
seaborn @ file:///tmp/build/80754af9/seaborn_1608578541026/work
selenium==3.141.0
Send2Trash @ file:///tmp/build/80754af9/send2trash_1607525499227/work
simplegeneric==0.8.1
singledispatch @ file:///tmp/build/80754af9/singledispatch_1602523705405/work
sip==4.19.13
six @ file:///C:/ci/six_1605187374963/work
snowballstemmer==2.0.0
sortedcollections==1.2.1
sortedcontainers @ file:///tmp/build/80754af9/sortedcontainers_1606865132123/work
soupsieve @ file:///tmp/build/80754af9/soupsieve_1607965878077/work
Sphinx @ file:///tmp/build/80754af9/sphinx_1608836605859/work
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==1.0.3
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.4
sphinxcontrib-websupport @ file:///tmp/build/80754af9/sphinxcontrib-websupport_1597081412696/work
spyder @ file:///C:/b/work
spyder-kernels @ file:///C:/ci/spyder-kernels_1608560699887/work
SQLAlchemy @ file:///C:/ci/sqlalchemy_1608243831000/work
statsmodels==0.12.1
sympy @ file:///C:/ci/sympy_1608119646897/work
tables==3.6.1
tblib @ file:///tmp/build/80754af9/tblib_1597928476713/work
terminado==0.9.1
testpath==0.4.4
threadpoolctl @ file:///tmp/tmp9twdgx9k/threadpoolctl-2.1.0-py3-none-any.whl
three-merge @ file:///tmp/build/80754af9/three-merge_1607553261110/work
tifffile @ file:///tmp/build/80754af9/tifffile_1607624018440/work
toml @ file:///tmp/build/80754af9/toml_1592853716807/work
toolz @ file:///tmp/build/80754af9/toolz_1601054250827/work
tornado @ file:///C:/ci/tornado_1606942392901/work
tqdm @ file:///tmp/build/80754af9/tqdm_1608922753628/work
traitlets @ file:///tmp/build/80754af9/traitlets_1602787416690/work
typed-ast==1.4.1
typing-extensions @ file:///tmp/build/80754af9/typing_extensions_1598376058250/work
ujson @ file:///C:/ci/ujson_1602505266082/work
unicodecsv==0.14.1
urllib3 @ file:///tmp/build/80754af9/urllib3_1606938623459/work
watchdog @ file:///C:/ci/watchdog_1606939226617/work
wcwidth @ file:///tmp/build/80754af9/wcwidth_1593447189090/work
webencodings==0.5.1
Werkzeug==1.0.1
widgetsnbextension==3.5.1
win-inet-pton @ file:///C:/ci/win_inet_pton_1605306167264/work
win-unicode-console==0.5
wincertstore==0.2
wrapt==1.11.2
xlrd @ file:///tmp/build/80754af9/xlrd_1608072521494/work
XlsxWriter @ file:///tmp/build/80754af9/xlsxwriter_1602692860603/work
xlwings==0.21.4
xlwt==1.3.0
yapf @ file:///tmp/build/80754af9/yapf_1593528177422/work
zict==2.0.0
zipp @ file:///tmp/build/80754af9/zipp_1604001098328/work
zope.event==4.5.0
zope.interface @ file:///C:/ci/zope.interface_1606940376312/work
Output of the python
command:
Python 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.