From excel to databases with python

I just finished a basic Python script for a client that I’d like to share with you. He needed an easy means of moving data back and forth between MySQL and Excel, and sometimes he needed to do a bit of manipulation between along the way. In the past I may have relied solely on VBA for this, but I have found it to be much easier with Python. In this post and the accompanying video, I show just part of the project — importing data from Excel into MySQL via Python. Let’s get started.

Be sure to check out the accompanying video!

Download the dependencies

Assuming you have Python installed (I’m using version 2.7), download and install the xlrd library and MySQLdb module-

  • http://pypi.python.org/pypi/xlrd
  • http://sourceforge.net/projects/mysql-python/

Develop the script

Then tailor the following script to fit your needs:

import xlrd
import MySQLdb

# Open the workbook and define the worksheet
book = xlrd.open_workbook("pytest.xls")
sheet = book.sheet_by_name("source")

# Establish a MySQL connection
database = MySQLdb.connect (host="localhost", user = "root", passwd = "", db = "mysqlPython")

# Get the cursor, which is used to traverse the database, line by line
cursor = database.cursor()

# Create the INSERT INTO sql query
query = """INSERT INTO orders (product, customer_type, rep, date, actual, expected, open_opportunities, closed_opportunities, city, state, zip, population, region) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"""

# Create a For loop to iterate through each row in the XLS file, starting at row 2 to skip the headers
for r in range(1, sheet.nrows):
		product		= sheet.cell(r,).value
		customer	= sheet.cell(r,1).value
		rep			= sheet.cell(r,2).value
		date		= sheet.cell(r,3).value
		actual		= sheet.cell(r,4).value
		expected	= sheet.cell(r,5).value
		open		= sheet.cell(r,6).value
		closed		= sheet.cell(r,7).value
		city		= sheet.cell(r,8).value
		state		= sheet.cell(r,9).value
		zip			= sheet.cell(r,10).value
		pop			= sheet.cell(r,11).value
		region	= sheet.cell(r,12).value

		# Assign values from each row
		values = (product, customer, rep, date, actual, expected, open, closed, city, state, zip, pop, region)

		# Execute sql Query
		cursor.execute(query, values)

# Close the cursor
cursor.close()

# Commit the transaction
database.commit()

# Close the database connection
database.close()

# Print results
print ""
print "All Done! Bye, for now."
print ""
columns = str(sheet.ncols)
rows = str(sheet.nrows)
print "I just imported " %2B columns %2B " columns and " %2B rows %2B " rows to MySQL!"

Hope this is useful. More to come!

Are you trying to create a database from an excel file? Here is a quick python tutorial that will teach you how you can create a SQLite database from an excel file.

The easiest way to convert an excel file into a database table in python is to use df.to_sql() function. df.to_sql() in python convert a data frame into SQLite database. But first, you have to convert your excel file to dataframe.

Follow these steps to convert an Excel file into Sqlie database

Step No 1: Convert Excel file into Dataframe

The first step in the process of conversion of an Excel file to SQLite database is to convert excel file into a data frame. The best way to convert an Excel file into a data frame is to use the read_excel() function. The read_excle() function in the python pandas module converts an excel file into a pandas dataframe.


import pandas as pd
import sqlite3
df = pd.read_excel('excel_file.xls')
print(df)

Step No 2: Convert Dataframe to SQL database

Once we have the pandas data frame we can use the df.to_sql() function to convert dataframe to a SQL database.

Below is the code to convert excel file to sqlite database using pandas.


import pandas as pd
import sqlite3

db = sqlite3.connect('sqlite.db')
dfs = pd.read_excel('excel_file.xls', sheet_name=None)
for table, df in dfs.items():
    df.to_sql(table, db)
    print(f'{df} inserted successfully')

Output of the code:

 inserted successfully

Convert Multiple Excel Workbooks to a Database table in Python

This is the best method to convert an excel file having multiple workbooks. First, get the name of the workbooks and then insert them into the database.

Check the following code to convert excel file to SQLite database.


import sqlite3
import pandas as pd

con=sqlite3.connect('database.db')
wb=pd.ExcelFile('excel_file.xls')
for sheet in wb.sheet_names:
        df=pd.read_excel('excel_file.xls',sheet_name=sheet)
        df.to_sql(sheet,con, index=False,if_exists="replace")
con.commit()
con.close()

Summary and Conclusion

We have learned how we can convert excel files to a database tables in python. These two are the best way to do the job done. If you have any questions please let me know in the comment section.

Here I give an outline and explanation of the process including links to the relevant documentation. As some more thorough details were missing in the original question, the approach needs to be tailored to particular needs.

The solution

There’s two steps in the process:

1) Import the Excel workbook as Pandas data frame

Here we use the standard method of using pandas.read_excel to get the data out from Excel file. If there is a specific sheet we want, it can be selected using sheet_name. If the file contains column labels, we can include them using parameter index_col.

import pandas as pd
# Let's select Characters sheet and include column labels
df = pd.read_excel("copybook.xls", sheet_name = "Characters", index_col = 0)

df contains now the following imaginary data frame, which represents the data in the original Excel file

first   last
0   John    Snow
1   Sansa   Stark
2   Bran    Stark

2) Write records stored in a DataFrame to a SQL database

Pandas has a neat method pandas.DataFrame.to_sql for interacting with SQL databases through SQLAlchemy library. The original question mentioned MySQL so here we assume we already have a running instance of MySQL. To connect the database, we use create_engine. Lastly, we write records stored in a data frame to the SQL table called characters.

from sqlalchemy import create_engine
engine = create_engine('mysql://USERNAME:PASSWORD@localhost/copybook')
# Write records stored in a DataFrame to a SQL database
df.to_sql("characters", con = engine)

We can check if the data has been stored

engine.execute("SELECT * FROM characters").fetchall()

Out:
[(0, 'John', 'Snow'), (1, 'Sansa', 'Stark'), (2, 'Bran', 'Stark')]

or better, use pandas.read_sql_table to read back the data directly as data frame

pd.read_sql_table("characters", engine)

Out:
index   first   last
0   0   John    Snow
1   1   Sansa   Stark
2   2   Bran    Stark

Learn more

No MySQL instance available?

You can test the approach by using an in-memory version of SQLite database. Just copy-paste the following code to play around:

import pandas as pd
from sqlalchemy import create_engine
# Create a new SQLite instance in memory
engine = create_engine("sqlite://")
# Create a dummy data frame for testing or read it from Excel file using pandas.read_excel
df = pd.DataFrame({'first' : ['John', 'Sansa', 'Bran'], 'last' : ['Snow', 'Stark', 'Stark']})
# Write records stored in a DataFrame to a SQL database
df.to_sql("characters", con = engine)
# Read SQL database table into a DataFrame
pd.read_sql_table('characters', engine)

In this article you will learn how to work with Excel data in python, import excel data to mysql and export mysql data to excel

excel to mysql in python

To work with excel from python code we need to import following libraries

import xlrd
import xlwt
import os
import pandas.io.sql as sql
from configparser import ConfigParser
import mysql.connector

You may need to install two packages for reading and writing excel, xlrd, xlwt and pandas

Command to install xlrd module :

pip install xlrd
pip install xlwt
pip install pandas

How to import excel data to mysql in python

In following example we read data from excel sheet and insert into mysql database

Here are the steps

  1. Create a excel sheet and set some data, save the file and close
  2. Keep the file in a folder under root of your project
  3. Now write the following code that will insert data from excel to your database
  4. Note: you need to create a custom class and then make a list of that class object with all email ids and names to make a database method.
    the custom class EmailInfo may look like

    class EmailInfo(object):
       
         def __init__(self, fullname, email):
            self.FullName = fullname;
            self.Email = email;
    

I have created a folder called «excelFolder», Where i want to create a file with name «emailList.xlsx»

rootPath = os.getcwd()
rootPath=rootPath+"/excelFolder/";
loc = (rootPath+"emailList.xlsx"); 

Now read the XLSX file and select the first sheet, then loop through all rows.

wb = xlrd.open_workbook(loc) 
sheet = wb.sheet_by_index(0) 
for i in range(sheet.nrows): 
        print(sheet.cell_value(i, 0),sheet.cell_value(i, 1)) 

Here is the function for loading the excel sheet and reading data.

def ReadFromExcel(self):
    rootPath = os.getcwd()
    rootPath=rootPath+"/excelFolder/";
    loc = (rootPath+"emailList.xlsx"); 
  
    wb = xlrd.open_workbook(loc) 
    sheet = wb.sheet_by_index(0) 
    list = [];  
    for i in range(sheet.nrows): 
            #print(sheet.cell_value(i, 0),sheet.cell_value(i, 1)) 
            list.append(EmailInfo(sheet.cell_value(i, 0),sheet.cell_value(i, 1)));
    print("Successfully retrieved all excel data");

The following function will insert all data into mysql database

def BulkInsert(self,list):
    mycursor = self.myConnection.cursor();
    #create the table
    #mycursor.execute("CREATE TABLE tbEmailList (tid INT AUTO_INCREMENT PRIMARY KEY, FullName VARCHAR(255), EmailId VARCHAR(255))"); 
    query = "INSERT INTO tbEmailList (FullName, EmailId) VALUES ('{}', '{}')"
    for obj in list: 
        print( obj.FullName, obj.Email);
        formattedQuery=query.format(obj.FullName, obj.Email);
        mycursor.execute(formattedQuery);
    self.myConnection.commit()
    mycursor.close()

export mysql data to excel in python

You can retrieve mysql data and save that into an excel file, that can done different ways in python code.

In following example we follow the sequence below

  1. Specify the file name and where to save, the full path
  2. Create an instance of workbook
  3. Create a MySql connection with required database, username, and password
  4. Fetch data from sql database table
  5. Use to_excel method to save the data to that specific excel sheet
def WriteToExcel2(self):
    rootPath = os.getcwd()
    rootPath=rootPath+"/excelFolder/newFile.xlsx";       
        
    workbook = xlwt.Workbook(encoding='utf-8')
    worksheet = workbook.add_sheet("mysheet1",cell_overwrite_ok=True)
    worksheet.Title = "Email List";     
    
    df=sql.read_sql('SELECT firstName, LastName, RegDate FROM tbStudent',self.myConnection)
    df.to_excel('ds.xls')
         
    print("Successfully created excel file");

Now let’s look at another example of export mysql data to excel sheet, In this example you will learn how to loop through SQL data and fill the excel worksheet and finally save in specific folder.

def WriteToExcel(self):
    rootPath = os.getcwd()
    rootPath=rootPath+"/excelFolder/newFile.xlsx";       
        
    workbook = xlwt.Workbook(encoding='utf-8')
    worksheet = workbook.add_sheet("mysheet1",cell_overwrite_ok=True)
    worksheet.Title = "Email List";     
    
    fileds = [u'ID',u'Name',u'Email']
    for filed in range(0,len(fileds)):
        worksheet.write(0,filed,fileds[filed])           
     
    workbook.save(rootPath);
         
    print("Successfully created excel file");

Here is the complete code

import xlrd
import xlwt
import os
import pandas.io.sql as sql
from configparser import ConfigParser
import mysql.connector
class ExcelExample(object):
def __init__(self):
self.Title = "Work with Excel Example";
config = ConfigParser()
config.read('mypy.ini')
database = config.get('dbinfo', 'database')
dbusername = config.get('dbinfo', 'dbusername')
        dbpassword = config.get('dbinfo', 'dbpassword')
        dbhost = config.get('dbinfo', 'dbhost')
         
        self.myConnection = mysql.connector.connect(
          host=config.get('dbinfo', 'dbhost'),
          user=config.get('dbinfo', 'dbusername'),
          password=config.get('dbinfo', 'dbpassword'),
          database=config.get('dbinfo', 'database'))
    def ReadFromExcel(self):
        rootPath = os.getcwd()
        rootPath=rootPath+"/excelFolder/";
        loc = (rootPath+"emailList.xlsx"); 
  
        wb = xlrd.open_workbook(loc) 
        sheet = wb.sheet_by_index(0) 
        for i in range(sheet.nrows): 
                print(sheet.cell_value(i, 0),sheet.cell_value(i, 1)) 
        print("Successfully retrieved all excel data");
    def WriteToExcel(self):
        rootPath = os.getcwd()
        rootPath=rootPath+"/excelFolder/newFile.xlsx";       
        
        workbook = xlwt.Workbook(encoding='utf-8')
        worksheet = workbook.add_sheet("mysheet1",cell_overwrite_ok=True)
        worksheet.Title = "Email List";     
    
        fileds = [u'ID',u'Name',u'Email']
        for filed in range(0,len(fileds)):
            worksheet.write(0,filed,fileds[filed])           
     
        workbook.save(rootPath);
         
        print("Successfully created excel file");
    def WriteToExcel2(self):
        rootPath = os.getcwd()
        rootPath=rootPath+"/excelFolder/newFile.xlsx";       
        
        workbook = xlwt.Workbook(encoding='utf-8')
        worksheet = workbook.add_sheet("mysheet1",cell_overwrite_ok=True)
        worksheet.Title = "Email List";     
    
        df=sql.read_sql('SELECT firstName, LastName, RegDate FROM tbStudent',self.myConnection)
        df.to_excel('ds.xls')
         
        print("Successfully created excel file");

Довольно часто возникают задачи заполнения базы данных из каких-либо внешних источников. В данном примере показано как можно наполнить базу данных (SQLlite) данными из xlsx файла. 

В разработке я использую PyCharm 2018.3.3 Professoinal. Python 3.7

Используемые библитеки sqlite3 — работа с базой данных, openpyxl — работа с excel

1. Создайте новый проект Python и добавьте .py файл

2. Добавьте в проект базу данных и сделайте тестовый connect. 

3. Программу можно разбить на три части:

1. Подключение к базе и создание таблицы

2. Чтение xlsx файла с данными

3. Запись в базу и закрытие соединения

Исходный файл, который необходимо переложить в SQL

После запуска программы у вас должна появиться таблица cars с точно такими же данными

import os
import sqlite3
import openpyxl


def export_to_sqlite():
    '''Экспорт данных из xlsx в sqlite'''

    # 1. Создание и подключение к базе

    # Получаем текущую папку проекта
    prj_dir = os.path.abspath(os.path.curdir)

    a = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))

    # Имя базы
    base_name = 'auto.sqlite3'

    # метод sqlite3.connect автоматически создаст базу, если ее нет
    connect = sqlite3.connect(prj_dir + '/' + base_name)
    # курсор - это специальный объект, который делает запросы и получает результаты запросов
    cursor = connect.cursor()

    # создание таблицы если ее не существует
    cursor.execute('CREATE TABLE IF NOT EXISTS cars (brand text, model text, distance int , year int)')

    # 2. Работа c xlsx файлом

    # Читаем файл и лист1 книги excel
    file_to_read = openpyxl.load_workbook('Cars.xlsx', data_only=True)
    sheet = file_to_read['Sheet1']

    # Цикл по строкам начиная со второй (в первой заголовки)

    for row in range(2, sheet.max_row + 1):
        # Объявление списка
        data = []
        # Цикл по столбцам от 1 до 4 ( 5 не включая)
        for col in range(1, 5):
            # value содержит значение ячейки с координатами row col
            value = sheet.cell(row, col).value
            # Список который мы потом будем добавлять
            data.append(value)

    # 3. Запись в базу и закрытие соединения

        # Вставка данных в поля таблицы
        cursor.execute("INSERT INTO cars VALUES (?, ?, ?, ?);", (data[0], data[1], data[2], data[3]))

    # сохраняем изменения
    connect.commit()
    # закрытие соединения
    connect.close()


def clear_base():
    '''Очистка базы sqlite'''

    # Получаем текущую папку проекта
    prj_dir = os.path.abspath(os.path.curdir)

    # Имя базы
    base_name = 'auto.sqlite3'

    connect = sqlite3.connect(prj_dir + '/' + base_name)
    cursor = connect.cursor()

    # Запись в базу, сохранение и закрытие соединения
    cursor.execute("DELETE FROM cars")
    connect.commit()
    connect.close()


# Запуск функции
export_to_sqlite()

Понравилась статья? Поделить с друзьями:
  • From excel to csv конвертер
  • Frequency of word use
  • From excel to csv online
  • Frequency excel на русском
  • From excel to anki