Win32com client dispatch word application

Based on the script here: .doc to pdf using python I’ve got a semi-working script to export .docx files to pdf from C:Export_to_pdf into a new folder.

The problem is that it gets through the first couple of documents and then fails with:

(-2147352567, 'Exception occurred.', (0, u'Microsoft Word', u'Command failed', u'wdmain11.chm', 36966, -2146824090), None)

This, apparently is an unhelpful general error message. If I debug slowly it using pdb, I can loop through all files and export successfully. If I also keep an eye on the processes in Windows Task Manager I can see that WINWORD starts then ends when it is supposed to, but on the larger files it takes longer for the memory usage to stablise. This makes me think that the script is tripping up when WINWORD doesn’t have time to initialize or quit before the next method is called on the client.Dispatch object.

Is there a way with win32com or comtypes to identify and wait for a process to start or finish?

My script:

import os
from win32com import client

folder = "C:\Export_to_pdf"
file_type = 'docx'
out_folder = folder + "\PDF"

os.chdir(folder)

if not os.path.exists(out_folder):
    print 'Creating output folder...'
    os.makedirs(out_folder)
    print out_folder, 'created.'
else:
    print out_folder, 'already exists.n'

for files in os.listdir("."):
    if files.endswith(".docx"):
        print files

print 'nn'

try:
    for files in os.listdir("."):
        if files.endswith(".docx"):
            out_name = files.replace(file_type, r"pdf")
            in_file = os.path.abspath(folder + "\" + files)
            out_file = os.path.abspath(out_folder + "\" + out_name)
            word = client.Dispatch("Word.Application")
            doc = word.Documents.Open(in_file)
            print 'Exporting', out_file
            doc.SaveAs(out_file, FileFormat=17)
            doc.Close()
            word.Quit()
except Exception, e:
    print e

The working code — just replaced the try block with this. Note moved the DispatchEx statement outside the for loop and the word.Quit() to a finally statement to ensure it closes.

try:
    word = client.DispatchEx("Word.Application")
    for files in os.listdir("."):
        if files.endswith(".docx") or files.endswith('doc'):
            out_name = files.replace(file_type, r"pdf")
            in_file = os.path.abspath(folder + "\" + files)
            out_file = os.path.abspath(out_folder + "\" + out_name)
            doc = word.Documents.Open(in_file)
            print 'Exporting', out_file
            doc.SaveAs(out_file, FileFormat=17)
            doc.Close()
except Exception, e:
    print e
finally:
    word.Quit()

Last Updated on January 15, 2022 by

This tutorial will walk through how to automate Word documents using python-docx and sending emails with win32com libraries. Imagine that we have a list of customer information stored inside an Excel file (or a database). The process looks like the following:

  • Automatically generate an invoice in MS Word for each client
  • Convert the Word document to PDF format
  • Send (using MS Outlook App) the PDF invoice to customers with a customized greeting message

Required Libraries

We’ll need three libraries for this project. We use pandas to read data from an Excel file, but the pandas library is not a must-have if your data is elsewhere or if you prefer to extract customer data another way.

python-docx for automating .docx (e.g. MS Word, Google docs, etc) file

pywin32 for interacting with Windows APIs

pip install pandas python-docx pywin32

The library you’ll hear is docx; however, for installation purposes, it’s python-docx.

So, note the following difference:

pip install python-docx

import docx

Since the docx library creates .docx files, you don’t have to use MS Word. Both Google Docs and LibreOffice are free alternatives that support .docx files, and they are as good as the MS Office suite.

To create a .docx file, we need to create a Document object first. Then inside the document object, we can add various elements such as headings, paragraphs, pictures, etc. In the code below, the Inches object is used to define the size of an element, e.g. a picture.

from docx import Document
from docx.shared import Inches

document = Document()
document.add_picture('brand_logo.png', width = Inches(1))
document.add_heading('Invoice', 0)

The Run Object

The Run object represents any text – it can be a letter, a word, a sentence, or a full paragraph. Visually, each red box in the below picture represents a separate Run. We use .add_paragraph() to start a new sentence/paragraph “This is a “. Then we can keep adding new Runs to the existing Paragraph object.

Once we add a Run, we can also modify its properties such as font, size, color, etc.

The following code will create the above sentence with shown styles.

from docx import Document
from docx.shared import Pt, RGBColor

document = Document()
p1 = document.add_paragraph('This is a ')
p1.add_run('MS WORD ').bold = True
p1.add_run('document ')
eg = p1.add_run('example')
eg.font.size = Pt(20)
eg.font.color.rgb = RGBColor(0,128,0)

Create Invoices

Our sample data inside Excel looks like the following:

Of course, we don’t want to send to these guys’ actual email addresses, so I’m using my own test email address.

Essentially, this is our company’s sales data, for example, the first record means: We sold 10 units of Falcon 9 rockets to Elon Musk at a unit price of $1m. Let’s create an invoice for each customer 🙂

Since this is part of a streamlined process, we’ll write a function that only does one thing at a time. The first step is creating invoices in .docx format. This function will take the following arguments: customer name, email, the product sold to them, number of units, and the unit price.

In the code below:

  • line 6 inserts a customer name
  • line 10 inserts the number of units
  • line 12 inserts the product name
  • line 15 uses a list comprehension to add two blank lines
  • lines 17 – 30 creates a table to summarize the invoice
  • line 38 saves the document with the client’s name

Let’s test the function, looks good!

make_client_invoice('Elon Musk', 'amznbotnotification@gmail.com','Falcon 9',10, 1000000)
Python creates MS Word .docx file

Convert MS Word Document To PDF Format

Now we have our invoice in Word, let’s convert it to PDF since that’s the standard format for business documents.

We’ll use the pywin32/win32com library, this one also got a weird naming between installation name and library name. Note the difference below:

pip install pywin32

import win32com.client

The win32com is a great library that allows us to control lots of things in the Windows operating system. It can control the Office Suite of apps for example.

To convert Word (.docx) to PDF format, we essentially open the document using win32com, then Save As PDF format. Easy peasy!

The code below takes an input file path src, then converts and saves a pdf to file path dst.

win32com.client.Dispath("Word.Application") will create a MS Word instance/object inside Python. Replace the Word with Excel, then you’ll have an Excel instance!

The wdFormatPDF = 17 is likely a convention used by VBA, check this list on other file type options we can save to.

Automate Sending Email Using Outlook App

Next, we’ll send out the invoice to our customers! win32com is again our helper to interact with the Outlook App. Note – not the web-based Outlook, but the actual app that’s installed on our computer. This step requires you have Office (especially Outlook) installed on your computer, and logged into an Outlook account. User name and password are not required as long as your Outlook App stays logged in.

In the code above, line 3 CreateItem(0) means to create a Mail object. See the below table on other possible objects we can create inside Outlook.

To add an attachment, simply pass in the file location similar to line 8.

It appears that we don’t even need to have Outlook App open to send an email using Python. As long as we have previously logged into our Outlook App, it’s good to go. The best part – no credentials or passwords are required, the win32com will just interact with Outlook with your existing settings.

Putting It Together

Now I offer the three functions for the three steps of the invoicing system. It’s your turn to try putting it together. You can use a loop to send invoices one by one or build them with other processes. Enjoy!

from docx import Document
from docx.shared import Inches
import pandas as pd
import win32com.client




def make_client_invoice(name, email, product, unit, price):
    document = Document()
    document.add_picture('brand_logo.png', width=Inches(1))
    document.add_heading('Invoice', 0)
    p1 = document.add_paragraph('Dear ')
    p1.add_run(name).bold=True
    p1.add_run(',')

    p2 = document.add_paragraph('Please find attached invoice for your recent purchase of ')
    p2.add_run(str(unit)).bold = True
    p2.add_run(' units of ')
    p2.add_run(product).bold=True
    p2.add_run('.')

    [document.add_paragraph('') for _ in range(2)]
    
    table = document.add_table(rows=1, cols=4)
    hdr_cells = table.rows[0].cells
    hdr_cells[0].text = 'Product Name'
    hdr_cells[1].text = 'Units'
    hdr_cells[2].text = 'Unit Price'
    hdr_cells[3].text = 'Total Price'
    for i in range(4):
        hdr_cells[i].paragraphs[0].runs[0].font.bold = True
        
    row_cells = table.add_row().cells
    row_cells[0].text = product
    row_cells[1].text = f'{unit:,.2f}'
    row_cells[2].text = f'{price:,.2f}'
    row_cells[3].text = f'{unit * price:,.2f}'
    
    [document.add_paragraph('') for _ in range(10)]

    document.add_paragraph('We appreciate your business and and please come again!')
    document.add_paragraph('Sincerely')
    document.add_paragraph('Jay')

    document.save(f'{name}.docx')

def docx_to_pdf(src, dst):
    word = win32com.client.Dispatch("Word.Application")
    wdFormatPDF = 17
    doc = word.Documents.Open(src)
    doc.SaveAs(dst, FileFormat=wdFormatPDF)
    doc.Close()
    word.Quit()

def send_email(name, to_addr, attachment):
    outlook = win32com.client.Dispatch("Outlook.Application")
    mail = outlook.CreateItem(0)
    mail.To = to_addr #'amznbotnotification@gmail.com'
    mail.Subject = 'Invoice from PythonInOffice'
    mail.Body = f'Dear {name}, Please find attached invoice'
    mail.Attachments.Add(attachment)
    mail.Send()

I have a DOCX file, which has the following info

pages of section header
section 1 1 None
section 2 2 Abstract. current page ??, total IV pages
section 3 5 paper body, current page ??, total 35 pages

Then if the line oSec.Headers(1).Range.Fields.Update is used in the following VBA code, the corrected header text will be shown

Function myTrim(s)
    a = Replace(s, vbLf, "")
    myTrim = Trim(a)
End Function

Sub displayHeader()
    idx = 1
    For Each oSec In ActiveDocument.Sections
		oSec.Headers(1).Range.Fields.Update 'this line must be called
        MsgBox "sec " & idx & " " & myTrim(oSec.Headers(1).Range.Text)
        idx = idx + 1
    Next
End Sub

Then I coined the Python version, as we all know it looks like the original VBA one

import win32com
from win32com.client import Dispatch, constants

#~ word = win32com.client.Dispatch('Word.Application')

word = win32com.client.gencache.EnsureDispatch('Word.Application')

word.Visible = 1
word.DisplayAlerts = 0

word.Documents.Open('r:/test.docx')

for idx, oSec in enumerate(word.ActiveDocument.Sections):
    #~ oSec.Headers(1).Range.Fields.Update()
    print(f'sec {idx+1}', oSec.Headers(1).Range.Text.strip())

word.Documents.Close(constants.wdDoNotSaveChanges)
word.Quit()

However the Python code does not give the same corrected header text:

Dispatch(‘Word.Application’) gencache.EnsureDispatch(‘Word.Application’)
use Range.Fields.Update() sec 1
sec 2 Abstract. current page I, total I pages
sec 3 paper body, current page 1, total 1 pages
sec 1
sec 2 Abstract. current page I, total I pages
sec 3 paper body, current page 1, total 1 pages
do not use Range.Fields.Update() sec 1
sec 2 Abstract. current page III, total IV pages
sec 3 paper body, current page 1, total 35 pages
sec 1
sec 2 Abstract. current page I, total I pages
sec 3 paper body, current page 1, total 35 pages

So, what is the problem, and how to fix it? Thank you in advance.

In order to implement COM objects in the Python version of Windows, you need a set of extensions developed by Mark Hammond and Greg Stein. Part of the win32com package, these extensions enable you to do everything that is COM-related, including writing COM clients and COM servers.

The following link takes you to the download page of these extensions:

http://www.python.org/download/download_windows.html

All the Win32 extensions (including the COM extensions) are part of the win32all installation package. This package also installs the PythonWin IDE in your machine.

The latest version of this whole package is located at the win32all home page. Search for the win32all.exe file:

http://www.python.org/windows/win32all/

You can also go directly to Mark Hammond’s starship home page, which might have more recent beta releases of this package:

http://starship.python.net/crew/mhammond/

After installing the package in your machine, take a look at the readme.htm file, which is stored at the win32com directory.

COM support for Python is compounded of the core PythonCOM module, which supports the C++ code, and the other modules that implement helper code in Python. The whole package is known as win32com.

The win32com Package

The win32com support is standalone, as it does not require PythonWin. The win32com package itself does not provide any functionality. Some of the modules contained in this package are win32com.pythoncom— Provides core C++ support for COM objects and exposes COM object methods, such as QueryInterface() and Invoke(), just as the C++ API does. Note that all the reference counting is automatically done for you. Programmers rarely access this module directly. Instead, they usually use the win32com wrapper classes and functions written in Python to provide a nice, programmable interface.

win32com.client— Provides support for COM clients (for example, using Python to start Microsoft Excel and create a spreadsheet). The COM client support enables Python to manipulate other COM objects via their exposed interfaces. All client-side IUnknown-derived objects, including IDispatch, are supported.

win32com.server— Provides support for COM servers (for example, creating and registering a COM server object in Python and using a language such as Visual Basic or Delphi to access the Python objects). The COM server support enables Python to create COM servers, which can be manipulated by another COM client. All server-side IUnknown-derived objects are supported.

win32com.axscript— This is the ActiveX Scripting implementation for Python.

win32com.axdebug— This is the Active Debugging implementation for Python.

win32com.mapi— Provides utilities for working with MAPI and the Microsoft Exchange Server.

Talking to Windows Applications

The COM technology has been part of the Windows world for a long time. The COM genealogy can be traced back to DDE (Dynamic Data Exchange). DDE was the first device for transferring data between various applications in a multi-tasking computer. After some time, DDE was expanded to Object Linking and Embedding (OLE)—note that COM was invented as part of OLE. The creation of the Visual Basic Extensions (VBXs) enhanced the OLE technology for visual components, originating a new standard called OLE2, which was based on top of COM. Soon, the OLE2 technology became more integrated with COM, which is a generalpurpose mechanism. Nowadays, COM is mostly known, in part, because of the ActiveX technology.

Professional applications such as Microsoft Office and the Netscape browser enable you to control their objects using COM. Therefore, programs written in Python can be easily used to control those applications.

COM passes string objects as Unicode characters. Before using these objects in Python, it’s necessary to convert them to strings. The Python-2.0 Unicode string type is not the same as the string type, but it is easy to convert between the two.

PythonWin comes with a basic COM browser (Python Object browser). This program helps you to identify the current objects in your system that implement COM interfaces.

To run the browser, select it from the PythonWin Tools menu, or double-click on the file win32comclientcombrowse.py.

Note that there are other COM browsers available, such as the one that comes with the Microsoft Visual C++.

If you study the file pythonwin32comserversinterp.py, which is installed as part of your PythonWin distribution, you will learn how to implement a very simple COM server. This server exposes the Python interpreter by providing a COM object that handles both the exec and eval methods. Before using this object, register it by running the module from Python.exe. Then, from Visual Basic, use CreateObject(‘Python.Interpreter’) to initialize the object, and you can start calling the methods.

Word and Excel

Let’s quit talking and get to some practicing. Our objective here is to open and manipulate Microsoft applications from Python.

The first thing that you need to do is to import the COM client and dispatch the right object. In the next example, a variable is assigned a reference to an Excel application:

>>> import win32com.client

>>> xl = win32com.client.Dispatch(«Excel.Application»)

The following does the same thing, but this time the reference is to a Word application. >>> wd = win32com.client.Dispatch(«Word.Application»)

Excel.Application and Word.Application are the Program IDs (progid), which are the names of the objects for which you want to create an instance. Internally, these objects have a Class ID (clsid) that uniquely registers them in the Windows Registry. The matching table between progids and clsids is stored in the Windows Registry and the matching is performed by the COM mechanism.

It is not an easy job to identify an application progid, or to find out object methods and attributes. You can use COM browsers to see what applications have COM interfaces in your system.

For the Microsoft Products, you can take a look at the documentation; it is a good source of information.

Not necessarily every COM object implements the same interface. However, there are similarities.

For example, if the previous assignments have just created the objects and you want to make them visible, you have to type

>>> xl.Visible = 1 # Sets the visible property for the Excel application >>> wd.Visible = 1 # Sets the visible property for the Word application

To close both programs and release the memory, you need to say

>>> xl = None >>> wd = None or, you could use >>> del xl, wd

These were simple examples of implementing COM clients in Python. Next, we will see how to implement a Python COM server by creating a Python interface that exposes an object. The next block of code registers the interface in the Windows Registry.

Note that every new COM object that you create must have a unique clsid, but you don’t have to worry about it. The complex algorithm that works behind the scenes is ready to generate a unique identification, as shown here:

>>> import pythoncom

>>> print pythoncom.CreateGuid()

Your COM server is defined next. You have to execute the program in order to make the COM object available in the system. Store it on a file, and double-click on it.

1: class TaxApplication:

3: _reg_progid_ = «Tax.Application»

4: _reg_clsid_ = «{D2DEB6E1-3C6D-11D4-804E-0050041A5111}» 5:

10: print «Registering COM server»

11: import win32com.server.register

12: win32com.server.register.UseCommandLine(TaxApplication)

Line 2: Exposes the method to be exported.

Line 3: Defines the name that the COM client application must use to connect to the object. Line 4: Defines the unique Class ID (clsid) used by the object. Line 12: Registers the TaxApplication class.

In order to test the program, we need to have an external COM client. Let’s use the Visual Basic for Applications Editor, which is present in both Excel and Word.

Open your Microsoft application, type ALT+F8 in the Macro dialog box, and select the option that creates a macro. Now, you need to type the following block of code:

Set TaxApplication = CreateObject(«Tax.Application») newamount = TaxApplication.PAtax(100) MsgBox newamount Set TaxApplication = Nothing

End Sub

Now, if you press F5, Visual Basic should display a message box showing the result of our simple tax operation, which, in our case, is 107.

To unregister your COM object you can either pass the argument —unregister when calling your script, or you can use the following line of code inside your Python program:

>>> win32com.server.register.UnregisterClasses(TaxApplication)

A very comprehensive example of using Microsoft Word and Excel is stored in the testMSOffice.py file, which is part of your PythonWin distribution. It’s worth checking out!!!

Word

The following code implements a simple wrapper for the Microsoft Word Application. To test it you need to create a Word document and replace its path in the code. The program will open this file, replace the first occurrence of the string «#name#» within the file, add a small bit of text to the end of the line, and print the file.

import win32com.client False = 0 True = -1 wdLine = 5

class WordApp:

self.app = win32com.client.Dispatch(«Word.Application») def open(self, document_file):

self.app.Documents.Open(document_file) def replace(self, source_selection, new_text): self.app.Selection.HomeKey(Unit=wdLine) self.app.Selection.Find.Text = source_selection self.app.Selection.Find.Execute() self.app.Selection.TypeText(Text=new_text) def addtext(self, new_text):

self.app.Selection.EndKey(Unit=wdLine) self.app.Selection.TypeText(Text=new_text) def printdoc(self):

self.app.Application.PrintOut() def close(self):

self.app.ActiveDocument.Close(SaveChanges =False)

worddoc = WordApp()

worddoc.open(r»s: template.doc»)

worddoc.replace(«#name#», «Andre Lessa»)

worddoc.addtext(» What do you want to learn ?»)

worddoc.printdoc()

worddoc.close

If you type in the name of the object’s attribute that accesses the Dispatch method, you get as a result, the COM object name:

<COMObject Word..Application.>

This object is an example of a dynamic dispatch object. The provided name indicates that the object is a generic COM object, and affirms that Python doesn’t know anything about it, except the name that you used to create it. All the information about this object is built dynamically.

Besides dynamic dispatches, you can also use static dispatches, which involve the generation of a .py file that contains support for the specific COM object. In CORBA speak, this is called stub generation, or IDL compilation.

In order to generate the Python files that support a specific COM object, you need to execute win32comclientmakepy.py. A list of Type Libraries will be displayed. Select one (for example, ‘Microsoft Word 8.0 Object Library’) and click OK. You can also call the makepy.py program directly from the command prompt by typing makepy.py «Microsoft Word 8.0 Object Library».

Now, Python knows exactly how to handle the interfaces before invoking the COM object. Although, you can’t see any differences, you can check that Python really knows something else now by querying the COM object:

>>> import win32com.client

>>> wd=win32com.client.Dispatch(«Word.Application») >>> wd

<win32com.gen_py.Microsoft Word 8.0 Object Library._Application>

Note that Python knows the explicit type of the object now.

All the compiled information is stored in a file in the win32com/gen_py directory. You probably won’t understand the filename because it is encoded. Actually, you don’t need to use this file at all. All the interface information is made available via win32com.client.Dispatch and win32com.client.constants.

If you really need to identify the name of the module that was generated, you can use the win32com.client.gencache module. This module has two functions: GetModuleForCLSID and GetModuleForProgID that return Python module objects you can use in your code.

makepy.py also automatically installs all generated constants from a library of types in an object called win32com.clients.constants. After creating the object, all the constants become available to you.

In the previous example, we had to initialize the constant wdLine, because the constants were not available. Now, after running makepy.py, you can replace the line self.app.Selection.EndKey(Unit=wdLine)

with self.app.Selection.EndKey(Unit=win32com.clients.constants.wdLine)

and remove the initialization line wdLine = 5

The next example uses the wdWindowStateMaximize constant to maximize Microsoft Word: >>> w.WindowState = win32com.client.constants.wdWindowStateMaximize

Excel

Next, we’ll see how to create COM clients using Microsoft Excel. The principle is very simple. Actually, it is the same one used previously for wrapping Microsoft Word, as it is demonstrated in the following example.

>>> import win32com.client

>>> excelapp = win32com.client.Dispatch(«Excel.Application») >>> excelapp.Visible = 1

Note that we have to change the Visible property in order to see the Excel application. The default behavior is to hide the application window because it saves processor cycles. However, the object is available to any COM client that asks for it.

As you can see in the example, Excel’s progid is Excel.Application.

After you create the Excel object, you are able to call its methods and set its properties. Keep in mind that the Excel Object Model has the following hierarchy: Application, WorkBook, Sheet, Range, and Cell.

Let’s play a little with Excel. The following statements write to the workbook:

>>> excelapp.Range(«A1:C1»).Value = «Hello», «Python», «World» >>> excelapp.Range(«A2:A2»).Value = ‘SPAM! SPAM! SPAM!’

Note that you can also use tuples to transport values:

>>> excelapp.Range(«A1:C1»).Value = (‘Hello’, ‘Python’, ‘World’) To print a selected area, you need to use the PrintOut() method:

>>> excelapp.Range(«A1:C1»).PrintOut()

What about entering date and time information? The following examples will show you how to set the Date/Time format for Excel cells.

First, call Excel’s time function:

>>> excelapp.Cells(4,3).Value = «=Now()»

>>> excelapp.Columns(«C»).EntireColumn.AutoFit()

The AutoFit() function is required in order to display the information, instead of showing «#######». Now, use Python to set the time you want:

>>> import time, pythoncom

>>> excelapp.Cells(4,1).Value = pythoncom.MakeTime(time.time())

>>> excelapp.Range(«A4:A4»).NumberFormat = «d/mm/yy h:mm»

>>> excelapp.Columns(«A:C»).EntireColumn.AutoFit()

Note that the Cells() structure works like a numeric array. That means that instead of using Excel’s notation of letters and numbers, you need to think of the spreadsheet as a numeric matrix.

Visual Basic

In order to implement a COM object using Python you need to implement a Python class that exposes the functionality to be exported. It is also necessary to assign two special attributes to this class, as required by the Python COM implementation.

The first attribute is the Class ID (_reg_clsid_). This attribute must contain a UUID, which can be generated by calling the pythoncom.CreateGuid() function. The other attribute is a friendly string that you will use to call the COM object (_reg_progid_), as follows:

class COMCalcServer:

_reg_clsid_ = ‘{ C76BEA61-3B39-11D4-8A7C-444553546170} ‘ _reg_progid_ = ‘COMCALCSERVER.VERSION1’ _public_methods_ = [‘mul,,,div,,,add,,,sub’]

Other interesting attributes are

• _public_methods—A list of all method names that you want to publicly expose to remote COM clients.

• _public_attrs—A list of all attribute names to be exposed to remote COM clients.

• _readonly_attrs—A list of all attributes that can be accessed, but not set. This list should be a subset of the list exposed by _public_attrs.

After creating the class, you need to register your COM object. The general technique is to run the module that implements the COM object as a script, in order to register the object:

import win32com.server.register win32com.server.register.UseCommandLine(COMCalcServer)

Notice that you need to inform the class object, and not a class instance. After the UseCommandLine() function has been successfully executed, the following message is returned by the Python interpreter:

Registered: COMCALCSERVER.VERSION1

When you have your COM object up and running, any automation-capable language, such as Python, Visual Basic, Delphi, or Perl, can use it.

The following example is a complete program that implements a calculator. First, you need to collect the unique IDs for your class:

Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on Win32

Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam

>>> import pythoncom

>>> print pythoncom.CreateGuid()

<iid:{C76BEA6 0-3B3 9-11D4-8A7C-444 553 546170}>

After informing the new clsid value to the _reg_clsid_ attribute, we have the following program:

# File: comcalcserver.py class COMCalcServer:

_reg_clsid_ = ‘{C76BEA61-3B39-11D4-8A7C-444553546170}’ _reg_progid_ = ‘COMCALCSERVER.VERSION1’ _public_methods_ = [‘mul,,,div,,,add,,,sub’] def mul(self, arg1, arg2):

return arg1 * arg2 def div(self, arg1, arg2):

return arg1 / arg2 def add(self, arg1, arg2):

return arg1 + arg2 def sub(self, arg1, arg2): return arg1 — arg2

import win32com.server.register win32com.server.register.UseCommandLine(COMCalcServer)

Make sure that all methods are included in the _public_methods_. Otherwise, the program will fail. Now, go to the DOS prompt and execute the program to register the COM object:

C:python>c:progra~1pythonpython comcalcserver.py Registered: COMCALCSERVER.VERSION1

To create the Visual Basic COM client, you need to create a Visual Basic Form that contains all the implementation details (see Figure 7.1).

Figure 7.1. A design for creating the Visual Basic Form.

Figure 7.1. A design for creating the Visual Basic Form.

Python Form

Most of the time, the initialization steps are stored in the Form_Load section in order to be executed when the application starts:

Dim COMCalcServer as Object

Set COMCalcServer = CreateObject(«COMCALCSERVER.VERSION1»

Remember to always deallocate the objects before exiting the application. It’s good practice to do it in the Form Unload section:

Set COMCalcServer = Nothing

Public COMCalcServer As Object

Private Sub Form_Unload(Cancel As Integer)

Set COMCalcServer = Nothing End Sub

Sub InitCOMCalcServer()

Set COMCalcServer = CreateObject(«COMCALCSERVER.VERSION1») Exit Sub End Sub

Private Sub Command1_Click() Dim result As Double result = COMCalcServer.Mul(Val(Text1), Val(Text2)) MsgBox Text1 & «*» & Text2 & «=» & Str(result) End Sub

Private Sub Command2_Click() Dim result As Double result = COMCalcServer.Div(Val(Text1), Val(Text2)) MsgBox Text1 & «/» & Text2 & «=» & Str(result) End Sub

Private Sub Command3_Click() Dim result As Double result = COMCalcServer.Add(Val(Text1), Val(Text2)) MsgBox Text1 & «+» & Text2 & «=» & Str(result) End Sub

Private Sub Command4_Click() Dim result As Double result = COMCalcServer.Sub(Val(Text1), Val(Text2)) MsgBox Text1 & «-» & Text2 & «=» & Str(result) End Sub

Private Sub Form Load()

Textl = 0

Text2 = 0

Commandl.Caption =

«Mul

Command2.Caption =

«Div

Command3.Caption =

«Add

Command4.Caption =

«Sub

InitCOMCalcServer

While executing the application (see Figure 7.2), your Visual Basic application will be talking to the Python COM object behind the scenes.

Figure 7.2. A Visual Basic executable running.

Python Codes Tkinter

The next example is based on the previous one. This one implements a callback function. The VB program calls a Python function that clearly manipulates the Visual Basic Form object.

You need to add or replace the following functions in the Visual Basic code:

Sub InitCOMCalcServer() Set COMCalcServer = Exit Sub End Sub

CreateObject(«COMCALCSERVER.VERSION2»

Commandl.Caption = «Mul» Command2.Caption = «Div» Command3.Caption = «Add» Command4.Caption = «Sub» InitCOMCalcServer COMCalcServer.updatecaption Me End Sub

The following new function must be created in the Python code, too. The VB function call uses the keyword Me to send a reference of the Form object to Python’s updatecaption() method:

def updatecaption(self, object):

Form = win32com.client.Dispatch(object) Form.Caption = «Python COM Routine is Active»

The following code is a full replacement to be used with this example. Remember to create a new _reg_clsid_ for this new example.

# File: comcalcserver2.py class COMCalcServer:

_reg_clsid_ = ‘{ C76BEA64-3B39-11D4-8A7C-444553546170} ‘ _reg_progid_ = ‘COMCALCSERVER.VERSION2’

_public_methods_ = [‘mul,,,div,,,add,,,sub’, ‘updatecaption’] def mul(self, arg1, arg2):

return arg1 * arg2 def div(self, arg1, arg2):

return arg1 / arg2 def add(self, arg1, arg2):

return arg1 + arg2 def sub(self, arg1, arg2):

return arg1 — arg2 def updatecaption(self, object): import win32com.client

Form = win32com.client.Dispatch(object) Form.Caption = «Python COM Routine is Active»

import win32com.server.register win32com.server.register.UseCommandLine(COMCalcServer)

The result of running this example is shown in Figure 7.3.

Figure 7.3. Python/Visual Basic callback implementation.

Figure 7.3. Python/Visual Basic callback implementation.

Programa Visual Python

Every script that defines a COM class can be used to unregister the class, too. Python automatically knows that, when you pass the argument —unregister to the script, you want to remove all the references to this class from the Windows Registry.

C:python>python comcalcserver2.py —unregister Unregistered: COMCALCSERVER.VERSION2

Handling Numbers and Strings

Whenever you have a Python method as part of a COM server interface that returns a number or a string, as shown in the next few lines of code:

def GetNumber(self): return 2 5

def GetString(self, name):

return ‘Your name is %s’% name

The COM client written in Visual Basic must handle the methods as follows

Dim num as Variant num = Server.GetNumber

Dim str as Variant str = Server.GetString(«Andre»)

MsgBox str

Python and Unicode do not really work well together in the current version of Python. All strings that come from COM will actually be Unicode objects rather than string objects. In order to make the previous code work in a COM environment, the last line of the GetString() method must become return ‘Your name is %s’% str(name)

The conversion of the «name» to «str(name) » forces the Unicode object into a native Python string object. In Python-2.0, if the win32com stuff starts using native Python Unicode strings, the str() call will cause the Unicode string to be reencoded in UTF8.

Handling Lists and Tuples

When you have a Python method as part of a COM server interface that returns a list or a tuple, as illustrated in the next example:

The COM client written in Visual Basic must handle the method as follows:

Dim arry as Variant arry = Server.GetList Debug.Print UBound(arry) For Each item in arry

Debug.Print item Next

Delphi

Using Delphi to implement a COM client is very similar to using Visual Basic. First, you need to register the COM class. The following code is similar to the one used for the Visual Basic example.

# File: comcalcserver.py class COMCalcServer:

_reg_clsid_ = ‘{ C76BEA61-3B39-11D4-8A7C-444553546170} _reg_progid_ = ‘COMCALCSERVER.VERSION1’ _public_methods_ = [‘mul,,,div,,,add,,,sub’] def mul(self, arg1, arg2):

return arg1 * arg2 def div(self, arg1, arg2):

return arg1 / arg2 def add(self, arg1, arg2):

return arg1 + arg2 def sub(self, arg1, arg2): return arg1 — arg2

import win32com.server.register win32com.server.register.UseCommandLine(COMCalcServer)

Now, you need to create a Delphi form to support all the COM client activities (see Figure 7.4).

Figure 7.4. Delphi design: A form with three Edit boxes and four buttons.

Figure 7.4. Delphi design: A form with three Edit boxes and four buttons.

Script Form Edit Delphi

unit Calcform;

interface uses

Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs, StdCtrls, OLEAuto;

type

TForm1 = class(TForm) Button1: TButton; Edit1: TEdit; Edit2: TEdit; Edit3: TEdit; Button2: TButton; Button3: TButton; Button4: TButton;

procedure FormCreate(Sender: TObject); procedure Button1Click(Sender: TObject); procedure Button4Click(Sender: TObject); procedure Button3Click(Sender: TObject); procedure Button2Click(Sender: TObject); private

{ Private declarations } public

{ Public declarations } end;

Form1: TForm1; COMCalcServer: Variant; implementation

procedure TForm1.FormCreate(Sender: TObject); begin try

COMCalcServer := CreateOleObject(‘COMCALCSERVER.VERSION1’);

Form1.Caption := ‘Python COM Routine is Active’;

except

MessageDlg(‘An error has happened!’, mtError, [mbOk],0); Application.Terminate; end; end;

procedure TForm1.Button1Click(Sender: TObject); var tmp1float, tmp2float : Real;

tmp3string : String; begin tmplfloat := StrToFloat(Editl.text); tmp2float := StrToFloat(Edit2.text);

tmp3string := FloatToStr(COMCalcServer.mul(tmp1float, tmp2float)); Edit3.text := tmp3string;

end;

procedure TForm1.Button2Click(Sender: TObject); var tmplfloat, tmp2float : Real;

tmp3string : String; begin tmplfloat := StrToFloat(Editl.text); tmp2float := StrToFloat(Edit2.text);

tmp3string := FloatToStr(COMCalcServer.div(tmplfloat, tmp2float)); Edit3.text := tmp3string;

end;

procedure TForml.Button3Click(Sender: TObject); var tmplfloat, tmp2float : Real;

tmp3string : String; begin tmplfloat := StrToFloat(Editl.text); tmp2float := StrToFloat(Edit2.text);

tmp3string := FloatToStr(COMCalcServer.add(tmplfloat, tmp2float)); Edit3.text := tmp3string;

end;

procedure TForml.Button4Click(Sender: TObject); var tmplfloat, tmp2float : Real;

tmp3string : String; begin tmplfloat := StrToFloat(Editl.text); tmp2float := StrToFloat(Edit2.text);

tmp3string := FloatToStr(COMCalcServer.sub(tmplfloat, tmp2float)); Edit3.text := tmp3string;

After compiling and running the application, you should see the interface shown in Figure 7.5.

Figure 7.5. Delphi Calculator Application.

Tkinter Calculator

Make Note | Bookmark

CONTINUE >

Make Note | Bookmark

CONTINUE >

Continue reading here: DBM Database Managers Databases

Was this article helpful?

Вопрос:

Я хотел преобразовать текстовый документ в текст. Поэтому я использовал скрипт.

import win32com.client

app = win32com.client.Dispatch('Word.Application')
doc = app.Documents.Open(r'C:UsersSBYSMR10DesktopNew folder (2)GENERAL DATA.doc')
content=doc.Content.Text
app.Quit()
print content

У меня есть следующий результат:

enter image description here

Теперь я хочу преобразовать этот текст в список, содержащий все его элементы. я использовал

content = " ".join(content.replace(u"xa0", " ").strip().split())

РЕДАКТИРОВАТЬ

Когда я это сделаю, я получаю:

enter image description here

Это не список. В чем проблема? Что это за большой символ?

Лучший ответ:

Документы Word не являются текстом, это документы: они имеют управляющую информацию (например, форматирование) и текст. Если вы игнорируете управляющую информацию, текст становится бесполезным.

Поэтому вам нужно вникнуть в детали, как перемещаться по структуре управления документа, чтобы найти тексты, которые вас интересуют, а затем получить текстовое содержимое этих структур.

Примечание. Вы обнаружите, что Word очень сложный. Если вы можете, рассмотрите также эти два подхода:

  • Сохраните документ Word как HTML из Word. Он потеряет некоторое форматирование, но списки останутся нетронутыми. HTML гораздо проще разбирать и понимать, чем Word.

  • сохраните документ как OOXML (существует по крайней мере с Office 10, расширение – .docx). Это ZIP-архив с XML-документами внутри. XML снова легче разбирать/понимать, чем полный документ Word, но сложнее, чем версия HTML.

Ответ №1

Теперь я хочу преобразовать этот текст в список, содержащий все его элементы. я использовал

content = “”.join(content.replace(u “ xa0”, “”).strip(). split())

Это не список. В чем проблема?

Метод.join всегда возвращает строку. Он ожидает, что вы передадите список и затем соедините этот список с данным разделителем (“в вашем случае”).

Кроме того, что сказал Аарон Дигулла.

Ответ №2

Ответ №3

Вы можете просто анализировать документ слова за строкой. Это не изящно, и это, конечно, не очень, но это работает. Вот фрагмент из чего-то подобного, что я сделал в python 3.3.

import os
directory='your/path/to/file/'
file='yourword.doc'
doc=open(directory+file,'r+b')
for line in doc:
line2=str(line)
print(line2))

Я использовал регулярное выражение, чтобы получить то, что мне нужно. Но этот код будет читать каждую строку вашего документа Word (форматирование и все) и преобразовывать его в красивые строки, с которыми вы можете иметь дело. Не уверен, что это вообще полезно (эта должность – пару лет), но по крайней мере она анализирует документ слова. Тогда это просто вопрос избавления от строк, которые вы не хотите, прежде чем писать в txt файл.

Вот этот код

import win32com.client as client
import pythoncom, os

pythoncom.CoInitialize()
word = client.Dispatch('Word.Application')
doc = word.Documents.Open(os.path.abspath(document_word))
doc.SaveAs(os.path.abspath(document_pdf), FileFormat=17)
doc.Close()
word.Quit()
pythoncom.CoUninitialize()

не работает на win 2012 к которому я подключаюсь по удаленному рабочему столу.Локально у себя на компьютере он работает в двух случаях если я запускаю его с встроенным серваком джанги, или же под apache но при условии что сделаю как сказано здесь http://stackoverflow.com/questions/26991609/python… в общем этот совет локально лечит все , но на сервере где каждый под своим доменом, даже указав в DCOM себя и запустив APACHE так же под своими данными проблема не решается, хотелось бы отметить что word запускается однако я не могу получить к нему доступа , насколько я понимаю дело в правах доступа.Этот совет stackoverflow.com/questions/4803850/win32com-excel… не дает никаких результатов

Here are two ways:

Use win32com Use docx

1. Use win32com extension package

Only valid for windows platforms

Code:

#coding=utf-8
import win32com
from win32com.client import dispatch, dispatchex
word=dispatch ("word.application") #open word application
#word=dispatchex ("word.application") #Start a separate process
word.visible=0 #run in the background,Do not show
word.displayalerts=0 #no warning
path="g:/workspace/python/tmp/test.docx" #word file path
doc=word.documents.open (filename=path, encoding="gbk")
#content=doc.range (doc.content.start, doc.content.end)
#content=doc.range ()
print "----------------"
print "Number of paragraphs:", doc.paragraphs.count
#Use subscript to traverse paragraphs
for i in range (len (doc.paragraphs)):
  para=doc.paragraphs [i]
  print para.range.text
print "-------------------------"
#Iterate through paragraphs directly
for para in doc.paragraphs:
  print para.range.text
  #print para #Can only be used when the content of the document is full English
doc.close () #Close word document
#word.quit #Close the word program

2. Using the docx extension package

advantage:Does not rely on the operating system,Cross-platform

installation:

pip install python-docx

Code:

import docx
def read_docx (file_name):
  doc=docx.document (file_name)
  content=" n" .join ([para.text for para in doc.paragraphs])
  return content

Create form

#coding=utf-8
import docx
doc=docx.document ()
table=doc.add_table (rows=1, cols=3,) #Create a table with borders
hdr_cells=table.rows [0] .cells #Get all cells in row 0
hdr_cells [0] .text="name"
hdr_cells [1] .text="id"
hdr_cells [2] .text="desc"
#Add three rows of data
data_lines=3
for i in range (data_lines):
  cells=table.add_row (). cells
  cells [0] .text="name%s"%i
  cells [1] .text="id%s"%i
  cells [2] .text="desc%s"%i
rows=2
cols=4
table=doc.add_table (rows=rows, cols=cols)
val=1
for i in range (rows):
  cells=table.rows [i] .cells
  for j in range (cols):
    cells [j] .text=str (val * 10)
    val +=1
doc.save ("tmp.docx")

Read form

#coding=utf-8
import docx
doc=docx.document ("tmp.docx")
for table in doc.tables:#Iterate over all tables
  print "---- table ------"
  for row in table.rows:#traverse all rows of the table
    #row_str=" t" .join ([cell.text for cell in row.cells]) #one row of data
    #print row_str
    for cell in row.cells:
      print cell.text, " t",    print

Related style references:

Понравилась статья? Поделить с друзьями:
  • Win word текстовый редактор скачать
  • Win word with friends
  • Win word to pdf
  • Win excel скачать бесплатно
  • Win every word game