Microsoft office interop word visual studio

RRS feed

  • Remove From My Forums
  • Question

  • I have read a lot way to solve this problem, but still can’t find it.

    My pc is win10 & visual studio 2017.

    I can find it in the C:Windowsassembly. Does it means I have installed?

Answers

  • Did you install the Visual Studio Tools for Office?

    — Wayne

    • Marked as answer by
      Chien-Wei
      Thursday, September 13, 2018 4:18 AM

All replies

  • Hi,

    .net 4.0 or later version added reference, if it is not 4.0 or above, there may not be this, you can download a Microsoft.Office.Interop.Word.dll by yourself.

    In your project, right-click on «References» and select «Add» .

    Imports Microsoft.Office.Interop.Word

    Best Regards,

    Alex


    MSDN Community Support Please remember to click «Mark as Answer» the responses that resolved your issue, and to click «Unmark as Answer» if not. This can be beneficial to other community members reading this thread. If you have any
    compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Marked as answer by
      Chien-Wei
      Wednesday, September 12, 2018 11:50 AM
    • Unmarked as answer by
      Chien-Wei
      Wednesday, September 12, 2018 11:50 AM

  • With VS 2015, Office 2016, I add the Object Library :

  • There are times when an assembly will not show up and you need to browse for them. Browse under were the bold parts reflect which version of Office is installed.

    C:Program Files (x86)Microsoft Visual Studio 14.0Visual Studio Tools for OfficePIAOffice15.

    You can also add them to «Custom Component Set» in Object Browser (the default hot key is F2). Once added select that library and press the button where the arrow is indicating. 

    I this for all the ones shown as I use them often.


    Please remember to mark the replies as answers if they help and unmark them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via
    my MSDN profile but will not answer coding question on either.
    VB Forums — moderator

    profile for Karen Payne on Stack Exchange, a network of free, community-driven Q&A sites

  • HI,Alex-Li-MSFT

    I check my project is .net4.5.2 , but I can’t find the microsoft.office.interop.word in the VS assemblies.

    Is it any possible that my office is o365 cause the problem?

    Thanks for your reply.

    • Edited by
      Chien-Wei
      Wednesday, September 12, 2018 11:51 AM

  • Hi, Casorix31

    I have add this com already, but there are some code error.

    I try to add a table in the document,but error shows tables are not the member of document.

    I search this problem and they tell me I need to add microsoft.office.interop.word first.

    Thanks for your reply.

    • Edited by
      Chien-Wei
      Wednesday, September 12, 2018 11:52 AM

  • Did you install the Visual Studio Tools for Office?

    — Wayne

    • Marked as answer by
      Chien-Wei
      Thursday, September 13, 2018 4:18 AM

  • Hi, WayneAKing

    Thanks for your help, I finally find it.

  • Hi,

    I am glad you have got your solution, we appreciated you shared us your solution and mark it as an answer.

    Best Regards,

    Alex


    MSDN Community Support Please remember to click «Mark as Answer» the responses that resolved your issue, and to click «Unmark as Answer» if not. This can be beneficial to other community members reading this thread. If you have any
    compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

I got the sources of a .NET project that I am trying to compile. Although, the project uses the reference (namespace) Microsoft.Office.Interop.Word from Office 2010 that I cannot find anywhere.
I was able to download the file microsoft.office.interop.word.dll but apparently the one from Office 2007 since it still doesn’t compile because the project uses the function Document.SaveAs2 (which is from Office 2010 library).
I have Office 2007 on my computer and Visual Studio 2012 Express for Desktop.

Could you please explain me how this works? How come was I able to download the dll but I cannot find the one from Office 2010. How come my client was able to compile the projet without this dll? Does Visual Studio automatically «connects» to the Microsoft Office libraries if installed when compiling ?

Thank you for your help.

wattostudios's user avatar

wattostudios

8,64613 gold badges43 silver badges57 bronze badges

asked Mar 9, 2015 at 14:24

mentinet's user avatar

4

You shouldn’t be searching for the dll on your local system yourself if you installed the assemblies correctly. See following link for information on how to download and install office interop libraries without installing office. Second link details how to add the assemblies to your project correctly.

Install Office Primary Interop Assemblies

Office Primary Interop Assemblies

For a further reference here are some pictures detailing how to add the dll correctly:

In your project, right-click on «References» and select «Add» and then «Reference».

Add References

Next select «Extensions» in the Reference Manager, scroll to find the correct dll. Which for Microsoft.Office.Interop.Word.dll Office 2010 is the version 14 one.

Add dll

answered Mar 9, 2015 at 14:55

Bilal Bashir's user avatar

Bilal BashirBilal Bashir

1,43314 silver badges18 bronze badges

10

answered Jul 19, 2018 at 20:31

Matheus Miranda's user avatar

Matheus MirandaMatheus Miranda

1,7352 gold badges20 silver badges36 bronze badges

1

Now that Visual Studio 2019 is out, you can install Microsoft Office interop libraries as part of an optional bundled component called Visual Studio Tools for Office (VSTO).

Microsoft have made this super easier, and you don’t need to reference them in the GAC!

answered May 7, 2019 at 4:34

CrazyTim's user avatar

CrazyTimCrazyTim

6,5476 gold badges33 silver badges55 bronze badges

1

If you have Office 2016 installed, you can get the file Microsoft.Office.Interop.Word.dll here:

C:Program Files (x86)Microsoft Visual StudioSharedVisual Studio Tools for OfficePIAOffice15Microsoft.Office.Interop.Word.dll

answered Oct 19, 2020 at 13:00

Ashok Jingar's user avatar

The C# programming language includes capabilities that make working with Microsoft Office API objects easier. With the advent of named and optional arguments, introduction of the dynamic type in .NET, and the ability to pass arguments to the reference parameters in COM methods, C# 4.0 quickly became the language of choice for working with COM and Interop objects.

This article talks about office interop objects in C# and how you can use them to interact with Microsoft Word and Microsoft Excel. Code examples are also provided to illustrate the concepts covered.

Prerequisites for working with Interop Objects

Visual Studio 2019 or Visual Studio 2022 must be installed on your computer to work with the code samples demonstrated in this C# tutorial. In this example, we will be using Visual Studio 2022. If you don’t have it installed in your computer, you can download it from here.

As of this writing, Visual Studio 2022 RC 2 has been released. You should also have Microsoft Office Excel 2007 or Microsoft Office Word 2007 (or their later versions) installed on your computer.

Read: Code Refactoring Tips for C#.

How to Create a New Console Application in Visual Studio

In this section we will examine how we can create a new console application project in Visual Studio 2022. Assuming Visual Studio 2022 is installed on your system, adhere to the steps given below to create a new Console Application project:

  • Start the Visual Studio 2022 IDE.
  • Click on “Create new project.
  • In the “Create new project” page, select C# in the language drop down list, Windows from the Platforms list and Console from the “Project types” list.
  • Select Console App (.NET Framework) from the project templates displayed.

Create New Project in Visual Studio

  • Click Next.
  • In the “Configure your new project” screen, specify the project’s name and the location where you would want the project to be created.
  • Before you move on to the next screen, you can optionally select the “Place solution and project in the same directory” checkbox.

Configure Visual Studio Projects

  • Click Next.
  • In the Additional Information screen, specify the Framework version you would like to use. We will use .NET Framework 4.8 in this example.

Configure VS Projects

  • Click Create to complete the process.

This will create a new .NET Framework Console application project in Visual Studio 2022. We will use this project in the sections that follow.

Install NuGet Packages

Install the following libraries from NuGet using the NuGet Package Manager or from the NuGet Package Manager Console:

Microsoft.Office.Interop.Word
Microsoft.Office.Interop.Excel

Read: Working with C# Math Operators.

How to Program Office Interop Objects in C#

In this section we will examine how to work with Office Interop objects and use them to connect to Microsoft Word and Excel and read/write data.

You must add the following using directives in your program for working with Word and Excel respectively when using Office interop objects:

using Microsoft.Office.Interop.Excel;
using Microsoft.Office.Interop.Word;

Working with Excel Interop Objects in C#

To begin, create a new Excel document named Test.xslx as a sample Excel file present in the root directory of the D:> drive. We will use this file in the following example.

You should create an instance of the Application class pertaining to the Microsoft.Office.Interop.Excel library for communicating with Excel. To do this, write the following C# code:

Application excelApplication = new Application();

The next step is to create an instance of the Workbook class to access a Workbook in Excel. You can create an instance of Workbook using the following code:

Workbook excelWorkBook = excel.Workbooks.Open(@"D:Test.xslx");

To read the name of the workbook, you can use the Name property of the workbook instance as shown in the code snippet given below:

string workbookName = excelWorkBook.Name;

The following code listing illustrates how you can display the value of the first cell of the first worksheet of the Excel document:

int worksheetcount = excelWorkBook.Worksheets.Count;
if (worksheetcount > 0) {
  Worksheet worksheet = (Worksheet) excelWorkBook.Worksheets[1];
  string worksheetName = worksheet.Name;
  var data = ((Range) worksheet.Cells[row, column]).Value;
  Console.WriteLine(data);
} else {
  Console.WriteLine("No worksheets available");
}

Here’s the complete code listing for your reference:

using Microsoft.Office.Interop.Excel;
using Microsoft.Office.Interop.Word;
using System;
using System.Runtime.InteropServices;

namespace OfficeInteropDemoApp
{
    class Program
    {
        static void Main(string[] args)
        {
            string filename = @"D:Test.xlsx";
            DisplayExcelCellValue(filename, 1, 1);
            Console.Read();
        }

        static void DisplayExcelCellValue(string filename, 
        int row, int column)
        {
            Microsoft.Office.Interop.Excel.Application 
            excelApplication = null;
            try
            {
                excelApplication = new 
                Microsoft.Office.Interop.Excel.Application();
                Workbook excelWorkBook = 
                excelApplication.Workbooks.Open(filename);
                string workbookName = excelWorkBook.Name;
                int worksheetcount = excelWorkBook.Worksheets.Count;

                if (worksheetcount > 0)
                {
                    Worksheet worksheet = 
                   (Worksheet)excelWorkBook.Worksheets[1];
                    string firstworksheetname = worksheet.Name;
                    var data = ((Microsoft.Office.Interop.Excel.Range)
                    worksheet.Cells[row, column]).Value;
                    Console.WriteLine(data);
                }
                else
                {
                    Console.WriteLine("No worksheets available");
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
            finally
            {
                if (excelApplication != null)
                {
                    excelApplication.Quit();
                    Marshal.FinalReleaseComObject(excelApplication);
                }
            }
        }
    }
}

Refer to the code listing given above. Note, the finally block of the DisplayExcelCellValue method. The Quit method is called on the Excel application instance to stop the application. Finally, a call to Marshall.FinalReleaseComObject sets the reference counter of the Excel application instance to 0.

The following code listing illustrates how you can create a new Excel document using Office Interop in C#. Note how a new workbook has been created:

static void CreateExcelDocument() 
{
	Microsoft.Office.Interop.Excel.Application excelApplication = null;

	try {
		excelApplication = new 
            Microsoft.Office.Interop.Excel.Application();
		Workbook excelWorkBook = excelApplication.Workbooks.Add();
		Worksheet worksheet = (Worksheet) excelWorkBook.Worksheets[1];
		worksheet.Cells[1, 1] = "Product Id";
		worksheet.Cells[1, 2] = "Product Name";
		worksheet.Cells[2, 1] = "1";
		worksheet.Cells[2, 2] = "Lenovo Laptop";
		worksheet.Cells[3, 1] = "2";
		worksheet.Cells[3, 2] = "DELL Laptop";
		excelWorkBook.SaveAs(@"D:Test.xls");
	}
	catch(Exception ex) {
		Console.WriteLine(ex.Message);
	}
	finally {
		if (excelApplication != null) {
			excelApplication.Quit();
			Marshal.FinalReleaseComObject(excelApplication);
		}
	}
}

When you run this code, a new Excel document will be created at the path specified with the following content inside:

C# Interop Objects Tutorial

Read: Working with Strings in C#.

Working with Word Interop Objects in C#

To work with Microsoft Word, you would need to create an instance of Microsoft.Office.Interop.Word.Application. Like Excel, this instance would be used to communicate with a Word document.

Microsoft.Office.Interop.Word.Application wordApplication = new Microsoft.Office.Interop.Word.Application();

The next step is to create a document instance using the Documents property of the Microsoft.Office.Interop.Word.Application instance we just created, as shown in the C# code snippet given below:

wordApplication.Documents.Add();

Next, you can create a paragraph and add some text to it using the as shown in the code snippet shown below:

var paragraph = document.Paragraphs.Add();
paragraph.Range.Text = "This is a sample text to demonstrate how Interop works...";

Then you can save the Word document using this code:

wordApplication.ActiveDocument.SaveAs(@"D:Test.doc", WdSaveFormat.wdFormatDocument);

Here is the complete code listing showing how to work with Microsoft Word Interop Objects in C# for your reference:

using Microsoft.Office.Interop.Excel;
using Microsoft.Office.Interop.Word;
using System;
using System.Runtime.InteropServices;

namespace OfficeInteropDemoApp
{
    class Program
    {
        static void Main(string[] args)
        {
            string filename = @"D:Test.doc";
            CreateWordDocument(filename);
            Console.Read();
        }

        static void CreateWordDocument(string filename)
        {
            Microsoft.Office.Interop.Word.Application 
            wordApplication = null;
            try
            {
                wordApplication = new 
                Microsoft.Office.Interop.Word.Application();
                var document = wordApplication.Documents.Add();
                var paragraph = document.Paragraphs.Add();
                paragraph.Range.Text = "This is a sample text to 
                demonstrate how Interop works...";
                wordApplication.ActiveDocument.SaveAs(filename, 
                WdSaveFormat.wdFormatDocument);
                document.Close();

            }
            finally
            {
                if (wordApplication != null)
                {
                    wordApplication.Quit();
                    Marshal.FinalReleaseComObject(wordApplication);
                }
            }
        }
    }
}

To read a Word document and display each word of the document you can use the following C# code:

static void ReadWordDocument(string filename)
        {
            Microsoft.Office.Interop.Word.Application 
            wordApplication = null;
            try
            {
                wordApplication = new 
                Microsoft.Office.Interop.Word.Application();
                Document document = 
                wordApplication.Documents.Open(filename);

                int count = document.Words.Count;
                for (int i = 1; i <= count; i++)
                {
                    string text = document.Words[i].Text;
                    Console.WriteLine(text);
                }
            }
            catch(Exception ex)
            {
                Console.Write(ex.Message);
            }
            finally
            {
                if (wordApplication != null)
                {
                    wordApplication.Quit();
                    Marshal.FinalReleaseComObject(wordApplication);
                }
            }
        }

Note how the Words property of the Word application instance has been used to retrieve the words contained in the document.

C# Interop Objects Tutorial

In this article we have examined how we can access Microsoft Office Interop objects using C#. Since there is still no support for working with Interop objects in .NET Core, we have created a .NET Framework Console Application in this example.

title description ms.date ms.topic dev_langs helpviewer_keywords author ms.author manager ms.technology ms.workload

Word object model overview

The Word object model consists of classes and interfaces that are provided in the primary interop assembly for Word and are defined in the Word namespace.

02/02/2017

conceptual

VB

CSharp

Word object model

Word [Office development in Visual Studio], object model

object models [Office development in Visual Studio], Office

object models [Office development in Visual Studio], Word

objects [Office development in Visual Studio], Office object models

Office object models

John-Hart

johnhart

jmartens

office-development

office

Word object model overview

[!INCLUDE Visual Studio]
When you develop Word solutions in Visual Studio, you interact with the Word object model. This object model consists of classes and interfaces that are provided in the primary interop assembly for Word, and are defined in the xref:Microsoft.Office.Interop.Word namespace.

[!INCLUDEappliesto_wdalldocapp]

This topic provides a brief overview of the Word object model. For resources where you can learn more about the entire Word object model, see Use the Word object model documentation.

For information about using the Word object model to perform specific tasks, see the following topics:

  • Work with documents

  • Work with text in documents

  • Work with tables

Understand the Word object model

Word provides hundreds of objects with which you can interact. These objects are organized in a hierarchy that closely follows the user interface. At the top of the hierarchy is the xref:Microsoft.Office.Interop.Word.Application object. This object represents the current instance of Word. The xref:Microsoft.Office.Interop.Word.Application object contains the xref:Microsoft.Office.Interop.Word.Document, xref:Microsoft.Office.Interop.Word.Selection, xref:Microsoft.Office.Interop.Word.Bookmark, and xref:Microsoft.Office.Interop.Word.Range objects. Each of these objects has many methods and properties that you can access to manipulate and interact with the object.

The following illustration shows one view of these objects in the hierarchy of the Word object model.

Word Object Model graphic

At first glance, objects appear to overlap. For example, the xref:Microsoft.Office.Interop.Word.Document and xref:Microsoft.Office.Interop.Word.Selection objects are both members of the xref:Microsoft.Office.Interop.Word.Application object, but the xref:Microsoft.Office.Interop.Word.Document object is also a member of the xref:Microsoft.Office.Interop.Word.Selection object. Both the xref:Microsoft.Office.Interop.Word.Document and xref:Microsoft.Office.Interop.Word.Selection objects contain xref:Microsoft.Office.Interop.Word.Bookmark and xref:Microsoft.Office.Interop.Word.Range objects. The overlap exists because there are multiple ways you can access the same type of object. For example, you apply formatting to a xref:Microsoft.Office.Interop.Word.Range object; but you may want to access the range of the current selection, of a particular paragraph, of a section, or of the entire document.

The following sections briefly describe the top-level objects and how they interact with each other. These objects include the following five:

  • Application object

  • Document object

  • Selection object

  • Range object

  • Bookmark object

    In addition to the Word object model, Office projects in Visual Studio provide host items and host controls that extend some objects in the Word object model. Host items and host controls behave like the Word objects they extend, but they also have additional functionality such as data-binding capabilities and extra events. For more information, see Automate Word by using extended objects and Host items and host controls overview.

Application object

The xref:Microsoft.Office.Interop.Word.Application object represents the Word application, and is the parent of all of the other objects. Its members usually apply to Word as a whole. You can use its properties and methods to control the Word environment.

In VSTO Add-in projects, you can access the xref:Microsoft.Office.Interop.Word.Application object by using the Application field of the ThisAddIn class. For more information, see Program VSTO Add-ins.

In document-level projects, you can access the xref:Microsoft.Office.Interop.Word.Application object by using the xref:Microsoft.Office.Tools.Word.Document.Application%2A property of the ThisDocument class.

Document object

The xref:Microsoft.Office.Interop.Word.Document object is central to programming Word. It represents a document and all of its contents. When you open a document or create a new document, you create a new xref:Microsoft.Office.Interop.Word.Document object, which is added to the xref:Microsoft.Office.Interop.Word.Documents collection of the xref:Microsoft.Office.Interop.Word.Application object. The document that has the focus is called the active document. It is represented by the xref:Microsoft.Office.Interop.Word._Application.ActiveDocument%2A property of the xref:Microsoft.Office.Interop.Word.Application object.

The Office development tools in Visual Studio extend the xref:Microsoft.Office.Interop.Word.Document object by providing the xref:Microsoft.Office.Tools.Word.Document type. This type is a host item that gives you access to all features of a xref:Microsoft.Office.Interop.Word.Document object, and adds additional events and the ability to add managed controls.

When you create a document-level project, you can access xref:Microsoft.Office.Tools.Word.Document members by using the generated ThisDocument class in your project. You can access members of the xref:Microsoft.Office.Tools.Word.Document host item by using the Me or this keywords from code in the ThisDocument class, or by using Globals.ThisDocument from code outside the ThisDocument class. For more information, see Program document-level customizations. For example, to select the first paragraph in the document, use the following code.

C#

:::code language=»csharp» source=»../vsto/codesnippet/CSharp/Trin_VstcoreWordAutomationCS/ThisDocument.cs» id=»Snippet120″:::

VB

:::code language=»vb» source=»../vsto/codesnippet/VisualBasic/Trin_VstcoreWordAutomationVB/ThisDocument.vb» id=»Snippet120″:::

In VSTO Add-in projects, you can generate xref:Microsoft.Office.Tools.Word.Document host items at run time. You can use the generated host item to add controls to the associated document. For more information, see Extend Word documents and Excel workbooks in VSTO Add-ins at run time.

Selection object

The xref:Microsoft.Office.Interop.Word.Selection object represents the area that is currently selected. When you perform an operation in the Word user interface, such as bolding text, you select, or highlight the text and then apply the formatting. The xref:Microsoft.Office.Interop.Word.Selection object is always present in a document. If nothing is selected, then it represents the insertion point. In addition, a selection can encompass multiple blocks of text that are not contiguous.

Range object

The xref:Microsoft.Office.Interop.Word.Range object represents a contiguous area in a document, and is defined by a starting character position and an ending character position. You are not limited to a single xref:Microsoft.Office.Interop.Word.Range object. You can define multiple xref:Microsoft.Office.Interop.Word.Range objects in the same document. A xref:Microsoft.Office.Interop.Word.Range object has the following characteristics:

  • It can consist of the insertion point alone, a range of text, or the entire document.

  • It includes non-printing characters such as spaces, tab characters, and paragraph marks.

  • It can be the area represented by the current selection, or it can represent an area different from the current selection.

  • It is not visible in a document, unlike a selection, which is always visible.

  • It is not saved with a document and exists only while the code is running.

    When you insert text at the end of a range, Word automatically expands the range to include the inserted text.

Content control objects

A xref:Microsoft.Office.Interop.Word.ContentControl provides a way for you to control the input and presentation of text and other types of content in Word documents. A xref:Microsoft.Office.Interop.Word.ContentControl can display several different types of UI that are optimized for use in Word documents, such as a rich text control, a date picker, or a combo box. You can also use a xref:Microsoft.Office.Interop.Word.ContentControl to prevent users from editing sections of the document or template.

Visual Studio extends the xref:Microsoft.Office.Interop.Word.ContentControl object into several different host controls. Whereas the xref:Microsoft.Office.Interop.Word.ContentControl object can display any of the different types of UI that are available for content controls, Visual Studio provides a different type for each content control. For example, you can use a xref:Microsoft.Office.Tools.Word.RichTextContentControl to create a rich text control, or you can use a xref:Microsoft.Office.Tools.Word.DatePickerContentControl to create a date picker. These host controls behave like the native xref:Microsoft.Office.Interop.Word.ContentControl, but they have additional events and data-binding capabilities. For more information, see Content controls.

Bookmark object

The xref:Microsoft.Office.Interop.Word.Bookmark object represents a contiguous area in a document, with both a starting position and an ending position. You can use bookmarks to mark a location in a document, or as a container for text in a document. A xref:Microsoft.Office.Interop.Word.Bookmark object can consist of the insertion point, or be as large as the entire document. A xref:Microsoft.Office.Interop.Word.Bookmark has the following characteristics that set it apart from the xref:Microsoft.Office.Interop.Word.Range object:

  • You can name the bookmark at design time.

  • xref:Microsoft.Office.Interop.Word.Bookmark objects are saved with the document, and thus are not deleted when the code stops running or your document is closed.

  • Bookmarks can be hidden or made visible by setting the xref:Microsoft.Office.Interop.Word.View.ShowBookmarks%2A property of the xref:Microsoft.Office.Interop.Word.View object to false or true.

    Visual Studio extends the xref:Microsoft.Office.Interop.Word.Bookmark object by providing the xref:Microsoft.Office.Tools.Word.Bookmark host control. The xref:Microsoft.Office.Tools.Word.Bookmark host control behaves like a native xref:Microsoft.Office.Interop.Word.Bookmark, but has additional events and data-binding capabilities. You can bind data to a bookmark control on a document in the same way that you bind data to a text box control on a Windows Form. For more information, see Bookmark control.

Use the Word object model documentation

For complete information about the Word object model, you can refer to the Word primary interop assembly (PIA) reference and the Visual Basic for Applications (VBA) object model reference.

Primary interop assembly reference

The Word PIA reference documentation describes the types in the primary interop assembly for Word. This documentation is available from the following location: Word 2010 primary interop assembly reference.

For more information about the design of the Word PIA, such as the differences between classes and interfaces in the PIA and how events in the PIA are implemented, see Overview of classes and interfaces in the Office primary interop assemblies.

VBA object model reference

The VBA object model reference documents the Word object model as it is exposed to VBA code. For more information, see Word 2010 object model reference.

All of the objects and members in the VBA object model reference correspond to types and members in the Word PIA. For example, the Document object in the VBA object model reference corresponds to the xref:Microsoft.Office.Interop.Word.Document object in the Word PIA. Although the VBA object model reference provides code examples for most properties, methods, and events, you must translate the VBA code in this reference to Visual Basic or Visual C# if you want to use them in a Word project that you create by using Visual Studio.

See also

  • Office primary interop assemblies
  • Automate Word by using extended objects
  • Work with documents
  • Work with text in documents
  • Work with tables
  • Host items and host controls overview
  • Programmatic limitations of host items and host controls
  • Optional parameters in Office solutions

Время прочтения: 6 мин.

В
процессе аудита часто возникает потребность в обработке данных из документов в
форматах MS Word или Excel. В своей статье я хочу поделиться опытом считывания
информации из таких файлов с использованием языка C#.

Для
работы с файлами Word и Excel
я решил выбрать библиотеки Microsoft.Office.Interop.Word и Microsoft.Office.Interop.Excel, предоставляющие программные интерфейсы для
взаимодействия с объектами MS
Word и
Excel.

Преимущества
использования этих библиотек:

  • созданы корпорацией Microsoft, следовательно, взаимодействие с объектами программ пакета MS Office реализовано наиболее оптимально,
  • нужный пакет Visual Studio Tool for Office поставляется вместе с Visual Studio (достаточно отметить его при установке VS).

Также
следует заметить, что у такого похода есть и недостаток: для того, чтобы
написанная программа работала на ПК пользователя необходимо, чтобы на нём были
установлены программы MS
Office и
MS Excel. Поэтому такой
подход плохо подходит для серверных решений. Также такая программа не будет
являться кроссплатформенной.

Добавление библиотек в проект Visual Studio

Библиотеки
поставляются вместе с пакетом Visual
Studio
Tool
for
Office (платформа .NET Framework).

Для
использования библиотеки нужно:

  • добавить ссылку на неё: в обозревателе решений необходимо кликнуть правой кнопкой мыши по пункту Ссылки (Рис. 1) и найти нужную библиотеку по ключевым словам (после добавления ссылка появится в списке),

  • указать используемое пространство имён в файле программы (в примере ему назначен алиас Word): (Рис. 2):

Пример парсинга файла MS Word

Можно прочитать основные
форматы: .doc, .docx, .rtf.

Ниже
приведён листинг с примером считывания текста из документа MS Word:

object FileName = @"C:test.doc";
object rOnly = true;
object SaveChanges = false;
object MissingObj = System.Reflection.Missing.Value;

Word.Application app = new Word.Application();
Word.Document doc = null;
Word.Range range = null;
try
{
    doc = app.Documents.Open(ref FileName, ref MissingObj, ref rOnly, ref MissingObj,
    ref MissingObj, ref MissingObj, ref MissingObj, ref MissingObj,
    ref MissingObj, ref MissingObj, ref MissingObj, ref MissingObj,
    ref MissingObj, ref MissingObj, ref MissingObj, ref MissingObj);

    object StartPosition = 0;
    object EndPositiojn = doc.Characters.Count;
    range = doc.Range(ref StartPosition, ref EndPositiojn);

    // Получение основного текста со страниц (без учёта сносок и колонтитулов)
    string MainText = (range == null || range.Text == null) ? null : range.Text;
    if (MainText != null)
    {
        /* Обработка основного текста документа*/
    }

    // Получение текста из нижних и верхних колонтитулов
    foreach (Word.Section section in doc.Sections)
    {
        // Нижние колонтитулы
        foreach (Word.HeaderFooter footer in section.Footers)
        {
            string FooterText = (footer.Range == null || footer.Range.Text == null) ? null : footer.Range.Text;
            if (FooterText != null)
            {
                /* Обработка текста */
            }
        }

        // Верхние колонтитулы
        foreach (Word.HeaderFooter header in section.Headers)
        {
            string HeaderText = (header.Range == null || header.Range.Text == null) ? null : header.Range.Text;
            if (HeaderText != null)
            {
                /* Обработка текста */
            }
        }
    }
    // Получение текста сносок
    if (doc.Footnotes.Count != 0)
    {
        foreach (Word.Footnote footnote in doc.Footnotes)
        {
            string FooteNoteText = (footnote.Range == null || footnote.Range.Text == null) ? null : footnote.Range.Text;
            if (FooteNoteText != null)
            {
                /* Обработка текста */
            }
        }
    }
} catch (Exception ex)
{
    /* Обработка исключений */
}
finally
{
    /* Очистка неуправляемых ресурсов */
    if(doc != null)
    {
        doc.Close(ref SaveChanges);
    }
    if(range != null)
    {
        Marshal.ReleaseComObject(range);
        range = null;
    }            
    if(app != null)
    {
        app.Quit();
        Marshal.ReleaseComObject(app);
        app = null;
    }
}

Примечания:

  • в коде приводится пример считывания основного текста документа, текста верхних и нижних колонтитулов, а также текста сносок,
  • в коде производится очистка неуправляемых ресурсов с использованием класса Marshal (подробнее можно почитать по ссылке )

Пример парсинга файла MS Excel

Можно прочитать основные
форматы: .xls, .xlsx.

Ниже
приведён листинг с примером считывания текста из документа MS Excel (по
ячейкам):

string FileName = @"C:UsersbeeDownloadstest.xlsx";
object rOnly = true;
object SaveChanges = false;
object MissingObj = System.Reflection.Missing.Value;

Excel.Application app = new Excel.Application();
Excel.Workbooks workbooks = null;
Excel.Workbook workbook = null;
Excel.Sheets sheets = null;
try
{
    workbooks = app.Workbooks;
    workbook  = workbooks.Open(FileName, MissingObj, rOnly, MissingObj, MissingObj,
                                MissingObj, MissingObj, MissingObj, MissingObj, MissingObj,
                                MissingObj, MissingObj, MissingObj, MissingObj, MissingObj);

    // Получение всех страниц докуента
    sheets = workbook.Sheets;
                
    foreach(Excel.Worksheet worksheet in sheets)
    {
        // Получаем диапазон используемых на странице ячеек
        Excel.Range UsedRange = worksheet.UsedRange;
        // Получаем строки в используемом диапазоне
        Excel.Range urRows = UsedRange.Rows;
        // Получаем столбцы в используемом диапазоне
        Excel.Range urColums = UsedRange.Columns;

        // Количества строк и столбцов
        int RowsCount = urRows.Count;
        int ColumnsCount = urColums.Count;
        for(int i = 1; i <= RowsCount; i++)
        {
            for(int j = 1; j <= ColumnsCount; j++)
            {
                Excel.Range CellRange = UsedRange.Cells[i, j];
                // Получение текста ячейки
                string CellText = (CellRange == null || CellRange.Value2 == null) ? null :
                                    (CellRange as Excel.Range).Value2.ToString();

                if(CellText != null)
                {
                    /* Обработка текста */
                }
            }
        }
        // Очистка неуправляемых ресурсов на каждой итерации
        if (urRows != null) Marshal.ReleaseComObject(urRows);
        if (urColums != null) Marshal.ReleaseComObject(urColums);
        if (UsedRange != null) Marshal.ReleaseComObject(UsedRange);
        if (worksheet != null) Marshal.ReleaseComObject(worksheet);
    }
} catch (Exception ex)
{
    /* Обработка исключений */
}
finally
{
    /* Очистка оставшихся неуправляемых ресурсов */
    if (sheets != null) Marshal.ReleaseComObject(sheets);
    if (workbook != null)
    {
        workbook.Close(SaveChanges);
        Marshal.ReleaseComObject(workbook);
        workbook = null;
    }

    if (workbooks != null)
    {
        workbooks.Close();
        Marshal.ReleaseComObject(workbooks);
        workbooks = null;
    }
    if (app != null)
    {
        app.Quit();
        Marshal.ReleaseComObject(app);
        app = null;
    }
}

Примечания:

  • при обработке текста каждой ячейки
    приходится заранее знать количество задействованных строк и столбцов на текущем
    листе документа,
  • такой перебор не совсем оптимален
    (временная сложность алгоритма O(n2)):
    при желании его можно ускорить (например, разбив обработку на несколько
    потоков): в данной статье приводится лишь пример получения текста из каждой
    ячейки,
  • при таком переборе ячеек необходимо
    на каждой итерации освобождать неуправляемые ресурсы, чтобы избежать утечек
    памяти (аналогично предыдущему примеру, используется класс Marshal).

Приведенные
примеры хорошо подходят для реализации приложения по обработке документов Word и
Excel
на платформе .NET Framework.

С
помощью указанных библиотек можно не только читать текст из документов, но и
создавать новые файлы форматов MS Word и
Excel.

Like this post? Please share to your friends:
  • Microsoft office interop word template
  • Microsoft office interop word tables
  • Microsoft office interop word range
  • Microsoft office interop word download
  • Microsoft office interop word documentclass