Regular expression in microsoft word


August 20, 2018 — by Suat M. Ozgur

How to Use RegEx in Microsoft Word

Lissa asks:

Is there a way to change a number (always a random number) after the word fox? Example: fox 23, bear 1, fox 398, frog 12, fox 15. I want to change the number to the same color of the word fox.

We can find and replace by format in Microsoft Word. This is a great feature to quickly find the formatted text, and even replace the entire text format in the document.

Select Advanced Find on the ribbon.

Find and Replace Dialog

Find and Replace Dialog

Enter the text to find, then click More button to see advanced options, and click on Format button.

Advanced Find Options

Advanced Find Options

Select Font option in the settings, then you can set up the text color that you would like to find in the document. Click OK to close Find Font dialog window.


Select text color in Find Font dialog.

Select text color in Find Font dialog.

Click Find Next, and you’ll see the first occurance of the text being searched in certain color will be selected.

Find Next to find first occurance.

Find Next to find first occurance.

We can also make more complicated searches by using wildcards. However, Word’s native search module doesn’t let us make a search how Lissa asked.

That’s where we can call RegEx into the game!

VBSCript Regular Expressions Library

VBA doesn’t ship with any regular expression support. However Microsoft VBScript library contains powerful regular expression capabilities. This library is part of Internet Explorer 5.5 and later, so it is available on all computers running Windows XP, Vista, 7, 8, 8.1, or 10.

Mac Users

Since Internet Explorer is not a Mac application, this library doesn’t exist in Mac. Therefore, VBA samples below don’t work in Mac.

To use this library in VBA, switch to VBE, select Project and References in the VBE menu, then scroll down the list to find the item «Microsoft VBScript Regular Expressions 5.5», and tick it to include in the application.

VBScript Regular Expressions Library

VBScript Regular Expressions Library

Insert a new module, and copy and paste the following code into this module.

Sub doRegexFind()
Dim strSample As String
Dim objRegex As RegExp
Dim matches As MatchCollection
Dim fnd As Match

    strSample = "First product code is fox 12, second one is fox 22, and third product is fox 45."
    
    Set objRegex = New RegExp
    
    With objRegex
        .Pattern = "fox d+"
        .Global = True
        .IgnoreCase = True
        Set matches = .Execute(strSample)
        For Each fnd In matches
            Debug.Print fnd
        Next fnd
    End With
End Sub

This procedure takes the sample text, finds the product codes by the given pattern — which is starting with «fox», single space and a number, and prints the matched codes in the Immediate window (hit Ctrl + G in VBE if it is not visible already).

Matched product codes printed in Immediate window.

Matched product codes printed in the Immediate window.

d+ character class in the pattern defines one or more numeric characters, and pattern is basically «fox» prefix followed by a space followed by numbers.

Copy and paste following code to see RegEx in action to remove the spaces from product codes.

Sub doRegexFindReplace()
Dim objRegex As RegExp
Dim matches As MatchCollection
Dim fnd As Match
Dim strSample As String

    strSample = "First product code is fox 12, second one is fox 22, and third product is fox 45."
    
    Set objRegex = New RegExp
    
    With objRegex
        .Pattern = "(fox) (d+)"
        .Global = True
        .IgnoreCase = True
        strSample = .Replace(strSample, "$1$2")
    End With
    
    Debug.Print strSample
End Sub

This procedure replaces the sample text content by removing the spaces from the product codes matched with the given pattern, and prints the result text in the Immediate window.

Replaced text printed in the Immediate window.

Replaced text printed in the Immediate window.

Please note that pattern is slightly different than the first code. Terms in this pattern are enclosed with parentheses, and corresponding terms are used in the Replace method as $1 and $2 in order. This procedure simply joins the two terms without spaces.

Back to the Question

Let’s go back to the sample text we used at the beginning of this article.

Sample Text

Sample Text

We need to find «fox» followed by numeric characters, and change the match by using the color of the «fox» section in the matched text.

Although RegEx is very good matching by the given pattern, it cannot replace the color of text in Word document. So we will combine RegEx and Word VBA methods in the following procedure.

Here are the steps:

  1. Find the matches with RegEx.
  2. Search each matched text by using Word Find method.
  3. Find the color of the first word in the found range.
  4. Change the color of the found range with the color in the previous step.

Switch to VBE, and insert a new module. Make sure VBScript Regular Expressions library is added to the project, and copy and paste the following code into this new module.

Sub doRegexMagic()
Dim str As String
Dim objRegex As RegExp
Dim matches As MatchCollection
Dim fnd As Match
    Set objRegex = New RegExp
    
    str = "fox"
    
    With Selection
        .HomeKey wdStory
        .WholeStory
    End With
    
    With objRegex
        .Pattern = str & " d+"
        .Global = True
        .IgnoreCase = True
        Set matches = .Execute(Selection.Text)
    End With

    With Selection
        .HomeKey wdStory
        With .Find
            .ClearFormatting
            .Forward = True
            .Format = False
            .MatchCase = True
            For Each fnd In matches
                .Text = fnd
                .Execute
                With Selection
                    .Font.Fill.ForeColor = .Range.Words(1).Font.TextColor
                    .MoveRight wdCharacter
                End With
            Next fnd
        End With
        .HomeKey wdStory
    End With
End Sub

Run the code, and here is the result.

Result

Result

RegEx in Excel?

Regex is completely missing from Excel. However, we can still use VBScript Regular Expressions in Excel VBA.

Launch Excel, open a new workbook, and create the content as shown below.

Sample data in Excel

Sample data in Excel

Switch to VBE, and insert a new module. Make sure VBScript Regular Expressions library is added to the project just like you did in Word, and copy and paste the following code into this new module.

Sub doRegexMagicInExcel()
Dim str As String
Dim objRegex As RegExp
Dim matches As MatchCollection
Dim fnd As Match
Dim rng As Range
Dim cll As Range

    Set objRegex = New RegExp
    Set rng = Selection
    
    str = "fox"
        
    With objRegex
        .Pattern = "(" & str & ") (d+)"
        .Global = True
        .IgnoreCase = True
        For Each cll In rng.Cells
            Set matches = .Execute(cll.Value)
            For Each fnd In matches
                cll.Value = .Replace(cll.Value, "$1$2")
            Next fnd
        Next cll
    End With
End Sub

Return to worksheet, and select the range with sample text. Run the macro, and see the result.

Result in Excel

Result in Excel

This procedure loops through the cells in the selected range, replaces the text in the cells by removing the spaces from the product codes matched with the given RegEx pattern.

Title Photo: tito pixel / Unsplash

  • Download demo — 63.16 KB
  • Download source — 91.13 KB

Image 1

Prerequisites

In order to run the sample application, the Microsoft .NET Framework 2.0 or higher must be installed. In addition, Microsoft Office 2003 or higher must be installed along with the Microsoft Office 2003 Primary Interop Assemblies (PIAs) redistributable. These PIAs are installed if one performs a full install of Microsoft Office 2003, or you can get them for free from Microsoft.

For more information on how to install and use the Primary Interop Assemblies in .NET programs, please refer to this link.

I would like to emphasize that one does not need Visual Tools for Office to run or modify this program.

Introduction

Regular Expressions are a very powerful tool for text processing. Sophisticated expressions can be used to find all kinds of patterns of text. Regular Expression engines are integrated into many text editors. Most Regular Expression examples show how to manipulate either ASCII or Unicode text. In addition to editors that handle the standard text formats mentioned previously, there are millions (or probably billions) of documents encoded in one of Microsoft’s many Office formats, such as WORD format (doc), Rich Text Format (RTF), and Excel (XLS). While one can perform searches in Microsoft Office documents using Regular Expressions through the use of Smart Tags, its implementation is cumbersome for many document processing purposes. In this article, I will present a simple methodology of applying the power of Regular Expressions to Microsoft Word documents through the use of the Microsoft .NET Framework. The methodology makes use of the System.Text.RegularExpressions namespace and the Microsoft Word interop assemblies to realize this solution. In addition, through the use of dynamically loadable assemblies, every Regular Expression match can be validated to ensure that the match is correct. For example, it is quite easy to write a Regular Expression for a numerical date of the form 02/07/2007 for February 7, 2007. But to include in the Regular Expression checks for invalid dates such as 04/31/2002 or 02/30/2007 is quite difficult without code that performs such checks.

In future articles, I plan to present ways of using Regular Expressions to perform sophisticated text search and replace algorithms through the use of the MSOFFICE interop assemblies and .NET technologies. I will also apply these techniques to other MSOFFICE documents such as EXCEL.

Background

Support for Regular Expressions for Microsoft applications first appeared in Word 97. Its implementation was quite tedious because the syntax used differed significantly from the Regular Expression Standard. Microsoft realized the shortfalls in their implementation, and reintroduced Regular Expressions as part of their Smart Tags library 2.0, which was first available with Microsoft Office 2003. Smart Tags, of which Regular Expression operations form a small part, represented a generalized, integrated way to enable users to present data from their documents. However, due to its non-intuitive, complicated manner, Microsoft itself admits in their MSDN Web site that a poll showed developers have not taken the necessary steps to develop them or use the Microsoft .NET Framework to do so. Please refer to this MSDN article for more information: Realize the Potential of Office 2003 by Creating Smart Tags in Managed Code. The focus of this article is devising a simple, yet powerful way of using Regular Expressions (along with validation code).

Using the Code

On startup, the program reads the XML file Searches.XML. This file contains information for all built-in Regular Expression searches. Included in this XML file are searches for URLs, IP addresses, US dates, European dates, US phone numbers, and email addresses. One can add as many search options as she or he wants to this file. Each search option can be activated by placing a check by the desired search.

Each search group contains the following information in the XML file:

  • Search Regex – The Regular Expression used in the search
  • Indentifier – The search title that appears in the check listbox
  • FindColor – The color used to highlight the found text in the document
  • Action – The operation used (this version only supports Find)
  • PlugInName – The name of the assembly associated with the search. If no assembly is associated, “None” is used.
  • PlugInFunction – The function called for this search block that is found in its plug-in assembly
  • Description – The description text that is displayed in the check list box

Finding the Text

MSWordRegExDemo contains methods which manipulate the Microsoft Word or RTF document using automation by way of the Microsoft Word interop assembly. All of these methods are contained in the DocumentEngine class. The two main Microsoft Word objects that are used in this application are:

Word.Application app;
Word.Document theDoc;

To open the document, we perform the following call which is triggered by the file open event in the GUI:

public void OpenDocument(string documentName)
{
    object optional = Missing.Value;
    object visible = true;
    object fileName = documentName;
    if (app == null)
        app = new Word.Application();

    app.Visible = true;

    try
    {
        
        theDoc = app.Documents.Open(ref fileName, ref optional,
            ref optional, ref optional, ref optional, ref optional, ref optional,
            ref optional, ref optional, ref optional, ref optional, ref visible,
            ref optional, ref optional, ref optional, ref optional);

        paraCount = theDoc.Paragraphs.Count;
    }
    catch(Exception ex)
    {
        MessageBox.Show(ex.Message + ": Error opening document");
    }
}

The first step is converting the text of the Word document into Text. Once we have the document in the text domain, we can perform a Regular Expression search on the text and see if there are any matches. See below:

docText = docEngine.GetRng(currentParaNum).Text;

If one or more matches occur, we then take the match text and feed it through the Microsoft Word.Find function. In searching for text, we need to select a text range to import into text. I have chosen the paragraph range specifier. This means that we will loop through the document paragraph by paragraph, performing our searches on each paragraph. For short documents, we could select the entire range of the document. If we wanted to iterate through footnotes, Word provides a footnote range. To get the range of each paragraph, the following function is used:


public Word.Range GetRng(int nParagraphNumber)
{
    try
    {
        return theDoc.Paragraphs[nParagraphNumber].Range;
    }
    catch (System.Runtime.InteropServices.COMException ex)
    {
        MessageBox.Show(ex.Message + "nParagraph Number:
        " + nParagraphNumber.ToString() + " does not exist.");
        return null;
    }
}

The main function which performs the «find» of text is RegularExpressionFind.

public void RegularExpressionFind(int paraNum, string docText,
       SearchStruct selSearchStruct, out List<hitinfo /> hits)
{
    HitInfo hitInfo = new HitInfo();
    hits = new List<hitinfo />();
    System.Text.RegularExpressions.Regex r;
    Word.WdColor color = GetSearchColor(selSearchStruct.TextColor);

    r = new Regex(selSearchStruct.RegExpression);
    MatchCollection matches = r.Matches(docText);

    
    if (matches.Count == 0)
        return;

    
    try
    {
        if (!LoadSearchAssembly(selSearchStruct.PlugInName,
                                selSearchStruct.PlugInFunction))
            return;
    }
    catch (Exception ex)
    {
        throw ex;
    }

    int index = 0;

    
    int startSearchPos = GetRng(paraNum).Start;

    foreach (Match match in matches)
    {
        
        if (hasValidationAssembly)
        {
            Object[] objList = new Object[1];
            objList[0] = (Object)match;
            if (!Convert.ToBoolean(validationMethod.Invoke
                (assemblyInstance, objList)))
                continue;
        }
        index = docText.IndexOf(match.Value, index);

        
        string matchStr = docText.Substring(index, match.Value.Length);
        index += matchStr.Length - 1;

        
        FindTextInDoc(OperationMode.DotNetRegExMode, paraNum,
        matchStr, color, startSearchPos, out  startSearchPos,
        out hitInfo.StartDocPosition);

        
        hitInfo.Text = match.Value;
        hits.Add(hitInfo);
   }
}

First, we search for the Regular Expression in the imported paragraph, by using the Regex .NET functions.

r = new Regex(selSearchStruct.RegExpression);
MatchCollection matches = r.Matches(docText);


if (matches.Count == 0)
    return;

If there is a match, we load the search assembly if it has not already been loaded, and perform additional validation on the match.

try
{
  if (!LoadSearchAssembly(selSearchStruct.PlugInName,
            selSearchStruct.PlugInFunction))
            return;
}

The following method dynamically loads the validation assembly for the Regular Expression, if one exists. If the assembly was previously loaded, the LoadFrom method will return it.

 public bool LoadSearchAssembly(string plugginName, string plugInFunction)
 {
     try
     {
        
        if (plugginName.ToLower() == "none")
        {
            hasValidationAssembly = false;
            return true;
        }
        hasValidationAssembly = true;

        
        
        string plugginPath = Path.GetDirectoryName
        (Application.ExecutablePath) + @"Plugins" + plugginName;
        if (!File.Exists(plugginPath))
            throw new Exception("Cannot find path to assembly: " +
                                plugginName);

        Assembly a = Assembly.LoadFrom(plugginPath);
        
        Type[] types = a.GetTypes();

        
        validationMethod = types[0].GetMethod(plugInFunction);
        
        assemblyInstance = Activator.CreateInstance(types[0]);

        return true;
    }
    catch (Exception ex)
    {
        MessageBox.Show(ex.Message);
        return false;
    }
}

Below is the assembly that validates a numerical date:






namespace SaelSoft.RegExPlugIn.NumericalDateValidator
{
    public class NumericalDateValidatorClass
    {
        int month = 0;
        int day = 0;
        int year = 0;
        public bool ValidateUSDate(Match matchResult)
        {
            if (matchResult.Groups.Count < 3)
                return false;
            int nResult = 0;

            if (int.TryParse(matchResult.Groups[1].ToString(), out nResult))
                month = nResult;
            else
                return false;
            if (int.TryParse(matchResult.Groups[2].ToString(), out nResult))
                day = nResult;
            else
                return false;

            if (int.TryParse(matchResult.Groups[3].ToString(), out nResult))
                year = nResult;
            else
                return false;

            return CommonDateValidation();
        }

        public bool ValidateEuropeanDate(Match matchResult)
        {
            if (matchResult.Groups.Count < 3)
                return false;
            int nResult = 0;

            if (int.TryParse(matchResult.Groups[1].ToString(), out nResult))
                month = nResult;
            else
                return false;
            if (int.TryParse(matchResult.Groups[2].ToString(), out nResult))
                day = nResult;
            else
                return false;

            if (int.TryParse(matchResult.Groups[3].ToString(), out nResult))
                year = nResult;
            else
                return false;

            return CommonDateValidation();
        }

        private bool CommonDateValidation()
        {
            
            if (day == 31 && (month == 4 || month == 6 || month == 9 || month == 11))
            {
                return false; 
            }
            
            else if (day >= 30 && month == 2)
            {
                return false; 
            }
            
            else if (month == 2 && day == 29 && !(year % 4 == 0
                                && (year % 100 != 0 || year % 400 == 0)))
            {
                return false;
            }
            else
            {
                return true; 
            }
        }
    }

Finally, if we have a real match, we perform a search for the match string in the Word document by calling the DocumentEngine function, FindTextInDoc.

internal bool FindTextInDoc(OperationMode opMode, int currentParaNum,
         string textToFind, Word.WdColor color, int start, out int end,
         out int textStartPoint)
{
    string strFind = textToFind;
    textStartPoint = 0;

    
    Word.Range rngDoc = GetRng(currentParaNum);

    
    if (start >= rngDoc.End)
    {
        end = 0;
        return false;
    }
    rngDoc.Start = start;

    
    
    rngDoc.Find.ClearFormatting();
    rngDoc.Find.Forward = true;
    rngDoc.Find.Text = textToFind;

    
    object caseSensitive = "1";
    object missingValue = Type.Missing;

    
    object matchWildCards = Type.Missing;

    
    if (opMode == OperationMode.Word97Mode)
        matchWildCards = "1";

    
    rngDoc.Find.Execute(ref missingValue, ref caseSensitive,
        ref missingValue, ref missingValue, ref missingValue,
        ref missingValue, ref missingValue, ref missingValue,
        ref missingValue, ref missingValue, ref missingValue,
        ref missingValue, ref missingValue, ref missingValue,
        ref missingValue);

    
    if (hilightText)
        rngDoc.Select();

    end = rngDoc.End + 1;
    textStartPoint = rngDoc.Start;

    
    if (rngDoc.Find.Found)
    {
        rngDoc.Font.Color = color;
        
        return true;
    }
    return false;
}

Points of Interest

The DocumentEngine class makes use of Microsoft Office events in order to detect the situation when the user closes the Microsoft Word document that was loaded by the application. When the Quit event is invoked, the app and the document objects are set to NULL. They are reinitialized when the user opens a new document.

public DocumentEngine()
{
  app = new Word.Application();
  
  ((Word.ApplicationEvents4_Event)app).Quit += new Microsoft.Office.
  Interop.Word.ApplicationEvents4_QuitEventHandler(App_Quit);
}


private void App_Quit()
{
   app = null;
   theDoc = null;
}

This project can serve as the first step of a complex document processing application for Microsoft Word and RTF documents. Basically, everything that can be accomplished with Regular Expressions with ASCII or UNICODE files can now be done almost as easily for *.doc and *.rtf files. In my next article, I will show how, by means of dynamic assemblies, we can perform complex formatting using Regular Expressions.

For more online information on Microsoft Office Interop Assemblies, please refer to MSDN.

For Further Investigation

For those who would like to find out more information on regular expressions and Microsoft Office automation, I recommend the follow excellent books: Mastering Regular Expressions by Jeffrey E. F. Freidl, and Visual Studio Tools for Office — Using C# with Excel, Word, Outlook, and Infoview by Eric Carter and Eric Lippert.

History

  • 13th June, 2008: First version
  • 14th June, 2008: Fixed the *.sln (solution files) so it is a bit tidier
  • 16th June, 2008: Added a ColorCheckedBoxList component (subclassed from CheckeListBox) to so it would be able to see which color corresponds to which Regular Expression match.
    Drag and Drop functionality also added.
01

Все кто когда-либо сталкивался с подстановочными символами (Wildcards) знают, что это достаточно убогая попытка реализовать в VBA механизм подобный регулярным выражениям в других более развитых языках. Помимо более скудных возможностей (я уже не говорю о невозможности указания количества «ноль-или-один») данный механизм также ограничен и в сложности выстраиваемых выражений, и те кто пытался решить более-менее сложные задачи не раз сталкивался с ошибкой Поле «Найти» содержит слишком сложное выражение с использованием подстановочных символов. Отсюда и возникла необходимость воспользоваться более могущественным инструментом — регулярными выражениями.

02 VBA

1
2
3
4
5
6
7
8
9
10
11

Dim objRegExp, matches, match

Set objRegExp = CreateObject(«VBScript.RegExp»)

With objRegExp
.Global = True
.IgnoreCase = False
.pattern = «pattern»
End With

03

Здесь, конечно, каждый кодер обрадуется — вызываем Replace и все ок!

04 VBA

1

Set matches = objRegExp.Replace(ActiveDocument.Content, «replacementstring»)

05

Но при запуске, конечно же, будет выдана ошибка. Это связано с тем, что метод Replace объекта VBScript.RegExp принимает на вход первым параметром строковую переменную, а не объект (в нашем случае ActiveDocument.Content), и возвращает этот метод также измененную строку, а не вносит изменение во входящую, отсюда и танцы с бубнами:

06 VBA

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Set matches = objRegExp.Execute(ActiveDocument.Content)

For Each match In matches
Set matchRange = ActiveDocument.Content
With matchRange.Find
.Text = match.Value
.Replacement.Text = «replacementstring»
.MatchWholeWord = True
.MatchCase = True
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = False
.Execute Replace:=wdReplaceOne
End With
Next

07

Ну хорошо, скажете вы, ну а если нам нужно переформатировать данные по аналогии с выражениями типа $1-$3-$2 (т. н. «обратные ссылки» в регулярных выражениях), т. е. как к примеру из 926-5562214 получить +7 (926) 556-22-14. Это тоже достаточно просто, здесь они тоже есть — единственное отличие — нумерация найденных групп начинается не с нуля, а единицы — $1. Давайте пока отвлечемся от нашего документа и посмотрим как это можно сделать с обычной строковой переменной:

08 VBA

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

19

Dim objRegExp, matches, match

Set objRegExp = CreateObject(«VBScript.RegExp»)

Dim strSearch As String
Dim strResult As String
strSearch = «Пусть у нас есть несколько телефонов 8495-3584512, 8800-4852620 и, к примеру, 8950-5628585»

With objRegExp
.Global = True
.IgnoreCase = False

.pattern = «8(d{3})-(d{3})(d{2})(d{2})»

End With

strResult = objRegExp.Replace(strSearch, «+7 ($1) $2-$3-$4»)

Debug.Print strResult

09 На заметку:

12 строка выделена для того чтобы подчеркнуть каким образом было разделено указание на подгруппы ($2, $3 и $4), ведь выражение (d{3})(d{2})(d{2}) эквивалентно (d{7}). Но во втором случае, рекурсивный запрос содержал бы все 7 цифр.

Изучайте регулярные выражения!

10

Но поскольку, как уже говорилось выше, вместо входной строки у нас объект ActiveDocument.Content, такой метод не подойдет для работы. Придется пойти на хитрость — объединить два предыдущих кода:

11 VBA

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Set objRegExp = CreateObject(«VBScript.RegExp»)

With objRegExp
.Global = True
.IgnoreCase = False
.pattern = «8(d{3})-(d{3})(d{2})(d{2})»
End With

Set matches = objRegExp.Execute(ActiveDocument.Content)

Dim strReplacement As String

For Each match In matches
Set matchRange = ActiveDocument.Content

strReplacement = objRegExp.Replace(match.Value, «+7 ($1) $2-$3-$4»)

With matchRange.Find
.Text = match.Value
.Replacement.Text = strReplacement
.MatchWholeWord = True
.MatchCase = True
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = False
.Execute Replace:=wdReplaceOne
End With
Next

12

Оборачиваем в оболочку-функцию и, вуаля:

13 VBA

1

2
3
4

5

6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

Sub ВыполнитьГруппуПреобразований_RegExp()
Call Выполнить_RegExp(«8(d{3})-(d{3})(d{2})(d{2})», «+7 ($1) $2-$3-$4»)

End Sub

Sub ВыполнитьГруппуПреобразований_RegExp() …

Private Sub Выполнить_RegExp(pattern As String, patternExpr As String)
Set objRegExp = CreateObject(«VBScript.RegExp»)

With objRegExp
.Global = True
.IgnoreCase = False
.pattern = pattern
End With

Set matches = objRegExp.Execute(ActiveDocument.Content)

Dim strReplacement As String

For Each match In matches
Set matchRange = ActiveDocument.Content

strReplacement = objRegExp.Replace(match.Value, patternExpr)

With matchRange.Find
.Text = match.Value
.Replacement.Text = strReplacement
.MatchWholeWord = True
.MatchCase = True
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = False
.Execute Replace:=wdReplaceOne
End With
Next
End Sub

Private Sub Выполнить_RegExp(pattern As String, patternExpr As String) …

14

Нельзя умолчать о существовании, к сожалению, некоторых ограничений в синтаксисе регулярных выражений при использовании объекта VBScript.RegExp в VBA. Эти ограничения провоцируют ошибку Run-time error ‘5017’ Application-defined or object-defined error. Вот некоторые из них:

15
  • отсутствуют указатели на начало и конец текста A и Z — вместо этих указателей можно использовать указатель конца текста $;
  • отсутствуют назад- (?<=…) и впередсмотрящие (?=…) указатели (утверждения, lookbehind assertions), равно как и их отрицательные реализации — (?!=…) и (?!…);
  • отсутствует ряд модификаторов.
17

Похожие запросы:

  • Регулярные выражения в MS-Word
  • VBA regular expression, replacing groups
  • Поиск и замена текста с помощью объекта VBScript.RegExp
  • Замена текста в документе при помощи регулярных выражений
  • Regex Capture Groups and Back-References
  • Разработка регулярного выражения для разбора строки с помощью обратных ссылок

Как заменить «Фамилия И.О.» на «И.О.Фамилия» одним нажатием по всему документу?

заменить «Фамилия И.О.» на «И.О.Фамилия»

Как использовать?
Для использования регулярных выражений необходимо, находясь в окне текстового редактора
1) нажать Ctr+F
регулярные выражения word поиск
2) перейти на вкладку «Заменить»
Screenshot at 2017-07-30 02-16-15
3) нажать клавишу «Больше>>>»
Вкладка "Больше" выражения
4) отметить флаг «Подстановочные знаки»заменить «Фамилия И.О.» на «И.О.Фамилия»

Как использовать?
Допустим, у Вас есть много документов в тексте которых упомянуты «Фамилия И.О.» очень много раз. Например, это может быть список класса или учебник к которому вышли новые правила оформления. Можно сидеть и исправлять вручную, тогда это займет очень очень много времени + возможность допусить человеческие ошибки (опечататься).
Возникает задача заменить «Фамилия И.О.» на «И.О.Фамилия».
Зачем это?
В поле найти написать: ([A-Я]{1;1}[a-я]{2;11}) ([A-Я]{1;1}) ([A-Я]{1;1})
Заменить на:  2.3.^s1
() — можно выводить несколько разных значений
[] — диапозон букв
! — кроме, например [!А-Я] – все кроме заглавных
{x, y} – применять действие с символа x до символа y
2 – берём вторые скобки, то есть Имя со второго ставим на первое место
точка в поле «Заменить на» служит для склеивания
Альтернативы?
Кстати в «LibreOffice Writer» тоже есть регулярные выражения. В отличие от Майкрософта, которые обозвали своё творение «wildcards», создатели свободного редактора написали ясно и понятно «Regular expressions».
регулярные выражения libreoffice

P.S. Посмотреть информацию о регулярных выражениях вы можете только в интернете, так как справка у обоих редакторах онлайновая:(
А какие регулярные выражения используюте Вы?)

microsoft wordregex

I want to remove leading and trailing tags from country names.
In my example those tags are <li> and <a>.

<li><a href="http://afghanistan.makaan.com/">Afghanistan</a></li>
<li><a href="http://albanie.makaan.com/">Albanie</a></li>
<li><a href="http://algérie.makaan.com/">Algérie</a></li>

Result should be:

Afghanistan
Albanie
Algérie

In Microsoft Word, I want to use the Find and Replace feature to accomplish it with regular expression.

How can I use regular expressions in MS Word?

Related Question

    Понравилась статья? Поделить с друзьями:
  • Regex in excel replace
  • Regular expression for a word
  • Regular expression find any word
  • Regex how to find a word
  • Regex for word matching