Word read by element

I am trying to read some specific text from a word document using open xml SDK.
The structure of the document looks like this.

1.1.1 Title of the Chapter

Author’s notes: Some text here
sometimes Author’s notes are more than one line.

1.1.1.1 Title of the sub-chapter

Some text here

1.1.1.2 Title of the next sub-chapter

Some text here.

End of the Chapter

Note: Author’s note may not be there in every Chapter

My requirement is that in a given chapter, whenever there exists 1.1.1.2 Title of the next sub-chapter I need to get some text under that element, also the Title of the Chapter and if the Author’s notes are present, then the text under that element as well.

This is what I tried:

WordprocessingDocument myDoc = WordprocessingDocument.Open(wordfile, true)          
MainDocumentPart mainPart = myDoc.MainDocumentPart;
//Some functions to get the title of the Chapters// 
IEnumerable<Wp.Paragraph> paraList = ParagraphsByStyleName(mainPart, paraStyle1, paraStyle2, paraStyle3);
var purposeParas = paraList.Where(p => Regex.Match(p.InnerText.ToUpper(), "TITLE OF THE NEXT SUB-CHAPTER").Success).ToList();
var myHeaders = purposeParas.Select(p => p.Parent).Distinct().ToList();

when I tried to iterate through myHeaders it gave me the entire Document as the Inner Text. Now I have no way to get to my required text.

So I tried this:

var purposeParas = paraList.Where(p => Regex.Match(p.InnerText.ToUpper(), "TITLE OF THE NEXT SUB-CHAPTER").Success).ToList();
var applicability = purposeParas.Select(p => p.NextSibling()).Distinct().ToList();
var myHeader1 = purposeParas.Select(p => p.PreviousSibling()).Distinct().ToList();
var myHeader2 = myHeader1.Select(p => p.PreviousSibling()).Distinct().ToList();
var myHeader3 = myHeader2.Select(p => p.PreviousSibling()).Distinct().ToList();
var myHeaders = myHeader3.Select(p => p.PreviousSibling()).Distinct().ToList();

This way I was able to read some Chapters. But when Author’s notes was more than 1 line, this method fails.
Any kind suggestion will be greatly appreciated.

  • Remove From My Forums
  • Question

  •  I have a word document that contains a table.
     I need to be able to read this data into my C# ASP.NET application. 

     Does anyone how know to successfully read an word document ?

    Many thanks.


    shobha

    • Moved by

      Tuesday, December 1, 2009 12:05 PM
      Moving thread to correct Category. OP Wrongly posted in SQL Server Section (From:Database Mirroring)

Answers

  • Hello Shobha,

    As far as I know, you need to add a reference to Microsoft Word 12.0/11.0 Object Library on COM tab. If this library is not listed on COM tab, office 2007/2003 PIA need to be installed on your computer. You could download Office PIA from Microsoft Download Center. Here are the links: Office 2007 PIA and Office 2003 PIA. And then use code like this,

                Word.Application wordApp = new Word.Application();
                object
    filename=@»C:Temp11.docx«;
                object missing=Type.Missing;
                Word.Document doc = wordApp.Documents.Open(ref filename, ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing, ref missing,
                    ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
                Word.Table table = doc.Tables[0]; //define this index depending on the number of table which you want to get

                doc.Close(ref missing, ref missing, ref missing);
                Marshal.ReleaseComObject(wordApp);

    Best regards,
    Bessie


    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.

    • Proposed as answer by
      Geert van Horrik
      Monday, December 7, 2009 12:05 PM
    • Marked as answer by
      Bessie Zhao
      Thursday, December 10, 2009 9:44 AM

With GemBox.Document you can open and read many Word file formats (like DOCX, DOC, RTF, ODT and HTML) in the same manner. The documents can be loaded using one of the DocumentModel.Load methods from your C# and VB.NET application. These methods enable you to work with a physical file (when providing the file’s path) or with an in-memory file (when providing the file’s Stream).

You can specify the format of your Word file by providing an object from the LoadOptions derived class (like DocxLoadOptions, DocLoadOptions, RtfLoadOptions, and HtmlLoadOptions). Or you can let GemBox.Document choose the appropriate options for you when opening the file by omitting the LoadOptions.

The following example shows the easiest way how you can read the document’s text from a Word file.

Opening and reading Word document's text in C# and VB.NET

Screenshot of read text from input Word document
using System;
using System.Linq;
using GemBox.Document;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        // Load Word document from file's path.
        var document = DocumentModel.Load("%InputFileName%");

        // Get Word document's plain text.
        string text = document.Content.ToString();

        // Get Word document's count statistics.
        int charactersCount = text.Replace(Environment.NewLine, string.Empty).Length;
        int wordsCount = document.Content.CountWords();
        int paragraphsCount = document.GetChildElements(true, ElementType.Paragraph).Count();
        int pageCount = document.GetPaginator().Pages.Count;

        // Display file's count statistics.
        Console.WriteLine($"Characters count: {charactersCount}");
        Console.WriteLine($"     Words count: {wordsCount}");
        Console.WriteLine($"Paragraphs count: {paragraphsCount}");
        Console.WriteLine($"     Pages count: {pageCount}");
        Console.WriteLine();

        // Display file's text content.
        Console.WriteLine(text);
    }
}
Imports System
Imports System.Linq
Imports GemBox.Document

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        ' Load Word document from file's path.
        Dim document = DocumentModel.Load("%InputFileName%")

        ' Get Word document's plain text.
        Dim text As String = document.Content.ToString()

        ' Get Word document's count statistics.
        Dim charactersCount As Integer = text.Replace(Environment.NewLine, String.Empty).Length
        Dim wordsCount As Integer = document.Content.CountWords()
        Dim paragraphsCount As Integer = document.GetChildElements(True, ElementType.Paragraph).Count()
        Dim pageCount As Integer = document.GetPaginator().Pages.Count

        ' Display file's count statistics.
        Console.WriteLine($"Characters count: {charactersCount}")
        Console.WriteLine($"     Words count: {wordsCount}")
        Console.WriteLine($"Paragraphs count: {paragraphsCount}")
        Console.WriteLine($"     Pages count: {pageCount}")
        Console.WriteLine()

        ' Display file's text content.
        Console.WriteLine(text)

    End Sub
End Module

Reading Word document’s elements

Besides reading the text of the whole document, you can also read just some part of it, like a specific Section element or HeaderFooter element. Each element has a Content property with which you can extract its text via the Content.ToString method.

The following example shows how you can open a document and traverse through all Paragraph elements and their child Run elements, and read their text and formatting. To read more about the visual information of the content elements, see the Formattings and Styles help page.

Opening and reading Word document's text and formatting in C# and VB.NET

Screenshot of read elements from input Word document
using System;
using System.IO;
using System.Linq;
using GemBox.Document;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("%InputFileName%");
        using (var writer = File.CreateText("Output.txt"))
        {
            // Iterate through all Paragraph elements in the Word document.
            foreach (Paragraph paragraph in document.GetChildElements(true, ElementType.Paragraph))
            {
                // Iterate through all Run elements in the Paragraph element.
                foreach (Run run in paragraph.GetChildElements(true, ElementType.Run))
                {
                    string text = run.Text;
                    CharacterFormat format = run.CharacterFormat;

                    // Replace text with bold formatting to 'Mathematical Bold Italic' Unicode characters.
                    // For instance, "ABC" to "𝑨𝑩𝑪".
                    if (format.Bold)
                    {
                        text = string.Concat(text.Select(
                            c => c >= 'A' && c <= 'Z' ? char.ConvertFromUtf32(119847 + c) :
                                 c >= 'a' && c <= 'z' ? char.ConvertFromUtf32(119841 + c) :
                                 c.ToString()));
                    }

                    writer.Write(text);
                }

                writer.WriteLine();
            }
        }
    }
}
Imports System
Imports System.IO
Imports System.Linq
Imports GemBox.Document

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document = DocumentModel.Load("%InputFileName%")
        Using writer = File.CreateText("Output.txt")

            ' Iterate through all Paragraph elements in the Word document.
            For Each paragraph As Paragraph In document.GetChildElements(True, ElementType.Paragraph)

                ' Iterate through all Run elements in the Paragraph element.
                For Each run As Run In paragraph.GetChildElements(True, ElementType.Run)

                    Dim text As String = run.Text
                    Dim format As CharacterFormat = run.CharacterFormat

                    ' Replace text with bold formatting to 'Mathematical Bold Italic' Unicode characters.
                    ' For instance, "ABC" to "𝑨𝑩𝑪".
                    If format.Bold Then
                        text = String.Concat(text.Select(
                            Function(c)
                                Return If(c >= "A"c AndAlso c <= "Z"c, Char.ConvertFromUtf32(119847 + AscW(c)),
                                       If(c >= "a"c AndAlso c <= "z"c, Char.ConvertFromUtf32(119841 + AscW(c)),
                                       c.ToString()))
                            End Function))
                    End If

                    writer.Write(text)
                Next

                writer.WriteLine()
            Next
        End Using

    End Sub
End Module

By combining these two examples you can achieve various tasks, like selecting only the Table elements and reading their text content, or selecting only the Picture elements and extracting their images, or reading the Run.Text property of only the highlighted elements (the ones that have CharacterFormat.HighlightColor).

Reading Word document’s pages

Word files (DOCX, DOC, RTF, HTML, etc.) don’t have a page concept, which means they don’t contain information about how many pages they occupy nor which element is on which page.

They are of a flow document type and their content is written in a flow-able manner. The page concept is specific to the Word application(s) that renders or displays the document.

On the other hand, files of fixed document type (PDF, XPS, etc.) do have a page concept. Their content is fixed: it’s defined on which exact page location the elements are rendered.

GemBox.Document uses its rendering engine to paginate and render the document’s content when saving to PDF, XPS, or image format. So, the best and the easiest way to read the text content of some specific page is to convert a Word document to a PDF file (or save a specific Word page as a PDF) with GemBox.Document and then read the PDF page’s text content with our other component, GemBox.Pdf.

Nevertheless, the following example shows how you can use GemBox.Document’s rendering engine to retrieve each document page as a FrameworkElement object from a WPF framework and then extract text from it using the provided FrameworkElement.ToText extension method.

Opening and reading Word document's page in C# and VB.NET

Screenshot of read page from input Word document
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows;
using System.Windows.Media;
using GemBox.Document;

class Program
{
    [STAThread]
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("Reading.docx");
        var pages = document.GetPaginator().Pages;

        for (int i = 0, count = pages.Count; i < count; ++i)
        {
            Console.WriteLine(new string('-', 50));
            Console.WriteLine($"Page {i + 1} of {count}");
            Console.WriteLine(new string('-', 50));

            // Get FrameworkElement object from Word document's page.
            DocumentModelPage page = pages[i];
            FrameworkElement pageContent = page.PageContent;

            // Extract text from FrameworkElement object.
            Console.WriteLine(pageContent.ToText());
        }
    }
}

/// <summary>
/// Contains methods that are used to extract text out of a FrameworkElement object.
/// </summary>
public static class GemBoxDocumentHelper
{
    public static string ToText(this FrameworkElement root)
    {
        var builder = new StringBuilder();

        foreach (var visual in root.GetSelfAndDescendants().OfType<DrawingVisual>())
        {
            GlyphRun previousRun = null;

            // Order runs first vertically (Y), then horizontally (X).
            foreach (var currentRun in visual.Drawing
                .GetSelfAndDescendants()
                .OfType<GlyphRunDrawing>()
                .Select(glyph => glyph.GlyphRun)
                .OrderBy(run => run.BaselineOrigin.Y)
                .ThenBy(run => run.BaselineOrigin.X))
            {
                if (previousRun != null)
                {
                    // If base-line of current text segment is left from base-line of previous text segment, then assume that it is new line.
                    if (currentRun.BaselineOrigin.X <= previousRun.BaselineOrigin.X)
                    {
                        builder.AppendLine();
                    }
                    else
                    {
                        Rect currentRect = currentRun.ComputeAlignmentBox();
                        Rect previousRect = previousRun.ComputeAlignmentBox();

                        double spaceWidth = currentRun.BaselineOrigin.X + currentRect.Left - previousRun.BaselineOrigin.X - previousRect.Right;
                        double spaceHeight = (currentRect.Height + previousRect.Height) / 2;

                        // If space between successive text segments has width greater than a sixth of its height, then assume that it is a word (add a space).
                        if (spaceWidth > spaceHeight / 6)
                            builder.Append(' ');
                    }
                }

                builder.Append(currentRun.Characters.ToArray());
                previousRun = currentRun;
            }
        }

        return builder.ToString();
    }

    private static IEnumerable<DependencyObject> GetSelfAndDescendants(this DependencyObject parent)
    {
        yield return parent;

        for (int i = 0, count = VisualTreeHelper.GetChildrenCount(parent); i < count; i++)
            foreach (var descendant in VisualTreeHelper.GetChild(parent, i).GetSelfAndDescendants())
                yield return descendant;
    }

    private static IEnumerable<Drawing> GetSelfAndDescendants(this DrawingGroup parent)
    {
        yield return parent;

        foreach (var child in parent.Children)
        {
            var drawings = child as DrawingGroup;
            if (drawings != null)
                foreach (var descendant in drawings.GetSelfAndDescendants())
                    yield return descendant;
            else
                yield return child;
        }
    }
}
Imports System
Imports System.Collections.Generic
Imports System.Linq
Imports System.Text
Imports System.Windows
Imports System.Windows.Media
Imports GemBox.Document

Module Program

    <STAThread>
    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document = DocumentModel.Load("Reading.docx")
        Dim pages = document.GetPaginator().Pages
        Dim count = pages.Count

        For i = 0 To count - 1
            Console.WriteLine(New String("-"c, 50))
            Console.WriteLine($"Page {i + 1} of {count}")
            Console.WriteLine(New String("-"c, 50))

            ' Get FrameworkElement object from Word document's page.
            Dim page As DocumentModelPage = pages(i)
            Dim pageContent As FrameworkElement = page.PageContent

            ' Extract text from FrameworkElement object.
            Console.WriteLine(pageContent.ToText())
        Next

    End Sub
End Module

''' <summary>
''' Contains methods that are used to extract text out of a FrameworkElement object.
''' </summary>
Module GemBoxDocumentHelper
    <Runtime.CompilerServices.Extension>
    Function ToText(ByVal root As FrameworkElement) As String
        Dim builder As New StringBuilder()

        For Each visual In root.GetSelfAndDescendants().OfType(Of DrawingVisual)()
            Dim previousRun As GlyphRun = Nothing

            ' Order runs first vertically (Y), then horizontally (X).
            For Each currentRun In visual.Drawing _
                .GetSelfAndDescendants() _
                .OfType(Of GlyphRunDrawing)() _
                .Select(Function(glyph) glyph.GlyphRun) _
                .OrderBy(Function(run) run.BaselineOrigin.Y) _
                .ThenBy(Function(run) run.BaselineOrigin.X)

                If previousRun IsNot Nothing Then
                    ' If base-line of current text segment is left from base-line of previous text segment, then assume that it is new line.
                    If currentRun.BaselineOrigin.X <= previousRun.BaselineOrigin.X Then
                        builder.AppendLine()
                    Else
                        Dim currentRect As Rect = currentRun.ComputeAlignmentBox()
                        Dim previousRect As Rect = previousRun.ComputeAlignmentBox()

                        Dim spaceWidth As Double = currentRun.BaselineOrigin.X + currentRect.Left - previousRun.BaselineOrigin.X - previousRect.Right
                        Dim spaceHeight As Double = (currentRect.Height + previousRect.Height) / 2

                        ' If space between successive text segments has width greater than a sixth of its height, then assume that it is a word (add a space).
                        If spaceWidth > spaceHeight / 6 Then builder.Append(" "c)
                    End If
                End If

                builder.Append(currentRun.Characters.ToArray())
                previousRun = currentRun
            Next
        Next

        Return builder.ToString()
    End Function

    <Runtime.CompilerServices.Extension>
    Private Iterator Function GetSelfAndDescendants(ByVal parent As DependencyObject) As IEnumerable(Of DependencyObject)
        Yield parent

        Dim count = VisualTreeHelper.GetChildrenCount(parent)
        For i = 0 To count - 1
            For Each descendant In VisualTreeHelper.GetChild(parent, i).GetSelfAndDescendants()
                Yield descendant
            Next
        Next
    End Function

    <Runtime.CompilerServices.Extension>
    Private Iterator Function GetSelfAndDescendants(ByVal parent As DrawingGroup) As IEnumerable(Of Drawing)
        Yield parent

        For Each child In parent.Children
            Dim drawings = TryCast(child, DrawingGroup)
            If drawings IsNot Nothing Then
                For Each descendant In drawings.GetSelfAndDescendants()
                    Yield descendant
                Next
            Else
                Yield child
            End If
        Next
    End Function
End Module

GemBox.Document is a .NET component that enables you to read, write, edit, convert, and print document files from your .NET applications using one simple API. How about testing it today?

Download Buy

Published: September 14, 2018 | Modified: December 19, 2022 | Author: Mario Zorica

Office Open XML

Office Open XML is introduced by Microsoft to work with documents. For e.g.: — read/write MS word documents.

Prerequisites

  • Visual Studio
  • MS Word Document
  • Open XML SDK

Execute below command to Install DocumentFormat.OpenXml SDK in your project

  1. Install-Package DocumentFormat.OpenXml Install-Package DocumentFormat.OpenXml  
Note

In this example, I am using a windows form application to interact with word document and display the result on the screen.

How to find first table from word document?

  1. Table table =  doc.MainDocumentPart.Document.Body.Elements<Table>().First();  

Here .First() is extension method to find first table from word document.

The WordprocessingDocument Methods

EXTENSION METHOD

DESCRIPTION

.Elements<Table>().First()

To get the first table from the word document

.Elements<Table>().Last()

To get the last table from the word document

.Elements<Table>().FisrtOrDefault()

To get the first table from the word document or return the default value if the document does not contain table.

.Elements<Table>().LastOrDefault()

To get the Last table from word document or return the default value if the document does not contain table.

.Elements<Table>().ElementAt(Index)

To get the exact table from the word document by index number. Default index number is 0.

Table.Elements<TableRow>().ElementAt(index)

To get the exact table row from selected table by index number. Default index number is 0.

Row.Elements<TableCell>().ElementAt(Index)

To get the exact row cell from selected row by index number. Default index number is 0.

 

Execute the following code to read the first table from the word document,

  1. using System;  
  2. using System.Collections.Generic;  
  3. using System.Data;  
  4. using System.Linq;  
  5. using System.Windows.Forms;  
  6. using DocumentFormat.OpenXml.Packaging;  
  7. using DocumentFormat.OpenXml.Wordprocessing;  
  8.   
  9. namespace ReadTable  
  10. {  
  11.     public partial class Form1 : Form  
  12.     {  
  13.         public Form1()  
  14.         {  
  15.             InitializeComponent();  
  16.         }  
  17.   
  18.           
  19.         private void btnBrowse_Click(object sender, EventArgs e)  
  20.         {  
  21.             DialogResult result = this.openDB.ShowDialog();  
  22.             if (result == DialogResult.OK)  
  23.             {  
  24.                 txtBrowse.Text = openDB.FileName;  
  25.             }  
  26.         }  
  27.   
  28.         private void btnReadTable_Click(object sender, EventArgs e)  
  29.         {  
  30.               
  31.             using (var doc = WordprocessingDocument.Open(txtBrowse.Text.Trim(), false))  
  32.             {  
  33.                   
  34.                 DataTable dt = new DataTable();  
  35.                 int rowCount = 0;  
  36.   
  37.                   
  38.                 Table table = doc.MainDocumentPart.Document.Body.Elements<Table>().First();  
  39.   
  40.                   
  41.                 IEnumerable<TableRow> rows = table.Elements<TableRow>();  
  42.   
  43.                   
  44.                 foreach (TableRow row in rows)  
  45.                 {  
  46.                     if (rowCount == 0)  
  47.                     {  
  48.                         foreach (TableCell cell in row.Descendants<TableCell>())  
  49.                         {  
  50.                             dt.Columns.Add(cell.InnerText);  
  51.                         }  
  52.                         rowCount += 1;  
  53.                     }  
  54.                     else  
  55.                     {  
  56.                         dt.Rows.Add();  
  57.                         int i = 0;  
  58.                         foreach (TableCell cell in row.Descendants<TableCell>())  
  59.                         {  
  60.                             dt.Rows[dt.Rows.Count — 1][i] = cell.InnerText;  
  61.                             i++;  
  62.                         }  
  63.                     }  
  64.                 }  
  65.   
  66.                   
  67.                   
  68.                 dgvTable.DataSource = dt;  
  69.             }  
  70.         }  
  71.     }  
  72. }  

Output

In the above example, I have used .First() extension method to get the first table from word document. You can use other extension methods to find exact table or you can use for/foreach loop to find the exact table from the word document.

Summary

In this session, I discussed how to read the table from the word document using c#. I hope the above session will help you to read the table from the word document using C#.

This article is for people with visual or cognitive impairments who use a screen reader program such as Microsoft’s Narrator, JAWS, or NVDA with the Microsoft 365 products. This article is part of the Microsoft 365 screen reader support content set where you can find more accessibility information on our apps. For general help, visit Microsoft Support home or Fixes or workarounds for recent office issues.

Use Word with your keyboard and a screen reader to explore and navigate the different views and move between them. We’ve tested it with Narrator, JAWS, and NVDA, but it might work with other screen readers as long as they follow common accessibility standards and techniques.

Decorative icon Need instructions on how to get started with Word, but not using a screen reader? See Word help & learning.

Notes: 

  • New Microsoft 365 features are released gradually to Microsoft 365 subscribers, so your app might not have these features yet. To learn how you can get new features faster, join the Office Insider program.

  • To learn more about screen readers, go to How screen readers work with Microsoft 365.

In this topic

  • Navigate the main view

  • Navigate between views

  • Explore a document

  • Use Search

Navigate the main view

When you open a Word document for editing, you land on the main view. To cycle between the elements in the main view, press the F6 (forward) or Shift+F6 (backward). The main elements are, in order:

  • The main content area, which shows the document content. You hear the name of the document, followed by «Editing,» when the focus is on the document content.

  • The status bar at the bottom of the screen where you can find document statistics such as page count, word count, text language, and zoom level. You hear the current page number when the focus is on the status bar, for example, «Page two of three.»

    • To navigate the status bar, use the Right and Left arrow keys.

  • The row of ribbon tabs, which includes tabs such as File, Home, Insert, Review, View, and the Share and Comments buttons. The ribbon containing the options specific to the currently selected tab is located below the row of ribbon tabs.

    • When the focus moves to the ribbon tabs, you hear «Ribbon tabs,» followed by the currently selected tab. To navigate the row of ribbon tabs, press the Left or Right arrow key until you hear the name of the tab or control you want, and press Enter to select it.

    • To navigate from the row of ribbon tabs to the ribbon, press the Tab key or the Down arrow key once. You hear the name of the first option on the ribbon. To navigate between options on the ribbon, press the Tab key, Shift+Tab, or the Right or Left arrow key. You can also use keyboard shortcuts to select options directly. For the ribbon keyboard shortcuts, refer to Use the keyboard to work with the ribbon in Word.

Navigate between views

In addition to the main view, Word has the following commonly used views and areas:

The File menu

The File menu contains commands such as New, Open, and Save a Copy. You can also access your accounts and the app settings from the File menu. The File menu consists of a tab pane on the left and the contents of a selected tab on the right.

  • To open the File menu, press Alt+F. You hear: «File, home.» The focus is on the Home tab in the tab pane.

  • To navigate between the tabs in the tab pane, press the Up or Down arrow key until you hear the tab you want to open, for example, «New.» Press Enter to open the tab. The content pane of the selected tab opens to the right of the tab pane. To move the focus to the content pane, press the Tab key once.

  • To navigate within a tab content pane, press the Tab key, Shift+Tab, or the arrow keys.

  • To exit the File menu and return to the main view, press Esc.

For the File menu keyboard shortcuts, refer to Keyboard shortcuts for the File menu in Microsoft 365 for Windows.

The Quick Access Toolbar

The Quick Access Toolbar and title bar at the top of the screen contain the document name, buttons for AutoSave, ribbon display options, and your account, and buttons for minimizing, restoring the size, and closing the active window.

  • To navigate to the Quick Access Toolbar and title bar, press Alt once. You hear: «Ribbon tabs.» Then press Shift+Tab once. The focus is now on the Quick Access Toolbar.

  • To browse the available options, press Shift+Tab repeatedly. You can add and remove Quick Access Toolbar buttons and change their order on the toolbar. For instructions, refer to Use a keyboard to customize the Quick Access Toolbar.

The Options window

The Options window contains Word settings, such as personalization, proofreading, and language preferences. The Options window consists of an options category pane on the left and the content pane of the selected category on the right.

  • To open the Options window, press Alt+F, T. You hear: «Word options.»

  • To navigate the options categories, press the Down arrow key until you hear the name of the category you want, then press the Tab key to move the focus to the content pane.

  • To navigate the content pane, press the Tab key, Shift+Tab, or the Up and Down arrow keys.

  • To exit the Options window and return to the main view, press the Tab key until you hear «OK,» and press Enter. To return to the main view without making changes, press Esc.

Explore a document

Use the Narrator scan mode

To navigate the content of your document by elements, you can use the Narrator scan mode. To turn on the scan mode, press the SR key+Spacebar.

With the scan mode enabled, you can use the Up and Down arrow keys and keyboard shortcuts to navigate your document and cycle between paragraphs, other elements, areas, and landmarks. For detailed information on how to use the Narrator scan mode, refer to Chapter 3: Using scan mode.

To find the JAWS cursor that suits your needs, refer to So Many Cursors, So Little Time, Understanding Cursors in JAWS. To learn how to use the NVDA Browse mode which is also optionally available for Word, refer to 6. Browse Mode.

Use the Navigation pane

You can use the Navigation pane to quickly navigate between parts of the document such as headings or graphics.

  1. To turn on the Navigation pane, press Alt+W, K. You hear: «Navigation, search document, edit box.»

  2. Do one of the following:

    • To navigate the headings in the document, press the Tab key until you hear «Heading tab item,» press the Tab key until you hear the heading you want, and then press Enter. The focus moves to the beginning of the heading row in the document body.

    • To navigate by certain elements in your document, such as graphics, press the Tab key until you hear «Search, split button,» and then press Alt+Down arrow key to expand the menu. Press the Down arrow key until you hear the element you want, for example, «Graphic,» and then press Enter to select. The focus moves to the next result button. Press Enter repeatedly to move through the results.

  3. To close the Navigation pane, press Alt+W, K.

Use Read Mode

Read Mode is designed to make reading text easier and includes reading tools such as Read Aloud.

  1. To enable Read Mode, press Alt+W, F.

  2. Do one or more of the following:

    • To access the Read Mode toolbar, press Alt, and then press the Tab key until you hear the name of the menu you want, and then press Enter to select it. Press the Down arrow key to move down on the list of available options, and press Enter to select an option.

    • To use Read Aloud, press Alt+W, R. To access the reading controls, press the Tab key until you reach the option you want, and then press Enter to select it.

      Tip: For the best results, it might be helpful to turn off your screen reader when you use Read Aloud.

  3. To exit Read Mode, press Esc.

Use the Immersive Reader view

With Immersive Reader, you can improve focus, declutter the text you’re reading, read scanned texts more easily, and decode complex texts.

  1. To turn on Immersive Reader, press Alt+W, L, 2.

  2. To access the Immersive Reader ribbon and options, press Alt. You hear: «Immersive, Immersive Reader, tab.» Press the Tab key to move between the options on the ribbon and press Enter to select an option.

  3. To turn off Immersive Reader, press Alt+W, L, 2.

Use the Focus mode

The Focus mode can help you minimize distractions and concentrate on writing, creating, and collaborating in Word. The Focus mode hides the ribbon and status bar, showing just the Word document itself.

  • To turn on the Focus mode, press Alt+W, O.

  • To turn off the Focus mode, press Esc.

Navigate between floating shapes 

  1. To quickly move the focus to the first floating shape such as a text box or a chart, press Ctrl+Alt+5.

  2. To cycle between the floating shapes, press the Tab key or Shift+Tab.

  3. To return to the normal navigation, press Esc.

Zoom in or out

Zoom in to get a close-up of your document or zoom out to get an overview of the page at a reduced size.

  1. Press Alt+W, Q. You hear “Zoom dialog” or “Zoom window.”

  2. Press the Tab key until you reach the Percent spinner, and then type a percentage or use the Up or Down arrow key to change the percentage.

  3. Press the Tab key until you reach the OK button and press Enter.

Use Search

To find an option or perform an action quickly, use the Search text field. To learn more about the Search feature, go to Find what you need with Microsoft Search.

Note: Depending on the version of Microsoft 365 you are using, the Search text field at the top of the app window might be called Tell Me instead. Both offer a largely similar experience, but some options and search results can vary.

  1. Select the item or place in your document, presentation, or spreadsheet where you want to perform an action.

  2. To go to the Search text field, press Alt+Q.

  3. Type the search words for the action that you want to perform. For example, if you want to add a bulleted list, type bullets.

  4. Press the Down arrow key to browse through the search results.

  5. Once you’ve found the result that you want, press Enter to select it and to perform the action.

See also

Use a screen reader to insert and change text in Word

Use a screen reader to insert a picture or image in Word

Keyboard shortcuts in Word

Basic tasks using a screen reader with Word

Set up your device to work with accessibility in Microsoft 365

Make your Word documents accessible to people with disabilities

What’s new in Microsoft 365: Release notes for Current Channel

Use Word for Mac with your keyboard and VoiceOver, the built-in MacOS screen reader, to explore and navigate the different views and move between them.

Decorative icon Need instructions on how to get started with Word, but not using a screen reader? See Word help & learning.

Notes: 

  • New Microsoft 365 features are released gradually to Microsoft 365 subscribers, so your app might not have these features yet. To learn how you can get new features faster, join the Office Insider program.

  • This topic assumes that you are using the built-in macOS screen reader, VoiceOver. To learn more about using VoiceOver, go to VoiceOver Getting Started Guide.

In this topic

  • Navigate the main view

  • Navigate between views

  • Explore a document

Navigate the main view

When you open a Word document for editing, you land on the main view. To navigate the main view, press F6 (forward) and Shift+F6 (backward). The focus moves through the following elements in the main view, in order:

  • The main content area, which shows the document content. This is where you edit the document. When the focus is on the content area, you hear the page you’re on, followed by the location of the text insertion point.

  • The status bar at the bottom of the screen, which contains document statistics such as page count, word count, text language, and the zoom level. When the focus moves to the status bar, you hear the current page number, followed by the total number of pages, for example, «Page six of fourteen.» To browse the options on the status bar, press Control+Option+Right or Left arrow key.

  • The quick access toolbar at the top of the screen, which contains, for example, the AutoSave, Print, and Undo buttons and the name of the document. You hear «Autosave» when the focus moves to the quick access toolbar. To move between the options on the toolbar, press Control+Option+Right or Left arrow key.

  • The row of ribbon tabs, which includes tabs such as Home, Insert, Review, View, and the Share and Comments buttons. When the focus moves to the ribbon tabs, you hear the currently selected tab, for example, «Home, selected tab.»

    • To navigate the row of ribbon tabs, press Control+Option+Right or Left arrow key until you hear the name of the tab or control you want, and press Control+Option+Spacebar to select it and display the ribbon.

    • The ribbon containing the options specific to the currently selected tab is located below the row of ribbon tabs. To navigate from the row of ribbon tabs to the ribbon, press the Tab key until you hear the ribbon you’re entering and the first option on the ribbon. For example, with the View tab selected, you hear: «Entering View tab commands scroll area.» To navigate between the options on the ribbon, press Control+Option+Right or Left arrow key.

Navigate between views

In addition to the main view, Word has the following commonly used views and areas:

Word start page

When you open the Word app, you land on the start page. On the start page, you can create a new document, browse templates, open an existing document, and access your account info. The start page consists of a tab pane on the left and the contents of a selected tab on the right.

  • To navigate the tab pane, press the Tab key. To display the tab contents, press Control+Option+Spacebar.

  • To navigate from the tab pane to the content pane of the selected tab, press the Tab key until you hear: «Entering scroll area.» To browse the available sections in the content pane, press Control+Option+Right arrow key. To interact with a section, press Control+Option+Shift+Down arrow key. To stop interacting with a section, press Control+Option+Shift+Up arrow key.

  • To navigate to the start page from the main view, press Shift+Command+P.

The app menu bar

The app menu bar contains additional options and controls, for example, for editing text and formatting tables. You can also access the File menu with options to start a new document or open an existing one.

  1. To move the focus to the app menu bar, press Control+Option+M.

  2. To browse the options on the menu bar, press Control+Option+Right arrow key.

  3. To exit the menu bar, press Esc.

The Word Preferences window

In the Word Preferences window, you can access Word settings such as AutoCorrect and ribbon options.

  1. To open the Word Preferences window, press Command+Comma (,).

  2. To navigate the window, press the Tab key. To select a setting, press Spacebar. The setting dialog box opens. To navigate within a setting dialog box, press the Tab key or the arrow keys.

  3. To close the Word Preferences window and return to your document, press Esc.

Explore a document

To navigate around a Word document, you can use the keyboard shortcuts, the VoiceOver features such as rotor, or the Navigation Pane.

Use the keyboard shortcuts

One of the quickest ways to move around in a document is to use the keyboard shortcuts. For a full list of keyboard shortcuts for navigating a document, refer to the «Navigate the document» section in Keyboard shortcuts in Word.

Use the VoiceOver rotor, Quick Nav, or Item Chooser

You can use the VoiceOver features such as rotor, Quick Nav, or Item Chooser to navigate directly to an item, for example, a section heading or link.

  • To open the rotor, press Control+Option+U. For more info, refer to Use the VoiceOver rotor on Mac.

  • To open the Item Chooser, press Control+Option+I. For more info, refer to Use the VoiceOver Item Chooser to navigate on Mac.

  • To activate Quick Nav, press the Left and Right arrow keys at the same time. For more info, refer to Use VoiceOver Quick Nav in apps and webpages on Mac.

Use the Navigation Pane

Use the Navigation Pane to quickly navigate between parts of the document such as headings.

  1. To turn on the Navigation Pane, press Command+F6 until you hear the name of the current ribbon tab. Press Control+Option+Right arrow key until you hear «View, tab,» and press Control+Option+Spacebar. Press the Tab key until you hear «Navigation pane, toggle button,» and press Control+Option+Spacebar.

  2. To move the focus to the navigation pane, press Command+F6 until you hear: «Thumbnails pane, selected.»

  3. Press Control+Option+Right arrow key until you hear «Document map, tab,» and then press Control+Option+Spacebar.

  4. Press Control+Option+Right arrow key until you hear «Table,» and then press Control+Option+Shift+Down arrow key to open the headings table.

  5. To move between the headings, press the Down or Up arrow key until you find the heading you want, and then press Control+Option+Spacebar to move the focus to the beginning of the heading in the document body.

See also

Use a screen reader to insert and change text in Word

Use a screen reader to insert a picture or image in Word

Keyboard shortcuts in Word

Basic tasks using a screen reader with Word

Set up your device to work with accessibility in Microsoft 365

Make your Word documents accessible to people with disabilities

What’s new in Microsoft 365: Release notes for Current Channel

Use Word for iOS with VoiceOver, the built-in iOS screen reader, to explore and navigate the different views and move between them.

Decorative icon Need instructions on how to get started with Word, but not using a screen reader? See Word help & learning.

Notes: 

  • New Microsoft 365 features are released gradually to Microsoft 365 subscribers, so your app might not have these features yet. To learn how you can get new features faster, join the Office Insider program.

  • This topic assumes that you are using the built-in iOS screen reader, VoiceOver. To learn more about using VoiceOver, visit Apple accessibility.

In this topic

  • Navigate the main view

  • Navigate between views

  • Explore a document

  • Use VoiceOver with an external keyboard

Navigate the main view

When you open a Word document for editing, you land on the main view. It contains the following main elements:

  • The top menu bar, which contains options such as Close file, Share, and File.

    • To move the focus to the top menu bar, tap near the top of the screen with four fingers. You hear the name of the document. Then swipe right once. The focus is now on the top menu bar. To browse the available options, swipe right repeatedly.

  • The document content area, which appears under the top menu and takes up the majority of the screen.

    • To move the focus to the content area, swipe right or left until you hear the name of the document, followed by the file extension, for example «Docx» and the current page. VoiceOver starts to read the page content.

  • The quick toolbar, which appears at the bottom of the screen when you’ve selected an editable element in the content area. It contains document formatting options for the selected element.

    • To go to and navigate the quick toolbar, select an editable element in the document, and swipe right until you reach the toolbar buttons.

  • The ribbon menu, which pops up from the bottom of the screen and contains tabs with different sets of tools and options. The ribbon options specific to the selected tab are displayed below the tab name.

    • To go to the ribbon menu, tap near the top of the screen with four fingers, swipe right until you hear «Show ribbon,» and double-tap the screen. You hear the currently selected tab.

    • To switch to another tab, double-tap the screen, swipe left or right until you hear the name of the tab you want, and then double-tab the screen.

    • To navigate the ribbon options, swipe left or right.

Navigate between views

In addition to the main view, Word has the following commonly used views:

The Home, New, and Open tabs

When you open the Word app, you land on the Home tab. This tab lists the documents that you’ve recently worked on and documents that others have shared with you. At the top of the Home tab, you can find the Search text field to search for a document.

The New tab is where you can start a new document. Here you can also find the available templates.

On the Open tab, you can access the document storage locations that are available to you, such as OneDrive, SharePoint, and your iPhone. You can browse for a file in each location and open it for editing.

  • To navigate the contents of each tab, swipe left or right. To select a file, folder, or storage location, double-tap the screen.

  • To switch between the Home, New, and Open tabs, tap near the bottom of the screen with four fingers, swipe left or right until you hear the tab you want, and then double-tap the screen.

  • To navigate to a tab when you’re editing a document in the main view, tap near the top of the screen with four fingers, swipe right until you hear «Close file,» and double-tap the screen. The focus moves to the tab from where you opened the document you just closed.

The File menu

The File menu contains options such as Save a Copy, Export, and Print.

  1. To open the File menu, tap near the top of the screen with four fingers, swipe right until you hear «File,» and double-tap the screen.

  2. To navigate the menu, swipe left or right.

  3. To exit the menu, swipe left until you hear «Done,» and double-tap the screen.

The Search view

In the Search view, you search the currently open document and browse search results.

  1. To navigate to the Search view when you’re editing a document, swipe left until you hear «Find,» and double-tap the screen. Use the on-screen keyboard to type the search words.

  2. To browse the search results, tap near the top of the screen with four fingers, swipe left until you hear «Next search result» or «Previous search result,» and double-tap the screen.

  3. To exit the Search view, swipe right until VoiceOver starts to read the document content, and then double-tap the screen.

Explore a document

  • To explore the text of a document, swipe right or left until you hear VoiceOver announce the currently open page, followed by «Content.» Swipe up or down to change the screen reader navigation mode, for example, to headings, paragraphs, lines, or words, and then swipe right or left to navigate.

  • Use the VoiceOver rotor to choose how you want to move through a document when you swipe up or down. For example, if you choose «Words,» the focus moves through the document word by word with each swipe.

    • To use the rotor, rotate two fingers on your phone screen as if you’re turning a dial. You hear the first rotor option. Keep rotating your fingers until you hear the option you want, and lift your fingers to select the option. To navigate by the selected element, swipe up or down.

  • To scroll through a document, swipe up or down with three fingers. When you lift your fingers off the screen, VoiceOver announces the page you’re on.

  • Use the Mobile view to simplify the page layout, which could make it easier to read and edit text on your phone’s screen. Swipe left or right until you reach the Mobile view button, and then double-tap the screen. To return to the Print view, swipe left until you hear «Print view,» and double-tap the screen.

Use VoiceOver with an external keyboard

If you use VoiceOver with an external keyboard and you want to use keyboard shortcuts to navigate and edit your document, make sure Quick Nav is turned off. To turn Quick Nav off, on your external keyboard, press the Left and Right arrow keys simultaneously. To turn Quick Nav back on, press the Left and Right arrow keys again.

For the keyboard shortcuts, refer to Keyboard shortcuts in Word.

See also

Use a screen reader to insert and change text in Word

Use a screen reader to insert a picture or image in Word 

Basic tasks using a screen reader with Word

Set up your device to work with accessibility in Microsoft 365

Make your Word documents accessible to people with disabilities

What’s new in Microsoft 365: Release notes for Current Channel

Use Word for Android with TalkBack, the built-in Android screen reader, to explore and navigate the different views and move between them.

Decorative icon Need instructions on how to get started with Word, but not using a screen reader? See Word help & learning.

Notes: 

  • New Microsoft 365 features are released gradually to Microsoft 365 subscribers, so your app might not have these features yet. To learn how you can get new features faster, join the Office Insider program.

  • This topic assumes that you are using the built-in Android screen reader, TalkBack. To learn more about using TalkBack, go to Android accessibility.

In this topic

  • Navigate the main view

  • Navigate between views

  • Explore a document

  • Use the TalkBack menu

Navigate the main view

When you open a Word document for editing, you land on the main view. It contains the following main elements:

  • The top menu bar, which contains buttons such as More options to open the ribbon, SearchUndo, and Menu to open options for saving and sharing, for example.

    • To go to the top menu bar from the document content, swipe left until you hear «Menu.»

    • To navigate the top menu, swipe left and right.

  • The main content area, which appears under the top menu and takes up the majority of the screen. To move the focus to the content area, swipe right until you hear the name of the document and its file extension, for example, «Docx.»

  • The quick toolbar, which appears at the bottom of the screen when you’ve selected an editable element in the content area. It contains document formatting options for the selected element.

    • To go to and navigate the quick toolbar, select an editable element in the document, and swipe right until you reach the toolbar buttons.

  • The ribbon, which pops up from the bottom of the screen and contains tabs with different editing tools and options.

    • To go to the ribbon, slide one finger near the top of the screen until you hear «More options, button,» and double-tap the screen. You hear the currently selected ribbon tab. The ribbon options specific to the selected tab are displayed below the tab name.

    • To switch to another tab, double-tap the screen, swipe left or right until you hear the name of the tab you want, and double-tab the screen.

    • To navigate the ribbon options, swipe left or right.

Navigate between views

In addition to the main editing view, Word has the following commonly used views and areas:

The Recent, Shared, and Open views

When you open the Word app, you land on the Recent view. It lists the documents that you’ve recently worked on. To browse the list, swipe right or left. To select a document, double-tap the screen. The document opens in the editing view.

In the Shared view, you can find the documents that others have shared with you. To browse the list, swipe right or left. To select a document, double-tap the screen. The document opens in the editing view.

In the Open view, you can browse the available file storage locations or navigate to a document you want to open.

  • To switch between the Recent, Shared, and Open views, slide one finger near the bottom of the screen until you hear the view you want, and double-tap the screen.

  • At the top of each view, you can find buttons for accessing your account info, and creating a new document. Slide one finger at the top of the screen until you hear «New button,» or «Signed in as,» followed by your username. In the Recent and Shared views, you can also find a button to search for a document. Swipe right or left until you hear «Search, button,» and double-tap the screen.

  • To navigate to the Recent view when you’re editing your document in the main view, swipe left or slide one finger near the upper-left corner of the screen until you hear «Back button,» and double-tap the screen.

The Word menu

The Word menu contains options for saving, sharing, and printing your document. From here you can also access the Word for Android settings.

  1. To open the Word menu, swipe left until you hear «Menu,» and then double-tap the screen.

  2. To navigate the Word menu, swipe left or right until you hear the option you want, and then double-tap the screen.

  3. To exit the menu, swipe down-then-left.

The Find bar

Use the Find bar to search the currently open document and browse the search results.

  1. To navigate to the Find bar when you’re editing a document, swipe left until you hear «Find,» and then double-tap the screen. Use the on-screen keyboard to type your search words.

  2. To browse the search results, swipe left until you hear «Find previous» or «Find next,» and double-tap the screen.

  3. To close the Find bar, swipe right you hear «Close Find bar,» and then double-tap the screen.

Explore a document

To explore the text of a document, swipe right or left until you hear the screen reader announce the currently open page, followed by «Content.» You can change the screen reader navigation mode, also known as the reading control, for example, to headings, paragraphs, lines, or words. The gestures to change the mode depend on the Android version of your phone. For more information, refer to Use TalkBack gestures.

See also

Use a screen reader to insert and change text in Word

Use a screen reader to insert a picture or image in Word 

Keyboard shortcuts in Word

Basic tasks using a screen reader with Word

Set up your device to work with accessibility in Microsoft 365

Make your Word documents accessible to people with disabilities

What’s new in Microsoft 365: Release notes for Current Channel

Use Word for the web with your keyboard and a screen reader to explore and navigate the different views and move between them. We have tested it with Narrator in Microsoft Edge and JAWS and NVDA in Chrome, but it might work with other screen readers and web browsers as long as they follow common accessibility standards and techniques.

Decorative icon Need instructions on how to get started with Word, but not using a screen reader? See Word help & learning.

Notes: 

  • New Microsoft 365 features are released gradually to Microsoft 365 subscribers, so your app might not have these features yet. To learn how you can get new features faster, join the Office Insider program.

  • To learn more about screen readers, go to How screen readers work with Microsoft 365.

  • When you use Word for the web with a screen reader, switch to the full screen mode. Press F11 to toggle the full screen mode on and off.

  • When you use Word for the web, we recommend that you use Microsoft Edge as your web browser. Because Word for the web runs in your web browser, the keyboard shortcuts are different from those in the desktop program. For example, you’ll use Ctrl+F6 instead of F6 for jumping in and out of the commands. Also, common shortcuts like F1 (Help) and Ctrl+O (Open) apply to the web browser – not Word for the web.

In this topic

  • Navigate the main view

  • Navigate between views

  • Explore a document

  • Use Search

Navigate the main view

When you open a Word for the web document, you land on the main view. To cycle between the elements in the Word for the web main view, press the Ctrl+F6 (forward) or Shift+Ctrl+F6 (backward). The main elements are, in order:

  • The main content area, which contains the document content. This is where you edit the document. You hear «Document contents, editing» when the focus is on the main content area.

  • The status bar at the bottom of the screen, which contains document statistics such as page count, word count, text language, and the zoom level. When the focus is on the status bar, you hear the number of the page you’re currently on and the total number of pages in the document, for example, «Page one of three, button.» To navigate within the status bar, press the Tab key or Shift+Tab.

  • The title banner at the top of the screen, which contains the App Launcher button for launching other applications, the name and file path of the currently open file, the Search text field, and buttons for accessing the settings and your account info. You hear «Banner, App launcher» when the focus is on the title banner. To navigate the title banner, press the Tab key or Shift+Tab.

  • The row of ribbon tabs, which includes tabs such as File, Home, InsertView, and Help. When the focus moves to the row of ribbon tabs, you hear «Ribbon tabs,» followed by the currently selected tab. To navigate the row of ribbon tabs, use the Right and Left arrow keys. 

    • The ribbon containing buttons specific to the currently selected tab is located immediately below the row of ribbon tabs. To navigate from a ribbon tab to the ribbon, press the Tab key once. You hear the name of the first button in the ribbon. To navigate between buttons on the ribbon, use the Right and Left arrow keys.

    • The row of ribbon tabs also contains controls for additional actions such as switching between modes, sharing the document, displaying the Comments pane, and more. To access the additional controls, press Ctrl+F6 until you hear «Ribbon tabs, and then press the Tab key until you hear «Additional controls, Mode menu,» followed by the currently selected mode, for example, «Editing, selected.» To browse the additional controls, press the Right arrow key.

Navigate between views

In addition to the main view, Word for the web has the following commonly used views and areas:

The File menu

The File menu contains options such as New, Open, and Save a Copy. You can also access your account info and the app settings. The File menu consists of a tab pane on the left and the contents of a selected tab on the right.

  • To open the File menu, press Alt+Windows logo key+F. You hear: «Close.» The Home tab is selected and its contents are displayed in the content pane.

  • To navigate between the tabs in the tab pane, press the Up or Down arrow key until you hear the tab you want to open, for example, «New.» Press Enter to open the tab. The content pane of the selected tab opens to the right of the tab pane. The focus moves to the first item in the content pane.

  • To navigate within a content pane, press the Tab key, Shift+Tab, or the arrow keys.

  • To exit the File menu and return to the main view, press Esc.

Reading View

Reading View is designed to make reading text easier for everyone. In Reading View, Word for the web also offers Accessibility Mode, which can make reading a document easier for people who use screen reader. In Accessibility Mode, Word for the web presents a Portable Document Format (PDF) version of the file with tagging in your browser. Your screen reader reads the text and its formatting from the PDF version in the browser.

  • To turn on Reading View, press Alt+Windows logo key, W, F.

  • To navigate to the Reading View toolbar, press Ctrl+F6 until you hear «Accessibility Mode,» press the Tab key to explore the options on the toolbar.

  • In Reading View, to turn Accessibility Mode on or off, press Ctrl+F6 until you hear «Accessibility Mode,» and press Enter.

  • To exit Reading View, press Ctrl+F6 until you hear «Accessibility Mode,» press the Tab key until you hear «Edit document,» and press Enter. You hear: «Edit document, Make quick changes right here in  Word.» Then do one of the following:

    • To continue editing and reading the document in Word for the web, press Enter.

    • To open the document in the full desktop version of Word, press the Down arrow key until you hear «Open in desktop app,» and press Enter.

Explore a document

Use the Navigation Pane

You can use the Navigation Pane to quickly navigate between headings in a document.

  1. To turn on the Navigation Pane, press Alt+Windows logo key, W, K. You hear: «Navigation, search for.»

  2. Press the Tab key or Shift+Tab until you hear «Headings,» and press Enter. Press the Tab key until you hear the currently selected tab, for example, «Find tab item,» and then press the Right arrow key until you hear: «Heading tab item.»

  3. Press the Tab key until you hear the heading you want, and press Enter. The focus moves to the beginning of the heading row in the document body.

Use zoom

  1. Press Ctrl+F6 until you hear the current page number followed by the total number of pages in the document, for example, «Page one of three.»

  2. To zoom in, press the Tab key until you hear «Zoom in,» and press Enter. You hear the new zoom percentage, for example, «90 percent.» To zoom out, press the Tab key or Shift+Tab until you hear «Zoom out,» and press Enter.

Use Search

To find an option or perform an action quickly, use the Search text field. To learn more about the Search feature, go to Find what you need with Microsoft Search.

Note: Depending on the Microsoft 365 version of you are using, the Search text field at the top of the app window might be called Tell Me instead. Both offer a largely similar experience, but some options and search results can vary.

  1. Select the item or place in your document, presentation, or spreadsheet where you want to perform an action. For example, in an Excel spreadsheet, select a range of cells.

  2. To go to the Search text field, press Alt+Q.

  3. Type the search words for the action that you want to perform. For example, if you want to add a bulleted list, type bullets.

  4. Press the Down arrow key to browse through the search results.

  5. When you’ve found the result that you want, press Enter to select it and to perform the action.

See also

Use a screen reader to insert and change text in Word

Use a screen reader to insert a picture or image in Word 

Keyboard shortcuts in Word

Basic tasks using a screen reader with Word

Make your Word documents accessible to people with disabilities

What’s new in Microsoft 365: Release notes for Current Channel

Technical support for customers with disabilities

Microsoft wants to provide the best possible experience for all our customers. If you have a disability or questions related to accessibility, please contact the Microsoft Disability Answer Desk for technical assistance. The Disability Answer Desk support team is trained in using many popular assistive technologies and can offer assistance in English, Spanish, French, and American Sign Language. Please go to the Microsoft Disability Answer Desk site to find out the contact details for your region.

If you are a government, commercial, or enterprise user, please contact the enterprise Disability Answer Desk.

February 04, 2015

This page will provide Apache POI-XWPF API example to read MS word DOCX header, footer, paragraph and table. Start by the API XWPFDocument to read DOCX file. There are different POI-XWPF classes to extract data. Header and footer is read by using XWPFHeader and XWPFFooter respectively. XWPFParagraph is used to read paragraph and XWPFTable is sused to read tables of DOCX. We can also read complete data of DOCX in one go by using XWPFWordExtractor.

Read Header and Footer using XWPFHeader and XWPFFooter

Apache POI provides XWPFHeader and XWPFFooter to read header and footer respectively. First create the object of XWPFDocument passing the path of DOCX file. Now create XWPFHeaderFooterPolicy object by passing instance of XWPFDocument. Fetch instance of XWPFHeader and XWPFFooter using object of XWPFHeaderFooterPolicy. We can do this in using below methods.

XWPFHeaderFooterPolicy.getFirstPageHeader(): Provides the header of first page.

XWPFHeaderFooterPolicy.getDefaultHeader(): Provides the default header of DOCX file given to each and every page.

XWPFHeaderFooterPolicy.getFirstPageFooter(): Provides the footer of first page.

XWPFHeaderFooterPolicy.getDefaultFooter(): Provides the default footer of DOCX file given to each and every page.

Find the sample demo for getDefaultHeader and getDefaultFooter.

ReadDOCXHeaderFooter.java

package com.concretepage;
import java.io.FileInputStream;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.model.XWPFHeaderFooterPolicy;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFFooter;
import org.apache.poi.xwpf.usermodel.XWPFHeader;
public class ReadDOCXHeaderFooter {
   public static void main(String[] args) {
     try {
	 FileInputStream fis = new FileInputStream("D:/docx/read-test.docx");
	 XWPFDocument xdoc=new XWPFDocument(OPCPackage.open(fis));
	 XWPFHeaderFooterPolicy policy = new XWPFHeaderFooterPolicy(xdoc);
	 //read header
	 XWPFHeader header = policy.getDefaultHeader();
	 System.out.println(header.getText());
	 //read footer
	 XWPFFooter footer = policy.getDefaultFooter();
	 System.out.println(footer.getText());
     } catch(Exception ex) {
	ex.printStackTrace();
     } 
  }
}

Find the output.

This is header
This is footer 

Read Paragraph using XWPFParagraph

Apache POI provides XWPFParagraph class to fetch paragraph text. Using XWPFDocument.getParagraphs(), we get the list of all paragraphs of the document. Find the example.

ExtractParagraphDOCX.java

package com.concretepage;
import java.io.FileInputStream;
import java.util.List;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
public class ExtractParagraphDOCX {
   public static void main(String[] args) {
     try {
       FileInputStream fis = new FileInputStream("D:/docx/read-test.docx");
       XWPFDocument xdoc=new XWPFDocument(OPCPackage.open(fis));
       List<XWPFParagraph> paragraphList =  xdoc.getParagraphs();
       for (XWPFParagraph paragraph: paragraphList){
	   System.out.println(paragraph.getText());
       }
     } catch(Exception ex) {
	   ex.printStackTrace();
     } 
   }
}  

Find the output.

This is body content of Page One.

This is body content of Page Two. 

Read Table using XWPFTable

Apache POI provides XWPFTable class to fetch table data. We can get this object by two way.
First by using XWPFDocument directly.

List<XWPFTable> tables = XWPFDocument.getTables() 

And second by using IBodyElement.

IBodyElement.getBody().getTables() 

Now find the example to extract data of tables within DOCX.

ExtractTableDOCX.java

package com.concretepage;
import java.io.FileInputStream;
import java.util.Iterator;
import java.util.List;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.IBodyElement;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFTable;
public class ExtractTableDOCX {
   public static void main(String[] args) {
    try {
	FileInputStream fis = new FileInputStream("D:/docx/read-test.docx");
	XWPFDocument xdoc=new XWPFDocument(OPCPackage.open(fis));
	Iterator<IBodyElement> bodyElementIterator = xdoc.getBodyElementsIterator();
	while(bodyElementIterator.hasNext()) {
	  IBodyElement element = bodyElementIterator.next();
          if("TABLE".equalsIgnoreCase(element.getElementType().name())) {
	     List<XWPFTable> tableList =  element.getBody().getTables();
	     for (XWPFTable table: tableList){
	        System.out.println("Total Number of Rows of Table:"+table.getNumberOfRows());
		System.out.println(table.getText());
	     }
	  }
        }
    } catch(Exception ex) {
	ex.printStackTrace();
    } 
   }
}  

Find the output.

Total Number of Rows of Table:2
Row 1- column 1	Row 1- column 2
Row 2- column 1	Row 2- column 2 

Extract Complete Data using XWPFWordExtractor

Apache POI provides XWPFWordExtractor class to fetch complete data of every page of a DOCX.

BasicTextExtractor.java

package com.concretepage;
import java.io.FileInputStream;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
public class BasicTextExtractor {
   public static void main(String[] args) {
      try {
        FileInputStream fis = new FileInputStream("D:/docx/read-test.docx");
	XWPFDocument xdoc = new XWPFDocument(OPCPackage.open(fis));
	XWPFWordExtractor extractor = new XWPFWordExtractor(xdoc);
	System.out.println(extractor.getText());
      } catch(Exception ex) {
	ex.printStackTrace();
      } 
   }
} 

Find the output.

This is header
This is body content of Page One.

Row 1- column 1	Row 1- column 2
Row 2- column 1	Row 2- column 2

This is body content of Page Two.

This is footer 

Gradle file for Apache POI-XWPF

Find the gradle file to resolve JAR for Apache POI-XWPF.

build.gradle

apply plugin: 'java'
apply plugin: 'eclipse'
archivesBaseName = 'ApachePOI'
version = '1' 
repositories {
    mavenCentral()
}
dependencies {
    compile 'org.apache.poi:poi-ooxml:3.11'
} 

The input DOCX file read-test.docx has been attached in ZIP file.

Download Source Code

POSTED BY

ARVIND RAI

In this article we will be discussing about ways and techniques to read word documents in Java using Apache POI library. The word document may contain images, tables or plain text. Apart from this a standard word file has header and footers too. Here in the following examples we will be parsing a word document by reading its different paragraph, runs, images, tables along with headers and footers. We will also take a look into identifying different styles associated with the paragraphs such as font-size, font-family, font-color etc.

Maven Dependencies

Following is the poi maven depedency required to read word documents. For latest artifacts visit here

pom.xml

	<dependencies>
		<dependency>
                     <groupId>org.apache.poi</groupId>
                     <artifactId>poi-ooxml</artifactId>
		     <version>3.16</version>
                 </dependency>
	</dependencies>

Reading Complete Text from Word Document

The class XWPFDocument has many methods defined to read and extract .docx file contents. getText() can be used to read all the texts in a .docx word document. Following is an example.

TextReader.java

public class TextReader {
	
	public static void main(String[] args) {
	 try {
		   FileInputStream fis = new FileInputStream("test.docx");
		   XWPFDocument xdoc = new XWPFDocument(OPCPackage.open(fis));
		   XWPFWordExtractor extractor = new XWPFWordExtractor(xdoc);
		   System.out.println(extractor.getText());
		} catch(Exception ex) {
		    ex.printStackTrace();
		}
 }

}

Reading Headers and Foooters of Word Document

Apache POI provides inbuilt methods to read headers and footers of a word document. Following is an example that reads and prints header and footer of a word document. The example .docx file is available in the source which can be downloaded at the end of thos article.

HeaderFooter.java

public class HeaderFooterReader {

	public static void main(String[] args) {
		
		try {
			FileInputStream fis = new FileInputStream("test.docx");
			XWPFDocument xdoc = new XWPFDocument(OPCPackage.open(fis));
			XWPFHeaderFooterPolicy policy = new XWPFHeaderFooterPolicy(xdoc);

			XWPFHeader header = policy.getDefaultHeader();
			if (header != null) {
				System.out.println(header.getText());
			}

			XWPFFooter footer = policy.getDefaultFooter();
			if (footer != null) {
				System.out.println(footer.getText());
			}
		} catch (Exception ex) {
			ex.printStackTrace();
		}

	}

}

Output

This is Header

This is footer

 Other Interesting Posts
Java 8 Lambda Expression
Java 8 Stream Operations
Java 8 Datetime Conversions
Random Password Generator in Java

Read Each Paragraph of a Word Document

Among the many methods defined in XWPFDocument class, we can use getParagraphs() to read a .docx word document paragraph wise.This method returns a list of all the paragraphs(XWPFParagraph) of a word document. Again the XWPFParagraph has many utils method defined to extract information related to any paragraph such as text alignment, style associated with the paragrpahs.

To have more control over the text reading of a word document,each paragraph is again divided into multiple runs. Run defines a region of text with a common set of properties.Following is an example to read paragraphs from a .docx word document.

ParagraphReader.java

public class ParagraphReader {

	public static void main(String[] args) {
		try {
			FileInputStream fis = new FileInputStream("test.docx");
			XWPFDocument xdoc = new XWPFDocument(OPCPackage.open(fis));

			List paragraphList = xdoc.getParagraphs();

			for (XWPFParagraph paragraph : paragraphList) {

				System.out.println(paragraph.getText());
				System.out.println(paragraph.getAlignment());
				System.out.print(paragraph.getRuns().size());
				System.out.println(paragraph.getStyle());

				// Returns numbering format for this paragraph, eg bullet or lowerLetter.
				System.out.println(paragraph.getNumFmt());
				System.out.println(paragraph.getAlignment());

				System.out.println(paragraph.isWordWrapped());

				System.out.println("********************************************************************");
			}
		} catch (Exception ex) {
			ex.printStackTrace();
		}
	}
}

Reading Tables from Word Document

Following is an example to read tables present in a word document. It will print all the text rows wise.

TableReader.java

public class TableReader {

	public static void main(String[] args) {
		try {
			FileInputStream fis = new FileInputStream("test.docx");
			XWPFDocument xdoc = new XWPFDocument(OPCPackage.open(fis));
			Iterator bodyElementIterator = xdoc.getBodyElementsIterator();
			while (bodyElementIterator.hasNext()) {
				IBodyElement element = bodyElementIterator.next();

				if ("TABLE".equalsIgnoreCase(element.getElementType().name())) {
					List tableList = element.getBody().getTables();
					for (XWPFTable table : tableList) {
						System.out.println("Total Number of Rows of Table:" + table.getNumberOfRows());
						for (int i = 0; i < table.getRows().size(); i++) {

							for (int j = 0; j < table.getRow(i).getTableCells().size(); j++) {
								System.out.println(table.getRow(i).getCell(j).getText());
							}
						}
					}
				}
			}
		} catch (Exception ex) {
			ex.printStackTrace();
		}
	}
}

Reading Styles from Word Document

Styles are associated with runs of a paragraph. There are many methods available in the XWPFRun class to identify the styles associated with the text.There are methods to identify boldness, highlighted words, capitalized words etc.

StyleReader.java

public class StyleReader {

	public static void main(String[] args) {
		try {
			FileInputStream fis = new FileInputStream("test.docx");
			XWPFDocument xdoc = new XWPFDocument(OPCPackage.open(fis));

			List paragraphList = xdoc.getParagraphs();

			for (XWPFParagraph paragraph : paragraphList) {

				for (XWPFRun rn : paragraph.getRuns()) {

					System.out.println(rn.isBold());
					System.out.println(rn.isHighlighted());
					System.out.println(rn.isCapitalized());
					System.out.println(rn.getFontSize());
				}

				System.out.println("********************************************************************");
			}
		} catch (Exception ex) {
			ex.printStackTrace();
		}

	}

}

Reading Image from Word Document

Following is an example to read image files from a word document.

public class ImageReader {

	public static void main(String[] args) {

		try {
			FileInputStream fis = new FileInputStream("test.docx");
			XWPFDocument xdoc = new XWPFDocument(OPCPackage.open(fis));
			List pic = xdoc.getAllPictures();
			if (!pic.isEmpty()) {
				System.out.print(pic.get(0).getPictureType());
				System.out.print(pic.get(0).getData());
			}

		} catch (Exception ex) {
			ex.printStackTrace();
		}
	}

}

Conclusion

I hope this article served you that you were looking for. If you have anything that you want to add or share then please share it below in the comment section.

Download source

HI Team,

             I have one requirement in my current project, which I need to replace some customized words in the elements where it is present in whole document.The below peace of code facing some issues on replacing the words.

function checkWord(word,color){

local rtn = 1;

  local logmsg = oid_first();

# Set disable continue search

  set wrapscan = off;

   # Cursor goto root of document

  goto_oid(oid_caret());

  # Find word & changing text color

  while(rtn == 1) {

  rtn = find(word, 0x0008|0x0200|0x0020);

  if(rtn == 1) {

  selectText = main::selection;

  EditDelete;

  str = ‘<?Pub _font FontColor=»‘.$color.'» ?>’.selectText.'<?Pub /_font?>’;

#insert(str);

  local findWithin; findWithin = V[‘FindWithin’];

  local replFlags = flags|0x0020

  replace(word, insert(str), replFlags, findWithin, current_doc())

  }

}

  }

example doc : —

<SimpleText>

  <para>Be aware of the fire hazard, especially when working near flammable substances or vapours</para>

  <para2>I am WelCome you vapours</para2>

</SimpleText>

After Replacing the out put the doc

<SimpleText>

  <para>Be aware of the fire hazard, especially when working near flammable substances or vapours</para>

  <para2>I am WelCome you 1</para2> // here word vapours is replaced with one.

</SimpleText>

Could you please any one do some need full help on this.

Thanks

Prashant

Hi, thanks for this repository and sorry for my bad English. You really help me! But from someone colum cant get data, how i can get all data? This column has normal text no more 5000 symbol. Coding is utf 8. Instead, I get a text *** that means empty. Thanks!
Example data:
——- column1 column2 column3 —column4
row1 — 4122 ——П ——- empty — А.Кожем’якін
row2 — 4069 ——Д ——- empty — А.Кожем’якін

Instead empty value in column 3 has to be data like:
Проект про внесення змін до Бюджетного кодексу (щодо упорядкування системи надходження та використання коштів із рентної плати за користування надрами для видобування нафти, природного газу та газового конденсату) (проект н.д. О Ивкан надано 09.09.2015, проект н.д. О.Ивкан — 11.09.2015, подання Комітету – 19.02.2016)

python read word document

This post will talk about how to read Word Documents with Python. We’re going to cover three different packages – docx2txt, docx, and my personal favorite: docx2python.

The docx2txt package

Let’s talk about docx2text first. This is a Python package that allows you to scrape text and images from Word Documents. The example below reads in a Word Document containing the Zen of Python. As you can see, once we’ve imported docx2txt, all we need is one line of code to read in the text from the Word Document. We can read in the document using a method in the package called process, which takes the name of the file as input. Regular text, listed items, hyperlink text, and table text will all be returned in a single string.

import docx2txt

# read in word file
result = docx2txt.process("zen_of_python.docx")

python scrape word document

What if the file has images? In that case we just need a minor tweak to our code. When we run the process method, we can pass an extra parameter that specifies the name of an output directory. Running docx2txt.process will extract any images in the Word Document and save them into this specified folder. The text from the file will still also be extracted and stored in the result variable.

import docx2txt

result = docx2txt.process("zen_of_python_with_image.docx", "C:/path/to/store/files")

Sample Image

python scrape image from word document

docx2txt will also scrape any text from tables. Again, this will be returned into a single string with any other text found in the document, which means this text can more difficult to parse. Later in this post we’ll talk about docx2python, which allows you to scrape tables in a more structured format.

The docx package

The source code behind docx2txt is derived from code in the docx package, which can also be used to scrape Word Documents. docx is a powerful library for manipulating and creating Word Documents, but can also (with some restrictions) read in text from Word files.

In the example below, we open a connection to our sample word file using the docx.Document method. Here we just input the name of the file we want to connect to. Then, we can scrape the text from each paragraph in the file using a list comprehension in conjunction with doc.paragraphs. This will include scraping separate lines defined in the Word Document for listed items. Unlike docx2txt, docx, cannot scrape images from Word Documents. Also, docx will not scrape out hyperlinks and text in tables defined in the Word Document.

import docx

# open connection to Word Document
doc = docx.Document("zen_of_python.docx")

# read in each paragraph in file
result = [p.text for p in doc.paragraphs]

python docx

The docx2python package

docx2python is another package we can use to scrape Word Documents. It has some additional features beyond docx2txt and docx. For example, it is able to return the text scraped from a document in a more structured format. Let’s test out our Word Document with docx2python. We’re going to add a simple table in the document so that we can extract that as well (see below).

python word document table

docx2python contains a method with the same name. If we call this method with the document’s name as input, we get back an object with several attributes.

from docx2python import docx2python

# extract docx content
doc_result = docx2python('zen_of_python.docx')

Each attribute provides either text or information from the file. For example, consider that our file has three main components – the text containing the Zen of Python, a table, and an image. If we call doc_result.body, each of these components will be returned as separate items in a list.

# get separate components of the document
doc_result.body

# get the text from Zen of Python
doc_result[0]

# get the image
doc_result[1] 

# get the table text
doc_result[2]

Scraping a word document table with docx2python

The table text result is returned as a nested list, as you can see below. Each row (including the header) gets returned as a separate sub-list. The 0th element of the list refers to the header – or 0th row of the table. The next element refers to the next row in the table and so on. In turn, each value in a row is returned as an individual sub-list within that row’s corresponding list.

docx2python scrape table

We can convert this result into a tabular format using pandas. The data frame is still a little messy – each cell in the data frame is a list containing a single value. This value also has quite a few “t”‘s (which represent tab spaces).

pd.DataFrame(doc_result.body[1][1:])

python scrape table from word file

Here, we use the applymap method to apply the lambda function below to every cell in the data frame. This function gets the individual value within the list in each cell and removes all instances of “t”.

import pandas as pd


pd.DataFrame(doc_result.body[1][1:]).
                            applymap(lambda val: val[0].strip("t"))


docx2python pandas data frame

Next, let’s change the column headers to what we see in the Word file (which was also returned to us in doc_result.body).


df.columns = [val[0].strip("t") for val in doc_result.body[1][0]]


docx2python scrape table from word document

Extracting images

We can extract the Word file’s images using the images attribute of our doc_result object. doc_result.images consists of a dictionary where the keys are the names of the image files (not automatically written to disk) and the corresponding values are the images files in binary format.

type(doc_result.images) # dict

doc_result.images.keys() # dict_keys(['image1.png'])

We can write the binary-formatted image out to a physical file like this:


for key,val in doc_result.images.items():
    f = open(key, "wb")
    f.write(val)
    f.close()

Above we’re just looping through the keys (image file names) and values (binary images) in the dictionary and writing each out to file. In this case, we only have one image in the document, so we just get one written out.

Other attributes

The docx2python result has several other attributes we can use to extract text or information from the file. For example, if we want to just get all of the file’s text in a single string (similar to docx2txt) we can run doc_result.text.

# get all text in a single string
doc_result.text

In addition to text, we can also get metadata about the file using the properties attribute. This returns information such as the creator of the document, the created / last modified dates, and number of revisions.

doc_result.properties

If the document you’re scraping has headers and footers, you can also scrape those out like this (note the singular version of “header” and “footer”):

# get the headers
doc_result.header

# get the footers
doc_result.footer

Footnotes can also be extracted like this:

doc_result.footnotes

Getting HTML returned with docx2python

We can also specify that we want to get an HTML object returned with the docx2python method that supports a few types of tags including font (size and color), italics, bold, and underline text. We just need to specify the parameter “html = True”. In the example below we see The Zen of Python in bold and underlined print. Corresponding to this, we can see the HTML version of this in the second snapshot below. The HTML feature does not currently support table-related tags, so I would recommend using the method we went through above if you’re looking to scrape tables from Word documents.


doc_html_result = docx2python('zen_of_python.docx', html = True)


python word document html

python get html from word document

Hope you enjoyed this post! Please check out other Python posts of mine below or by clicking here.

Понравилась статья? Поделить с друзьями:
  • Word regret mean to you
  • Word reference to another document
  • Word reference on line
  • Word reference numbered list
  • Word reference not found