Is a word document a text file

If you are not coming from a programming background it might not yet be clear what is really a file?
What is a binary file and what makes something a text file?

Is a Microsoft Word document a text file or a binary file?

Is an spreadsheet a text file or a binary file?

Let’s try to explain this.

A little disclaimer:

There is actually a lot more variation to this, but I’ll focus on files of Unix/Linux systems, Windows and Mac.
Wikipedia has some more to say about text files
and binary files, so if this article does not
to satisfy your curiosity, then please check out those articles.

What is a file?

Basically every file is just a series of bytes one after the other. That is, numbers between 0 and 255.
In order to facilitate the storage device they are on, a file might be spread out to several areas on
that device. From our point of view, each file is just a series of bytes.
In general every file is a binary file, but if the data in it contains only text (letter, numbers
and other symbols one would use in writing, and if it consists of lines, then we consider it a text file.

What is a text file?

(I am going to simplify here a bit for clarity and for now assume that the files are only use
ASCII characters.)

When you open a text file with Notepad or some other, simple text editor you will see several lines of text.
The file on the disk on the other hand isn’t broken up to such lines. It is a series of numbers
one after the other. When you open the file using Notepad, it translates each number to a visual representation.
For example if it encounters the number 97, it will show the letter a. We can say that the
a character is represented by the number 97 in ASCII.

The reason that you see several lines in your editor is that some of the bytes in the file,
that are called newlines, are actually instructions to the editor to go to the beginning of next line.
Thus the character that was in the file after the newline character will be displayed in the next line.

What is a newline?

Which number represents the newline?

Actually none of the characters in the ASCII table is called a newline.
When we say newline we usually mean the sign that can convince the computer to go to the beginning of next row.

There are various sets of bytes that represent a newline depending on the operating system.
In the operating systems we care about in this article a newline is always represented by a combination
of two characters that in the ASCII table are called LF — line feed (hexa 0x0A or decimal 10) and
CR — carriage return (hexa 0x0D or decimal 13).

If you have ever seen a Typewriter, you will
remember, in order to go to the next line, the user had to pull a handle towards the beginning of
the line. (Usually to the left side of the paper.)
This movement first pushed the «carriage» to the beginning of the paper and when it arrived to the
beginning (and got stuck), further pulling of the handle turning the paper a bit so the
carriage would point to the next line. Then the typewrite was ready for the next line.

That is, they used two operations
carriage return, pushing the «carriage» to the beginning of the paper and
line feed — going to the next line.

typewriter

(Image from Wikipedia)

Therefore on MS Windows, a newline is represented by two characters: CRLF.
A Carriage return followed by a line feed.

On Unix/Linux systems and on Mac OSX, the newline is represented by a single
LF (line feed).

Just for curiosity, Mac OS Classic (before OSX),
Commodore, ZX Spectrum,
TRS-80
all used a CR (Carriage Return) to represent a newline.

(I learned programming on a HT-1080Z which was a TRS-80 clone and later switched
to a ZX Spectrum.)

Wikipedia has even more to say about newline.

So if you have a file filled with ASCII printable characters
with a few «newlines» sprinkled in, then you have a text file.

Encoding

Of course if you looked at the ASCII table you saw that
only very few languages could be written with those letters. Mostly the Latin based languages.
Many languages that use those characters have a few extra letters.
For example in Hungarian
there are a few more vowels: aáeéiíoóöőuúüű. The 5 from Latin and 9 extra. For fun.)
You cannot represent then within the ASCII table.

Therefore people have invented other Encodings,
besides ASCII. Without going into the details, each encoding is a mapping between numbers that
can be saved in a computer file and «drawings» that should be displayed on the screen.

Remember, even in ASCII, you don’t have a letter a in a file. You have a decimal number 97
saved that your computer knows to display as the letter a. The computer will display the
letter a if it thinks that your file is in ASCII encoding,
or in any of the ASCII-based or ASCII-compatible encodings,
such as Latin1
or UTF8.

So in the ancient times people used various encoding to represent their own language,
but these encodings overlapped. The same number was used to represent difference characters (drawings)
in the different languages. That did not allow the mixing of these languages in the same
file and if the application was used the incorrect encoding to display a file, all you got
was a mix of unintelligible list of characters from some other language.

You can still see this problem when a web page is written in one of these ancient-time
encodings, but the browser uses a different encoding to show it. The solution would be
to include a hint about the encoding in the HTML page, but at times people forget to do this.

The other good solution is to use UTF-8 encoding
as this encoding maps out all the characters in the known universe. Unfortunately
Klingon is not yet included.

UTF-8 is one of good ways to
map Unicode characters to numbers.
As Unicode currently includes more than 110,000 characters it cannot be represented in one byte
which can hold only numbers between 0 and 255. So in UTF-8, every character is represented by 1 to 4
bytes. If you open a file that was written using the UTF-8 encoding, with a tool
that can only handle ASCII characters, you will see lots of «garbage». That’s because in UTF-8
some of the characters are represented by numbers that are «control characters» in ASCII.

So to the casual viewer, the file would be indistinguishable from a binary file.

Binary file

A binary file is basically any file that is not «line-oriented». Any file where besides the
actual written characters and newlines there are other symbols as well.

So a program written in the C programming language is a text file, but after you compiled it, the
compiled version is binary.

A Perl program is a text file, but if you package it with
PAR::Packer it will be a binary file.

A Microsoft word file is a binary file as besides the actual text, it also contains various
characters representing font size and color.

An Open Office Write file is binary
as it is a zipped set of XML files, but the XML files inside are considered text files.
Even though they contain both text and characters that represent font-size and color.

An HTML file, is a text file too, even though it contains lots of characters that are
invisible when viewed in a browser. It is considered a text file even though a newline,
as described above, won’t cause the next character to be displayed on the next line
when viewed through a browsers.
It is considered a text file, because all the «control characters» are themselves
«printable characters», when viewed in a regular text editor.

In the comments, please wrap your code snippets within <pre> </pre> tags and use spaces for indentation.

From Wikipedia, the free encyclopedia

Word Document

.doc icon (2000-03).svg
Filename extension

doc

Internet media type

application/msword[1]

Uniform Type Identifier (UTI) com.microsoft.word.doc[2][3]
Developed by Microsoft
Latest release

10.1
17 May 2022; 10 months ago[4]

Type of format document file format
Container for Text, Image,Table
Extended from Compound File Binary Format (since 97)
Extended to Microsoft Office XML formats, Office Open XML
Open format? Yes

.doc (an abbreviation of «document») is a filename extension used for word processing documents stored on Microsoft’s proprietary Microsoft Word Binary File Format.[4] Microsoft has used the extension since 1983.

Microsoft Word Binary File Format[edit]

Binary DOC files often contain more text formatting information (as well as scripts and undo information) than some other document file formats like Rich Text Format and Hypertext Markup Language, but are usually less widely compatible.

The DOC files created with Microsoft Word versions differ. Microsoft Word versions before Word 97 («8.0») used a different format from the OLE and CFBF-based Microsoft Word 97 – 2003.

In Microsoft Word 2007 and later, the binary file format was replaced as the default format by the Office Open XML format, though Microsoft Word can still produce DOC files.

Application support[edit]

The DOC format is native to Microsoft Word. Other word processors, such as OpenOffice.org Writer, IBM Lotus Symphony, Apple Pages and AbiWord, can also create and read DOC files, although with some limitations. Command line programs for Unix-like operating systems that can convert files from the DOC format to plain text or other standard formats include the wv library, which itself is used directly by AbiWord.

Specification[edit]

Because the DOC file format was a closed specification for many years, inconsistent handling of the format persists and may cause some loss of formatting information when handling the same file with multiple word processing programs. Some specifications for Microsoft Office 97 binary file formats were published in 1997 under a restrictive license, but these specifications were removed from online download in 1999.[5][6][7][8] Specifications of later versions of Microsoft Office binary file formats were not publicly available. The DOC format specification was available from Microsoft on request[9] since 2006[10] under restrictive RAND-Z terms until February 2008. Sun Microsystems and OpenOffice.org reverse engineered the file format.[11] On February 15, 2008, Microsoft released a .DOC format specification[4][12][13] under the Microsoft Open Specification Promise.[14][15] However, this specification does not describe all of the features used by DOC format and reverse engineered work remains necessary.[16] Since 2008 the specification has been updated several times; the latest change was made in May 2022.

The format used in earlier, pre-97 («1.0» 1989 through «7.0» 1995) versions of Word are less known, but both OpenOffice and LibreOffice contain open-source code for reading these formats. The format is probably related to the «Stream» format found in similar Excel versions.[17] Word 95 also seems to have an OLE-wrapped form

Other file formats[edit]

Some historical documentations may use the DOC filename extension for plain-text files, indicating documentation for software or hardware. The DOC filename extension was also used during the 1980s by WordPerfect for its proprietary format.

DOC is sometimes used by users of Palm OS as shorthand for PalmDoc, an unrelated format (commonly using PDB filename extension) used to encode text files such as ebooks.

See also[edit]

  • docx, the file format used by modern versions of Word
  • De facto standard
  • Dominant design

References[edit]

  1. ^ «IME Content-Type/Subtype — application/msword». IANA. 1993-07-22. Retrieved 2012-06-20.
  2. ^ Uniform Type Identifiers Reference (PDF), Apple, retrieved 2012-06-20
  3. ^ «System-Declared Uniform Type Identifiers (Mac OS X v10.4)». Apple Developer Connection. Apple Inc. 2008-04-08.
  4. ^ a b c MS-DOC: Word (.doc) Binary File Format, 2019-11-19, retrieved 2020-02-25
  5. ^ «Comparing ODF and OOXML» (pdf). 2006. Retrieved 2011-05-23.
  6. ^ Beware of Geeks Bearing Gifts, 2006, retrieved 2011-05-23
  7. ^ «A Word 8 converter for Unix». Retrieved 2011-05-23.
  8. ^ «Microsoft Word 97 Binary File Format». Retrieved 2011-05-23.
  9. ^ «Royalty-free specifications for Microsoft Office binary file formats». Retrieved 2011-05-23.
  10. ^ «Mapping documents in the binary format (.doc; .xls; .ppt) to the Open XML format». 2008-01-16. Retrieved 2011-05-23.
  11. ^ «Microsoft Compound Document Format» (PDF). OpenOffice.org. 2007-08-07.
  12. ^ Microsoft Office Binary (doc, xls, ppt) File Formats, 2008-02-15, archived from the original on 2008-02-18
  13. ^ «Microsoft Office Word 97 — 2007 Binary File Format Specification (*.doc)» (PDF). Microsoft Corporation. 2008.
  14. ^ «Microsoft Open Specification Promise». Microsoft Corporation. March 23, 2009.
  15. ^ «How to extract information from Office files by using Office file formats and schemas». Retrieved 2011-05-23.
  16. ^ Joel Spolsky. «Why are the Microsoft Office file formats so complicated? (And some workarounds)». Archived from the original on 2013-10-14. Retrieved 2011-05-23.
  17. ^ «LibreOffice/core». GitHub.

External links[edit]

  • DOC, XLS, and PPT specifications
  • Microsoft Compound Document Format — OpenOffice.org

Microsoft Word is a word processor developed by Microsoft. It was first released on October 25, 1983,[9] under the name Multi-Tool Word for Xenix systems.[10][11][12] Subsequent versions were later written for several other platforms including: IBM PCs running DOS (1983), Apple Macintosh running the Classic Mac OS (1985), AT&T UNIX PC (1985), Atari ST (1988), OS/2 (1989), Microsoft Windows (1989), SCO Unix (1990), macOS (2001), Web browsers (2010), iOS (2014) and Android (2015). Using Wine, versions of Microsoft Word before 2013 can be run on Linux.

Microsoft Word

Microsoft Office Word (2019–present).svg
Microsoft Word.png

Microsoft Office 365 version of Microsoft Word, with the new redesign applied

Developer(s) Microsoft
Initial release October 25, 1983; 39 years ago (as Multi-Tool Word)
Stable release

2209 (16.0.15629.20208)
/ October 11, 2022; 6 months ago[1]

Repository none Edit this at Wikidata
Written in C++ (back-end)[2]
Operating system
  • Windows 10 and later, Windows Server 2016 and later
Office 365 only
  • Windows 7 SP1, Windows Server 2008 R2 and later[3]
Platform IA-32, x64, ARM, ARM64
Type Word processor
License Trialware
Website microsoft.com/en-us/microsoft-365/word
Microsoft Word for Mac

Word for Mac screenshot.png

Word for Mac running on macOS Ventura (13.2)

Developer(s) Microsoft
Stable release

16.64 (Build 22081401)
/ August 16, 2022; 7 months ago[4]

Repository none Edit this at Wikidata
Written in C++ (back-end), Objective-C (API/UI)[2]
Operating system macOS
Type Word processor
License Proprietary software plus services
Website products.office.com/word
Microsoft Word for Android

Word for Android.png

Screenshot of Microsoft Word for Android 13

Original author(s) Microsoft Corporation
Developer(s) Microsoft Corporation
Initial release January 29, 2015; 8 years ago[5]
Stable release

16.0.15427.20090
/ July 14, 2022; 8 months ago[6]

Repository none Edit this at Wikidata
Operating system Android Pie and later
License Proprietary commercial software
Website products.office.com/word
Microsoft Word for iOS

Developer(s) Microsoft Corporation
Initial release March 27, 2014; 9 years ago[7]
Stable release

2.63.2
/ July 18, 2022; 8 months ago[8]

Repository none Edit this at Wikidata
Operating system iOS 14 or later
IPadOS 14 or later
License Proprietary commercial software
Website products.office.com/word
Word Mobile for Windows 10

Developer(s) Microsoft
Repository none Edit this at Wikidata
Operating system Windows 10 and later, Windows 10 Mobile
Type Word processor
License Freemium
Website www.microsoft.com/store/productId/9WZDNCRFJB9S

Commercial versions of Word are licensed as a standalone product or as a component of Microsoft Office suite of software, which can be purchased either with a perpetual license or as part of a Microsoft 365 subscription.

HistoryEdit

OriginsEdit

In 1981, Microsoft hired Charles Simonyi, the primary developer of Bravo, the first GUI word processor, which was developed at Xerox PARC.[13] Simonyi started work on a word processor called Multi-Tool Word and soon hired Richard Brodie, a former Xerox intern, who became the primary software engineer.[13][14][15]

Microsoft announced Multi-Tool Word for Xenix[13] and MS-DOS in 1983.[16] Its name was soon simplified to Microsoft Word.[10] Free demonstration copies of the application were bundled with the November 1983 issue of PC World, making it the first to be distributed on-disk with a magazine.[10][17] That year Microsoft demonstrated Word running on Windows.[18]

Unlike most MS-DOS programs at the time, Microsoft Word was designed to be used with a mouse.[16] Advertisements depicted the Microsoft Mouse and described Word as a WYSIWYG, windowed word processor with the ability to undo and display bold, italic, and underlined text,[19] although it could not render fonts.[10] It was not initially popular, since its user interface was different from the leading word processor at the time, WordStar.[20] However, Microsoft steadily improved the product, releasing versions 2.0 through 5.0 over the next six years. In 1985, Microsoft ported Word to the classic Mac OS (known as Macintosh System Software at the time). This was made easier by Word for DOS having been designed for use with high-resolution displays and laser printers, even though none were yet available to the general public.[21] It was also notable for its very fast cut-and-paste function and unlimited number of undo operations, which are due to its usage of the piece table data structure.[22]

Following the precedents of LisaWrite and MacWrite, Word for Mac OS added true WYSIWYG features. It fulfilled a need for a word processor that was more capable than MacWrite.[23] After its release, Word for Mac OS’s sales were higher than its MS-DOS counterpart for at least four years.[13]

The second release of Word for Mac OS, shipped in 1987, was named Word 3.0 to synchronize its version number with Word for DOS; this was Microsoft’s first attempt to synchronize version numbers across platforms. Word 3.0 included numerous internal enhancements and new features, including the first implementation of the Rich Text Format (RTF) specification, but was plagued with bugs. Within a few months, Word 3.0 was superseded by a more stable Word 3.01, which was mailed free to all registered users of 3.0.[21] After MacWrite Pro was discontinued in the mid-1990s, Word for Mac OS never had any serious rivals. Word 5.1 for Mac OS, released in 1992, was a very popular word processor owing to its elegance, relative ease of use, and feature set. Many users say it is the best version of Word for Mac OS ever created.[21][24]

In 1986, an agreement between Atari and Microsoft brought Word to the Atari ST[25] under the name Microsoft Write. The Atari ST version was a port of Word 1.05 for the Mac OS[26][27] and was never updated.

The first version of Word for Windows was released in 1989. With the release of Windows 3.0 the following year, sales began to pick up and Microsoft soon became the market leader for word processors for IBM PC-compatible computers.[13] In 1991, Microsoft capitalized on Word for Windows’ increasing popularity by releasing a version of Word for DOS, version 5.5, that replaced its unique user interface with an interface similar to a Windows application.[28][29] When Microsoft became aware of the Year 2000 problem, it made Microsoft Word 5.5 for DOS available for free downloads. As of February 2021, it is still available for download from Microsoft’s website.[30]
In 1991, Microsoft embarked on a project code-named Pyramid to completely rewrite Microsoft Word from the ground up. Both the Windows and Mac OS versions would start from the same code base. It was abandoned when it was determined that it would take the development team too long to rewrite and then catch up with all the new capabilities that could have been added at the same time without a rewrite. Instead, the next versions of Word for Windows and Mac OS, dubbed version 6.0, both started from the code base of Word for Windows 2.0.[24]

With the release of Word 6.0 in 1993, Microsoft again attempted to synchronize the version numbers and coordinate product naming across platforms, this time across DOS, Mac OS, and Windows (this was the last version of Word for DOS). It introduced AutoCorrect, which automatically fixed certain typing errors, and AutoFormat, which could reformat many parts of a document at once. While the Windows version received favorable reviews (e.g., from InfoWorld[31]), the Mac OS version was widely derided. Many accused it of being slow, clumsy, and memory intensive, and its user interface differed significantly from Word 5.1.[24] In response to user requests, Microsoft offered Word 5 again, after it had been discontinued.[32] Subsequent versions of Word for macOS are no longer direct ports of Word for Windows, instead featuring a mixture of ported code and native code.

Word for WindowsEdit

Word for Windows is available stand-alone or as part of the Microsoft Office suite. Word contains rudimentary desktop publishing capabilities and is the most widely used word processing program on the market. Word files are commonly used as the format for sending text documents via e-mail because almost every user with a computer can read a Word document by using the Word application, a Word viewer or a word processor that imports the Word format (see Microsoft Word Viewer).

Word 6 for Windows NT was the first 32-bit version of the product,[33] released with Microsoft Office for Windows NT around the same time as Windows 95. It was a straightforward port of Word 6.0. Starting with Word 95, each release of Word was named after the year of its release, instead of its version number.[34]

Word 2007 introduced a redesigned user interface that emphasized the most common controls, dividing them into tabs, and adding specific options depending on the context, such as selecting an image or editing a table.[35] This user interface, called Ribbon, was included in Excel, PowerPoint and Access 2007, and would be later introduced to other Office applications with Office 2010 and Windows applications such as Paint and WordPad with Windows 7, respectively.[36]

The redesigned interface also includes a toolbar that appears when selecting text, with options for formatting included.[37]

Word 2007 also included the option to save documents as Adobe Acrobat or XPS files,[37] and upload Word documents like blog posts on services such as WordPress.

Word 2010 allows the customization of the Ribbon,[38] adds a Backstage view for file management,[39] has improved document navigation, allows creation and embedding of screenshots,[40] and integrates with online services such as Microsoft OneDrive.[41]

Word 2019 added a dictation function.

Word 2021 added co-authoring, a visual refresh on the start experience and tabs, automatic cloud saving, dark mode, line focus, an updated draw tab, and support for ODF 1.3.

Word for MacEdit

The Mac was introduced on January 24, 1984, and Microsoft introduced Word 1.0 for Mac a year later, on January 18, 1985. The DOS, Mac, and Windows versions are quite different from each other. Only the Mac version was WYSIWYG and used a graphical user interface, far ahead of the other platforms. Each platform restarted its version numbering at «1.0».[42] There was no version 2 on the Mac, but version 3 came out on January 31, 1987, as described above. Word 4.0 came out on November 6, 1990, and added automatic linking with Excel, the ability to flow text around graphics, and a WYSIWYG page view editing mode. Word 5.1 for Mac, released in 1992 ran on the original 68000 CPU and was the last to be specifically designed as a Macintosh application. The later Word 6 was a Windows port and poorly received. Word 5.1 continued to run well until the last Classic MacOS. Many people continue to run Word 5.1 to this day under an emulated Mac classic system for some of its excellent features, such as document generation and renumbering, or to access their old files.

Microsoft Word 2011 running on OS X

In 1997, Microsoft formed the Macintosh Business Unit as an independent group within Microsoft focused on writing software for Mac OS. Its first version of Word, Word 98, was released with Office 98 Macintosh Edition. Document compatibility reached parity with Word 97,[32] and it included features from Word 97 for Windows, including spell and grammar checking with squiggles.[43] Users could choose the menus and keyboard shortcuts to be similar to either Word 97 for Windows or Word 5 for Mac OS.

Word 2001, released in 2000, added a few new features, including the Office Clipboard, which allowed users to copy and paste multiple items.[44] It was the last version to run on classic Mac OS and, on Mac OS X, it could only run within the Classic Environment. Word X, released in 2001, was the first version to run natively on, and required, Mac OS X,[43] and introduced non-contiguous text selection.[45]

Word 2004 was released in May 2004. It included a new Notebook Layout view for taking notes either by typing or by voice.[46] Other features, such as tracking changes, were made more similar with Office for Windows.[47]

Word 2008, released on January 15, 2008, included a Ribbon-like feature, called the Elements Gallery, that can be used to select page layouts and insert custom diagrams and images. It also included a new view focused on publishing layout, integrated bibliography management,[48] and native support for the new Office Open XML format. It was the first version to run natively on Intel-based Macs.[49]

Word 2011, released in October 2010, replaced the Elements Gallery in favor of a Ribbon user interface that is much more similar to Office for Windows,[50] and includes a full-screen mode that allows users to focus on reading and writing documents, and support for Office Web Apps.[51]

Word 2021 added real-time co-authoring, automatic cloud saving, dark mode, immersive reader enhancements, line focus, a visual refresh, the ability to save pictures in SVG format, and a new Sketched style outline.

File formatsEdit

Native file formats

DOC Legacy Word document
DOT Legacy Word templates
WBK Legacy Word document backup
DOCX XML Word document
DOCM XML Word macro-enabled document
DOTX XML Word template
DOTM XML Word macro-enabled template
DOCB XML Word binary document

Filename extensionsEdit

Microsoft Word’s native file formats are denoted either by a .doc or .docx filename extension.

Although the .doc extension has been used in many different versions of Word, it actually encompasses four distinct file formats:

  1. Word for DOS
  2. Word for Windows 1 and 2; Word 3 and 4 for Mac OS
  3. Word 6 and Word 95 for Windows; Word 6 for Mac OS
  4. Word 97 and later for Windows; Word 98 and later for Mac OS

(The classic Mac OS of the era did not use filename extensions.)[52]

The newer .docx extension signifies the Office Open XML international standard for Office documents and is used by default by Word 2007 and later for Windows as well as Word 2008 and later for macOS.[53]

Binary formats (Word 97–2007)Edit

During the late 1990s and early 2000s, the default Word document format (.DOC) became a de facto standard of document file formats for Microsoft Office users.[citation needed] There are different versions of «Word Document Format» used by default in Word 97–2007.[54] Each binary word file is a Compound File,[55] a hierarchical file system within a file. According to Joel Spolsky, Word Binary File Format is extremely complex mainly because its developers had to accommodate an overwhelming number of features and prioritize performance over anything else.

As with all OLE Compound Files, Word Binary Format consists of «storages», which are analogous to computer folders and «streams», which are similar to computer files. Each storage may contain streams or other storage. Each Word Binary File must contain a stream called the «WordDocument» stream and this stream must start with a File Information Block (FIB).[57] FIB serves as the first point of reference for locating everything else, such as where the text in a Word document starts, ends, what version of Word created the document and other attributes.

Word 2007 and later continue to support the DOC file format, although it is no longer the default.

XML Document (Word 2003)Edit

The .docx XML format introduced in Word 2003[58] was a simple, XML-based format called WordProcessingML or WordML.

The Microsoft Office XML formats are XML-based document formats (or XML schemas) introduced in versions of Microsoft Office prior to Office 2007. Microsoft Office XP introduced a new XML format for storing Excel spreadsheets and Office 2003 added an XML-based format for Word documents.

These formats were succeeded by Office Open XML (ECMA-376) in Microsoft Office 2007.

Cross-version compatibilityEdit

Opening a Word Document file in a version of Word other than the one with which it was created can cause an incorrect display of the document. The document formats of the various versions change in subtle and not-so-subtle ways (such as changing the font or the handling of more complex tasks like footnotes). Formatting created in newer versions does not always survive when viewed in older versions of the program, nearly always because that capability does not exist in the previous version.[59] Rich Text Format (RTF), an early effort to create a format for interchanging formatted text between applications, is an optional format for Word that retains most formatting and all content of the original document.

Third-party formatsEdit

Plugins permitting the Windows versions of Word to read and write formats it does not natively support, such as international standard OpenDocument format (ODF) (ISO/IEC 26300:2006), are available. Up until the release of Service Pack 2 (SP2) for Office 2007, Word did not natively support reading or writing ODF documents without a plugin, namely the SUN ODF Plugin or the OpenXML/ODF Translator. With SP2 installed, ODF format 1.1 documents can be read and saved like any other supported format in addition to those already available in Word 2007.[59][60][61][62][63] The implementation faces substantial criticism, and the ODF Alliance and others have claimed that the third-party plugins provide better support.[64] Microsoft later declared that the ODF support has some limitations.[65]

In October 2005, one year before the Microsoft Office 2007 suite was released, Microsoft declared that there was insufficient demand from Microsoft customers for the international standard OpenDocument format support and that therefore it would not be included in Microsoft Office 2007. This statement was repeated in the following months.[66][67][68][69] As an answer, on October 20, 2005, an online petition was created to demand ODF support from Microsoft.[70]

In May 2006, the ODF plugin for Microsoft Office was released by the OpenDocument Foundation.[71] Microsoft declared that it had no relationship with the developers of the plugin.[72]

In July 2006, Microsoft announced the creation of the Open XML Translator project – tools to build a technical bridge between the Microsoft Office Open XML Formats and the OpenDocument Format (ODF). This work was started in response to government requests for interoperability with ODF. The goal of the project was not to add ODF support to Microsoft Office, but only to create a plugin and an external toolset.[73][74] In February 2007, this project released a first version of the ODF plugin for Microsoft Word.[75]

In February 2007, Sun released an initial version of its ODF plugin for Microsoft Office.[76] Version 1.0 was released in July 2007.[77]

Microsoft Word 2007 (Service Pack 1) supports (for output only) PDF and XPS formats, but only after manual installation of the Microsoft ‘Save as PDF or XPS’ add-on.[78][79] On later releases, this was offered by default.

Features and flawsEdit

Among its features, Word includes a built-in spell checker, a thesaurus, a dictionary, and utilities for manipulating and editing text. It supports creating tables. Depending on the version, it can perform simple calculations, and supports formatting formulas and equations.

The following are some aspects of its feature set.

TemplatesEdit

Several later versions of Word include the ability for users to create their formatting templates, allowing them to define a file in which: the title, heading, paragraph, and other element designs differ from the standard Word templates.[80] Users can find how to do this under the Help section located near the top right corner (Word 2013 on Windows 8).

For example, Normal.dotm is the master template from which all Word documents are created. It determines the margin defaults as well as the layout of the text and font defaults. Although Normal.dotm is already set with certain defaults, the user can change it to new defaults. This will change other documents which were created using the template.[81] It was previously Normal.dot.[82]

Image formatsEdit

Word can import and display images in common bitmap formats such as JPG and GIF. It can also be used to create and display simple line art. Microsoft Word added support[83] for the common SVG vector image format in 2017 for Office 365 ProPlus subscribers and this functionality was also included in the Office 2019 release.

WordArtEdit

An example image created with WordArt

WordArt enables drawing text in a Microsoft Word document such as a title, watermark, or other text, with graphical effects such as skewing, shadowing, rotating, stretching in a variety of shapes and colors, and even including three-dimensional effects. Users can apply formatting effects such as shadow, bevel, glow, and reflection to their document text as easily as applying bold or underline. Users can also spell-check text that uses visual effects and add text effects to paragraph styles.

MacrosEdit

A macro is a rule of pattern that specifies how a certain input sequence (often a sequence of characters) should be mapped to an output sequence according to a defined process. Frequently used or repetitive sequences of keystrokes and mouse movements can be automated. Like other Microsoft Office documents, Word files can include advanced macros and even embedded programs. The language was originally WordBasic, but changed to Visual Basic for Applications as of Word 97.

This extensive functionality can also be used to run and propagate viruses in documents. The tendency for people to exchange Word documents via email, USB flash drives, and floppy disks made this an especially attractive vector in 1999. A prominent example was the Melissa virus, but countless others have existed.

These macro viruses were the only known cross-platform threats between Windows and Macintosh computers and they were the only infection vectors to affect any macOS system up until the advent of video codec trojans in 2007.[citation needed] Microsoft released patches for Word X and Word 2004 that effectively eliminated the macro problem on the Mac by 2006.

Word’s macro security setting, which regulates when macros may execute, can be adjusted by the user, but in the most recent versions of Word, it is set to HIGH by default, generally reducing the risk from macro-based viruses, which have become uncommon.

Layout issuesEdit

Before Word 2010 (Word 14) for Windows, the program was unable to correctly handle ligatures defined in OpenType fonts.[84] Those ligature glyphs with Unicode codepoints may be inserted manually, but are not recognized by Word for what they are, breaking spell checking, while custom ligatures present in the font are not accessible at all. Since Word 2010, the program now has advanced typesetting features which can be enabled,[85] OpenType ligatures,[86] kerning and hyphenation (previous versions already had the latter two features). Other layout deficiencies of Word include the inability to set crop marks or thin spaces. Various third-party workaround utilities have been developed.[87]

In Word 2004 for Mac OS X, support of complex scripts was inferior even to Word 97[88] and Word 2004 did not support Apple Advanced Typography features like ligatures or glyph variants.[89]

Issues with technical documentsEdit

Microsoft Word is only awkwardly suitable for some kinds of technical writing, specifically, that which requires mathematical equations,[90] figure placement, table placement and cross-references to any of these items.[citation needed] The usual workaround for equations is to use a third-party equation typesetter.[citation needed] Figures and tables must be placed manually; there is an anchor mechanism but it is not designed for fully automatic figure placement and editing text after placing figures and tables often requires re-placing those items by moving the anchor point and even then the placement options are limited.[citation needed] This problem is deeply baked into Word’s structure since 1985 as it does not know where page breaks will occur until the document is printed.[citation needed]

Bullets and numberingEdit

Microsoft Word supports bullet lists and numbered lists. It also features a numbering system that helps add correct numbers to pages, chapters, headers, footnotes, and entries of tables of content; these numbers automatically change to correct ones as new items are added or existing items are deleted. Bullets and numbering can be applied directly to paragraphs and converted to lists.[91] Word 97 through 2003, however, had problems adding correct numbers to numbered lists. In particular, a second irrelevant numbered list might have not started with number one but instead resumed numbering after the last numbered list. Although Word 97 supported a hidden marker that said the list numbering must restart afterward, the command to insert this marker (Restart Numbering command) was only added in Word 2003. However, if one were to cut the first item of the listed and paste it as another item (e.g. fifth), then the restart marker would have moved with it and the list would have restarted in the middle instead of at the top.[92]

Word continues to default to non-Unicode characters and non-hierarchical bulleting, despite user preference for Powerpoint-style symbol hierarchies (e.g., filled circle/emdash/filled square/endash/emptied circle) and universal compatibility.

AutoSummarizeEdit

Available in certain versions of Word (e.g., Word 2007), AutoSummarize highlights passages or phrases that it considers valuable and can be a quick way of generating a crude abstract or an executive summary.[93] The amount of text to be retained can be specified by the user as a percentage of the current amount of text.

According to Ron Fein of the Word 97 team, AutoSummarize cuts wordy copy to the bone by counting words and ranking sentences. First, AutoSummarize identifies the most common words in the document (barring «a» and «the» and the like) and assigns a «score» to each word – the more frequently a word is used, the higher the score. Then, it «averages» each sentence by adding the scores of its words and dividing the sum by the number of words in the sentence – the higher the average, the higher the rank of the sentence. «It’s like the ratio of wheat to chaff,» explains Fein.[94]

AutoSummarize was removed from Microsoft Word for Mac OS X 2011, although it was present in Word for Mac 2008. AutoSummarize was removed from the Office 2010 release version (14) as well.[95]

Other platformsEdit

Word for mobileEdit

Word Mobile[96] is a word processor that allows creating and editing documents. It supports basic formatting, such as bolding, changing font size, and changing colors (from red, yellow, or green). It can add comments, but can’t edit documents with tracked changes. It can’t open password-protected documents; change the typeface, text alignment, or style (normal, heading 1); create bulleted lists; insert pictures; or undo.[97][98][99] Word Mobile is neither able to display nor insert footnotes, endnotes, page headers, page footers, page breaks, certain indentation of lists, and certain fonts while working on a document, but retains them if the original document has them.[100] In addition to the features of the 2013 version, the 2007 version on Windows Mobile also has the ability to save documents in the Rich Text Format and open legacy PSW (Pocket Word).[100] Furthermore, it includes a spell checker, word count tool, and a «Find and Replace» command. In 2015, Word Mobile became available for Windows 10 and Windows 10 Mobile on Windows Store.[101]

Support for Windows 10 Mobile version ended in January 12, 2021.[102]

Word for the webEdit

Word for the web is a free lightweight version of Microsoft Word available as part of Office on the web, which also includes web versions of Microsoft Excel and Microsoft PowerPoint.

Word for the web lacks some Ribbon tabs, such as Design and Mailings. Mailings allows users to print envelopes and labels and manage mail merge printing of Word documents.[103][104] Word for the web is not able to edit certain objects, such as: equations, shapes, text boxes or drawings, but a placeholder may be present in the document. Certain advanced features like table sorting or columns will not be displayed but are preserved as they were in the document. Other views available in the Word desktop app (Outline, Draft, Web Layout, and Full-Screen Reading) are not available, nor are side-by-side viewing, split windows, and the ruler.[105]

Password protectionEdit

Three password types can be set in Microsoft Word,

  • Password to open a document[106]
  • Password to modify a document[106]
  • Password restricting formatting and editing[107]

The second and third password types were developed by Microsoft for convenient shared use of documents rather than for their protection. There is no encryption of documents that are protected by such passwords and the Microsoft Office protection system saves a hash sum of a password in a document’s header where it can be easily accessed and removed by the specialized software. Password to open a document offers much tougher protection that had been steadily enhanced in the subsequent editions of Microsoft Office.

Word 95 and all the preceding editions had the weakest protection that utilized a conversion of a password to a 16-bit key.

Key length in Word 97 and 2000 was strengthened up to 40 bit. However, modern cracking software allows removing such a password very quickly – a persistent cracking process takes one week at most. Use of rainbow tables reduces password removal time to several seconds. Some password recovery software can not only remove a password but also find an actual password that was used by a user to encrypt the document using the brute-force attack approach. Statistically, the possibility of recovering the password depends on the password strength.

Word’s 2003/XP version default protection remained the same but an option that allowed advanced users to choose a Cryptographic Service Provider was added.[108] If a strong CSP is chosen, guaranteed document decryption becomes unavailable and, therefore, a password can’t be removed from the document. Nonetheless, a password can be fairly quickly picked with a brute-force attack, because its speed is still high regardless of the CSP selected. Moreover, since the CSPs are not active by default, their use is limited to advanced users only.

Word 2007 offers significantly more secure document protection which utilizes the modern Advanced Encryption Standard (AES) that converts a password to a 128-bit key using a SHA-1 hash function 50,000 times. It makes password removal impossible (as of today, no computer that can pick the key in a reasonable amount of time exists) and drastically slows the brute-force attack speed down to several hundreds of passwords per second.

Word’s 2010 protection algorithm was not changed apart from the increasing number of SHA-1 conversions up to 100,000 times and consequently, the brute-force attack speed decreased two times more.

ReceptionEdit

This section needs expansion. You can help by adding to it. (December 2021)

Initial releases of Word were met with criticism. Byte in 1984 criticized the documentation for Word 1.1 and 2.0 for DOS, calling it «a complete farce». It called the software «clever, put together well and performs some extraordinary feats», but concluded that «especially when operated with the mouse, has many more limitations than benefits … extremely frustrating to learn and operate efficiently».[109] PC Magazine‘s review was very mixed, stating: «I’ve run into weird word processors before, but this is the first time one’s nearly knocked me down for the count» but acknowledging that Word’s innovations were the first that caused the reviewer to consider abandoning WordStar. While the review cited an excellent WYSIWYG display, sophisticated print formatting, windows, and footnoting as merits, it criticized many small flaws, very slow performance, and «documentation produced by Madame Sadie’s Pain Palace». It concluded that Word was «two releases away from potential greatness».[110]

Compute!’s Apple Applications in 1987 stated that «despite a certain awkwardness», Word 3.01 «will likely become the major Macintosh word processor» with «far too many features to list here». While criticizing the lack of true WYSIWYG, the magazine concluded that «Word is marvelous. It’s like a Mozart or Edison, whose occasional gaucherie we excuse because of his great gifts».[111]

Compute! in 1989 stated that Word 5.0’s integration of text and graphics made it «a solid engine for basic desktop publishing». The magazine approved of improvements to text mode, described the $75 price for upgrading from an earlier version as «the deal of the decade» and concluded that «as a high-octane word processor, Word is worth a look».[112]

During the first quarter of 1996, Microsoft Word accounted for 80% of the worldwide word processing market.[113]

Release historyEdit

Legend: Old version, not maintained Older version, still maintained Current stable version

Microsoft Word 2010 running on Windows 7

Microsoft Word for Windows release history

Year released Name Version Comments
1989 Word for Windows 1.0 1.0 Code-named Opus[114]
1990 Word for Windows 1.1 1.1 For Windows 3.0.[115] Code-named Bill the Cat[citation needed]
1990 Word for Windows 1.1a 1.1a On March 25, 2014, Microsoft made the source code to Word for Windows 1.1a available to the public via the Computer History Museum.[116][117]
1991 Word for Windows 2.0 2.0 Included in Office 3.0.
1993 Word for Windows 6.0 6.0 Version numbers 3, 4, and 5 were skipped, to bring Windows version numbering in line with that of DOS, Mac OS, and WordPerfect (the main competing word processor at the time). Also, a 32-bit version for Windows NT only. Included in Office 4.0, 4.2, and 4.3.
1995 Word for Windows 95 7.0 Included in Office 95
1997 Word 97 8.0 Included in Office 97
1998 Word 98 8.5 Included in Office 97
1999 Word 2000 9.0 Included in Office 2000
2001 Word 2002 10.0 Included in Office XP
2003 Microsoft Word 2003 11.0 Included in Office 2003
2006 Microsoft Word 2007 12.0 Included in Office 2007; released to businesses on November 30, 2006, released worldwide to consumers on January 30, 2007. Extended support until October 10, 2017.
2010 Word 2010 14.0 Included in Office 2010; skipped 13.0 due to triskaidekaphobia.[118]
2013 Word 2013 15.0 Included in Office 2013
2016 Word 2016 16.0 Included in Office 2016
2019 Word 2019 16.0 Included in Office 2019
2021 Word 2021 16.0 Included in Office 2021
Microsoft Word for classic Mac OS and macOS release history

Year released Name Version Comments
1985 Word 1 1.0
1987 Word 3 3.0
1989 Word 4 4.0 Part of Office 1.0 and 1.5
1991 Word 5 5.0
  • Part of Office 3.0
  • Requires System 6.0.2, 512 KB of RAM (1 MB for 5.1, 2 MB to use spell check and thesaurus), 6.5 MB available hard drive space[21]
1992 Word 5.1 5.1
  • Part of Office 3.0
  • Last version to support 68000-based Macs[21]
1993 Word 6 6.0
  • Part of Office 4.2
  • Shares code and user interface with Word for Windows 6
  • Requires System 7.0, 4 MB of RAM (8 MB recommended), at least 10 MB available hard drive space, 68020 CPU[21]
1998 Word 98 8.5
  • Part of Office 98 Macintosh Edition
  • Requires PowerPC-based Macintosh
  • Renumbered alongside contemporary Windows version
2000 Word 2001 9.0
  • Part of Microsoft Office 2001
  • Word 2001 is the last version that is compatible with Classic Mac OS (Mac OS 9 or earlier)
2001 Word v. X 10.0
  • Part of Office v. X
  • First version for Mac OS X only
2004 Word 2004 11.0 Part of Office 2004
2008 Word 2008 12.0 Part of Office 2008
2010 Word 2011 14.0 Part of Office 2011; skipped 13.0 due to triskaidekaphobia.[118]
2015 Word 2016 16.0 Part of Office 2016; skipped 15.0
2019 Word 2019 16.0 Part of Office 2019
2021 Word 2021 16.0 Included in Office 2021
Word for MS-DOS release history

Year released Name Version Comments
1983 Word 1 1.0 Initial version of Word
1985 Word 2 2.0
1986 Word 3 3.0 Removed copy protection
1987 Word 4 4.0
1989 Word 5 5.0
1991 Word 5.1 5.1
1991 Word 5.5 5.5 First DOS version to use a Windows-like user interface
1993 Word 6 6.0 Last DOS version.
Word release history on other platforms

Platform Year released Name Comments
Atari ST 1988 Microsoft Write Based on Microsoft Word 1.05 for Mac OS
OS/2 1989 Microsoft Word 5.0 Word 5.0 ran both under DOS and OS/2 dual-mode as a native OS/2 application
OS/2 1991 Microsoft Word 5.5 Word 5.5 ran both under DOS and OS/2 dual-mode as a native OS/2 application
OS/2 1990 Microsoft Word for OS/2 Presentation Manager version 1.1
OS/2 1991 Microsoft Word for OS/2 Presentation Manager version 1.2[citation needed]
SCO Unix 1990 Microsoft Word for Unix version 5.0[119]
SCO Unix 1991 Microsoft Word for Unix version 5.1[120]

ReferencesEdit

  1. ^ «Update history for Microsoft Office 2019». Microsoft Docs. Retrieved April 13, 2021.
  2. ^ a b «C++ in MS Office». cppcon. July 17, 2014. Archived from the original on November 7, 2019. Retrieved June 25, 2019.
  3. ^ «System requirements for Office». Office.com. Microsoft. Retrieved March 30, 2019.
  4. ^ «Update history for Office for Mac». Microsoft Docs.
  5. ^ Lardinois, Frederic (January 29, 2015). «Microsoft’s Office For Android Tablets Comes Out Of Preview». TechCrunch. Retrieved January 28, 2023.
  6. ^ «Microsoft Word: Write, Edit & Share Docs on the Go APKs». APKMirror.
  7. ^ Cunningham, Andrew (March 27, 2014). «Microsoft brings Office to iPad, makes iPhone version free to all». Ars Technica. Retrieved January 27, 2023.
  8. ^ «Microsoft Word». App Store.
  9. ^ «Version 1.0 of today’s most popular applications, a visual tour – Pingdom Royal». Pingdom. June 17, 2009. Archived from the original on August 13, 2018. Retrieved April 12, 2016.
  10. ^ a b c d A. Allen, Roy (October 2001). «Chapter 12: Microsoft in the 1980s» (PDF). A History of the Personal Computer: The People and the Technology (1st ed.). Allan Publishing. pp. 12/25–12/26. ISBN 978-0-9689108-0-1. Retrieved November 7, 2010.
  11. ^ «Microsoft Office online, Getting to know you…again: The Ribbon». Archived from the original on May 11, 2011.
  12. ^ «The history of branding, Microsoft history». Archived from the original on May 28, 2009.
  13. ^ a b c d e Edwards, Benj (October 22, 2008). «Microsoft Word Turns 25». PC World. Archived from the original on July 4, 2012. Retrieved November 7, 2010.
  14. ^ Tsang, Cheryl (1999). Microsoft First Generation. John Wiley & Sons. ISBN 978-0-471-33206-0.
  15. ^ Schaut, Rick (May 19, 2004). «Anatomy of a Software Bug». MSDN Blogs. Archived from the original on February 1, 2010. Retrieved December 2, 2006.
  16. ^ a b Markoff, John (May 30, 1983). «Mouse and new WP program join Microsoft product lineup». InfoWorld. p. 10. Retrieved November 7, 2010.
  17. ^ Pollack, Andrew (August 25, 1983). «Computerizing Magazines». The New York Times. Retrieved April 24, 2013.
  18. ^ Lemmons, Phil (December 1983). «Microsoft Windows». BYTE. p. 48. Retrieved October 20, 2013.
  19. ^ Advertisement (December 1983). «Undo. Windows. Mouse. Finally». BYTE. pp. 88–89. Retrieved October 20, 2013.
  20. ^ Peterson, W.E. Pete (1994). Almost Perfect: How a Bunch of Regular Guys Built Wordperfect Corporation. Prima Publishing. ISBN 0-7881-9991-9.
  21. ^ a b c d e f Knight, Dan (May 22, 2008). «Microsoft Word for Mac History». Low End Mac. Retrieved November 7, 2010.
  22. ^ «The Piece Table».
  23. ^ Brand, Stewart (1989). Whole Earth Software Catalog. ISBN 9780385233019. For a year, I waited for a heavier-duty word processor than MACWRITE. I finally got it— WORD.
  24. ^ a b c Schaut, Rick (February 26, 2004). «Mac Word 6.0». Buggin’ My Life Away. MSDN Blogs. Archived from the original on May 14, 2004. Retrieved June 21, 2010.
  25. ^ «Atari announces agreement with Microsoft». Atarimagazines.com. April 25, 2008. Retrieved June 21, 2010.
  26. ^ «Feature Review: Microsoft Write». Atarimagazines.com. April 25, 2008. Retrieved June 21, 2010.
  27. ^ «Today’s Atari Corp.: A close up look inside». Atarimagazines.com. April 25, 2008. Retrieved June 21, 2010.
  28. ^ Miller, Michael J. (November 12, 1990). «First Look: Microsoft Updates Look of And Adds Pull-Down Menus to Character-Based Word 5.5». InfoWorld. p. 151. Retrieved November 7, 2010.
  29. ^ Needleman, Raphael (November 19, 1990). «Microsoft Word 5.5: Should You Fight or Switch?». InfoWorld. p. 106. Retrieved November 7, 2010.
  30. ^ «Microsoft Word 5.5 for MS-DOS (EXE format)». Microsoft Download Center. Retrieved August 19, 2011.
  31. ^ «War of the Words». InfoWorld. February 7, 1994. pp. 66–79. Retrieved November 7, 2010.
  32. ^ a b Lockman, James T.W. (May 15, 1998). «UGeek Software Review: Microsoft Office 98 Gold for Macintosh». Archived from the original on December 3, 2010. Retrieved November 7, 2010.
  33. ^ Rose, Daniel. «Microsoft Office for Windows NT». DanielSays.com – Daniel’s Legacy Computer Collections. Archived from the original on January 27, 2015. Retrieved May 15, 2015.
  34. ^ Ericson, Richard (October 11, 2006). «Final Review: The Lowdown on Office 2007». Computerworld. Retrieved November 8, 2010.
  35. ^ Lowe, Scott (December 11, 2006). «An introduction to the Microsoft Office 2007 ribbon interface». TechRepublic. Retrieved December 14, 2021.
  36. ^ Shultz, Greg (February 25, 2009). «Be ready for new and improved applets in Windows 7». TechRepublic. Archived from the original on December 14, 2021. Retrieved December 14, 2021.
  37. ^ a b Lowe, Scott (January 26, 2007). «Explore what is new and different in Microsoft Word 2007». TechRepublic. Retrieved December 14, 2021.
  38. ^ Mendelson, Edward (May 11, 2010). «Microsoft Office 2010». PC Magazine. Retrieved November 8, 2010.
  39. ^ Mendelson, Edward (May 11, 2010). «Microsoft Office 2010: Office 2010’s Backstage View». PC Magazine. Archived from the original on December 2, 2010. Retrieved November 8, 2010.
  40. ^ Mendelson, Edward (May 11, 2010). «Microsoft Office 2010: Lots of Graphics Options». PC Magazine. Archived from the original on April 24, 2010. Retrieved December 14, 2021.
  41. ^ «Introduction to Word Web App». Microsoft. Retrieved November 8, 2010.
  42. ^ «Microsoft Word 1.x (Mac)». WinWorld. Retrieved December 22, 2021.
  43. ^ a b McLean, Prince (November 12, 2007). «Road to Mac Office 2008: an introduction (Page 3)». AppleInsider. Archived from the original on July 7, 2011. Retrieved November 7, 2010.
  44. ^ Tetrault, Gregory (January 2001). «Review: Microsoft Office 2001». ATPM: About This Particular Macintosh. Retrieved November 7, 2010.
  45. ^ Negrino, Tom (February 1, 2002). «Review: Microsoft Office v. X». MacWorld. Archived from the original on August 18, 2010. Retrieved November 7, 2010.
  46. ^ Lunsford, Kelly; Michaels, Philip; Snell, Jason (March 3, 2004). «Office 2004: First Look». MacWorld. Archived from the original on June 25, 2010. Retrieved November 7, 2010.
  47. ^ Friedberg, Steve (May 25, 2004). «Review: Microsoft Office». MacNN. Archived from the original on April 5, 2010. Retrieved November 7, 2010.
  48. ^ McLean, Prince (November 14, 2007). «Road to Mac Office 2008: Word ’08 vs Pages 3.0». AppleInsider. Retrieved November 7, 2010.
  49. ^ McLean, Prince (November 12, 2007). «Road to Mac Office 2008: an introduction (Page 4)». AppleInsider. Archived from the original on July 7, 2011. Retrieved November 7, 2010.
  50. ^ McLean, Prince (March 29, 2010). «New Office 11 for Mac sports dense ribbons of buttons». AppleInsider. Retrieved November 7, 2010.
  51. ^ Dilger, Daniel Eran (October 25, 2010). «Review: Microsoft’s Office 2011 for Mac (Page 2)». Apple Insider. Archived from the original on October 28, 2010. Retrieved November 7, 2010.
  52. ^ Oakley, Howard (May 2, 2015). «.why .the .extensions? Quirks in the naming of files and folders». The Eclectic Light Company. Archived from the original on February 26, 2020. Retrieved February 26, 2020. Macs used to be the only computers that did not need filename extensions…on classic Mac systems, you can name applications, documents, and most other files almost anything that you like, as the name is not linked in any way to the type of thing that file is.
  53. ^ «DOCX Transitional (Office Open XML), ISO 29500:2008-2016, ECMA-376, Editions 1-5». loc.gov. January 20, 2017. Retrieved July 9, 2019.
  54. ^ «5 Appendix A: Product Behavior» (PDF). [MS-DOC]: Word (.doc) Binary File Format (PDF). Redmond, WA: Microsoft. Archived from the original on January 10, 2015. Retrieved January 10, 2015.
  55. ^ «2.1 File Structure» (PDF). [MS-DOC]: Word (.doc) Binary File Format (PDF). Redmond, WA: Microsoft. Archived from the original on January 10, 2015. Retrieved January 10, 2015.
  56. ^ «2.1.1 WordDocument Stream» (PDF). [MS-DOC]: Word (.doc) Binary File Format (PDF). Redmond, WA: Microsoft. Archived from the original on January 10, 2015. Retrieved January 10, 2015.
  57. ^ «What You Can Do with Word XML [Word 2003 XML Reference]». MSDN. 2004.
  58. ^ a b Casson, Tony; Ryan, Patrick S. (May 1, 2006). «Open Standards, Open Source Adoption in the Public Sector, and Their Relationship to Microsoft’s Market Dominance». In Bolin, Sherrie (ed.). Standards Edge: Unifier or Divider?. Sheridan Books. p. 87. SSRN 1656616.
  59. ^ «Microsoft Expands List of Formats Supported in Microsoft Office, May 21, 2008». News Center. Microsoft. May 21, 2008. Retrieved April 24, 2013.
  60. ^ Fulton, Scott M. III (May 21, 2008). «Next Office 2007 service pack will include ODF, PDF support options». Betanews.
  61. ^ Andy Updegrove (May 21, 2008). «Microsoft Office 2007 to Support ODF – and not OOXML, May 21, 2008». Consortiuminfo.org. Retrieved June 21, 2010.
  62. ^ «Microsoft: Why we chose ODF support over OOXML, 23 May 2008». Software.silicon.com. Archived from the original on July 21, 2009. Retrieved June 21, 2010.
  63. ^ «Fact-sheet Microsoft ODF support» (PDF). odfalliance. Archived from the original (PDF) on June 11, 2009. Retrieved May 24, 2009. Microsoft Excel 2007 will process ODF spreadsheet documents when loaded via the Sun Plug-In 3.0 for Microsoft Office or the SourceForge «OpenXML/ODF Translator Add-in for Office,» but will fail when using the «built-in» support provided by Office 2007 SP2.
  64. ^ Microsoft. «What happens when I save a Word 2007 document in the OpenDocument Text format?». Archived from the original on March 18, 2010. Retrieved April 5, 2010.
  65. ^ Goodwins, Rupert (October 3, 2005). «Office 12 to support PDF creation, 3 October 2005». News.zdnet.co.uk. Archived from the original on July 23, 2009. Retrieved June 21, 2010.
  66. ^ Marson, Ingrid (October 6, 2005). «Microsoft ‘must support OpenDocument’, 6 October 2005». News.zdnet.co.uk. Archived from the original on July 25, 2009. Retrieved June 21, 2010.
  67. ^ March 23, 2006, Gates: Office 2007 will enable a new class of application Mass. holding tight to OpenDocument – ZDNet Archived July 21, 2009, at the Wayback Machine
  68. ^ «May 08, 2006 – Microsoft Office to get a dose of OpenDocument». Zdnet.com.au. Archived from the original on July 22, 2009. Retrieved June 21, 2010.
  69. ^ OpenDocument Fellowship (October 20, 2005). «OpenDocument Support: Tell Microsoft You Want It!, 20 October 2005». Opendocumentfellowship.com. Archived from the original on March 23, 2008. Retrieved June 21, 2010.
  70. ^ «Coming soon: ODF for MS Office, May 04, 2006». Linux-watch.com. May 4, 2006. Retrieved June 21, 2010.
  71. ^ LaMonica, Martin (May 5, 2006). «Microsoft Office to get a dose of OpenDocument». CNET News. Retrieved June 21, 2010.
  72. ^ «Microsoft Expands Document Interoperability, July 5, 2006». Microsoft.com. July 5, 2006. Archived from the original on February 4, 2007. Retrieved June 21, 2010.
  73. ^ Jones, Brian; Rajabi, Zeyad (July 6, 2006). «Open XML Translator project announced (ODF support for Office)». Brian Jones: Office Solutions. Microsoft. Archived from the original on January 18, 2010. Retrieved April 24, 2013.
  74. ^ LaMonica, Martin (February 1, 2007). «Microsoft to release ODF document converter». CNet News. Retrieved April 24, 2013.
  75. ^ Lombardi, Candace (February 7, 2007). «Sun to release ODF translator for Microsoft Office». CNET. Retrieved June 21, 2010.
  76. ^ Paul, Ryan (July 7, 2007). «Sun releases ODF Plugin 1.0 for Microsoft Office, July 07, 2007». Arstechnica.com. Retrieved June 21, 2010.
  77. ^ «Download details: 2007 Microsoft Office Add-in: Microsoft Save as PDF or XPS». Microsoft.com. November 8, 2006. Retrieved June 21, 2010.
  78. ^ Microsoft to remove PDF support from Office 2007 in wake of Adobe dispute, Friday, June 2, 2006 Microsoft to remove PDF support from Office 2007 in wake of Adobe dispute | TG Daily Archived February 1, 2009, at the Wayback Machine
  79. ^ Klein, Matt. «Word Formatting: Mastering Styles and Document Themes». How-To Geek. Retrieved July 9, 2019.
  80. ^ «Change the Normal template (Normal.dotm )». support.microsoft.com. Retrieved May 20, 2021.
  81. ^ in-depth explanation of Normal.dot Archived June 20, 2005, at the Wayback Machine
  82. ^ «Edit SVG images in Microsoft Office 365». Office Support. Microsoft. Retrieved February 4, 2019.
  83. ^ What’s new in Word 2010. Retrieved July 1, 2010.
  84. ^ Improving the look of papers written in Microsoft Word. Retrieved May 30, 2010.
  85. ^ How to Enable OpenType Ligatures in Word 2010, Oreszek Blog, May 17, 2009.
  86. ^ Such as «How to delete a blank page in Word». Sbarnhill.mvps.org. Archived from the original on May 5, 2010. Retrieved June 21, 2010.
  87. ^ Alan Wood. «Unicode and Multilingual Editors and Word Processors for Mac OS X».
  88. ^ Neuburg, Matt (May 19, 2004). «TidBITS : Word Up! Word 2004, That Is». Db.tidbits.com. Archived from the original on July 8, 2012. Retrieved June 21, 2010.
  89. ^ «Automatically numbering equations and other equation-related questions in Word for Mac 2011». Microsoft Community. February 6, 2013.
  90. ^ McGhie, John (March 26, 2011). «Word’s numbering explained». word.mvps.org.
  91. ^ Aldis, Margaret (March 26, 2011). «Methods for restarting list numbering». Word.mvps.org.
  92. ^ «How To Access Auto Summarize in Microsoft Word 2007». Sue’s Word Tips. December 14, 2011. Retrieved July 9, 2019.
  93. ^ Gore, Karenna (February 9, 1997). «Cognito Auto Sum». Slate. Retrieved June 21, 2010.
  94. ^ Changes in Word 2010 (for IT pros). Technet.microsoft.com (May 16, 2012). Retrieved July 17, 2013.
  95. ^ Word Mobile
  96. ^ Ralph, Nate. «Office for Windows Phone 8: Your handy starter guide». TechHive. Archived from the original on October 15, 2014. Retrieved August 30, 2014.
  97. ^ Wollman, Dana. «Microsoft Office Mobile for iPhone hands-on». Engadget. Retrieved August 30, 2014.
  98. ^ Pogue, David (June 19, 2013). «Microsoft Adds Office for iPhone. Yawn». The New York Times. Retrieved August 30, 2014.
  99. ^ a b Unsupported Features in Word Mobile. Microsoft. Retrieved September 21, 2007.
  100. ^ Koenigsbauer, Kirk; Microsoft 365, Corporate Vice President for (July 29, 2015). «Office Mobile apps for Windows 10 are here!». Microsoft 365 Blog. Retrieved July 11, 2020.
  101. ^ Office Apps for Windows 10 Mobile: End of Support for Windows Phones
  102. ^ Bradley, Tony (February 2, 2015). «Office Online vs. Office 365: What’s free, what’s not, and what you really need». PC World. Archived from the original on July 24, 2017. Retrieved July 16, 2020.
  103. ^ Ansaldo, Michael (September 28, 2017). «Microsoft Office Online review: Work with your favorite Office formats for free». PC World. Retrieved October 31, 2019.
  104. ^ «Differences between using a document in the browser and in Word». Office Support. Microsoft. Archived from the original on November 7, 2017. Retrieved November 1, 2017.
  105. ^ a b «Password protect documents, workbooks, and presentations». Microsoft Office website. Microsoft. Retrieved April 24, 2013.
  106. ^ «How to Restrict Editing in Word 2010/2007». Trickyways. June 22, 2010. Retrieved April 24, 2010.
  107. ^ «How safe is Word encryption. Is it secure?». Oraxcel.com. Archived from the original on April 17, 2013. Retrieved April 24, 2013.
  108. ^ Cameron, Janet (September 1984). «Word Processing Revisited». BYTE (review). p. 171. Retrieved October 23, 2013.
  109. ^ Manes, Stephen (February 21, 1984). «The Unfinished Word». PC Magazine. p. 192. Retrieved October 19, 2021.
  110. ^ McNeill, Dan (December 1987). «Macintosh: The Word Explosion». Compute!’s Apple Applications. pp. 54–60. Retrieved September 14, 2016.
  111. ^ Nimersheim, Jack (December 1989). «Compute! Specific: MS-DOS». Compute!. pp. 11–12.
  112. ^ «Data Stream». Next Generation. No. 21. Imagine Media. September 1996. p. 21.
  113. ^ Opus Development Postmortem
  114. ^ «Microsoft Word 1.x (Windows) – Stats, Downloads and Screenshots :: WinWorld». WinWorld. Retrieved July 3, 2016.
  115. ^ Shustek, Len (March 24, 2014). «Microsoft Word for Windows Version 1.1a Source Code». Retrieved March 29, 2014.
  116. ^ Levin, Roy (March 25, 2014). «Microsoft makes source code for MS-DOS and Word for Windows available to public». Official Microsoft Blog. Archived from the original on March 28, 2014. Retrieved March 29, 2014.
  117. ^ a b «Office 14». Office Watch. June 1, 2007. For the sake of superstition the next version of Office won’t be called ’13’.{{cite web}}: CS1 maint: url-status (link)
  118. ^ Marshall, Martin (January 8, 1990). «SCO Begins Shipping Microsoft Word 5.0 for Unix and Xenix». InfoWorld. p. 6. Retrieved May 20, 2021.
  119. ^ «Microsoft Word: SCO announces Word for Unix Systems Version 5.1». EDGE: Work-Group Computing Report. March 11, 1991. p. 33. Retrieved May 20, 2021 – via Gale General OneFile.

Further readingEdit

  • Tsang, Cheryl. Microsoft: First Generation. New York: John Wiley & Sons, Inc. ISBN 978-0-471-33206-0.
  • Liebowitz, Stan J. & Margolis, Stephen E. Winners, Losers & Microsoft: Competition and Antitrust in High Technology Oakland: Independent Institute. ISBN 978-0-945999-80-5.

External linksEdit

  • Microsoft Word – official site
  • Find and replace text by using regular expressions (Advanced) — archived official support website

The DOCX is a text-based file that is highly editable, easy to use and manageable in size. The popularity of the DOCX document file ensures developers will continue to create specifically for it. Learn how to open, convert and utilize the DOCX with this guide.

What is a DOCX document file?

A DOCX file is a Microsoft Word document that typically contains text. DOCX is the newer version of DOC, the original official Microsoft Word file format. They are both opened using Microsoft Word, though alternate software programs open them as well. A DOCX is a convenient XML format, making it incredibly popular.

The DOCX was introduced by Microsoft in the new millennium as an upgrade from the previous DOC format. Though it’s mostly used to edit and create with text and hyperlinks, it also holds other media such as images. This file format remains one of the most widely used and is accessible through numerous programs.

The DOCX document file icon.

The DOCX is an upgraded format from the previous Microsoft Word format.

How to open it

The DOCX is a smaller document file format than the DOC, making it convenient to send via email and store on a hard drive. The DOCX is a compressed file, meaning it’s shrunken in size to reduce its impact on storage space. A DOCX is opened either using Microsoft Word or alternative, third-party programs.

It’s one of the most popular document file types, which is convenient when sharing with others. In fact, most users have Microsoft Word and can easily open and edit the file. In a team setting, a DOCX is ideal due to its editing capabilities. Team members can quickly share and edit the DOCX amongst one another, making it perfect for projects and campaigns.

A keyboard with the document file icon.

The DOCX has vast editing capabilities.

When to use a different document file

Articles, newsletters and advertisements are best created using DOCX.  A DOCX is used to create resumes and cover letters, though it’s not ideal because of how easy it is to edit a DOCX. In these instances, consider a different file format, especially for professional documents such as cover letters or resumes. A format such as a PDF document file is better in this instance because it’s harder to accidentally edit or change.

To convert a DOCX file:

  1. Open the file in Microsoft Word
  2. Click ‘file’
  3. Click ‘save as’
  4. From the dropdown menu, select the file type you wish to save it to (PDF, etc)

A screenshot of a user converting a DOCX file to .DOC

Convert a DOCX using Microsoft Word.

Alternatively, if you prefer to let a converter program handle the conversion process, those are widely available. Zamzar’s browser-based system has an easy-to-use interface and doesn’t require installation of a program to use. Give it a shot if you have multiple files to convert, as it might save you time.

Consider the ways in which a DOCX benefits different companies and teams. As tech progressed, Microsoft changed the way its document file type construction. Look for future similar developments to bolster potential.

What to Know

  • A text file contains just text (versus other content like images).
  • Open one with any text editor, such as Notepad or TextEdit.
  • Convert to other text-based formats with Notepad++ and similar tools.

This article describes what a text file is and how to open one or convert one to a different format.

What Is a Text File?

A text file is a file containing text, but there are several ways to think about that, so it’s important to know the kind of text document you have before dealing with a program that can open or convert it.

Some text files use the .TXT file extension and don’t contain any images. Others might contain both images and text, but still be called a text file or even abbreviated as a «txt file,» which can be confusing.

Types of Text Files

In the general sense, a text file refers to any file that has only text and is void of images and other non-text characters. These sometimes use the TXT file extension but don’t necessarily need to. For example, a Word document that is an essay containing just text can be in the DOCX file format but still be called a text file.

Another kind of text file is the «plain text» file. This is a file that contains zero formatting (unlike RTF files), meaning nothing is bold, italic, underlined, colored, using a special font, etc. Several examples of plain text file formats include ones that end in these file extensions: XML, REG, BAT, PLS, M3U, M3U8, SRT, IES, AIR, STP, XSPF, DIZ, SFM, THEME, and TORRENT.

Of course, files with the .TXT extension are text files, too, and are commonly used to store things that can be easily opened with any text editor or written to with a simple script. Examples might include storing step-by-step instructions for how to use something, a place to hold temporary information, or logs generated by a program (though those are usually stored in a LOG file).

«Plaintext,» or cleartext files, are different than «plain text» files (with a space). If file storage encryption or file transfer encryption isn’t used, the data can be said to exist in plaintext or be transferred over plaintext. This can be applied to anything that should be secured but isn’t, be it emails, messages, plain text files, passwords, etc., but it’s usually used in reference to cryptography.

How to Open a Text File

All text editors should be able to open any text file, especially if there isn’t any special formatting being used. For example, TXT files can be opened with the built-in Notepad program in Windows by right-clicking the file and choosing Edit. Similar for TextEdit on a Mac.

Another free program that can open any text file is Notepad++. Once installed, you can right-click the file and choose Edit with Notepad++.

The 4 Best Free Text Editors

Most web browsers and mobile devices can open text files as well. However, since most of them aren’t built to load text files using the various extensions you mind them using, you might need to first rename the file extension to .TXT if you want to use those applications to read the file.

Some other text editors and viewers include Microsoft Word, TextPad, Notepad2, Geany, and Microsoft WordPad.

Additional text editors for macOS include BBEdit and TextMate. Linux users can also try Leafpad, gedit, and KWrite.

Open Any File as a Text Document

Something else to understand here is that any file can be opened as a text document, even if it doesn’t contain readable text. Doing this is useful when you’re not sure what file format it’s really in, like if it’s missing a file extension, or you think it’s been identified with an incorrect file extension.

For example, you can open an MP3 audio file as a text file by plugging it into a text editor like Notepad++. You can’t play the MP3 this way, but you can see what it’s made up of in text form, since the text editor is only able to render the data as text.

With MP3s in particular, the very first line should include ID3 to indicate that it’s a metadata container that might store information like an artist, album, track number, etc.

Another example is the PDF file format; every file starts off with the %PDF text on the first line, even though the rest of the document is completely unreadable.

How to Convert Text Files

The only real purpose for converting text files is to save them into another text-based format like CSV, PDF, XML, HTML, XLSX, etc. You can do this with most advanced text editors but not the simpler ones since they generally only support basic export formats like TXT, CSV, and RTF.

For example, the Notepad++ program mentioned above is capable of saving to a huge number of file formats, like HTML, TXT, NFO, PHP, PS, ASM, AU3, SH, BAT, SQL, TEX, VGS, CSS, CMD, REG, URL, HEX, VHD, PLIST, JAVA, XML, and KML.

Other programs that export to a text format can probably save to a few different kinds, typically TXT, RTF, CSV, and XML. So if you need a file from a specific program to be in a new text format, consider returning to the application that made the original text file, and export it to something else.

All that said, text is text so long as it’s plain text, so simply renaming the file, swapping one extension for another, might be all you need to do to «convert» the file.

Still Can’t Open Your File?

Are you seeing jumbled text when you open your file? Maybe most of it, or all of it, is completely unreadable. The most likely reason for this is that the file isn’t plain text.

Like we mentioned above, you can open any file with Notepad++, but like with the MP3 example, it doesn’t mean that you can actually use the file there. If you try your file in a text editor and it’s not rendering like you think it should, rethink how it should open; it’s probably not in a file format that can be explained in human-readable text.

If you have no idea how your file should open, consider trying some popular programs that work with a wide variety of formats. For example, while Notepad++ is great for seeing the text version of a file, try dragging your file into VLC media player to check if it’s a media file that contains video or sound data.

FAQ

  • How do I open TXT files on an Android?

    Some Android phones or tablets have built-in office apps that can open TXT files as well as other types of documents and spreadsheets. If your device’s office app can’t open a text file, try a third-party Android text editor. For example, download Text Editor from the Google Play Store and use it to open and read your text files.

  • How do I make TXT files?

    On Windows, right-click any open space on the Desktop > New > Text Document. On a Mac, open Finder and navigate to the folder where you want the TXT file, then launch Terminal and enter touch MyTextFile.txt. On any system, you can also open a word processing application such as Microsoft Word, create your document, and then save it as a Plain Text (.txt) file.

  • How do you convert a text file to Excel?

    In Excel, select the Data tab > From Text/CVS > choose your text file > Import. Next, select Delimited > choose a delimiter > Next > General > Finish. Then, to ensure that your data starts with Row 1, Column A, select Existing Worksheet, and type Add «=$A$1» in the field.

  • How do I create a text file that lists the contents of a folder?

Thanks for letting us know!

Get the Latest Tech News Delivered Every Day

Subscribe

Updated: 11/06/2021 by

Microsoft Word

Sometimes called Winword, MS Word, or Word, Microsoft Word is a word processor published by Microsoft. It is one of the office productivity applications included in the Microsoft Office suite. Originally developed by Charles Simonyi and Richard Brodie, it was first released in 1983.

Microsoft Word is available for Microsoft Windows, Apple macOS, Android, and Apple iOS. It can also run on the Linux operating system using WINE.

What is Microsoft Word used for?

Microsoft Word lets you create professional-quality documents, reports, letters, and résumés. Unlike a plain text editor, Microsoft Word has features including spell check, grammar check, text and font formatting, HTML support, image support, advanced page layout, and more.

What does the Microsoft Word editor look like?

Below is an overview of a Microsoft Word 2010 document.

Microsoft Word document diagram

Where do you find or start Microsoft Word?

If you have Microsoft Word or the entire Microsoft Office package installed on Microsoft Windows, you can access Microsoft Word in your Start menu.

Keep in mind that new computers do not include Microsoft Word. It must be purchased and installed before running it on your computer. If you do not want (or cannot afford) to purchase Microsoft Word, you can use a limited version for free at the Microsoft Office website.

If Microsoft Word is installed on your computer, but you can’t find it in your Start menu, use the following steps to launch Microsoft Word manually.

  1. Open My Computer or File Explorer.
  2. Click or select the C: drive. If Microsoft Office is installed on a drive other than the C: drive, select that drive instead.
  3. Find and open the Program Files (x86) or Program Files folder.
  4. Open the Microsoft Office folder.
  5. In the Microsoft Office folder, open the root folder. Then open the OfficeXX folder, where XX is the version of Microsoft Office (e.g., Office16 for Microsoft Office 2016) installed on your computer.

Tip

If there is no root folder, look for and open the folder with Office in the folder name.

  1. Find and double-click the file named WINWORD.EXE to start the Microsoft Word program.

How to open Microsoft Word without using a mouse

  1. Press the Windows key.
  2. Type Word and select the Microsoft Word entry in the search results.
  3. If Microsoft Word does not open after selecting it in the search results, press Enter to launch it.

What are the uses of Microsoft Word?

Microsoft Word is a word processor, and, like other word processors, it’s capable of helping users create a variety of different types of documents. For example, users can create a résumé, business contract, instruction document, or a letter to another person. We’ve included a list of the top uses of a word processor on our word processor page.

How many lines are there on a page in Microsoft Word?

By default, there are 29 lines on one page in Microsoft Word.

What type of files can Microsoft Word create and use?

Early versions of Microsoft Word primarily created and used the .doc file extension, while newer versions of Word create and use the .docx file extension.

More recent versions of Microsoft Word support the creation and opening of these types of files:

  • .doc, .docm, .docx
  • .dot, .dotm, .dotx
  • .htm, .html
  • .mht, .mhtml
  • .odt
  • .pdf
  • .rtf
  • .txt
  • .wps
  • .xps
  • .xml

Example of a Microsoft Word .doc file

We created a Microsoft Word document that you can download and open in most word processor programs, including Microsoft Word. Click the link below to download the example Word document and experiment more with a word processing document.

  • Download example.doc

Why use Word instead of a plain-text editor?

Microsoft Word offers many features not found in a traditional text editor or a plain-text file. Some advantages include changing the formatting (e.g., center), editing the font type, size, and color, inserting pictures, and more.

Tip

The features above are also available in a rich-text editor, such as WordPad, which is included with Microsoft Windows.

Why use Word instead of a WordPad?

A rich-text editor, like WordPad, offers many of the same basic features as Microsoft Word. Where Microsoft Word differs is the ability to do more advanced features. The advanced features include mail merges, spellchecker, styles, tables, headers & footers, WordArt, columns, margins, and more.

What are the different versions of Microsoft Word?

Microsoft Word has had several versions throughout its history. The different releases with release dates are listed below.

Windows versions

Word 2016 and 97

  • Word 2019, released in 2018
  • Office 365 and Word 2016, released in 2016
  • Word 2013, released in 2013
  • Word 2010, released in 2010
  • Word 2007, released in 2006
  • Word 2003, released in 2003
  • Word 2002, released in 2001
  • Word 2000, released in 1999
  • Word 98, released in 1998
  • Word 97, released in 1997
  • Word 95, released in 1995
  • Word 6.0, released in 1993
  • Word 2.0, released in 1991
  • Word 1.1, released in 1990
  • Word 1.0, originally invented for MS-DOS and Xenix in 1983 by Charles Simonyi and Richard Brodie, working for Bill Gates and Paul Allen. Word was released in the Windows OS in 1989

Mac versions

Word 1.0

  • Word 2019, released in 2018
  • Word 2016, released in 2015
  • Word 2011, released in 2010
  • Word 2008, released in 2008
  • Word 2004, released in 2004
  • Word v. X, released in 2001
  • Word 2001, released in 2000
  • Word 98, released in 1998
  • Word 6, released in 1993
  • Word 5.1, released in 1992
  • Word 5, released in 1991
  • Word 4, released in 1989
  • Word 3, released in 1987
  • Word 1, released in 1985

Desktop publishing, Editor, Google Docs, Office, Office 365, Office Online, Software terms, WordPad, Word processor, Word processor terms

  • Download source code without executable — 3.8 KB
  • Download source code with executable — 34.1 KB

Introduction

In this tip, I’ll explain how to convert a Microsoft Word document to a text file in C#. To do this, Word must be installed.

Adding a reference to the Microsoft Word Object Library

The first step is to add a reference to the Microsoft Word Object Library. In Visual Studio, choose «Add Reference…», go to «COM», and select «Microsoft Word [version number here] Object Library».
Image 1
As you can see on the image, I use the Microsoft Word 15.0 Object Library, that’s the library of Word 2013. You can have another number than 15.0.

The code

At the top of the code file, we will add the following using [namespace] statements:

using System.IO;
using Word = Microsoft.Office.Interop.Word;

Now, we can just write Word.Document instead of Microsoft.Office.Interop.Word.Document for example.
Now, we will ask the user which file (s)he wants to convert, using the following code:

Console.WriteLine("Please enter the full file path of your Word document (without quotes):");
object path = Console.ReadLine();
Console.WriteLine("Please enter the file path of the text document in which you want to store the text of your word document (without quotes):");
string txtPath = Console.ReadLine();

As you can read in the code, for the path of the Word document, the full path is required. If you just write test.docx, then you’ll actually try to convert C:Windowssystem32test.docx instead of the test.docx file in the folder of the converter. For the file path of the text file, it is OK to write test.txt, because then it will create the test.txt file in the folder of the converter. It is also necessary that the path to the Word file is an object, not a string, because when we’re going to open the Word file, the parameters should be objects.
Now, we’ll open the Word file and retrieve the text using the following code:

Word.Application app = new Word.Application();
Word.Document doc;
object missing = Type.Missing;
object readOnly = true;
try
{
    doc = app.Documents.Open(ref path, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
    string text = doc.Content.Text;
    File.WriteAllText(txtPath, text);
    Console.WriteLine("Converted!");
}

Here, we create a Word Application that opens the document. The first argument of the Open method is the file path, the third argument is whether we want to open the file as read-only (yes in this case). The text is stored in Content.Text, and then we use the File.WriteAllText method to write the text to a file. Now, we’ll create the catch and finally blocks:

catch
{
    Console.WriteLine("An error occured. Please check the file path to your word document, and whether the word document is valid.");
}
finally
{
    object saveChanges = Word.WdSaveOptions.wdDoNotSaveChanges;
    app.Quit(ref saveChanges, ref missing, ref missing);
}

Because we don’t want to save the changes (we didn’t even make changes), we use WdSaveOptions.wdDoNotSaveChanges. The Application.Quit method closes all open documents, and quits the Word Application.
If we merge all code snippets, we get this:

Console.WriteLine("Please enter the full file path of your Word document (without quotes):");
object path = Console.ReadLine();
Console.WriteLine("Please enter the file path of the text document in which you want to store the text of your word document (without quotes):");
string txtPath = Console.ReadLine();
Word.Application app = new Word.Application();
Word.Document doc;
object missing = Type.Missing;
object readOnly = true;
try
{
    doc = app.Documents.Open(ref path, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
    string text = doc.Content.Text;
    File.WriteAllText(txtPath, text);
    Console.WriteLine("Converted!");
}
catch
{
    Console.WriteLine("An error occured. Please check the file path to your word document, and whether the word document is valid.");
}
finally
{
    object saveChanges = Word.WdSaveOptions.wdDoNotSaveChanges;
    app.Quit(ref saveChanges, ref missing, ref missing);
}

History

  • 5 Jan 2014: First version

Понравилась статья? Поделить с друзьями:
  • Is a subset of symbol in word
  • Is a proper noun a word
  • Is a proper name a word
  • Is a predicate one word
  • Is a number considered a word