Word open not as html

I would like to open an HTML file in MS Word 2007 directly so I can use some of its search and replace features. I don’t care about the file’s encoding and will not actually use the final thing as HTML (I’m going to be extracting interesting parts). I want to see raw HTML. When I open the document, instead Word renders everything.

I am hoping someone knows how to do this from within Word without having to resort to copying and pasting or modifying the file to change the html tags to no longer look like html (such as {html> or something). I have been hit by this multiple times over the years and wish that I could fix the problem instead of always be forced to find a workaround.

Is there some option or setting, or something in the Open dialog that can change Word’s behavior?

asked Dec 29, 2012 at 1:03

ErikE's user avatar

ErikEErikE

2641 gold badge6 silver badges17 bronze badges

2

You can turn off automatic file-type conversions in Word, and instead it will ask you what type of file you’re trying to open, at which point you can pick Text.

To turn it off (or rather turn on Confirmation of conversion):

  1. Open the Word Options. (Word 2007: click the Office button and then click Word Options. Word 2010: display the File tab of the ribbon and then click Options)
  2. At the left side of the dialog box click Advanced.
  3. Scroll through the options until you see the General section.
  4. Make sure the Confirm File Format Conversion On Open check box is selected.
  5. Click on OK.

If you never want to open rendered HTML in Word again, you can also uninstall the HTML Filter by running Office Setup and de-selecting it form the installed options.

Alternatively and probably easiest: Use something else for the task, like a good Plain Text editor. I personally use and suggest Notepad++.

answered Dec 29, 2012 at 1:18

Ƭᴇcʜιᴇ007's user avatar

Ƭᴇcʜιᴇ007Ƭᴇcʜιᴇ007

111k19 gold badges199 silver badges264 bronze badges

0

Go to your html file and right click on it open with it in word do the editing and click on save…(note file format conversion should be disabled)

answered Oct 31, 2015 at 4:09

Shafiq Ahmad's user avatar

2

Note: This article has done its job, and will be retiring soon. To prevent «Page not found» woes, we’re removing links we know about. If you’ve created links to this page, please remove them, and together we’ll keep the web connected.

If you need to save a Word document as a webpage, your best bet is to use the Web Page, Filtered option.

When you save your document as a filtered webpage, Word keeps only the content, style instructions, and some other information. The file is small, without a lot of extra code.

  1. Click File > Save As and choose the location where you want to save your document.

  2. Name your file.

  3. In the Save as type list, choose Web Page, Filtered.

The Save As dialog box with Web Page, Filtered selected

  1. Click Change Title and type the title you want to display in a web browser’s title bar.

  2. Click Save.

Tips

To save document properties and more Word information with the webpage, choose the Single File Web Page type. But your file will be larger—almost 10 times larger.

To save pictures in a separate folder from the text part of the webpage, choose the Web Page type. When you post the webpage to a website, post the pictures folder, too.

To see the webpage’s HTML code, browse to the file in Windows Explorer, right-click the file, point to Open with, and click Internet Explorer. Then right-click the page in Internet Explorer and click View Source.

Other ways to share a document online

Word was originally designed to create and print documents. In Word 2013, you now have other options for sharing your thoughts and your work online.

Save a document as a blog post

If you’re blogging and you want to write your post in Word, you can save your document as a blog post. Word keeps the least amount of information with your content. And the published document uses the blog’s styles.

  1. Click File > Share > Publish as Blog Post.

  2. Click Publish as Blog Post.

Publish as Blog Post button

The first time you post a document to your blog, Word guides you through registering your blog account.

Save onto OneDrive and share

Share your document with friends and colleagues by saving it to OneDrive and inviting them to view it.

  1. Click File > Save As > OneDrive.

  2. Choose a location in your OneDrive folders.

  3. Type a file name, and then click Save.

Then invite others to view your document. If people don’t have Word, the document opens automatically in Word for the web.

  1. Click File > Share > Invite People.

The box to list people's email addresses

  1. Add their email addresses.

  2. Click Share.

The Share button

For more information, see Share a document using SharePoint or OneDrive.

Save as a PDF

To convert your document to a PDF you can post to a website, click File > Save As. In the Save as type list, click PDF.

Need more help?

Please Note:
This article is written for users of the following Microsoft Word versions: 2007 and 2010. If you are using an earlier version (Word 2003 or earlier), this tip may not work for you. For a version of this tip written specifically for earlier versions of Word, click here: Turning Off HTML Conversions.

Written by Allen Wyatt (last updated February 21, 2023)
This tip applies to Word 2007 and 2010


Word includes a feature that allows you to open HTML documents and have them appear on your screen as formatted text. For some people this is great, while others see it as a big bother. If you don’t want your HTML documents formatted by Word, but instead want them opened as straight text, you have two general ways you can do this.

First, you can remove the HTML file filter used by Word. This is done by running the Word Setup program and then making sure the HTML filter is removed from the system. (You do this by making sure the option is explicitly NOT selected in the Setup program.)

The other way is less drastic, but can be just as helpful. Follow these steps:

  1. Display the Word Options dialog box. (In Word 2007 click the Office button and then click Word Options. In Word 2010 display the File tab of the ribbon and then click Options.)
  2. At the left side of the dialog box click Advanced.
  3. Scroll through the options until you see the General section. (See Figure 1.)
  4. Figure 1. The advanced options of the Word Options dialog box.

  5. Make sure the Confirm File Format Conversion On Open check box is selected.
  6. Click on OK.

Now, whenever you open a document with an HTML (or HTM) extension, Word displays the Convert File dialog box. Here you are being asked how you want Word to treat the file you are opening. The HTML Document option is selected, since Word detected the file contained HTML code. You can select the Text Only option, and then Word will treat the file as plain text, without doing any formatting.

You should note that the procedure just described only works if you use the Open dialog box to open your file. If you later use the MRU file list to open the file, or the Documents list from Windows, then Word doesn’t ask you how it should do the file conversion—it straightaway opens the file as a formatted HTML document. If you do quite a bit of this type of file opening, then your best option is to remove the HTML file filter as first described in this tip.

WordTips is your source for cost-effective Microsoft Word training.
(Microsoft Word is the most popular word processing software in the world.)
This tip (6275) applies to Microsoft Word 2007 and 2010. You can find a version of this tip for the older menu interface of Word here: Turning Off HTML Conversions.

Author Bio

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He is president of Sharon Parq Associates, a computer and publishing services company. Learn more about Allen…

MORE FROM ALLEN

Doubling Your Money

Make your money last longer by using your head when printing labels. Here’s a great example of how you can double the …

Discover More

Changing Portions of Many Hyperlinks

If you need to modify the URL used in a large number of hyperlinks, you can do so by using a macro and a little …

Discover More

Using the ABS Function

Need to find the absolute value of a number? That’s where the ABS function comes into play.

Discover More


Download Article

Save your .docx as an .html web page file


Download Article

  • Using Microsoft Word
  • |

  • Using Google Drive
  • |

  • Using Word 2 Clean HTML
  • |

  • Video
  • |

  • Q&A
  • |

  • Tips

If you have Microsoft Word on your computer, you can resave the DOC/DOCX file as an HTML file without installing additional software. If you don’t have Word or prefer a free online option, you can upload the document to Google Drive and save it as an HTML file. Or, paste the contents of the Word file into a converter like Word 2 Clean HTML. Since Word documents and HTML files are very different, the finished HTML webpage may not contain the same formatting as the original. This wikiHow will show you how to convert a Word document to HTML on your Windows PC or Mac.

Things You Should Know

  • In Microsoft Word, go to File > Save As. Change the file type to Web Page.
  • For Google Drive, upload and open the Word file in Google Docs. Then, go to File > Download > Web Page.
  • Try an HTML conversion web app like Word 2 Clean HTML for additional automatic formatting options.
  1. Image titled Convert a Word Document to HTML Step 1

    1

    Open the document in Microsoft Word. Word has a built-in feature to convert .docx documents to HTML files. Although the resulting HTML code may be a bit bulkier than if you’d written the HTML from scratch, the conversion is quick and can be used for simpler projects.[1]

    • If you’re looking for general HTML tips, check out how to create a simple web page, create a link, and make radio buttons.
  2. Image titled Convert a Word Document to HTML Step 2

    2

    Click the File menu. It’s at the top-left corner of Word.

    Advertisement

  3. Image titled Convert a Word Document to HTML Step 3

    3

    Click Save As. A list of locations will appear.

  4. Image titled Convert a Word Document to HTML Step 4

    4

    Select a location. You can save the file to any folder on your computer (or a cloud drive).

  5. Image titled Convert a Word Document to HTML Step 5

    5

    Type a name for the file. Enter the name in the textbox next to “File name:”.

  6. Image titled Convert a Word Document to HTML Step 6

    6

    Select Web Page from the «Save as type» dropdown menu. This will save the file in HTML format.

    • If you’re okay with losing some of the advanced layout code in favor of a simpler file, select Web Page, Filtered instead. This keeps only the style instructions, content, and some other info.
  7. Image titled Convert a Word Document to HTML Step 7

    7

    Click Save. A new version of the file is now saved in the HTML format.

  8. Advertisement

  1. Image titled Convert a Word Document to HTML Step 8

    1

    Go to https://www.google.com/drive in a web browser. Then click Go to Drive. As long as you have a Google account, you can use Google Drive to convert a Word document to a web page.

  2. Image titled Convert a Word Document to HTML Step 9

    2

    Click the + New button. It’s at the top-left corner of Google Drive.

  3. Image titled Convert a Word Document to HTML Step 10

    3

    Click File upload. It’s the second option.

  4. Image titled Convert a Word Document to HTML Step 11

    4

    Select your Word document and click Open. This uploads the Word document to your Google Drive.

  5. Image titled Convert a Word Document to HTML Step 12

    5

    Right-click the Word document in Google Drive. A pop-up context menu will open.

  6. Image titled Convert a Word Document to HTML Step 13

    6

    Click Open with. Another menu will expand.

  7. Image titled Convert a Word Document to HTML Step 14

    7

    Click Google Docs. The contents of your Word document will display in Google Docs.

  8. Image titled Convert a Word Document to HTML Step 15

    8

    Click the File menu in Google Docs. It’s just below the file name at the top-left corner of the document.

  9. Image titled Convert a Word Document to HTML Step 16

    9

    Click Download. Additional menu options will appear.

  10. Image titled Convert a Word Document to HTML Step 17

    10

    Click Web Page. This allows you to save the .docx as an HTML zipped file. If prompted to do so, click Save or OK to start the download.

  11. Advertisement

  1. Image titled Convert a Word Document to HTML Step 18

    1

    Go to https://word2cleanhtml.com in a web browser. Word 2 Clean HTML is a free, easy-to-use tool that will take the contents of a Word document and convert it to HTML code.

  2. Image titled Convert a Word Document to HTML Step 19

    2

    Open the Word document you want to convert. If you have Microsoft Word, open the document in that application. If not, you can either use the free version of Word located at https://www.office.com to open the file, or a Word alternative like Google Drive.

  3. Image titled Convert a Word Document to HTML Step 20

    3

    Copy the contents of the Word file to the clipboard. Press the Control and A keys (PC) or Command and A keys (Mac) at the same time to highlight everything in the file, right-click the highlighted area, and then click Copy.

  4. Image titled Convert a Word Document to HTML Step 21

    4

    Paste the copied text into the Word to Clean HTML field. Right-click the typing area and select Paste to paste the selected content.

  5. Image titled Convert a Word Document to HTML Step 22

    5

    Adjust your HTML preferences below the form. Use the checkboxes at the bottom of the page to toggle conversion preferences, such as converting Word’s Smart Quotes to regular ASCII quotes.

  6. Image titled Convert a Word Document to HTML Step 23

    6

    Click the convert to clean html button. It’s the button below the form. This converts the content to the HTML format and displays it in the text area.

    • To see the regular HTML (not «cleaned up») from the conversion, click the Original HTML tab.
    • To see a preview of how the code would look in a web browser, click the Preview tab.
    • To copy the code so you can paste it elsewhere, click the Copy cleaned HTML to clipboard link at the top of the page.
  7. Advertisement

Add New Question

  • Question

    What do I do if I did this accidentally and really want to delete it now?

    Community Answer

    If you want to delete it, right-click on it and click delete. If you want to change it back, rename the file extension from randomfile.html to randomfile.docx.

  • Question

    I want to convert a Word document with controls (text box) to an HTML file, which has those controls. How do I do this?

    Community Answer

    Change the ending from whatever it is (ex: .txt) to .html (ex: .html).

  • Question

    If I save a Word document as a web page using HTML, will it have an URL?

    Community Answer

    Yes it will because you are basically making a website and all websites have a URL.

See more answers

Ask a Question

200 characters left

Include your email address to get a message when this question is answered.

Submit

Advertisement

  • If you have to convert hundreds of files to HTML, use commercial software that can convert them all at once. Some options are Doc Converter Pro (formerly Word Cleaner) and NCH Doxillion.

  • It is not always possible to keep all of your Word formatting and styles during the conversion, and still have the HTML file display consistently on all browsers. You might need to use CSS to achieve this on your website.

  • Looking for money-saving deals on Microsoft Office products? Check out our coupon site for tons of coupons and promo codes on your next subscription.

Thanks for submitting a tip for review!

Advertisement

References

About This Article

Article SummaryX

«To use Microsoft Word to convert a Word document to HTML, start by opening the document in Word. Click the File menu and choose Save as. Choose where you want to save the file, and then give it a name. Click the «»Save as type»» menu and select Web Page. Click Save to save your new HTML code to the desired location.
To use Google Drive, start by signing in to Google Drive in a web browser. Click the New button and select File upload. Select the Word document and click Open to add it to your Drive. Once the upload is complete, right-click the document in drive, select Open with, and then select Google Docs. When you see the document, click the File menu, select Download, and choose the Web Page option. This downloads a ZIP file of your new HTML to your computer.
»

Did this summary help you?

Thanks to all authors for creating a page that has been read 768,550 times.

Is this article up to date?

Using MS Words built-in save as HTML option

  1. Go to the file menu.
  2. Select Save as.
  3. In the drop-down file type box select, Web Page, Filtered.
  4. Click Save.

Contents

  • 1 Can you convert Word to HTML?
  • 2 How do I convert a Word document to HTML without losing formatting?
  • 3 How do I convert text to HTML?
  • 4 How do I save a document as HTML?
  • 5 How do I turn a Word document into a link?
  • 6 How do I open a text file in HTML?
  • 7 How do you convert to HTML?
  • 8 Can we save a Word document as a website?
  • 9 How do I convert a PDF to HTML?
  • 10 How do you link a text file in HTML?
  • 11 What are the basic HTML commands?
  • 12 How do I open a chrome HTML document?
  • 13 How do I edit a chrome HTML document?
  • 14 What is an HTML file?
  • 15 Why is word not suitable to make HTML files for websites?
  • 16 How can I convert a PDF to HTML for free?
  • 17 How do I create a link from a PDF?
  • 18 How do you create an HTML file?
  • 19 How add SWF to HTML?

Can you convert Word to HTML?

To quickly convert a Word document to HTML or web page format: Open the Word document you want to convert to HTML. Or, open a new, blank document and enter the text you want to convert to an HTML file. Go to the File tab and choose Save As or Save a Copy to save the document.

How do I convert a Word document to HTML without losing formatting?

To convert a Word file to HTML using Word Clear Formatting,

  1. Open the file in Word.
  2. Click inside the document and select all of the content. Use Ctrl + A or use the menu.
  3. With the text selected, Click the More option from the drop-down list for the Styles group. Select Clear Formatting.

How do I convert text to HTML?

Open your notepad file, click ‘Save As’, type in the name of your file and add . html at the end. Then, in the drop-down menu, change ‘Text Documents’ to ‘All Files’ (the encoding is meant to be UTF-8 if you have that as an option to the bottom right.) Then click save!

How do I save a document as HTML?

Save a document in HTML format

  1. Choose File > Save As and choose HTML from the drop-down list.
  2. Give the filename an extension of . html, specify the file location, and click Save.
  3. Open the HTML file in a Web browser to examine the converted file. If it meets with your approval, you are done.

How do I turn a Word document into a link?

Select the text you want to format as a hyperlink. Select the Insert tab, then click the Hyperlink command. The Insert Hyperlink dialog box will appear. Using the options on the left side, you can choose to link to a file, webpage, email address, document, or a place in the current document.

How do I open a text file in HTML?

How to convert TXT to HTML

  1. Upload txt-file(s) Select files from Computer, Google Drive, Dropbox, URL or by dragging it on the page.
  2. Choose “to html” Choose html or any other format you need as a result (more than 200 formats supported)
  3. Download your html.

On a Windows computer, open an HTML web page in Internet Explorer, Google Chrome, or Firefox. On a Mac, open an HTML web page in Firefox. Click the “Convert to PDF” button in the Adobe PDF toolbar to start the PDF conversion. Enter a file name and save your new PDF file in a desired location.

Can we save a Word document as a website?

If you need to save a Word document as a webpage, your best bet is to use the Web Page, Filtered option.Click File > Save As and choose the location where you want to save your document. Name your file. In the Save as type list, choose Web Page, Filtered.

How do I convert a PDF to HTML?

How to convert a PDF into HTML. The quickest way to convert your PDF is to open it in Acrobat. Go to the File menu, navigate down to Export To, and select HTML Web Page. Your PDF will automatically convert and open in your default web browser.

How do you link a text file in HTML?

Linking Documents
A link is specified using HTML tag <a>. This tag is called anchor tag and anything between the opening <a> tag and the closing </a> tag becomes part of the link and a user can click that part to reach to the linked document. Following is the simple syntax to use <a> tag.

What are the basic HTML commands?

Basic HTML commands

  • The HTML tag. Although not currently required by all clients, the <html> tag signals the point where text should start being interpreted as HTML code.
  • The head tag.
  • Titles.
  • The body tag.
  • Headers.
  • Paragraphs.
  • Preformatted text.
  • Boldface and Italics.

How do I open a chrome HTML document?

Open a new tab in Chrome, then press Ctrl (Windows) or Cmd (Mac) + O. It will bring up the same Open File menu.
Open HTML File From Within Chrome

  1. Choose File from the Chrome ribbon menu.
  2. Navigate to your HTML file location, highlight the document and click Open.

How do I edit a chrome HTML document?

By right-clicking on the HTML in the “Elements” tab and selecting “Edit as HTML,” you can make live edits to the markup of a webpage that Chrome will immediately render once you’re done editing.

What is an HTML file?

HTML is a HyperText Markup Language file format used as the basis of a web page. HTML is a file extension used interchangeably with HTM. HTML is consists of tags surrounded by angle brackets. The HTML tags can be used to define headings, paragraphs, lists, links, quotes, and interactive forms.

Why is word not suitable to make HTML files for websites?

Because Word displays the Web page similar to the way the page would be displayed in a Web browser (Microsoft Internet Explorer), certain types of formatting and other items that are not supported by HTML or by the Web page authoring environment, are not displayed in Word (or in Web browsers).

How can I convert a PDF to HTML for free?

PDF to HTML conversion.

  1. Open the file you want to convert in your PDF editor.
  2. Select the Create & Edit button on the right-side toolbar.
  3. Click Export PDF at the top of the window.
  4. Choose HTML Web Page and select your options.
  5. Click Export and choose the folder where you want to save your new page.

How do I create a link from a PDF?

Method 1. Create a URL for PDF using a File-Sharing Service

  1. In the “Home” interface, click the “Upload a file” button.
  2. Import the PDF you want to create a URL for.
  3. Go to “Documents” > “Your documents.” Check the PDF file and click “Share.” And this will generate a URL for PDF.

How do you create an HTML file?

Create Your HTML Document

  1. Start Microsoft Word.
  2. In the New Document task pane, click Blank Web Page under New.
  3. On the File menu, click Save. NOTE: The Save as type box defaults to Web Page (*. htm; *. html).
  4. In the File name box, type the file name that you want for your document, and then click Save.

How add SWF to HTML?

Inserting Flash into HTML

  1. choose file ~> Open the Flash movie.
  2. choose file ~> Export Movie.
  3. Name the file “yourmovie. swf”. Choose the location where the file is to stored (in your Web folder) and click OK.
  4. Open the HTML page where you want to insert your Flash movie. Insert this code:

47

47 people found this article helpful

How to Convert a Word Document to HTML

Three options for turning a Word doc into HTML

Updated on February 25, 2021

What To Know

  • File > Save As. Select a location. Name the file, and select .html as the type. Press Save.
  • Editors like Dreamweaver can convert a Word document to HTML.

This article explains how to use Microsoft Word to save a document as an HTML web page. Instructions in this article apply to Word for Microsoft 365, Word 2019, Word 2016, Word 2013, and Word 2010.

How to Save a Word Document as a Web Page

To quickly convert a Word document to HTML or web page format:

  1. Open the Word document you want to convert to HTML. Or, open a new, blank document and enter the text you want to convert to an HTML file.

  2. Go to the File tab and choose Save As or Save a Copy to save the document.

  3. Select the location where you want to save the HMTL file.

  4. In the Enter file name here text box, enter a name for the document.

  5. Select the Save as Type drop-down arrow and choose Web Page (*.htm; *.html).

  6. Select Save.

Word is a convenient way to convert pages when you need them up on a website quickly, but it’s not the best long-term solution for online publishing. When used as a web page editor, Word adds strange styles and tags to the HTML code. These tags impact how cleanly coded your site is, how it works for mobile devices, and how quickly it downloads.

Another option is to create the document in Word, save the file with the DOC or DOCX extension, upload the DOC file to your website, and set up a download link on a web page so visitors can download the file.

Notepad++ is a simple text editor that offers some HTML features that make authoring website pages easier than converting documents to HTML in Word.

Use a Web Editor to Convert DOC Files to HTML

Most web editors have the ability to convert Word documents to HTML. For example, Dreamweaver converts DOC files to HTML in a few steps. And, Dreamweaver removes the strange styles that Word-generated HTML adds.

When using a web editor to convert Word documents to HTML, the pages don’t look like the Word document. The Word document looks like a web page.

Convert the Word Document to a PDF

If converting the Word document to HTML didn’t produce the desired result, convert the document to a PDF. A PDF file appears exactly like the Word document, and it can be displayed inline in a web browser.

The downside to using PDF files is that to search engines, a PDF is a flat file. Search engines don’t search PDF files for content and don’t rank PDFs for keywords and phrases that potential site visitors may be looking for, which might or might not be an issue for you. If you simply want a document you created in Word to display on a website, a PDF file is a good option to consider.

Thanks for letting us know!

Get the Latest Tech News Delivered Every Day

Subscribe

PopS said:

Hi,

I was going to set f’ups, but I’m not sure a couple of
those groups might not be good sources. FWIW, I’ve
noticed that a lot of folk in the .public.word.newusers
also frquent this group,

ah, the affinity group complication. :)

and there are probably others
I’m not aware of. Usually someone posting here will
get a response, if anyone knows the answer/s, that is.

===> Which text editor,

editpadlite, which as far as i can find in options has no unix lf-less etc settings.

i’ve used editpadlite for years, and it always produces standard «DOS» txt files as far as word or notepad opening editpadlite’s
txt files. (metapad is more puzzling in this regard)

and how was the info in it concieved; paste, manually written,

just those two.
mostly manually typed, plus paste-lets from View Source in mozilla (colored text that pastes as txt in editpadlite)

downloaded via IE,
etc.. Lots of ways to do it, and one can get differing
results.
snip

===> Is Word doing conversions when it opens these
files? Maybe setting it to confirm or ask permission
before converting a file would help here.

at times when i’m looking at «weird» files on the hard disc, Word will sometimes ask me to pick which among many text-like files.
often this dialog’s radio dot has pre-chosen «utf-8». I mentoin this in context of today’s misrecognized txt file , because none of
those Word open-as dialogs popup when opening *many* other editpadlite created txt files.
(i believe the other dialog — for opening wordperfect, word 2, etc formats — emanates from the converter add-ons?)

Depending on how the data gets into the txt file,
perhaps it’s only getting some but not all of the tags?
That’s why I asked about how it was »conceived» above.

yes, it looks as if Word is being ridiculously forgiving about «sloppy html». there;s very little html in the file.

===> Can you give an example of why the macros won’t
work? What goes wrong?

ah, i meant that i’d like word to properly open the file as txt, so i can work on the file using my handydandy Word macros. i
imagine the macros *would work* just fine, but
1 some of the text which Word has selectively recognized as HTML, is now «hidden» from me.
2 Word would probably force the Save to save as html and thereby *add* little piles of bizarre tagging throughout what
had originally been a txt file.

(i long ago gave up using word for any html.)

Perhaps, if it’s simple enough pastes, you could
filter out the problems tags? If that’s the problem, I
mean.

situation is otherwise. :) i’m trying to work with regex, which acts weird in editpadlite, because some parts of editpadlite uses
(it’s own?) regex.

===> Here is where turning off or at least making Word
ask before converting a document may well help.
In Word 2002 and earlier that setting is at Tools;
Options; the General Tab and tick «Confirm Conversions
at Open». Then if you open a .txt but windows wants to
render html, you can tell it not to.

will try (after restart, because this computer’s been running for hours. else, memory will probably floop if i start Word now)

snip

===> I don’t mind reasonable cross-posting and most
other reasonable people don’t. It’s the multi-posters
that get the flames, it seems. You should, however,
set f’ups so the info from all the groups wll end up in
one place for you (and others interested) to read.
Makes it a lot more convenient and still reaches all
the groups once you’ve made the first post or two.

yes.. the concept is good, but i can imagine most people would not hunt down the followups, in groups they never visits, to
add later response. still, i’ll transition to the busy group :)

um, even weaker, the «excess» groups still have to be manually trimmed in the even later replies, if i assume correctly.
thanks for your reply.

You can try with Microsoft.Office.Interop.Word;

   using Word = Microsoft.Office.Interop.Word;

    public static void ConvertDocToHtml(object Sourcepath, object TargetPath)
    {

        Word._Application newApp = new Word.Application();
        Word.Documents d = newApp.Documents;
        object Unknown = Type.Missing;
        Word.Document od = d.Open(ref Sourcepath, ref Unknown,
                                 ref Unknown, ref Unknown, ref Unknown,
                                 ref Unknown, ref Unknown, ref Unknown,
                                 ref Unknown, ref Unknown, ref Unknown,
                                 ref Unknown, ref Unknown, ref Unknown, ref Unknown);
        object format = Word.WdSaveFormat.wdFormatHTML;



        newApp.ActiveDocument.SaveAs(ref TargetPath, ref format,
                    ref Unknown, ref Unknown, ref Unknown,
                    ref Unknown, ref Unknown, ref Unknown,
                    ref Unknown, ref Unknown, ref Unknown,
                    ref Unknown, ref Unknown, ref Unknown,
                    ref Unknown, ref Unknown);

        newApp.Documents.Close(Word.WdSaveOptions.wdDoNotSaveChanges);


    }

In this guide, we will explore all the various ways for you to convert Word documents to HTML. We are experts at converting Microsoft Word documents to HTML with over 20 years of experience. We have several tools to help you convert and process your documents to clean HTML:

  1. Word To HTML – our Web App is great for quick clean-ups (paste content and process), uploading Word documents, and converting them to HTML
  2. Doc Converter Pro Desktop – the perfect choice for document batch converting on Windows Desktop
  3. Doc Converter Pro Web App – the best choice if you need to bulk convert and process documents online
  4. Doc Converter Pro API – for programmers looking to add document conversion to their Apps

Using MS Word built-in save as HTML option

If you have MS Word installed you can use the built-in save to HTML option. All you need to do is:

  • Go to the file menu
  • Select Save as
  • In the drop-down file type box select, Web Page, Filtered
  • Click Save

Easy eh? Not so fast there are two significant issues with using Word to save your HTML, the file sizes are large and the quality of the HTML is not very good.

Large file sizes when using MS Word to convert documents to HTML

If you create a simple test document like the one shown below, then save it to HTML you will see that the resulting web page has more than 100 lines of code.

If we use Upload file and convert the same file with one of our tools like WordToHTML.net you will get around 40 lines of code when full page mode is enabled, which is less than half of the size of the Word version.  If you copy and paste content from an MS Word document into Visual Editor you will get even 19 lines of code, which is less than one-fourth of the size of the Word version. You can try this experiment yourself or view our test files:

    • MS Word test file
    • test file converted with MS Word
    • test file converted with WordToHTML.net converter via Upload and convert document
    • test file converted with WordToHTML.net via paste into Visual Editor and keep formatting

This example is for a simple document, for complex files the file size difference can be even larger.

Why does it matter if my HTML code is large?

Apart from practical issues like server space web pages with lots of code that are longer to download. Google penalizes sites with slow download times, and more importantly, users are impatient. According to research nearly half of web users expect a site to load in 2 seconds or less, and they tend to abandon a site that has not loaded within 3 seconds.  In the modern internet, speed is the most important factor.

Word generates messy non-standard HTML when converting from Word to HTML

The other big issue with using Word is it creates overly complex non-standard HTML. Now to be fair to Microsoft, we imagine that the main reason they do this is to try to keep the layout of your files as similar as possible, but it does create issues. In our example file if we look at how Word handles the list items, this is the code it generates:

<p class=MsoListParagraphCxSpFirst style=’text-indent:-18.0pt’><span
style=’font-family:Symbol’>�<span style=’font:7.0pt “Times New Roman”‘>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span>Apples</p>

A lot of code just to display the word apple on a list. You will also notice that Word does not use the standard HTML Li and ul list tags. It uses CSS to format the lists, if you strip out this CSS the list becomes normal text. In comparison, this is the code generated with WordToHTML.net when you upload a document and convert it.

<li style="margin-left:28.06pt; padding-left:7.94pt; font-family:serif;"><span style="font-family:Calibri;">Apples</span></li>

If you just paste content from the MS Word test file into Visual Editor and choose to keep formatting you will be left with this very clean code:

<li>Apples</li>

So you can see if you want the cleanest HTML possible you need to use a proper tool to convert your Word files to HTML.

What is the best way to convert Word to HTML?

Here are our recommendations based on your various needs.

I want an online tool that will quickly convert my text or single documents to clean HTML:

Our online WordToHTML.net tool lets you paste your text into the Visual Editor and your text gets converted to HTML instantly. It is super easy to use and free for basic features.

For users who need more, our WordToHTML.net Pro version is only $10 a month and gives you the ability to upload Word (DOCX/DOC), PDF, and other file types and have them converted to HTML. You also have the ability to save your files, create conversion templates, and more cleanup features. You can try the Pro version for free.

Try our WordToHTML.net converter tool.

I need a Windows application to convert lots of Word or PDF files to HTML:

We have a Windows desktop product called Doc Converter Pro. It is an easy way for you to convert your Word, PDF, and other file formats to HTML. You can convert one file or batch convert hundreds of files in one go. You can also create your own custom templates to give you full control over your conversions.

The advantage of a desktop application is it will be faster if you are converting lots of files also it is a good option if your files are private as nothing will leave your system.

Find out more or Download your free trial for Windows…

I need a web app to batch convert my Word documents to HTML:

Our web app version of Doc Converter Pro Online gives you all the great features of our desktop version, but as it is web-based, you can work from any computer and any browser. You can also share accounts with your team.

How does Doc Converter Pro Web App differ from WordToHTML.net? Doc Converter Pro is designed for batch converting lots of documents whereas WordToHTML.net is better for cleaning up pasted text or converting single documents.

Check out the free trial of our web app…

I am a programmer who needs an API solution to convert our Word files to HTML:

No problem. We have a Web API version of Doc Converter Pro Online. With a few lines of code, we take all the hard work out of converting your documents. Try our free trial here…

We are here to help…

If you need more advice, feel free to contact us anytime. We can advise you on the best strategy for your needs.

Понравилась статья? Поделить с друзьями:
  • Word order adjectives worksheet
  • Word online will not print
  • Word open in spanish
  • Word order adjective noun
  • Word online read only