Преобразование word в latex

Конвертируйте Word в LaTeX файлы онлайн бесплатно. Мощный бесплатный онлайн Word в LaTeX конвертер документов легко. Установка программного обеспечения для настольных ПК, таких как Microsoft Word, OpenOffice или Adobe Acrobat, не требуется. Все конверсии вы можете сделать онлайн с любой платформы: Windows, Linux, macOS и Android. Мы не требуем регистрации. Этот инструмент абсолютно бесплатный.
С точки зрения доступности вы можете использовать наши онлайн-инструменты преобразования Word в LaTeX для обработки различных форматов файлов и размеров файлов в любой операционной системе. Независимо от того, находитесь ли вы на MacBook, компьютере с Windows или даже на карманном мобильном устройстве, конвертер Word в LaTeX всегда доступен в Интернете для вашего удобства.

Быстрый и простой способ конвертации

Быстрый и простой способ конвертации

Загрузите документ, выберите тип сохраненного формата и нажмите кнопку «Конвертировать». Вы получите ссылку для скачивания, как только файл будет конвертирован.

Конвертируй из любого места

Конвертируй из любого места

Он работает со всех платформ, включая Windows, Mac, Android и iOS. Все файлы обрабатываются на наших серверах. Вам не требуется установка плагинов или программного обеспечения.

Качество конвертера

Качество конвертера

. Все файлы обрабатываются с использованием Aspose APIs, которое используются многими компаниями из списка Fortune 100 в 114 странах мира.

  • Home

  • Document

  • DOCX to LATEX Converter

Convert DOCX (Word) to LATEX

Convert DOCX documents to LATEX format online and free.

Convert

DOCX


to

LATEX

Loading Uploader…

if you have uploaded a file, it will be displayed.

Microsoft Word Document (.docx)

DOCX is the file extension of the Office Open XML documents, an XML-based, zipped file format developed by Microsoft for its word processing program, Microsoft Word. DOCX files can contain formatted text, charts, tables, images, and other document elements.

DOCX Converter

More About DOCX

How to convert DOCX to LATEX?

1Upload DOCX File

Click the Choose File button to select the DOCX file.

2Select DOCX Tools

To customize DOCX to LATEX conversion, use the available tools before clicking the Convert button.

3Download Your LATEX

After the convert is complete, click on the Download button to get your LATEX markup document.

Convert DOCX to LATEX

Frequently Asked Questions

How to change DOCX format to LATEX?

To change DOCX format to LATEX, upload your DOCX file to proceed to the preview page. Use any available tools if you want to edit and manipulate your DOCX file. Click on the convert button and wait for the convert to complete. Download the converted LATEX file afterward.

Convert Files on Desktop

Convert DOCX to LATEX on macOS

Follow steps below if you have installed Vertopal CLI on your macOS system.

  1. Open macOS Terminal.
  2. Either cd to
    DOCX
    file location or include path to your input file.
  3. Paste and execute the command below, substituting in your DOCX_INPUT_FILE name or path.

    $ vertopal convert DOCX_INPUT_FILE —to latex

Convert DOCX to LATEX on Windows

Follow steps below if you have installed Vertopal CLI on your Windows system.

  1. Open Command Prompt or Windows PowerShell.
  2. Either cd to
    DOCX
    file location or include path to your input file.
  3. Paste and execute the command below, substituting in your DOCX_INPUT_FILE name or path.

    $ vertopal convert DOCX_INPUT_FILE —to latex

Convert DOCX to LATEX on Linux

Follow steps below if you have installed Vertopal CLI on your Linux system.

  1. Open Linux Terminal.
  2. Either cd to
    DOCX
    file location or include path to your input file.
  3. Paste and execute the command below, substituting in your DOCX_INPUT_FILE name or path.

    $ vertopal convert DOCX_INPUT_FILE —to latex

  • File
  • URL
  • Cloud
  • Ads

1) Cloud Service

Choose a cloud service provider:

DOCX

In an effort to create an open document standard, Microsoft in collaboration with ISO/IEC and Ecma, developed the Office Open XML standard in 2006. One of the filename extensions supported in this specification is the .docx extension, a text document filename extension. The .docx was introduced in Microsoft Office Word 2007 and has been supported ever since in later iterations. It has become the default filename extension for all text documents produced using Microsoft Office Word. Given the open source nature of the XML specification, more alternative document processing applications support read and write capabilities on documents saved with the .docx filename extension. This is in comparison to the .doc filename extension which is a proprietary asset owned by Microsoft.

TEX

LaTeX Source Document

How to convert DOCX to TEX

STEP 1

Select the document file in the DOCX format to convert to the TEX format, you can select a file from your computer or your Google Drive or Dropbox account.                            

STEP 2

Choose the TEX format from the drop-down list as the output format, and click the Convert button, you can convert up to 5 files at the same time and a maximum size of up to 300 MB.

STEP 3

Wait until your file is uploaded and converted into the TEX document format, you can download the converted file up to a maximum of 5 times, and can also delete the file from the Download page.

Overall Rating:

(120 Votes)

Приветствую читателей Хабра!

На создание статьи меня побудил этот хабропост, в котором рассказывается о методе конвертации файлов из MS Word в LaTeX. Видимо, на набор формул заново занял не малое время, автор делится своими впечатлениями:

И, конечно, отдельное удовольствие доставил повторный набор формул, изначально представленных в DOC-файле, в командах LaTeX’а.

Читателям предлагается способ превосходящий по быстроте ручной перенабор формул.

В самом общем случае объектом воздействия является файл набранный с помощью MS Office, в котором перемешаны формулы набранные с помощью Equation и MathType.

Нам понадобятся:

— MS Word c установленным MathType (6.5 или выше);
— LibreOffice с установленным дополнением Writer2LaTeX.

Этап №1

Открываем файл с помощью MS Word. Выбираем панель MathType и нажимаем кнопку Toogle TeX.

скрин

В результате, все набранные в MathType формулы буду преобразованы, Equation останутся не тронутыми. Стоит отметить, что повторное нажатие на Toogle TeX вернет все в исходное состояние. Полученный документ следует сохранить и перейти ко второму этапу.

скрин

Этап 2

Открываем наш файл в LibreOffice Writer. Если мы будем экспортировать файл в том виде, в котором он есть сейчас, то все слеши, доллары, галочки и фигурные скобки будут восприняты как текст. Мы получим следующий код:

Формула в foreignlanguage{english}{MathType
${textbackslash}frac{-b{textbackslash}pm
{textbackslash}sqrt{{{b}^{}{2}}-4ac}}{2a}$}

Выяснение какие слеши нужны, а какие нет, займет некоторое время. Что бы избежать такой ситуации, автозаменой проходимся по файлу и заменяем знаки слеш, фигурные скобки и т.п. на нетипичные для текста идущие подряд знаки препинания (к примеру, слеш на три двоеточия «:::»).

скрин

Далее, в меню файл выбираем «Экспорт», устанавливаем формат «LaTeX2e.txt», расставляем галочки по вкусу и экспортируем.

скрин

В получившемся тексте, производим обратные замены знаков препинания на слеши и т.д.

Теперь текст готов к компиляции.

Если у Вас, уважаемые читатели, возникли предложения по дальнейшей автоматизации данного метода, то обязательно напишите в комментариях.

If you’re running an AppleScript-compatible operating system, I’ve written a script to do this. It has many limitations as far as pictures go (totally unsupported), but it handles the essentials (bold, italics, underscores, percent signs, dollar signs, tables (in tabu)). Note that it keeps everything in unicode, therefore the fontspec package is recommended with xelatex. It is a work in progress.

You can find the latest version at:
https://gist.github.com/macmadness86/5582426

Note that if you have TeXShop installed, you can optionally uncomment two lines:

- `--my openInTeXShop()`
- `--my closeWordDocByName(myDup, false)`

This will automatically copy the «texified» Word Document into a new TeXShop Document and close the duplicated Word Document.

For the sake of keeping stackexchange self-contained, I will post the latest version as of this post here:

(*
Notes:
Ver. 2.11
Created by macmadness86 on 29.12.2013

Author of TeX Tutorials on YouTube
http://www.youtube.com/user/XeTeXTutorials?feature=watch

StackExchange User
http://stackoverflow.com/users/1236128/macmadness86

Instructions for use:
Have Microsoft Word document open. The frontmost document will be processed. The script creates a replica before processing, in order to avoid losing data. This document remains open when the script is finished and its contents can be copied to a tex editor e.g. TeXShop and compiled.

Version Notes:
26.12.2013 version 2.0 improved the table support. Now tables are coded as centered tabu tables.
29.12.2013 version 2.1 added list support using standard bullet or simple numbering buttons on the Word GUI. Supports only 1 embedded list.

Issues:
This script depends on a paragraph before a table. Therefore, a table must not be located at paragraph 1. There is a glitch in MS Word, preventing a script from adding a paragraph before a table (as far as I know).
*)

set myDup to my duplicateDoc()
--set outputPathAL to (path to desktop folder as string) & "Temporary Saved Doc for Latex Conversion.doc"
--my saveWordDoc(outputPathAL)


tell application "Microsoft Word"
if (count of documents) is greater than or equal to 1 then
tell document 1
-- Edit sectionTags and inlineTags for
set sectionTags to {{"Title", "title"}, {"Heading 1", "section"}, {"Heading 2", "subsection"}, {"Heading 3", "subsubsection"}}
set inlineTags to {{"Paragraph", "paragraph"}}
global stylesList
set stylesList to (get name local of Word styles)
-- Automated List
set sectionStyles to {}
repeat with itemStep from 1 to count of sectionTags
set end of sectionStyles to item 1 of item itemStep of sectionTags
end repeat
set {xpath, xname, xext, xbodytext, paraCount, wordCount} to {(get default file path file path type documents path), name, (get name extension), (get contents of text object), get count of paragraphs, get count of words}
--set allStyles to Word styles
-- Takes care of Section Tags
repeat with paraStep from 1 to paraCount
set paraStyle to (get style of paragraph paraStep)
set paraContent to (get content of text object of paragraph paraStep)
repeat with itemStep from 1 to (count of sectionTags)
if (paraContent as string) does not contain "{" then
set {wordTag, texTag} to {item 1 of item itemStep of sectionTags, item 2 of item itemStep of sectionTags}
try --
--return paraStyle
if Word style paraStyle is Word style wordTag then
my texifyHeading(paraContent, wordTag, texTag, paraStep)
end if
end try
end if
end repeat
end repeat
-- Handle Bold and Italics
end tell
end if
end tell


my texifyBoldItalicsQuotes()
my findReplace("&", "\&")
my findReplace("$", "\$")
my findReplace("_", "\_")
my findReplace("%", "\%")
my texifyLists()
my texifyTables()
set preBody to "
\documentclass[10pt]{article}
\usepackage{fontspec}
\usepackage{tabu}
\begin{document}
"
set postBody to "
\end{document}"

addTextToFrontOfDoc(preBody)
addTextToEndOfDoc(postBody)
--my openInTeXShop()
--my closeWordDocByName(myDup, false)

on texifyHeading(para_Content, word_style, tex_style, para_num)
tell application "Microsoft Word"
tell active document
--set content of text object of paragraph paraNUM to "poop"
--select text object of paragraph paraNUM
--set orig_text to content of text object of paragraph paraNUM
set sedFix to do shell script "echo " & para_Content & "| sed "s/$(printf '
')\$//""
try
if last character of (para_Content as string) is (ASCII character 13) then
set returnText to "\" & tex_style & "{" & sedFix & "}" & "
"
else
set returnText to "\" & tex_style & "{" & para_Content & "}"
end if
set content of text object of paragraph para_num to returnText
set style of paragraph para_num to word_style
--end if
end try
end tell
end tell
end texifyHeading

on texifyWord(word_content, tex_style, word_num)
tell application "Microsoft Word"
try
set bold_Style to make new Word style at active document with properties ¬
{name local:"Bold Tagged", style type:style type character}
set bold of font object of bold_Style to true
end try
tell active document
set wordRange to word word_num
--ASCII character 32 is space
if last character of (word_content as string) is (ASCII character 32) then
set wordOnly to set range wordRange start ((start of content of wordRange)) ¬
end ((end of content of wordRange) - 1)
set word_content to content of wordOnly
end if
set newContent to "\" & tex_style & "{" & word_content & "}"
set style of word word_num to "Bold Tagged"
set content of word word_num to newContent --("\" & tex_style & "{" & word_content & "}")
--set style of word word_num to
end tell
end tell
end texifyWord


on texifyBoldItalicsQuotes()
tell application "Microsoft Word"
if (count of documents) is greater than or equal to 1 then
set stylesList to (get name local of Word styles of active document)
if stylesList does not contain "Bold Tagged" then
set bold_Style to make new Word style at active document with properties ¬
{name local:"Bold Tagged", style type:style type character}
set bold of font object of bold_Style to true
end if
if stylesList does not contain "Italic Tagged" then
set italic_Style to make new Word style at active document with properties ¬
{name local:"Italic Tagged", style type:style type character}
set italic of font object of italic_Style to true
end if
--save as active document file name "Temp.doc"
set curly to false -- change to true if curly quotes desired
set wasSmartQuotes to auto format as you type replace quotes of settings
set auto format as you type replace quotes of settings to curly
set myFind to find object of selection
tell myFind
clear formatting myFind
execute find find text "&" replace with "\&" replace replace all
execute find find text "_" replace with "\_" replace replace all
end tell
-- mark up italics and take out of italics
clear formatting of myFind
set forward of myFind to true
set wrap of myFind to find continue
set style of myFind to "Normal"
set italic of font object of myFind to true
set content of myFind to ""
clear formatting replacement of myFind
set content of replacement of myFind to "\emph{^&}"
set italic of font object of replacement of myFind to false
set style of replacement of myFind to "Italic Tagged"
execute find myFind replace replace all
-- mark up bold and take out of bold
clear formatting of myFind
set forward of myFind to true
set wrap of myFind to find continue
set bold of font object of myFind to true
set style of myFind to "Normal"
set content of myFind to ""
clear formatting replacement of myFind
set content of replacement of myFind to "\textbf{^&}"
set bold of font object of replacement of myFind to false
set style of replacement of myFind to "Bold Tagged"
execute find myFind replace replace all
clear formatting of myFind
set forward of myFind to true
set wrap of myFind to find continue
set style of myFind to "quotation"
set content of myFind to ""
clear formatting replacement of myFind
set content of replacement of myFind to "\begin{quotation}^&\end{quotation}"
--set style of replacement of myFind to "normal"
execute find myFind replace replace all
set auto format as you type replace quotes of settings to wasSmartQuotes
end if
end tell
end texifyBoldItalicsQuotes

on findReplace(textToFind, replacementText)
tell application "Microsoft Word"
if (count of documents) is greater than or equal to 1 then
-- mark up italics and take out of italics
set myFind to find object of selection
clear formatting of myFind
set forward of myFind to true
set wrap of myFind to find continue
--set style of myFind to "Normal"
--set italic of font object of myFind to true
set content of myFind to textToFind
clear formatting replacement of myFind
set content of replacement of myFind to replacementText --\emph{^&}
set italic of font object of replacement of myFind to false
execute find myFind replace replace all
end if
end tell
end findReplace

on texifyLists()
tell application "Microsoft Word"
--tell active document
--end tell
--set listFormatProps to properties of list format of text object of selection
--set paraStep to GetParagraph() of me
set paraCompensator to 0
repeat with paraStep from 1 to count of paragraphs of active document
set paraStep to paraStep + paraCompensator
set listFormatProps to properties of list format of text object of paragraph paraStep of active document
set styleName to name local of style of text object of paragraph paraStep of active document
if styleName is "List paragraph" then
get list type of listFormatProps
-- ## Setup Itemize Environment
if list type of listFormatProps is list bullet then
-- ## Prefix List Items with item
if content of text object of paragraph paraStep of active document does not contain "\item" then
insert text "\item " at first word of paragraph paraStep of active document
end if
if list type of list format of text object of paragraph (paraStep - 1) of active document is list no numbering then
insert text return & "\begin{itemize}" at the last word of paragraph (paraStep - 1) of active document
set paraCompensator to paraCompensator + 1
set paraStep to paraStep + 1
end if
-- ## Look for start of embedded list (Level 2 Indent)
if list type of list format of text object of paragraph (paraStep + 1) of active document is list bullet then
if list level number of list format of text object of paragraph (paraStep + 1) of active document is 2 then
insert text return & "\begin{itemize}" at last word of paragraph (paraStep) of active document
set paraCompensator to paraCompensator + 1
set paraStep to paraStep + 1
end if
end if
-- ## Look for end of embedded list (Level 2 Indent)
if list type of list format of text object of paragraph (paraStep + 1) of active document is list bullet then
if list level number of list format of text object of paragraph (paraStep) of active document is 2 then
if list level number of list format of text object of paragraph (paraStep + 1) of active document is 1 then
insert text "\end{itemize}" & return at first word of paragraph (paraStep + 1) of active document
set paraCompensator to paraCompensator + 1
set paraStep to paraStep + 1
end if
end if
end if
-- ## Detect end of list (Level 1 Indent) and tag with end{itemize}
try
if list type of list format of text object of paragraph (paraStep + 1) of active document is list no numbering then
insert text "\end{itemize}" & return at first word of paragraph (paraStep + 1) of active document
set paraCompensator to paraCompensator + 1
end if
on error
set lastItem to true
insert text return & "\end{itemize}" at last word of paragraph (paraStep) of active document
log "Reached end of document checking for list at paragraph nr. " & paraStep
end try
end if
end if
-- ## Setup Enumerate Environment
if list type of listFormatProps is list simple numbering then
-- ## Prefix List Items with item
if content of text object of paragraph paraStep of active document does not contain "\item" then
insert text "\item " at first word of paragraph paraStep of active document
end if
if list type of list format of text object of paragraph (paraStep - 1) of active document is list no numbering then
insert text return & "\begin{enumerate}" at the last word of paragraph (paraStep - 1) of active document
set paraCompensator to paraCompensator + 1
set paraStep to paraStep + 1
end if
-- ## Look for start of embedded list (Level 1 Indent)
if list type of list format of text object of paragraph (paraStep + 1) of active document is list simple numbering then
if list level number of list format of text object of paragraph (paraStep + 1) of active document is 2 then
insert text "\begin{enumerate}" & return at first word of paragraph (paraStep + 1) of active document
set paraCompensator to paraCompensator + 1
set paraStep to paraStep + 1
end if
end if
-- ## Look for end of embedded list (Level 1 Indent)
if list type of list format of text object of paragraph (paraStep + 1) of active document is list simple numbering then
if list level number of list format of text object of paragraph (paraStep + 1) of active document is 1 then
insert text return & "\end{enumerate}" at last word of paragraph (paraStep) of active document
set paraCompensator to paraCompensator + 1
set paraStep to paraStep + 1
end if
end if
-- ## Detect end of list and tag with end{itemize}
try
if list type of list format of text object of paragraph (paraStep + 1) of active document is list no numbering then
set myInsert to insert text "\end{enumerate}" & return at first word of paragraph (paraStep + 1) of active document
set paraCompensator to paraCompensator + 1
end if
on error
set lastItem to true
insert text return & "\end{enumerate}" at last word of paragraph (paraStep) of active document
log "Reached end of document checking for list at paragraph nr. " & paraStep
end try
end if
log "Paragraph: " & paraStep
end repeat
repeat with paraStep from 1 to count of paragraphs of active document
set listFormatProps to properties of list format of text object of paragraph paraStep of active document
set styleName to name local of style of text object of paragraph paraStep of active document
set paraContent to content of text object of paragraph paraStep of active document
if styleName is "List Paragraph" then
set myStyle to Word style "Normal" of active document -- replace "Normal" with name of your style in quotations
set content of text object of paragraph paraStep of active document to tab & paraContent
select text object of paragraph paraStep of active document
set style of paragraph format of selection to myStyle
end if
end repeat
end tell
end texifyLists

on texifyTables()
tell application "Microsoft Word"
set tableCount to (count of tables of active document)
if tableCount is greater than or equal to 1 then
repeat with tableNum from 1 to tableCount
set thisTable to table tableNum of active document
set cellCount to (count of cells of text object of thisTable)
set rowCount to (count of rows of text object of thisTable)
--set columnCount to (count of columns of text object of thisTable) --DOES NOT WORK, RESULTS IN 0
set rowCount to number of rows of thisTable
set columnCount to number of columns of thisTable
repeat with rowIncr from 1 to rowCount
repeat with columnIncr from 1 to columnCount
--set myRange to create range active document start (start of content of text object of cell) end ((end of content of text object of cell columnIncr))
end repeat
end repeat
set rowList to {}
repeat with rowIncr from 1 to rowCount
--set columnIncr to 1
repeat with columnIncr from 1 to columnCount --(get cells of row rowIncr of thisTable)
--set myRange to create range active document start (start of content of text object of column columnIncr of thisTable) end ((end of content of text object of column columnIncr of thisTable) - 1)
set cellContent to (get content of text object of (get cell from table thisTable row rowIncr column columnIncr))
-- Remove "end-of-cell marker" AKA remove ASCII character 13 from end of cell content
set cellcontentList to {}
set orig_delims to AppleScript's text item delimiters
set AppleScript's text item delimiters to ASCII character 13 -- (a carriage return)
set cellcontentList to text items of cellContent
set cellItems to items of cellcontentList
set AppleScript's text item delimiters to orig_delims
set cellContent to item 1 of cellcontentList
-- Enable Script to be re-run without adding extra "&" or "\" to tables
if (count of characters of cellContent) is greater than 0 then
if last character of (cellContent) is not "&" then
if columnIncr = columnCount then
if last character of (cellContent) is not "\" then
set content of text object of (get cell from table thisTable row rowIncr column columnIncr) to cellContent & " \\"
end if
else
set content of text object of (get cell from table thisTable row rowIncr column columnIncr) to cellContent & " &"
end if
set columnIncr to columnIncr + 1
end if
else
if columnIncr = columnCount then
--if last character of (cellContent) is not "\" then
set content of text object of (get cell from table thisTable row rowIncr column columnIncr) to cellContent & " \\"
else
set content of text object of (get cell from table thisTable row rowIncr column columnIncr) to cellContent & " &"
end if
set columnIncr to columnIncr + 1
end if
end repeat
end repeat
end repeat -- table loop
-- ## loop for pre and post table code
tell active document
set selPara to GetParagraph() of me
get content of text object of paragraph selPara
end tell
if (count of tables of active document) is greater than 0 then
repeat with tableStep from 1 to count of tables of active document
--tell active document
set thisTable to table tableStep of active document
set cellCount to (count of cells of text object of thisTable)
set rowCount to (count of rows of text object of thisTable)
--set columnCount to (count of columns of text object of thisTable) --DOES NOT WORK, RESULTS IN 0
set rowCount to number of rows of thisTable
set columnCount to number of columns of thisTable
set colList to {}
repeat (columnCount) times
if (count of colList) is less than 1 then
set end of colList to "X[l]"
else
set end of colList to "X[m]"
end if
end repeat
set colString to colList as string
-- ## Deal with Pre-Table Code
select text object of (get cell from table thisTable row 1 column 1)
set preParaNum to (GetParagraph() of me)
set myRange to create range active document start (start of content of content of text object of paragraph preParaNum of active document) end ((end of content of content of text object of paragraph preParaNum of active document) - 1)
select myRange
set origContent to content of myRange
set preTableContent to "
\begin{table}
\centering
{\extrarowsep=1mm
\begin{tabu}{" & colString & "}
\tabucline[.4mm,black]1"
set content of myRange to origContent & return & preTableContent
-- ## Deal with Post-Table Code
select text object of (get cell from table thisTable row rowCount column columnCount)
set postParaNum to (GetParagraph() of me) + 1 + 1 + 1
set myRange to create range active document start (start of content of content of text object of paragraph postParaNum of active document) end ((end of content of content of text object of paragraph postParaNum of active document))
select myRange
set origContent to content of myRange
--return origContent
set postTableContent to "\tabucline[.4mm,black]1
\end{tabu}}
\end{table}"
set content of myRange to postTableContent & return & origContent
-- Add Line After Top Row (Heading Row)
select text object of (get cell from table thisTable row (2) column 1)
insert rows selection position above number of rows 1
select text object of (get cell from table thisTable row (2) column 1)
set content of selection to "\tabucline[.1mm,black]1"
--end tell --doc
end repeat --table loop
end if --tables exist
--end tell --doc
end if -- if table
(*
set myTable to table 1 of the active document
set aRange to convert row to text (row 1 of myTable) ¬
separator separate by tabs
set style of aRange to "normal"
set rowContent to content of aRange
set rowItems to {}
set orig_delims to AppleScript's text item delimiters
set AppleScript's text item delimiters to ASCII character 13 -- (a carriage return)
set rowContent_items to text items of rowContent
set rowItems to items of rowContent_items
set AppleScript's text item delimiters to orig_delims
repeat with thisItem in rowItems
set item thisItem to item thisItem & "&"
end repeat
set content of aRange to (rowItems as string)
*)
end tell
end texifyTables

on GetParagraph()
-- NOTE: If you select a paragraph including the first character of the first word, it will count up to the previous paragraph only!
tell application "Microsoft Word"
set myDoc to active document
set myRange to create range myDoc start 0 end (start of content of text object of selection)
set paragraphNum to (count paragraphs in myRange)
return paragraphNum
end tell
end GetParagraph

on openInTeXShop()
tell application "Microsoft Word"
get properties of settings
get RTF in clipboard of settings
tell active document
set paraCount to count of paragraphs
set myRange to create range start (start of content of text object of paragraph 1) end (end of content of text object of paragraph paraCount)
select myRange
end tell
if selection type of selection is selection normal then
copy object selection
end if
end tell
tell application "TeXShop"
make new document
activate
set thisDoc to the front document
(*
set preBody to "
\documentclass[10pt]{article}
\usepackage{fontspec}
\begin{document}
"
set postBody to "
\end{document}"
*)
set content of selection of thisDoc to (the clipboard)
end tell
end openInTeXShop

on saveWordDoc(inputPathAL)
tell application "Microsoft Word"
save as active document file name inputPathAL
end tell
end saveWordDoc

on duplicateDoc()
tell application "Microsoft Word"
tell active document
set paraCount to count of paragraphs
set myRange to create range start (start of content of text object of paragraph 1) end (end of content of text object of paragraph paraCount)
select myRange
end tell
if selection type of selection is selection normal then
copy object selection
make new document
paste object selection
return name of active document
end if
end tell
end duplicateDoc

--my closeWordDocByName(myDup, false)
on closeWordDocByName(docName, savingBOOL)
if savingBOOL is true then
tell application "Microsoft Word"
close document docName saving yes
end tell
else
tell application "Microsoft Word"
close document docName saving no
end tell
end if
end closeWordDocByName


on addTextToFrontOfDoc(preBody)
tell application "Microsoft Word"
tell active document
insert text preBody & return at beginning of text object of active document
end tell -- doc
end tell --prog
end addTextToFrontOfDoc


on addTextToEndOfDoc(postBody)
tell application "Microsoft Word"
tell active document
insert text postBody at end of text object of active document
end tell -- doc
end tell --prog
end addTextToEndOfDoc

Looking for a free text converter? Look no more, upload your Word files and convert them to LaTeX files. Yes, it’s that easy.

Converting from Word

Not sure if the world would be a better place without Microsoft Word, but I guess we’ll never find out. It’s here and it’s here to stay. Every fricking office computer has Word on it. Techies hate it, because it’s not really machine-readable, it’s proprietary and there is no documented standard. Office people love it though. There are not limits. Put text in it, fine. Add images, no problem. Want to switch the font to Comic Sans? Sure! Make a creative layout, amazing! Do whatever you like. But don’t forget to convert it to a proper file format before you send it to a techie.

The files end with .docx by default.


More about Word files

Converting to LaTeX

LaTeX was developed in 1984 and no, that’s not a typo. It’s nearly 40 years old. It started as a writing tool for mathematicians and computer scientists, but has quickly been taken up by scholars who wanted to write documents with math expressions or non-Latin scripts (Arabic or Chinese for example). As with a lot of other text document formats, it’s used to structure the content, not style it. LaTeX is used directly or as an intermediate format to produce files for printing or digital distribution. It supports highlighting (such as bold or italic), citations and cross-references. Or to make it short: It’s the most powerful format to structure your texts. Convert all your files to LaTeX.

The files end with .tex by default.


More about LaTeX files

Current Release Github All Releases Downloads

docx2tex

Converts Microsoft Word’s DOCX to LaTeX. Developed by le-tex and based on the transpect framework. The main author of docx2tex and the underlying xml2tex is @mkraetke.

get docx2tex

download the latest release

Download the latest docx2tex release

…or get source via Git. Please note that you have to add the --recursive option in order to clone docx2hub with submodules.

git clone https://github.com/transpect/docx2tex --recursive

requirements

  • Java 1.7 up to 1.15 (more recent versions not yet tested). Java 11 has a bug with file URIs, it should be avoided. Java 13 is safe again.
  • works on Windows, Linux and Mac OS X

run docx2tex

You can run docx2tex with a Bash script (Linux, Mac OSX, Cygwin) or the Windows batch script whose options are somewhat limited, compared to the Bash script.

Linux/MacOSX

./d2t [options ...] myfile.docx
Option Description
-o path to custom output directory
-c path to custom docx2tex configuration file
-m choose MathType source (ole|wmf|ole+wmf)
-f path to custom fontmaps directory
-p generate PDF with pdflatex
-t choose table model (tabularx|tabular|htmltabs)
-e custom XSLT stylesheet for evolve-hub overrides
-x custom XSLT stylesheet for postprocessing the evolve-hub results
-d debug mode

Windows

via XML Calabash

Linux/Mac OSX

calabash/calabash.sh -o result=myfile.tex -o hub=myfile.xml xpl/docx2tex.xpl docx=myfile.docx conf=conf/conf.xml

Windows

calabashcalabash.bat -o result=myfile.tex -o hub=myfile.xml xpl/docx2tex.xpl docx=myfile.docx conf=conf/conf.xml

configure

The docx2tex pipeline consists of 3 macroscopic steps:

  • docx2hub. This step is hardly configurable. It transforms a docx file to a Hub XML representation.
  • evolve-hub. This is a bag of XSLT modes that, among other things, transform paragraphs with list markers and hanging indentation to proper nested lists, create a nested section hierarchy, group images with their figure titles, etc. Only some of the modes are used by docx2tex, orchestrated by evolve-hub.xpl and configured in detail by evolve-hub-driver.xsl.
  • xml2tex

There are five major hooks for adding your own processing: CSV or xml2tex configuration; XSLT that is applied between evolve-hub and xml2tex; XSLT that modifies what happens in evolve-hub; fontmaps.


You can specify a custom configuration file for docx2tex. There are two different formats to write a configuration.

  • The CSV-based configuration format permits a simple way to map from MS Word styles to LaTeX commands.
  • The xml2tex configuration format is recommended for a deeper level of configuration but requires basic knowledge of XML and XPath.

CSV

For each MS Word style name, create a line with three semicolon separated values.

  • MS Word style name
  • LaTeX start statement
  • LaTeX end statement

Just follow this example:

Heading 1   ; chapter{     ; }
Heading 2   ; section{     ; }
Heading 3   ; subsection{  ; }
Quote       ; begin{quote} ; end{quote}

You can edit CSV files either with a simple text editor or with a spreadsheet application.

xml2tex

docx2tex can also be configured by means of an xml2tex configuration file. docx2tex will apply the configuration to the intermediate Hub XML file and generates the LaTeX output.

The configuration in conf/conf.xml is used by default and works with the styles defined in Microsoft Word’s normal.dot. If you want to configure docx2tex for other styles, you can edit this file or pass a custom configuration file with the conf option.

Learn how to edit this file here.

XSLT between evolve-hub and xml2tex

You can provide an XSLT that works on the result of evolve-hub (if debugging is enabled, on the file [basename].debug/evolve-hub/70.docx2tex-postprocess.xml). The location of this XSLT file (absolute URI or path relative to the main directory that d2t and d2t.bat reside in) may be provided to d2t via the -x option. d2t.bat does not have all the flags; if you are confined to Windows and don’t have Cygwin, WSL, or MinGW, you may invoke calabash/calabash.bat yourself, see above. The additional XSLT’s URI may be provided by the custom-xsl option. This processing is applied before the xml2tex configuration, so your XSLT should transform Hub (DocBook namespace) to Hub.

During evolve-hub

In case you need to influence what evolve-hub does, you can provide a custom stylesheet for this. Contrary to custom-xsl which is passed as an option, this is passed to the pipeline on the input port custom-evolve-hub-driver, or using the -e option of d2t. There is an example for such an XSLT that retains empty paragraphs that will otherwise be removed by default, in one of the XSLT passes that comprise evolve-hub. This example was created in response to a user request. If you want to create chapter, section, etc. headings from arbitrary docx paragraphs, you should add a template that sets the paragraph’s @role attribute to Heading1, Heading2, etc. (For paragraphs that are not removed during evolve-hub, this can also be done in the -x stylesheet.) It is strongly advised to xsl:import the default evolve-hub customization (see example).

fontmaps

The docx conversion supports individual fontmaps for mapping non-unicode characters to unicode. Please note that this is just needed for fonts that are not unicode-compatible. If you want to map characters from Unicode to LaTeX, please use the character map in the xml2tex configuration instead.

Please find further documentation on how to create a fontmap here.

After you created your fontmap, store it in a directory and pass the path of the directory to docx2tex with the -f option.

If you invoke the docx2tex XProc pipeline (xpl/docx2tex.xpl), you can specify the fontmap directory with the option custom-font-maps-dir.

language tagging

You may have noticed some obscure foreignlanguage{} or selectlanguage{} code that doesn’t match the actual language used in your TeX document. We have no fancy AI™-based natural language algorithms at work but docx2tex evaluates the original document language which typically applies to your system settings and the language setting of the paragraph or character style which is used by word for auto-correction and hyphenation. docx2tex evaluates these settings and filters redundant markup, e.g. detecting the main language by evaluating the character count of each of the styles and their respective language setting. However, when you copy and paste from the World Wide Web, Microsoft Word usually copies the language of the original Website as well. This causes most of the weird language markup, you may have noticed. So we recommend to copy and paste as plain text and to create new paragraph and character styles when you want to intentionally change the language of a text fragment.

Понравилась статья? Поделить с друзьями:
  • Преобразование word в dword
  • Преобразование word doc в word
  • Преобразование tiff в word
  • Преобразование rar в word
  • Преобразование png в word