Как конвертировать word в pdf php

Well my 2 cents when it comes to the topic word 2007 docx, word 97-2004 doc, pdf and all other types of MS Office wishing to be «converted from y to z but in real they don’t wanna be». In my experience so far, conversion with LibreOffice or OpenOffice can’t be relied on. Though .doc documents tend to be better supported than word 2007’s .docx. In general it’s very hard to convert the .docx to .doc without breaking anything.

.docx also tend to be extremely useful for templating where .doc is not for being binary.

The conversion from .doc to PDF was most of the time quite reliable. If you can still influence the design or content of the word document then this might be satisfying, but in my situation documents were supplied from foreign companies where even after generating the .docx templates, in some scenario’s, the generated .docx had to be slightly modified with supplement text before it was generated to a PDF.


WINDOWS BASED!

All this hiccup made me come to the conclusion that the only true reliable conversion method I found was using the COM class in PHP and let the MS Word or Excel Application do all the work for you. I’ll just give an example on converting .docx to .doc and/or PDF. If you do not have MS Office installed, you can download a trial version of 60 days which would give you enough room for testing purposes.

the COM.net extension is by default commented out in the php.ini, just search for the line php_com_dotnet.dll and uncomment it like so

  extension=php_com_dotnet.dll

Restart the web server (IIS is not a pre, Apache will work just as well).

The code below is a demonstration on how easy it is.

  $word = new COM("Word.Application") or die ("Could not initialise Object.");
  // set it to 1 to see the MS Word window (the actual opening of the document)
  $word->Visible = 0;
  // recommend to set to 0, disables alerts like "Do you want MS Word to be the default .. etc"
  $word->DisplayAlerts = 0;
  // open the word 2007-2013 document 
  $word->Documents->Open('yourdocument.docx');
  // save it as word 2003
  $word->ActiveDocument->SaveAs('newdocument.doc');
  // convert word 2007-2013 to PDF
  $word->ActiveDocument->ExportAsFixedFormat('yourdocument.pdf', 17, false, 0, 0, 0, 0, 7, true, true, 2, true, true, false);
  // quit the Word process
  $word->Quit(false);
  // clean up
  unset($word);

This is just a small demonstration. I can just say that if it comes to conversion, this was the only real reliable option I could use and even recommend.

15 minute read

Would it not be great if you can just use the normal Word-documents to convert them to PDF by using PHP? Yes, it would — and this article explains you how to achieve that as we set up a service that converts word documents delivered by clients or other project stakeholders to PDF documents with PHP.

Wie man Word-Dokumentvorlagen mit PHP bearbeitet und in PDF konvertiert

Creating custom PDF documents is a very time-consuming task in projects. In order to improve this process there are various approaches — one of the most common ones by creating a HTML document that gets converted to PDF. At the end of the day this solution is not always the best as every new PDF needs to be implemented as a HTML document again before you can start.

Intro

You can find few solutions for editing word document templates with PHP on the Internet, the most popular ones being: PHP Word, phpdocx and LiveDocx. Since PHP Word is the only one that is free, I gave it a go.

PHP Word authors say about PHP Word: “PHPWord is a library written in pure PHP that provides a set of classes to write to and read from different document file formats. The current version of PHPWord supports Microsoft Office Open XML (OOXML or OpenXML), OASIS Open Document Format for Office Applications (OpenDocument or ODF), Rich Text Format (RTF), HTML, and PDF”.

The tested version is v0.12.1. Its supported formats are, better known by their extensions, Office Open XML — .docx and .docm, Open Document — .odt and .fodt.

Even though the PHP Word can be found on both  CodePlex and GitHub , it has moved to GitHub some time ago. GitHub is also used for posting bug issues. Documentation is on Read the Docs.

Also, in case someone needs a free solution for manipulating spreadsheets, the same team, the PHP Office has a PHP Excel library.
I will now explain to you how to set up and use PHP Word for editing .docx templates.

doc to pdf

PHP Word Setup

This part explains how to set up PHP Word. I hereby start with a list of prerequisites and will then explain how to install PHP Word on a Linux Server, and finally, how to use it.

Requirements

Before we start with different steps, we need to consider which prerequisites are necessary. First you need to be able to set the PHP Word on your Linux web server. Therefore you need:

  • PHP 5.3+
  • PHP XML Parser extension (it is enabled by default)
  • Composer (optional, but recommended)
  • PHP Phar extension
    • comes pre-installed with required version of PHP
    • to enable the Phar extension, uncomment or add following line if it is not present in php.ini file:
  • OpenSSL package
  • PHP Openssl extension (comes pre-installed with PHP)
    • comes with OpenSSL package
    • to enable OpenSSL extension, uncomment or add following line if it is not present in php.ini file:

Installation

The recommended way to install PHP Word is to use a composer, but if you prefer not to use the composer, even though you know you should use it, you can download or clone project from GitHub .

“Composer is a tool for dependency management in PHP. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you.” “It deals with ‘packages’ or libraries, but it manages them on a per-project basis, installing them in a directory (e.g. vendor) inside your project. By default it does not install anything globally². Composer is using composer.json file to know which packages or libraries it needs to download.

To install PHP Word using composer way, you need to add „phpoffice/phpword“ : „dev-master“ to your composer.json file under the require key. If you don’t have composer.json file, simply create one in your project root with following content:

{
"require": {
"phpoffice/phpword": "dev-master"
}
}

In both cases, after the composer.json file is saved, issue the composer install command in terminal in the directory where you placed it.

Including PHP Word to your specific project

To include PHP Word to your project, you need to require the PHP Word Autoloader.php from src/PHPWord folder and register it:

require_once 'src/PhpWord/Autoloader.php';
PhpOfficePhpWordAutoloader::register(); 

Editing templates

You probably found yourself in a situation where you need some template documents with only a few different values for each user, receipt or something else. That is exactly the situation where PHP Word would come very handy and make your work effortless.

Thereby it is the only requirement to save the document .docx extension.
Placeholders for variables or arrays in the template are defined the same way, ${variableName} or ${arrayName}. Placeholder for array MUST be in a table cell, so it can be cloned later. This is good, because it enables you to later insert as many entries to one placeholder programmatically.

To replace the placeholders with your variables, you need to:

Open template:
$template = newPhpOfficePhpWordTemplateProcessor(‚folder/file.docx‘);

Replace your placeholders with your variables:
$template->setValue(‚variableName‘, ‚MyVariableValue‘);

Replace your placeholders with your arrays:
3a: first you need to clone your array placeholder to the count of your array
3b: $template->cloneRow(‚arrayName‘, count($array));
3c: and then you need to give all of them a values
3d: for($number = 0; $number < count($array); $number++) {
    $template->setValue(‚arrayName#‘.($number+1), htmlspecialchars($array[$number], ENT_COMPAT, ‚UTF-8‘));
}

And save the modified file
$template->saveAs(‚folder/result.docx‘);

Converting to PDF

There is an option to integrate dompdf with PHP Word to convert file to PDF after editing template, but it won’t give expected result of a well-formated document. If you want to achieve the best results, the easiest thing to do is to install LibreOffice on your server. If you are on shared hosting, you probably won’t be able to do that, so you’ll need either Virtual Private Server (VPS) or dedicated hosting. VPS hosting is not that expensive these days and if you need converting files to PDF, you will have to get it and set it up. If you bump into problems when trying to convert with LibreOffice, you still have great backup – unoconv.

Universal Office Converter (unoconv) is a command line tool to convert any document format that LibreOffice can import to any document format that LibreOffice can export. It makes use of the LibreOffice’s UNO bindings for non-interactive conversion of documents.

After you have installed LibreOffice on your server, you can see if it works by doing the following: Navigate to folder where you have some of your document files and issue following command:

libreoffice --headless --convert-to pdf /path/to/document/file --outdir /desired/output/directory

or if you want to issue it from PHP:

shell_exec('libreoffice --headless --convert-to pdf /path/to/document/file --outdir /desired/output/directory');

If you get PDF after issuing that command, you are good to go and everything works. However, there are two possible problems I have already warned you about.

The first problem is:

“[Java framework] Error in function createSettingsDocument (elements.cxx).javaldx failed!

Warning: failed to read path from javaldx”

This one can be solved. You need to create folder .config in home folder of your apache user. For CentOS, Debian and Ubuntu this is /var/www. You will neeed to use user with sudo rights (or root user, not recommended) to issue required commands. Navigate to required folder:

cd /var/www

and create folder:

mkdir .config    

After that, you need to make apache user and group owner of that newly created folder. Apache user and group are apache for CentOS and www-data for Ubuntu and Debian. For CentOS

sudo chown -R apache:apache /var/www/.config

or for Debian/Ubuntu:

sudo chown -R www-data:www-data /var/www/.config

The second problem starts with:

“Error: Please verify input parameters…”

For this problem, I didn’t find any working solutions, even though there are some proposed online. If you have time, check them, but I wouldn’t hope too much. Should you be able to find a solution, please let me know in the comments on this article.

If you had no luck with LibreOffice, you can still use unoconv for same result, with few more tunings.

After you install it, you will need to grant sudo rights to apache user and group for unoconv (only for unoconv!) since it requires sudo rights to be ran. Again, you will need to run following command with user with sudo rights:

sudo visudo

When it opens, scroll down to the bottom and find a line which looks like this:

root ALL=(ALL:ALL) ALL

You should add following below that line for CentOS:

apache ALL=NOPASSWD:/usr/bin/unoconv
%apache ALL=NOPASSWD:/usr/bin/unoconv

or for Debian/Ubuntu:

www-data ALL=NOPASSWD:/usr/bin/unoconv
%www-data ALL=NOPASSWD:/usr/bin/unoconv

This enables your apache user and group to run only unoconv with sudo rights with no password required.

With that done, you set unoconv. You can try it by issuing command:

sudo unoconv -f pdf /path/to/input/file

This command will convert the input file to PDF and save it to the current folder. Or if you want to use it from PHP, just run it with shell_exec:

shell_exec('sudo unoconv -f pdf /path/to/input/file');

Results

As a result of this whole process, we get a pixel-perfect pdf document that we can send to clients.

The biggest advantage of this solution is that it saves hours of development. Everyone who was making any kind of PDFs, using any solution – either solution that write (or draw) directly to pdf or made HTML template first and then converted it to pdf will know how much time it consumes and that results are not always what we expect. Not to mention that if you already spent time creating PDF and then client wants some changes. It was a nightmare. With this solution, you can let clients create their own template and change it any time without giving you a headache.

Earlier we also wasted so much time on aligning PDFs, but now we established this as a team-wide service which all our future projects can rely on and we can use our saved time to do something smarter – anything is better than wasting hours on creating PDFs :)

I hope this tutorial helps you to improve your development of PDFs in PHP. If you have any questions, feel free to contact us and leave us a message.

Furthermore, if there are things you consider to be important for this tutorial, please feel free to let us know.

1 https://github.com/PHPOffice/PHPWord

2 https://getcomposer.org/doc/00-intro.md3

3 https://github.com/dagwieers/unoconv

25 Jun 2020

In this post, you’ll learn how to convert DOCX files to PDFs using PSPDFKit’s DOCX to PDF PHP API. With our API, you can convert up to 100 PDF files per month for free. All you need to do is create a free account to get access to your API key.

PSPDFKit API

Document conversion is just one of our 30+ PDF API tools. You can combine our conversion tool with other tools to create complex document processing workflows. You’ll be able to convert various file formats into PDFs and then:

  • Merge several resulting PDFs into one

  • OCR, watermark, or flatten PDFs

  • Remove or duplicate specific PDF pages

Once you create your account, you’ll be able to access all our PDF API tools.

Step 1 — Creating a Free Account on PSPDFKit

Go to our website, where you’ll see the page below, prompting you to create your free account.

Free account PSPDFKit API

Once you’ve created your account, you’ll be welcomed by the page below, which shows an overview of your plan details.

Free plan PSPDFKit API

As you can see in the bottom-left corner, you’ll start with 100 documents to process, and you’ll be able to access all our PDF API tools.

Step 2 — Obtaining the API Key

After you’ve verified your email, you can get your API key from the dashboard. In the menu on the left, click API Keys. You’ll see the following page, which is an overview of your keys:

Convert DOCX to PDF PHP API Key

Copy the Live API Key, because you’ll need this for the DOCX to PDF API.

Step 3 — Setting Up Folders and Files

Now, create a folder called docx_to_pdf and open it in a code editor. For this tutorial, you’ll use VS Code as your primary code editor. Next, create two folders inside docx_to_pdf and name them input_documents and processed_documents.

Next, copy your DOCX file to the input_documents folder and rename it to document.docx. You can use our demo document as an example.

Then, in the root folder, docx_to_pdf, create a file called processor.php. This is the file where you’ll keep your code.

Your folder structure will look like this:

docx_to_pdf
├── input_documents
|    └── document.docx
├── processed_documents
└── processor.php

Step 4 — Writing the Code

Open the processor.php file and paste the code below into it:

<?php

$FileHandle = fopen('processed_documents/result.pdf', 'w+');

$curl = curl_init();

$instructions = '{
  "parts": [
    {
      "file": "document"
    }
  ]
}';

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.pspdfkit.com/build',
  CURLOPT_CUSTOMREQUEST => 'POST',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_POSTFIELDS => array(
    'instructions' => $instructions,
    'document' => new CURLFILE('input_documents/document.docx')
  ),
  CURLOPT_HTTPHEADER => array(
    'Authorization: Bearer YOUR API KEY HERE'
  ),
  CURLOPT_FILE => $FileHandle,
));

$response = curl_exec($curl);

curl_close($curl);

fclose($FileHandle);

ℹ️ Note: Make sure to replace YOUR_API_KEY_HERE with your API key.

Code Explanation

In the code above, you create a FileHandle variable that will allow you to save the file in the processed_documents folder.

Then, you create the instructions variable, where all the instructions for the API will be stored in the form of a JSON string. Finally, you make a CURL request to process the target file.

Output

To execute the code, run the command below:

On successful execution, you’ll see a new processed file, result.pdf, located in the processed_documents folder.

The folder structure will look like this:

docx_to_pdf
├── input_documents
|    └── document.docx
├── processed_documents
|    └── result.pdf
└── processor.php

Final Words

In this post, you learned how to easily and seamlessly convert DOCX files to PDF documents for your PHP application using our DOCX to PDF PHP API.

You can integrate all of these functions into your existing applications. With the same API token, you can also perform other operations, such as merging several documents into a single PDF, adding watermarks, and more. To get started with a free trial, sign up here.

transformDocument

ADVANCED / PREMIUM
BASIC

Transforms documents into other formats (DOCX, PDF, (X)HTML, DOC, RTF, PNG, TXT).

public transformDocument (string $source, string $target [, string $method [, array $options]])

This method allows to transform a document, generated or not with phpdocx, into DOCX, PDF, HTML, DOC, ODT, RTF, PNG and TXT preserving, as much as possible, the original formatting options.

You may find more info regarding this method in the Conversion plugin section.

Due to format limitations, the PNG transformation only generates the first page of the document.

source

Path to the document that you want to convert to a different format.

target

Path to the resulting transformed document (PDF, HTML, XHTML, DOCX, DOC, ODT, RTF, PNG or TXT).

method

Method used to transform the document: ‘native’, ‘libreoffice’, ‘msword’, ‘openoffice’

‘native’ method options

The possible keys and values are:

Key Type Description
dompdf DOMPDF dompdf instance.
addHeadersAndFooters bool True as default. If true, add header/footer default type.
stream bool False as default. If true, returns the document as stream.

‘libreoffice’ method options

The possible keys and values are:

Key Type Description
comments bool False by default. Export comments
debug bool False by default. Returns debug information about the conversion plugin.
extraOptions string Extra parameters to be used when doing the conversion.
formsfields bool False by default. Export form fields.
homeFolder string Set a custom home folder to be used for the conversions.
lossless bool False by default. Lossless compression.
outdir string Set the outdir path. Useful when the PDF output path is not the same than the running script.
pdfa1 bool False by default. Generate PDF/A-1 document.
toc bool False by default. If true updates the TOC before transforming the document.

‘msword’ method options

The possible keys and values are:

Key Type Description
selectedContent string Scope: ‘active’ (default) or ‘documents’.
toc bool False by default. If true updates the TOC before transforming the document.

‘openoffice’ method options

The possible keys and values are:

Key Type Description
debug bool False by default. Returns debug information about the conversion plugin.
homeFolder string Set a custom home folder to be used for the conversions.
odfconverter bool True by default. If set to false the conversion plugin does not use ODFConverter package. This may give better results in some cases.
tempDir string Set a custom temp folder to be used for the conversions.
version string 32-bit or 64-bit architecture. 32, 64 or null (default). If null autodetect.

The resulting output looks like:

Example #2

The resulting output looks like:

To convert your first file with the Zamzar API, send an HTTP request to POST https://sandbox.zamzar.com/v1/jobs containing your source file, and the your desired target format. If the source file is on the web or in S3, send us the URL: the source file doesn’t need to hit your servers.

<?php

$endpoint = "https://sandbox.zamzar.com/v1/jobs";
$apiKey = "GiVUYsF4A8ssq93FR48H";
$sourceFile = "https://s3.amazonaws.com/zamzar-samples/sample.docx";
$targetFormat = "PDF";

$postData = array(
  "source_file" => $sourceFile,
  "target_format" => $targetFormat
);

$ch = curl_init(); // Init curl
curl_setopt($ch, CURLOPT_URL, $endpoint); // API endpoint
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, $postData);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // Return response as a string
curl_setopt($ch, CURLOPT_USERPWD, $apiKey . ":"); // Set the API key as the basic auth username
$body = curl_exec($ch);
curl_close($ch);

$response = json_decode($body, true);

echo "Response:n---------n";
print_r($response);

Your source file is now being converted. Send an HTTP request to GET https://sandbox.zamzar.com/v1/jobs/$jobId to check its progress. The response will also give you details about your converted file.

<?php

$jobID = 15;
$endpoint = "https://sandbox.zamzar.com/v1/jobs/$jobID";
$apiKey = "GiVUYsF4A8ssq93FR48H";

$ch = curl_init(); // Init curl
curl_setopt($ch, CURLOPT_URL, $endpoint); // API endpoint
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // Return response as a string
curl_setopt($ch, CURLOPT_USERPWD, $apiKey . ":"); // Set the API key as the basic auth username
$body = curl_exec($ch);
curl_close($ch);

$job = json_decode($body, true);

echo "Job:n----n";
print_r($job);

Once the status of your job is successful, your converted file is ready to download. Send an HTTP request to GET https://sandbox.zamzar.com/v1/file/$fileId/content to download it. We store your files for a day by default, and for longer on our paid plans.

<?php

$fileID = 3;
$localFilename = "converted.pdf";;
$endpoint = "https://sandbox.zamzar.com/v1/files/$fileID/content";
$apiKey = "GiVUYsF4A8ssq93FR48H";

$ch = curl_init(); // Init curl
curl_setopt($ch, CURLOPT_URL, $endpoint); // API endpoint
curl_setopt($ch, CURLOPT_USERPWD, $apiKey . ":"); // Set the API key as the basic auth username
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);

$fh = fopen($localFilename, "wb");
curl_setopt($ch, CURLOPT_FILE, $fh);

$body = curl_exec($ch);
curl_close($ch);

echo "File downloadedn";

If you like what you see and want to start converting files under your own API account then please click the «Get Started Now» button to signup for your own API account. Please feel free to get in touch with us should you have any specific questions or refer to our extensive docs and FAQ for further information.

Понравилась статья? Поделить с друзьями:
  • Как конвертировать word в latex
  • Как конвертировать pdf в word abbyy
  • Как конвертировать word в google docs
  • Как конвертировать pdf в excel чтобы можно было редактировать
  • Как конвертировать word в fb2 онлайн бесплатно