Document Conversion on Heroku

If you're converting documents to PDF on Heroku, you're going to run into a few gotchas. LibreOffice and ImageMagick both need some coaxing to work in Heroku's environment. Here's what I've learned deploying apps that do document conversion.

LibreOffice

Heroku doesn't include LibreOffice by default, so you need a buildpack. I tried a few different options, but they didn't work on Heroku-24. So, we created our own LibreOffice buildpack, which installs exactly what you need.

Add the buildpack (it needs to run before the Ruby buildpack), for example:

heroku buildpacks:add --index 1 https://github.com/velocity-labs/heroku-buildpack-libreoffice -a your-app-name

Deploy your code to rebuild everything, and then verify it's working:

heroku run "soffice --version" -a your-app-name

ImageMagick

Heroku includes ImageMagick by default, but some stacks ship with a restrictive policy.xml that blocks PDF operations. This is a security measure (ImageMagick has had PDF-related CVEs). Heroku-24 seems to be fine out of the box, but older stacks may block image-to-PDF conversion.

To check whether your stack restricts PDF operations, run:

heroku run "identify -list policy" -a your-app-name

Look for a line like rights: None ... pattern: PDF. If you see that, PDF operations are blocked and you'll get errors like:

Magick::ImageMagickError: not authorized `output.pdf'

Fix: Override policy.xml

Create a .magick/policy.xml in your project root that overrides the restrictive defaults:

<policymap>
  <policy domain="coder" rights="read|write" pattern="PDF" />
</policymap>

Then set the MAGICK_CONFIGURE_PATH environment variable so ImageMagick picks up your policy file:

heroku config:set MAGICK_CONFIGURE_PATH=./.magick -a your-app-name

Ghostscript

ImageMagick delegates PDF operations to Ghostscript. If Ghostscript isn't available, you'll get an error. Check if it's installed:

heroku run "gs --version" -a your-app-name

If not, you can install it via the apt buildpack with an Aptfile containing ghostscript.

Memory considerations

LibreOffice is not lightweight. The installed package takes up a few hundred megabytes of disk, which eats into Heroku's slug size limit. But the real constraint is runtime memory. A single conversion can use 50-150MB of RAM depending on the document. On a standard Heroku dyno (512MB) with a Rails app already using 200-300MB, that's enough to push you into R14 (memory quota exceeded) errors.

A few strategies:

Use a worker dyno. Offload conversions to a background job (Sidekiq, GoodJob, etc.) running on a separate dyno. This keeps your web dynos responsive.
Use a Performance dyno. Performance-M gives you 2.5GB, which is comfortable for most conversion workloads.
Process one at a time. If you're on a smaller dyno, use a queue with concurrency of 1 for conversion jobs to avoid multiple LibreOffice processes competing for memory.
Cache the result. Document conversion is expensive. If there's any chance the same file will be converted more than once, cache the result. Store the converted PDF in S3 (or wherever your app keeps files) keyed by a digest of the input. Before converting, check if a cached version already exists.

Timeouts

Heroku enforces a 30-second request timeout for web dynos. Document conversion can take anywhere from 2 to 20 seconds depending on the file. For anything but the simplest conversions, try not to do it in a web request. Use a background job and poll or use WebSockets to notify the client when it's done.

Putting it together

A typical setup for a Rails app doing document conversion on Heroku:

LibreOffice buildpack (and Ghostscript if needed)
Check ImageMagick policy and fix if PDF operations are blocked
Background job for conversions (not inline in web requests)
Performance-M or larger dyno for the worker

It takes a bit of setup, but once it's working, it's reliable. The main thing is knowing about the gotchas upfront rather than discovering them in production.

And, if you're looking for a clean way to handle the conversion itself, check out DocPDF. It's a Ruby gem that wraps LibreOffice, ImageMagick, and PDF generation behind a simple API with zero hard dependencies.