All Articles

Merge PDFs within Boomi

Apache PDFBox is an open-source Java library that is used to work with PDFs. Boomi can use the library to merge PDFs together. This article will cover how to use the library within Boomi to merge PDFs.

First, you will need to download the library and upload it to your Boomi account. The jar file can be downloaded from maven: https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox/2.0.29. Click on bundle next to Files to download the jar file.

pdfbox maven

Figure 1. Download PDFBox from Maven.

Next upload the jar file to your Boomi account. Go to Settings -> Account Information and Setup -> Account Libraries -> Upload a File. Once uploaded, you will need to deploy the jar file as a custom library. Go to the Build tab and create a custom library. Set the Custom Library Type to Scripting and select the pdfbox-2.0.29.jar file. Click save and deploy to your desired environment. Additional information on creating a custom library can be found in Boomi’s Docs.

pdfbox custom library

Figure 2. Create a Boomi Custom Library for PDFBox.

Create a process that consumes PDFs. Within the process add a Data Process shape and include the code below. The script will combine 1 or more PDFs that hit the data process shape at one time. The output of the data process shape will be a single PDF. The properties from the first document will be retained on the combined output.

pdfbox data process shape

Figure 3. Data Process Shape to Merge PDFs.

// Groovy 2.4

/*
The script will combine 1 or more PDFs into a single PDF. ALl documents going into the script will 
get combined. The properties will be retained from the first document.

Doc:
https://pdfbox.apache.org/

Required jars:
pdfbox-2.0.29.jar
https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox/2.0.29
 */

import com.boomi.execution.ExecutionUtil;
import java.util.Properties;
import java.io.InputStream;
import org.apache.pdfbox.multipdf.PDFMergerUtility


ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
PDFMergerUtility merger = new PDFMergerUtility();
merger.setDestinationStream(outputStream);


for (int i = 0; i < dataContext.getDataCount(); i++) {
    InputStream is = dataContext.getStream(i);
    merger.addSource(is)
}

// Merge the PDFs and output single PDF
merger.mergeDocuments(null);
// Use the properties from the first document
dataContext.storeStream(new ByteArrayInputStream(outputStream.toByteArray()), dataContext.getProperties(0));
outputStream.close();

Example of the process with a data process shape.

pdfbox process

Figure 4. Example of the process with a data process shape.

The article was originally posted at Boomi Community.

Published Oct 9, 2023

Developing a better world.© All rights reserved.