In previous article about Generating PDF files using iText JAR, Kiran Hegde had described a nice and basic way of generating PDF files in Java using iTest JAR. It is a great starter tutorial for those who wants to start working with iText.
In one of the requirement, I had to merge two or more PDF files and generate a single PDF file out of it. I thought of implementing the functionality from scratch in iText, but then thought to google it and see if already someone have written code for what I was looking for. As expected, I got a nice implementation of java code that merges 2 or more PDF files using iText jar. I thought of dissecting the code in this post and give credit to original author of the post.
Merge PDF files in Java using iText JAR
So here we go. First let us see the code.package net.viralpatel.itext.pdf;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import com.lowagie.text.Document;
import com.lowagie.text.pdf.BaseFont;
import com.lowagie.text.pdf.PdfContentByte;
import com.lowagie.text.pdf.PdfImportedPage;
import com.lowagie.text.pdf.PdfReader;
import com.lowagie.text.pdf.PdfWriter;
public class MergePDF {
public static void main(String[] args) {
try {
List<InputStream> pdfs = new ArrayList<InputStream>();
pdfs.add(new FileInputStream("c:\\1.pdf"));
pdfs.add(new FileInputStream("c:\\2.pdf"));
OutputStream output = new FileOutputStream("c:\\merge.pdf");
MergePDF.concatPDFs(pdfs, output, true);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void concatPDFs(List<InputStream> streamOfPDFFiles,
OutputStream outputStream, boolean paginate) {
Document document = new Document();
try {
List<InputStream> pdfs = streamOfPDFFiles;
List<PdfReader> readers = new ArrayList<PdfReader>();
int totalPages = 0;
Iterator<InputStream> iteratorPDFs = pdfs.iterator();
// Create Readers for the pdfs.
while (iteratorPDFs.hasNext()) {
InputStream pdf = iteratorPDFs.next();
PdfReader pdfReader = new PdfReader(pdf);
readers.add(pdfReader);
totalPages += pdfReader.getNumberOfPages();
}
// Create a writer for the outputstream
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA,
BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
PdfContentByte cb = writer.getDirectContent(); // Holds the PDF
// data
PdfImportedPage page;
int currentPageNumber = 0;
int pageOfCurrentReaderPDF = 0;
Iterator<PdfReader> iteratorPDFReader = readers.iterator();
// Loop through the PDF files and add to the output.
while (iteratorPDFReader.hasNext()) {
PdfReader pdfReader = iteratorPDFReader.next();
// Create a new page in the target for each source page.
while (pageOfCurrentReaderPDF < pdfReader.getNumberOfPages()) {
document.newPage();
pageOfCurrentReaderPDF++;
currentPageNumber++;
page = writer.getImportedPage(pdfReader,
pageOfCurrentReaderPDF);
cb.addTemplate(page, 0, 0);
// Code for pagination.
if (paginate) {
cb.beginText();
cb.setFontAndSize(bf, 9);
cb.showTextAligned(PdfContentByte.ALIGN_CENTER, ""
+ currentPageNumber + " of " + totalPages, 520,
5, 0);
cb.endText();
}
}
pageOfCurrentReaderPDF = 0;
}
outputStream.flush();
document.close();
outputStream.close();
} catch (Exception e) {
e.printStackTrace();
} finally {
if (document.isOpen())
document.close();
try {
if (outputStream != null)
outputStream.close();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
}
Code language: Java (java)
If you see what the code does is pretty simple.- In main() method, we create a List of InputStream objects that points to all the input PDF files we need to merge
- We call MergePDF.concatPDFs() static method passing list of input PDFs, OutputStream object for merged output PDF and a boolean flag that represents whether you need to include page numbers at the end of each page as command line arguments
- In concatPDFs() method, first we convert List of InputStream objects to List of PdfReader objects in first while loop. And also we keep count of the total pages in all the input PDF files.
- Next we create BaseFont object using BaseFont.createFont() method. This will be the font for writing page numbers
- Next we create output objects to write our merged PDF file using Document class object and PdfWriter.getInstance() method
- Finally we write all the input PDFs into merged output PDF iterating each PDF and then writing each page of it in two while loops
- And then, close all the streams and clear all the buffers. Good boys do this ;-)
Split PDF files in Java using iText JAR
Let us see the code./**
* @author viralpatel.net
*
* @param inputStream Input PDF file
* @param outputStream Output PDF file
* @param fromPage start page from input PDF file
* @param toPage end page from input PDF file
*/
public static void splitPDF(InputStream inputStream,
OutputStream outputStream, int fromPage, int toPage) {
Document document = new Document();
try {
PdfReader inputPDF = new PdfReader(inputStream);
int totalPages = inputPDF.getNumberOfPages();
//make fromPage equals to toPage if it is greater
if(fromPage > toPage ) {
fromPage = toPage;
}
if(toPage > totalPages) {
toPage = totalPages;
}
// Create a writer for the outputstream
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
PdfContentByte cb = writer.getDirectContent(); // Holds the PDF data
PdfImportedPage page;
while(fromPage <= toPage) {
document.newPage();
page = writer.getImportedPage(inputPDF, fromPage);
cb.addTemplate(page, 0, 0);
fromPage++;
}
outputStream.flush();
document.close();
outputStream.close();
} catch (Exception e) {
e.printStackTrace();
} finally {
if (document.isOpen())
document.close();
try {
if (outputStream != null)
outputStream.close();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
Code language: Java (java)
In above code, we have created a method splitPDF () that can be used to extracts pages out of a PDF and write it into another PDF. The code is pretty much self explanatory and is similar to the one to merge PDF files.
Thus, if you need to split an input.pdf (having 20 pages) into output1.pdf (1-12 pages of input.pdf) and output2.pdf (13-20 of input.pdf), you can call the above method as follow:public static void main(String[] args) {
try {
MergePDF.splitPDF(new FileInputStream("C:\\input.pdf"),
new FileOutputStream("C:\\output1.pdf"), 1, 12);
MergePDF.splitPDF(new FileInputStream("C:\\input.pdf"),
new FileOutputStream("C:\\output2.pdf"), 13, 20);
} catch (Exception e) {
e.printStackTrace();
}
}
Code language: Java (java)
Feel free to bookmark the code and share it if you feel it will be useful to you :)
Hi ,
I want to split the pdf if my pdf exceeds the given size.
For Example,
If my pdf size is 12MB,I want to split the pdf in to 5MB parts.
So i need to split the pdf in to 3, 5MB,5MB,2MB respectively.
Pls let me know whether this is possible.
Regards,
Prem
@Prem, This example only splits the PDF on basis of page numbers. I will definitely try to write code for splitting pdf on basis of size and update you.
@Viral Patel, Thanks for the great comprehensive tutorial on merger two PDF’s, it does work great, and always 2nd pdf concatenates at the end of the merge PDF, How to add PDF in between the merge PDF @say, given particular page number…Please give me directions, Thanks
I would like to paste my code here to split a PDF based on bookmarks using iText. All my bookmarks are on level1. I based my code sample on the example on this page, and that’s why I would like to place my code here. I couldn’t really find any example of splitting a PDF based on bookmarks with iText, so I thought it could help other people if I put my code on this page:
Thanks Goblin for the code. I appreciate your effort of sharing this here.
Hi Goblin_Queen
NOT getting “page” key in the bookmark hashmap.
I am getting the page key only for first and last bookmark & not able to see the page key in the map for other than first and last bookmark. Thats why i cant navigate the pages of bookmarks.
The above two lines gives the following result…
Please suggest some solution
Jitendra
hi,
I have to split pdf based on chapters i am not able to get some of the childs of please help me
i had downloaded iText jar 5.0.1 but it do not contain com.lowagie.text.table package what should i do?? i want to create table dynamically in pdf page..
Viral,
I was searching pdf splittig on the basis of size(excedding a given size say > 10 MB). Prem was also looking for the same last year.Just checking if you got an opportunity to write on this.
Thanks for your help
Puru
Viral
Thanks for posting the code here.
i was searching for the code to read a image and parse it can you post the code
Viral : If you already found the script for splitting pdfs by page size, it would be VERY handy
Thx in advance
Hi Viral,
By any chance is it possible to split the pages randomly? E.g. I have a PDF document of 20 pages and now I was a new PDF containing only 2nd, 5th, 11th and 16th page of the original document.
Thanks in advance.
Vishnu.
Thank for sharing
hi viral…can you provide sample code to generate thumbnails to a pdf page
nice coding…:)its wrks for me always in 1 shoot viral
@Swati – Thanks :)
Hi Viral Patel,
Did you manage to develop sample code to split PDFs based on size? i.e. Splitting a 8MB file into 2MB PDFs.
Regards,
Edwin
Hi Viral , I got the following errors:
Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
at java.io.ByteArrayOutputStream.write(Unknown Source)
at com.itextpdf.text.pdf.RandomAccessFileOrArray.InputStreamToArray(Rand
omAccessFileOrArray.java:182)
at com.itextpdf.text.pdf.RandomAccessFileOrArray.(RandomAccessFile
OrArray.java:172)
at com.itextpdf.text.pdf.PdfReader.(PdfReader.java:237)
at com.itextpdf.text.pdf.PdfReader.(PdfReader.java:248)
at MergePDF.splitPDF(split.java:38)
at ShivaPdf.main(split.java:87)
How to resolve them.
Have you resolved java.lang.OutOfMemoryError: Java heap space???
Dimitri asked:
“Viral : If you already found the script for splitting pdfs by page size, it would be VERY handy”
I found the answer here:
http://java-x.blogspot.com/2006/11/merge-pdf-files-with-itext.html#comment-3167256311373072651
document.setPageSize(pdfReader.getPageSizeWithRotation(pageOfCurrentReaderPDF));
document.newPage();
I.e., immediately before creating a new page, set the page size to the page size (and orientation) of the page you are reading in. The only issue to be aware of is the PDF has to be (I believe) version 1.5 or higher. You’ll get errors trying to open PDFs saved as 1.4.
pdfWriter.setPdfVersion(PdfWriter.PDF_VERSION_1_5);
Hey
I have used your code for merging few pdf files but I still have a problem to merge files that carry different sizes!
I have try to had a command to set the page size
document.newPage();
pageOfCurrentReaderPDF++;
currentPageNumber++;
document.setPageSize(pdfReader.getPageSize(pageOfCurrentReaderPDF));
Thanks, for the helpful code!
Natalie
Very good code for merging & splitting….
good work itext…………exactly suits my requirement.
hi,thanks for your coding ,but I still have a problem to merge files that carry different sizes!
how could i solve it? some page is large while other is small.I am tired with it.help!
Does anybody have any examples for splitting a PDF at bookmarks that exist at a level 2? The code provided by Goblin_Queen works great for level 1 bookmarks, but I’m not quite sure how to expand that to level 2 bookmarks as the resulting hashmap for the “Kids” do not contain page information.
HashMap bm = bookmarks.get(i);
List kids = (List) bm.get(“Kids”);
The kids hashmap does not have the “Page” defined.
You could reduce the number of lines by using PdfStamper class.
PdfReader pdfReader = new PdfReader(fileDialog.FileName);
FileOutputStream fout = new FileOutputStream(“C:\\output1.pdf”);
PdfStamper splitter = new PdfStamper(pdfReader, fout);
pdfReader.SelectPages(“1 – 10”);
splitter.Close();
pdfReader.Close();
very useful code for all guys who r in initial stage thank u very much
I tried the concatPDFs code. It concatenates two files OK. But, it does not copy the signatures that were in the files. It also does not copy text data fields.
thank you for the great tutorial and comments. They help me a lot, to write my own split function
I used merging of pdf code but i got problem while i am going to merge A3 paper size pdf file. this code convert that files into A4 paper size with cut of the pdf image area.
This works but not for a lot of documents – I still used it as a good starting point, though. Since it opens the PDF FileInputStream’s in the main method, they have to stay open the entire time and never get closed. A better usage would be to create a List to hold the PDF’s filenames (instead of a List), then pass this to the concatPDFs method. Method concatPDFs would then need to loop/iterate through list (which it already does) and open each InputStream, read PDF, add it to readers, then close InputStream. Much better usage of resources when large number of files involved.
I tried it the Viral’s way and it works for up to a few hundred PDF’s, but anything more and I got a “FileNotFoundException (Too many open files)”.
@CMil, That would be a better solution. Thanks for pointing that out. I will update the post with your suggestion. :)
The above method does split the pdf however it does not preserve the orientation of the pages and will try and print them all as either landscape or portrait. This is fine unless you have a scanned document that has landscape pages turned on end in which case it tries to write out landscape in portrait format and clips the sides. The below is a method using pdfCopy that accomplishes the same thing but preserves orientation.
Thank you very much!
This is amazing blog… you guys are really putting nice solutions in the forum which I am sure helpful for thousands of people.
Hi I’m using itext to merge pdf but i’m new on this and I’m facing the fallowing issue:
I want to merge each page of a pdf file lets call it A with a header pdf file lets call it B.
The issue is that when I merge pages that are portrait it works fine but now i have a page on A that is in landscape and the header page on B that is on portrait so my code is not working .
this is my code … Thanks in advance for your help
Hi Guys,
I need to develop a program which should compare a set of pdf files stored in one folder with a set of pdf files stored in another folder. Both folders should contain the same no. of pdf files with same name in both folders. I am able to compare the pdf files text by storing it in a String array, bt i am not able to compare the changed FONTS , and changed images. Also, i need to write the generate a new pdf file which should highlight the difference of the pdf file by changing its background color, and when we place the mouse cursor over it, it should pop-up the contents of the next pdf file which is not matching. I am able to generate the pdf file, bt not able to highlight the difference. There can be n number of pdf files in the folders. we need to do 1 by 1 comparing of pdf files.
GUYS, please suggest me some appropriate ways, how i need to implement. I need to do it in my project.
The basic requirement is to match a pdf file with another pdf file, and it should generate a pdf file with the same contents as of first pdf file ,bt it should highlight the difference i.e. text,font,image,etc difference with a pop-up(if possible) in order to show the content of next pdf file
I would appreciate any help
Thanks in ADVANCE!!!
how to extract references from a research paper such as ieee,
and while reading ieee pdf research paper using itext pdf ,
i am able to get 1st line of 1st col then 1st row of second column,
(merging of two colums in a page),
and plz help me to guide on how to extract references from a pdf file
Viral,
I am using your code posted here to merge PDFs. It works good. Thanks for posting the code. In my PDF I have html hyper links. Those hyper links don’t work on the merged pdf, though it works on the individual PDFs. any thoughts on this?
Thanks
Vel
how to add header in iText ?
i tried with
HeaderFooter header = new HeaderFooter(new Phrase(chunk),true);
header.setAlignment(Element.ALIGN_CENTER);
header.setBorder(Rectangle.NO_BORDER);
document.setHeader(header);
but i t gives me error like Header Footer (Phrase,boolean) is undefined.
and setHeader properties is not there..
comment on it highly appreciate.
h to split the single pdf into multuiple pdf(individuals)
Neeed all of your help.
I needed to comare two pdfs (say pdf1 and pdf2- exactly same format, same form, just different variable data field values like interest rate) and highlight the difference in pdf2, let’s say in yellow background color, just the interest rate. Can I do it with this? Can anyone suggest anything?
I cannot thank enough if anyone can give me some idea. I am not a techie person, but I manage tech teams.
Regards, Nirban
I have 2 pdf files to merge using the MergePDF class:
first file contains A4(paper size) and the second file contains A3(paper size).
result from the newly generated PDF file:
the A3(paper size) page is cutting off, the portion of A4 size is merged.
how to fix this?
i am creating a pdf. in the second page of the pdf i need to insert another page from an exsisting pdf. How can i do this.
Could you please explain.
Thank you so much for your help for splitting pdf. I would like to get have code for adding page numbers to output split file.
this code give error when i try use can any boy find the error and past update code
I was unable to solution on internet.
But finally I was able to find a trick to create a single PDF file with different page sizes.
Below code is the sample to achieve the above task (This is not the full code).
public static void main(String args[]) {
Map rateDetails = new HashMap();
rateDetails.put(“2M”, “256”);
rateDetails.put(“3M”, “512”);
rateDetails.put(“7M”, “450”);
Document document1 = new Document();
try {
PDFRateCardBuilder2 ref = new PDFRateCardBuilder2();
PdfWriter.getInstance(document1, new FileOutputStream(FILE1));
document1.setPageSize(new Rectangle(1000, 1000));
document1.setMarginMirroring(true);
document1.setMargins(50, 50, 50, 50);
document1.open();
document1.add(ref.createFirstTable());
document1.add(ref.createSecondTable());
document1.add(ref.createThirdTable());
document1.add(ref.getStaticText());
document1.newPage();
document1
.setPageSize(new Rectangle(3000,
600 + (new PDFRateCardBuilder2()
.getDummyCustomerDTOs().length) * 100));
document1.setMarginMirroring(true);
document1.setMargins(50, 50, 50, 50);
document1.open();
document1.add(ref.createFirstTableForPage2());
document1.add(ref.createSecondTable());
document1.add(ref.createThirdTable());
document1.add(ref.getStaticText());
document1.close();
System.out.println(“Hi”);
Runtime.getRuntime().exec(
“rundll32 url.dll,FileProtocolHandler ” + FILE1);
} catch (Exception e) {
e.printStackTrace();
}
}
The code is working fine for splitting pdf files. Thanks for posting.
I want to generate the hash for the pdf files after splitting them.But every time if i run a same file for 15 times i am getting the different output.. why? Code :-
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.logging.Level;
import java.util.logging.Logger;
/**
*
* @author admin
*/
public class FingerprintGenerator {
public FingerprintGenerator() {
}
public String getFingerprint(String file_name) {
String fingerprint = null;
FileInputStream fis = null;
try {
MessageDigest md = MessageDigest.getInstance(“SHA-256”);
fis = new FileInputStream(file_name);
byte[] dataBytes = new byte[2048];
int nread = 0;
while ((nread = fis.read(dataBytes)) != -1) {
md.update(dataBytes, 0, nread);
}
byte[] mdbytes = md.digest();
StringBuilder hexString = new StringBuilder();
for (int i = 0; i < mdbytes.length; i++) {
hexString.append(Integer.toHexString(0xFF & mdbytes[i]));
}
fingerprint = hexString.toString();
// System.out.println("Fingerprint : " + fingerprint);
} catch (FileNotFoundException ex) {
Logger.getLogger(FingerprintGenerator.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(FingerprintGenerator.class.getName()).log(Level.SEVERE, null, ex);
} catch (NoSuchAlgorithmException ex) {
Logger.getLogger(FingerprintGenerator.class.getName()).log(Level.SEVERE, null, ex);
} finally {
try {
fis.close();
return fingerprint;
} catch (IOException ex) {
Logger.getLogger(FingerprintGenerator.class.getName()).log(Level.SEVERE, null, ex);
}
}
return fingerprint;
}
}
Please reply
HI Viral,
Thanks for giving nice tutorial for creating PDF File with split. I have a question that how to Generated pdf file save inside the my project folder.
thanks
hi viral
i would like to create many number of bar charts and save it in a single pdf. can u help me out with ur same example for the bar and pie chart..
Viral how do i extract/divide a part of pdf page(single page) and add it to different pages of pdf
I tried the concatPDFs code. It concatenates two files OK. But, it does not copy the signatures that were in the files. It also does not copy text data fields.
Hello Viral,
I have the same problem as Vel Rajagounder.
When I try to split my PDF source file that has html hyper links. Those hyper links don’t work on the split pdf.
Do you have the solution?
Thanks for your help
Vel
Hi Viral,
I have requirement like i am getting bunch(some times small data) of data in to the String and adding into new pdf page
doc.newPage(),
but i want split that data into no of page dynamically based on String length or size (which is coming from data base).
Please help me on this asap.
Thanks in advance
Hi Viral, Thanks, Nice Code working as expected.
I have a great problem, i have to merge two or more PDFs that have AcroFields but using your solution i lose all AcroFields, any suggestion?
Hi Viral, Good tutorial on “Merge PDF files in Java using iText JAR”, this works fine for me if I need to add (all pages) the second.pdf at the end of first.pdf…
How to achieve, If I want to add the second.pdf at particular page no. @ first.pdf , please give me directions, Thx.
hey How to use com.lowagie.text.pdf.PdfImportedPage it show an error in netbeans is it related to iText java please respond me thanksss.
hello
good day
I want create a site for convert pdf to Word persian .
I want do with Java.
I’m having problem in the reading pdf
pdf worked Right in English, but the language is persian problem
thank you