iText tutorial: Merge & Split PDF files using iText JAR

pdf-merge-pictureIn previous article about Generating PDF files using iText JAR, Kiran Hegde had described a nice and basic way of generating PDF files in Java using iTest JAR. It is a great starter tutorial for those who wants to start working with iText. In one of the requirement, I had to merge two or more PDF files and generate a single PDF file out of it. I thought of implementing the functionality from scratch in iText, but then thought to google it and see if already someone have written code for what I was looking for. As expected, I got a nice implementation of java code that merges 2 or more PDF files using iText jar. I thought of dissecting the code in this post and give credit to original author of the post.

Merge PDF files in Java using iText JAR

So here we go. First let us see the code.
package net.viralpatel.itext.pdf; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import com.lowagie.text.Document; import com.lowagie.text.pdf.BaseFont; import com.lowagie.text.pdf.PdfContentByte; import com.lowagie.text.pdf.PdfImportedPage; import com.lowagie.text.pdf.PdfReader; import com.lowagie.text.pdf.PdfWriter; public class MergePDF { public static void main(String[] args) { try { List<InputStream> pdfs = new ArrayList<InputStream>(); pdfs.add(new FileInputStream("c:\\1.pdf")); pdfs.add(new FileInputStream("c:\\2.pdf")); OutputStream output = new FileOutputStream("c:\\merge.pdf"); MergePDF.concatPDFs(pdfs, output, true); } catch (Exception e) { e.printStackTrace(); } } public static void concatPDFs(List<InputStream> streamOfPDFFiles, OutputStream outputStream, boolean paginate) { Document document = new Document(); try { List<InputStream> pdfs = streamOfPDFFiles; List<PdfReader> readers = new ArrayList<PdfReader>(); int totalPages = 0; Iterator<InputStream> iteratorPDFs = pdfs.iterator(); // Create Readers for the pdfs. while (iteratorPDFs.hasNext()) { InputStream pdf = iteratorPDFs.next(); PdfReader pdfReader = new PdfReader(pdf); readers.add(pdfReader); totalPages += pdfReader.getNumberOfPages(); } // Create a writer for the outputstream PdfWriter writer = PdfWriter.getInstance(document, outputStream); document.open(); BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED); PdfContentByte cb = writer.getDirectContent(); // Holds the PDF // data PdfImportedPage page; int currentPageNumber = 0; int pageOfCurrentReaderPDF = 0; Iterator<PdfReader> iteratorPDFReader = readers.iterator(); // Loop through the PDF files and add to the output. while (iteratorPDFReader.hasNext()) { PdfReader pdfReader = iteratorPDFReader.next(); // Create a new page in the target for each source page. while (pageOfCurrentReaderPDF < pdfReader.getNumberOfPages()) { document.newPage(); pageOfCurrentReaderPDF++; currentPageNumber++; page = writer.getImportedPage(pdfReader, pageOfCurrentReaderPDF); cb.addTemplate(page, 0, 0); // Code for pagination. if (paginate) { cb.beginText(); cb.setFontAndSize(bf, 9); cb.showTextAligned(PdfContentByte.ALIGN_CENTER, "" + currentPageNumber + " of " + totalPages, 520, 5, 0); cb.endText(); } } pageOfCurrentReaderPDF = 0; } outputStream.flush(); document.close(); outputStream.close(); } catch (Exception e) { e.printStackTrace(); } finally { if (document.isOpen()) document.close(); try { if (outputStream != null) outputStream.close(); } catch (IOException ioe) { ioe.printStackTrace(); } } } }
Code language: Java (java)
If you see what the code does is pretty simple.
  1. In main() method, we create a List of InputStream objects that points to all the input PDF files we need to merge
  2. We call MergePDF.concatPDFs() static method passing list of input PDFs, OutputStream object for merged output PDF and a boolean flag that represents whether you need to include page numbers at the end of each page as command line arguments
  3. In concatPDFs() method, first we convert List of InputStream objects to List of PdfReader objects in first while loop. And also we keep count of the total pages in all the input PDF files.
  4. Next we create BaseFont object using BaseFont.createFont() method. This will be the font for writing page numbers
  5. Next we create output objects to write our merged PDF file using Document class object and PdfWriter.getInstance() method
  6. Finally we write all the input PDFs into merged output PDF iterating each PDF and then writing each page of it in two while loops
  7. And then, close all the streams and clear all the buffers. Good boys do this ;-)
So now we know how to merge PDF files into one, let us see the way to split a PDF file or extract a part of PDF into another PDF.

Split PDF files in Java using iText JAR

Let us see the code.
/** * @author viralpatel.net * * @param inputStream Input PDF file * @param outputStream Output PDF file * @param fromPage start page from input PDF file * @param toPage end page from input PDF file */ public static void splitPDF(InputStream inputStream, OutputStream outputStream, int fromPage, int toPage) { Document document = new Document(); try { PdfReader inputPDF = new PdfReader(inputStream); int totalPages = inputPDF.getNumberOfPages(); //make fromPage equals to toPage if it is greater if(fromPage > toPage ) { fromPage = toPage; } if(toPage > totalPages) { toPage = totalPages; } // Create a writer for the outputstream PdfWriter writer = PdfWriter.getInstance(document, outputStream); document.open(); PdfContentByte cb = writer.getDirectContent(); // Holds the PDF data PdfImportedPage page; while(fromPage <= toPage) { document.newPage(); page = writer.getImportedPage(inputPDF, fromPage); cb.addTemplate(page, 0, 0); fromPage++; } outputStream.flush(); document.close(); outputStream.close(); } catch (Exception e) { e.printStackTrace(); } finally { if (document.isOpen()) document.close(); try { if (outputStream != null) outputStream.close(); } catch (IOException ioe) { ioe.printStackTrace(); } } }
Code language: Java (java)
In above code, we have created a method splitPDF () that can be used to extracts pages out of a PDF and write it into another PDF. The code is pretty much self explanatory and is similar to the one to merge PDF files. Thus, if you need to split an input.pdf (having 20 pages) into output1.pdf (1-12 pages of input.pdf) and output2.pdf (13-20 of input.pdf), you can call the above method as follow:
public static void main(String[] args) { try { MergePDF.splitPDF(new FileInputStream("C:\\input.pdf"), new FileOutputStream("C:\\output1.pdf"), 1, 12); MergePDF.splitPDF(new FileInputStream("C:\\input.pdf"), new FileOutputStream("C:\\output2.pdf"), 13, 20); } catch (Exception e) { e.printStackTrace(); } }
Code language: Java (java)
Feel free to bookmark the code and share it if you feel it will be useful to you :)
Get our Articles via Email. Enter your email address.

You may also like...

60 Comments

  1. prem says:

    Hi ,
    I want to split the pdf if my pdf exceeds the given size.
    For Example,
    If my pdf size is 12MB,I want to split the pdf in to 5MB parts.
    So i need to split the pdf in to 3, 5MB,5MB,2MB respectively.
    Pls let me know whether this is possible.

    Regards,
    Prem

    • @Prem, This example only splits the PDF on basis of page numbers. I will definitely try to write code for splitting pdf on basis of size and update you.

      • Ramas says:

        @Viral Patel, Thanks for the great comprehensive tutorial on merger two PDF’s, it does work great, and always 2nd pdf concatenates at the end of the merge PDF, How to add PDF in between the merge PDF @say, given particular page number…Please give me directions, Thanks

  2. Goblin_Queen says:

    I would like to paste my code here to split a PDF based on bookmarks using iText. All my bookmarks are on level1. I based my code sample on the example on this page, and that’s why I would like to place my code here. I couldn’t really find any example of splitting a PDF based on bookmarks with iText, so I thought it could help other people if I put my code on this page:

    public static void splitPDFByBookmarks(String pdf, String outputFolder){
            try
            {
                PdfReader reader = new PdfReader(pdf);
                //List of bookmarks: each bookmark is a map with values for title, page, etc
                List<HashMap> bookmarks = SimpleBookmark.getBookmark(reader);
                for(int i=0; i<bookmarks.size(); i++){
                    HashMap bm = bookmarks.get(i);
                    HashMap nextBM = i==bookmarks.size()-1 ? null : bookmarks.get(i+1);
                    //In my case I needed to split the title string
                    String title = ((String)bm.get("Title")).split(" ")[2];
                    
                    log.debug("Titel: " + title);
                    String startPage = ((String)bm.get("Page")).split(" ")[0];
                    String startPageNextBM = nextBM==null ? "" + (reader.getNumberOfPages() + 1) : ((String)nextBM.get("Page")).split(" ")[0];
                    log.debug("Page: " + startPage);
                    log.debug("------------------");
                    extractBookmarkToPDF(reader, Integer.valueOf(startPage), Integer.valueOf(startPageNextBM), title + ".pdf",outputFolder);
                }
            }
            catch (IOException e)
            {
                log.error(e.getMessage());
            }
        }
        
        private static void extractBookmarkToPDF(PdfReader reader, int pageFrom, int pageTo, String outputName, String outputFolder){
            Document document = new Document();
            OutputStream os = null;
            
            try{
                os = new FileOutputStream(outputFolder + outputName);
    
                // Create a writer for the outputstream
                PdfWriter writer = PdfWriter.getInstance(document, os);
                document.open();
                PdfContentByte cb = writer.getDirectContent(); // Holds the PDF data
                PdfImportedPage page;
        
                while(pageFrom < pageTo) {
                    document.newPage();
                    page = writer.getImportedPage(reader, pageFrom);
                    cb.addTemplate(page, 0, 0);
                    pageFrom++;
                }
                
                os.flush();
                document.close();
                os.close();
            }catch(Exception ex){
                log.error(ex.getMessage());
            }finally {
                if (document.isOpen())
                    document.close();
                try {
                    if (os != null)
                        os.close();
                } catch (IOException ioe) {
                    log.error(ioe.getMessage());
                }
            }
        }
    
    • Thanks Goblin for the code. I appreciate your effort of sharing this here.

    • jitendra says:

      Hi Goblin_Queen

      NOT getting “page” key in the bookmark hashmap.

      I am getting the page key only for first and last bookmark & not able to see the page key in the map for other than first and last bookmark. Thats why i cant navigate the pages of bookmarks.

      List bookmarks = SimpleBookmark.getBookmark(reader);
      System.out.println(bookmarks.size()+" bookmarks  =&gt;"+bookmarks );   
      


      The above two lines gives the following result…

      [{Action=GoTo, Page=1 FitH 669, Title=Front cover}, {Action=GoTo, Named=G1.108174, Title=brief contents}, {Action=GoTo, Named=G1.108372, Title=contents}, {Action=GoTo, Named=G1.107920, Title=preface}, {Action=GoTo, Named=G1.107932, Title=acknowledgments}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G1.107944, Title=Who should read this book}, {Action=GoTo, Named=G1.107946, Title=Roadmap}, {Action=GoTo, Named=G1.107954, Title=Code conventions and downloads}, {Action=GoTo, Named=G1.107957, Title=Software requirements}, {Action=GoTo, Named=G1.107961, Title=Author Online}], Named=G1.107942, Title=about this book}, {Action=GoTo, Named=G1.107965, Title=about the cover illustration}, {Open=false, Action=GoTo, Kids=[{Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G2.998644, Title=1.1.1 Seeing objects as services}], Named=G2.998545, Title=1.1 Every solution needs a problem}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G2.999513, Title=1.2.1 Construction by hand}, {Action=GoTo, Named=G2.1000151, Title=1.2.2 The Factory pattern}, {Action=GoTo, Named=G2.1002774, Title=1.2.3 The Service Locator pattern}], Named=G2.998957, Title=1.2 Pre-DI solutions}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G2.1003053, Title=1.3.1 The Hollywood Principle}, {Action=GoTo, Named=G2.1003538, Title=1.3.2 Inversion of Control vs. dependency injection}], Named=G2.1003003, Title=1.3 Embracing dependency injection}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G2.1003844, Title=1.4.1 Java}, {Action=GoTo, Named=G2.1004442, Title=1.4.2 DI in other languages and libraries}], Named=G2.1003819, Title=1.4 Dependency injection in the real world}, {Action=GoTo, Named=G2.1004528, Title=1.5 Summary}], Named=G2.998457, Title=Dependency injection: what’s all the hype?}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G3.998617, Title=2.1 Bootstrapping the injector}, {Action=GoTo, Named=G3.998809, Title=2.2 Constructing objects with dependency injection}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G3.1000106, Title=2.3.1 XML injection in Spring}, {Action=GoTo, Named=G3.1001142, Title=2.3.2 From XML to in-code configuration}, {Action=GoTo, Named=G3.1002177, Title=2.3.3 Injection in PicoContainer}, {Action=GoTo, Named=G3.1002827, Title=2.3.4 Revisiting Spring and autowiring}], Named=G3.999981, Title=2.3 Metadata and injector configuration}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G3.1004269, Title=2.4.1 Identifying by string keys}, {Action=GoTo, Named=G3.1006207, Title=2.4.2 Limitations of string keys}, {Action=GoTo, Named=G3.1007033, Title=2.4.3 Identifying by type}, {Action=GoTo, Named=G3.1007751, Title=2.4.4 Limitations of identifying by type}, {Action=GoTo, Named=G3.1007909, Title=2.4.5 Combinatorial keys: a comprehensive solution}], Named=G3.1003588, Title=2.4 Identifying dependencies for injection}, {Action=GoTo, Named=G3.1009438, Title=2.5 Separating infrastructure and application logic}, {Action=GoTo, Named=G3.1009541, Title=2.6 Summary}], Named=G3.998448, Title=Time for injection}, {Open=false, Action=GoTo, Kids=[{Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G4.998758, Title=3.1.1 Constructor injection}, {Action=GoTo, Named=G4.998880, Title=3.1.2 Setter injection}, {Action=GoTo, Named=G4.1000807, Title=3.1.3 Interface injection}, {Action=GoTo, Named=G4.1001788, Title=3.1.4 Method decoration (or AOP injection)}], Named=G4.998736, Title=3.1 Injection idioms}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G4.1003219, Title=3.2.1 Constructor vs. setter injection}, {Action=GoTo, Named=G4.1004405, Title=3.2.2 The constructor pyramid problem}, {Action=GoTo, Named=G4.1004791, Title=3.2.3 The circular reference problem}, {Action=GoTo, Named=G4.1006827, Title=3.2.4 The in-construction problem}, {Action=GoTo, Named=G4.1008163, Title=3.2.5 Constructor injection and object validity}], Named=G4.1003190, Title=3.2 Choosing an injection idiom}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G4.1009765, Title=3.3.1 The reinjection problem}, {Action=GoTo, Named=G4.1010385, Title=3.3.2 Reinjection with the Provider pattern}, {Action=GoTo, Named=G4.1011357, Title=3.3.3 The contextual injection problem}, {Action=GoTo, Named=G4.1011857, Title=3.3.4 Contextual injection with the Assisted Injection pattern}, {Action=GoTo, Named=G4.1013405, Title=3.3.5 Flexible partial injection with the Builder pattern}], Named=G4.1009724, Title=3.3 Not all at once: partial injection}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G4.1015816, Title=3.4.1 Injecting with externalized metadata}, {Action=GoTo, Named=G4.1016872, Title=3.4.2 Using the Adapter pattern}], Named=G4.1015786, Title=3.4 Injecting objects in sealed code}, {Action=GoTo, Named=G4.1017327, Title=3.5 Summary}], Named=G4.998430, Title=Investigating DI}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G5.998515, Title=4.1 Understanding the role of an object}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G5.998776, Title=4.2.1 Perils of tight coupling}, {Action=GoTo, Named=G5.999775, Title=4.2.2 Refactoring impacts of tight coupling}, {Action=GoTo, Named=G5.1001749, Title=4.2.3 Programming to contract}, {Action=GoTo, Named=G5.1002620, Title=4.2.4 Loose coupling with dependency injection}], Named=G5.998640, Title=4.2 Separation of concerns (my pants are too tight!)}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G5.1003221, Title=4.3.1 Out-of-container (unit) testing}, {Action=GoTo, Named=G5.1003492, Title=4.3.2 I really need my dependencies!}, {Action=GoTo, Named=G5.1003727, Title=4.3.3 More on mocking dependencies}, {Action=GoTo, Named=G5.1004357, Title=4.3.4 Integration testing}], Named=G5.1003135, Title=4.3 Testing components}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G5.1005095, Title=4.4.1 Rebinding dependencies}, {Action=GoTo, Named=G5.1005162, Title=4.4.2 Mutability with the Adapter pattern}], Named=G5.1005054, Title=4.4 Different deployment profiles}, {Action=GoTo, Named=G5.1006093, Title=4.5 Summary}], Named=G5.1012426, Title=Building modular applications}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G6.998572, Title=5.1 What is scope?}, {Action=GoTo, Named=G6.998953, Title=5.2 The no scope (or default scope)}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G6.1000438, Title=5.3.1 Singletons in practice}, {Action=GoTo, Named=G6.1002768, Title=5.3.2 The singleton anti-pattern}], Named=G6.999945, Title=5.3 The singleton scope}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G6.1004465, Title=5.4.1 HTTP request scope}, {Action=GoTo, Named=G6.1008619, Title=5.4.2 HTTP session scope}], Named=G6.1004208, Title=5.4 Domain-specific scopes: the web}, {Action=GoTo, Named=G6.1010476, Title=5.5 Summary}], Named=G6.998430, Title=Scope: a fresh breath of state}, {Open=false, Action=GoTo, Kids=[{Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G7.998584, Title=6.1.1 A quick primer on transactions}, {Action=GoTo, Named=G7.998675, Title=6.1.2 Creating a custom transaction scope}, {Action=GoTo, Named=G7.999599, Title=6.1.3 A custom scope in Guice}, {Action=GoTo, Named=G7.1001081, Title=6.1.4 A custom scope in Spring}], Named=G7.998524, Title=6.1 Defining a custom scope}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G7.1002711, Title=6.2.1 Singletons must be thread-safe}, {Action=GoTo, Named=G7.1003715, Title=6.2.2 Perils of scope-widening injection}], Named=G7.1002677, Title=6.2 Pitfalls and corner cases in scoping}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G7.1007992, Title=6.3.1 Cache scope}, {Action=GoTo, Named=G7.1008059, Title=6.3.2 Grid scope}, {Action=GoTo, Named=G7.1008382, Title=6.3.3 Transparent grid computing with DI}], Named=G7.1007946, Title=6.3 Leveraging the power of scopes}, {Action=GoTo, Named=G7.1008801, Title=6.4 Summary}], Named=G7.998430, Title=More use cases in scoping}, {Open=false, Action=GoTo, Kids=[{Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G8.998530, Title=7.1.1 Object creation}, {Action=GoTo, Named=G8.998974, Title=7.1.2 Object destruction (or finalization)}], Named=G8.998452, Title=7.1 Significant events in the life of objects}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G8.999629, Title=7.2.1 Contrasting lifecycle scenarios: servlets vs. database connections}, {Action=GoTo, Named=G8.1001494, Title=7.2.2 The Destructor anti-pattern}, {Action=GoTo, Named=G8.1001859, Title=7.2.3 Using Java’s Closeable interface}], Named=G8.999558, Title=7.2 One size doesn’t fit all (domain-specific lifecycle)}, {Action=GoTo, Named=G8.1002277, Title=7.3 A real-world lifecycle scenario: stateful EJBs}, {Action=GoTo, Named=G8.1002948, Title=7.4 Lifecycle and lazy instantiation}, {Action=GoTo, Named=G8.1003216, Title=7.5 Customizing lifecycle with postprocessing}, {Action=GoTo, Named=G8.1005146, Title=7.6 Customizing lifecycle with multicasting}, {Action=GoTo, Named=G8.1005897, Title=7.7 Summary}], Named=G8.998385, Title=From birth to death: object lifecycle}, {Open=false, Action=GoTo, Kids=[{Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G9.998940, Title=8.1.1 A tracing interceptor with Guice}, {Action=GoTo, Named=G9.999710, Title=8.1.2 A tracing interceptor with Spring}, {Action=GoTo, Named=G9.1000392, Title=8.1.3 How proxying works}, {Action=GoTo, Named=G9.1001800, Title=8.1.4 Too much advice can be dangerous!}], Named=G9.998792, Title=8.1 Intercepting methods and AOP}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G9.1002580, Title=8.2.1 Transactional methods with warp-persist}, {Action=GoTo, Named=G9.1004109, Title=8.2.2 Securing methods with Spring Security}], Named=G9.1002540, Title=8.2 Enterprise use cases for interception}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G9.1005451, Title=8.3.1 Sameness tests are unreliable}, {Action=GoTo, Named=G9.1006230, Title=8.3.2 Static methods cannot be intercepted}, {Action=GoTo, Named=G9.1006559, Title=8.3.3 Neither can private methods}, {Action=GoTo, Named=G9.1007280, Title=8.3.4 And certainly not final methods!}, {Action=GoTo, Named=G9.1008012, Title=8.3.5 Fields are off limits}, {Action=GoTo, Named=G9.1008841, Title=8.3.6 Unit tests and interception}], Named=G9.1005419, Title=8.3 Pitfalls and assumptions about interception and proxying}, {Action=GoTo, Named=G9.1009874, Title=8.4 Summary}], Named=G9.998403, Title=Managing an object’s behavior}, {Open=false, Action=GoTo, Kids=[{Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G10.999803, Title=9.1.1 Safe publication}, {Action=GoTo, Named=G10.1000274, Title=9.1.2 Safe wiring}], Named=G10.998572, Title=9.1 Objects and visibility}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G10.1001069, Title=9.2.1 On data and services}, {Action=GoTo, Named=G10.1003319, Title=9.2.2 On better encapsulation}], Named=G10.1001045, Title=9.2 Objects and design}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G10.1005361, Title=9.3.1 More on mutability}, {Action=GoTo, Named=G10.1007571, Title=9.3.2 Synchronization vs. concurrency}], Named=G10.1005326, Title=9.3 Objects and concurrency}, {Action=GoTo, Named=G10.1008397, Title=9.4 Summary}], Named=G10.1013488, Title=Best practices in code design}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G11.998530, Title=10.1 Fragmentation of DI solutions}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G11.999753, Title=10.2.1 Rigid configuration anti-patterns}, {Action=GoTo, Named=G11.1001768, Title=10.2.2 Black box anti-patterns}], Named=G11.999657, Title=10.2 Lessons for framework designers}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G11.1003603, Title=10.3.1 Case study: JSR-303}], Named=G11.1003574, Title=10.3 Programmatic configuration to the rescue}, {Action=GoTo, Named=G11.1007253, Title=10.4 Summary}], Named=G11.998431, Title=Integrating with third-party frameworks}, {Open=false, Action=GoTo, Kids=[{Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G12.998588, Title=11.1.1 Crosstalk’s requirements}], Named=G12.998536, Title=11.1 Crosstalk: a Twitter clone!}, {Action=GoTo, Named=G12.998679, Title=11.2 Setting up the application}, {Action=GoTo, Named=G12.1000527, Title=11.3 Configuring Google Sitebricks}, {Action=GoTo, Named=G12.1001393, Title=11.4 Crosstalk’s modularity and service coupling}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G12.1002594, Title=11.5.1 The HomePage template}, {Action=GoTo, Named=G12.1003550, Title=11.5.2 The Tweet domain object}, {Action=GoTo, Named=G12.1004785, Title=11.5.3 Users and sessions}, {Action=GoTo, Named=G12.1005775, Title=11.5.4 Logging in and out}], Named=G12.1001567, Title=11.5 The presentation layer}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G12.1008403, Title=11.6.1 Configuring the persistence layer}], Named=G12.1007373, Title=11.6 The persistence layer}, {Action=GoTo, Named=G12.1009427, Title=11.7 The security layer}, {Action=GoTo, Named=G12.1009867, Title=11.8 Tying up to the web lifecycle}, {Action=GoTo, Named=G12.1010116, Title=11.9 Finally: up and running!}, {Action=GoTo, Named=G12.1010349, Title=11.10 Summary}], Named=G12.998440, Title=Dependency injection in action!}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G13.998449, Title=A.1 A DSL for dependency injection}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G13.998538, Title=A.2.1 The DSL basics}, {Action=GoTo, Named=G13.998750, Title=A.2.2 The Factory chain}, {Action=GoTo, Named=G13.998972, Title=A.2.3 Contextual injection via input parameters}, {Action=GoTo, Named=G13.999260, Title=A.2.4 Reinjection via factory injection}], Named=G13.998509, Title=A.2 Application configuration}], Named=G13.1000235, Title=appendix A: The Butterfly Container}, {Open=false, Action=GoTo, Kids=[{Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G14.998506, Title=B.1.1 Injection annotations}], Named=G14.998483, Title=B.1 The class+name key}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G14.998832, Title=B.2.1 But how do I kickstart the whole thing?}], Named=G14.998656, Title=B.2 Injector rules}], Named=G14.998427, Title=appendix B: SmartyPants for Adobe Flex}, {Open=false, Action=GoTo, Kids=[{Action=GoTo, Named=G15.1659028, Title=Symbols}, {Action=GoTo, Named=G15.1659087, Title=A}, {Action=GoTo, Named=G15.1659219, Title=B}, {Action=GoTo, Named=G15.1659269, Title=C}, {Action=GoTo, Named=G15.1659485, Title=D}, {Action=GoTo, Named=G15.1659632, Title=E}, {Action=GoTo, Named=G15.1659682, Title=F}, {Action=GoTo, Named=G15.1659743, Title=G}, {Action=GoTo, Named=G15.1659854, Title=H}, {Action=GoTo, Named=G15.1659920, Title=I}, {Action=GoTo, Named=G15.1660036, Title=J}, {Action=GoTo, Named=G15.1660155, Title=K}, {Action=GoTo, Named=G15.1660171, Title=L}, {Action=GoTo, Named=G15.1660249, Title=M}, {Action=GoTo, Named=G15.1660316, Title=N}, {Action=GoTo, Named=G15.1660343, Title=O}, {Action=GoTo, Named=G15.1660432, Title=P}, {Action=GoTo, Named=G15.1660567, Title=R}, {Action=GoTo, Named=G15.1660629, Title=S}, {Action=GoTo, Named=G15.1660968, Title=T}, {Action=GoTo, Named=G15.1661074, Title=U}, {Action=GoTo, Named=G15.1661097, Title=V}, {Action=GoTo, Named=G15.1661109, Title=W}, {Action=GoTo, Named=G15.1661146, Title=X}, {Action=GoTo, Named=G15.1661154, Title=Z}], Named=G15.1010277, Title=index}, {Action=GoTo, Page=354 FitH 608, Title=Back cover}]
      

      Please suggest some solution

      Jitendra

    • caba says:

      hi,

      I have to split pdf based on chapters i am not able to get some of the childs of please help me

  3. dharmendra says:

    i had downloaded iText jar 5.0.1 but it do not contain com.lowagie.text.table package what should i do?? i want to create table dynamically in pdf page..

  4. Puru says:

    Viral,
    I was searching pdf splittig on the basis of size(excedding a given size say > 10 MB). Prem was also looking for the same last year.Just checking if you got an opportunity to write on this.

    Thanks for your help
    Puru

  5. Kunu says:

    Viral

    Thanks for posting the code here.

  6. Mohamed nazar says:

    i was searching for the code to read a image and parse it can you post the code

  7. Dimitri says:

    Viral : If you already found the script for splitting pdfs by page size, it would be VERY handy
    Thx in advance

  8. Vishnu says:

    Hi Viral,

    By any chance is it possible to split the pages randomly? E.g. I have a PDF document of 20 pages and now I was a new PDF containing only 2nd, 5th, 11th and 16th page of the original document.

    Thanks in advance.
    Vishnu.

  9. ManhHD says:

    Thank for sharing

  10. maddireddy says:

    hi viral…can you provide sample code to generate thumbnails to a pdf page

  11. Swati says:

    nice coding…:)its wrks for me always in 1 shoot viral

  12. Edwin Tan says:

    Hi Viral Patel,

    Did you manage to develop sample code to split PDFs based on size? i.e. Splitting a 8MB file into 2MB PDFs.

    Regards,
    Edwin

  13. Shivakumar says:

    Hi Viral , I got the following errors:

    Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
    at java.io.ByteArrayOutputStream.write(Unknown Source)
    at com.itextpdf.text.pdf.RandomAccessFileOrArray.InputStreamToArray(Rand
    omAccessFileOrArray.java:182)
    at com.itextpdf.text.pdf.RandomAccessFileOrArray.(RandomAccessFile
    OrArray.java:172)
    at com.itextpdf.text.pdf.PdfReader.(PdfReader.java:237)
    at com.itextpdf.text.pdf.PdfReader.(PdfReader.java:248)
    at MergePDF.splitPDF(split.java:38)
    at ShivaPdf.main(split.java:87)

    How to resolve them.

    • Sandeep says:

      Have you resolved java.lang.OutOfMemoryError: Java heap space???

  14. Luke Pacholski says:

    Dimitri asked:

    “Viral : If you already found the script for splitting pdfs by page size, it would be VERY handy”

    I found the answer here:

    http://java-x.blogspot.com/2006/11/merge-pdf-files-with-itext.html#comment-3167256311373072651

    document.setPageSize(pdfReader.getPageSizeWithRotation(pageOfCurrentReaderPDF));
    document.newPage();

    I.e., immediately before creating a new page, set the page size to the page size (and orientation) of the page you are reading in. The only issue to be aware of is the PDF has to be (I believe) version 1.5 or higher. You’ll get errors trying to open PDFs saved as 1.4.

    pdfWriter.setPdfVersion(PdfWriter.PDF_VERSION_1_5);

  15. natalie Afota says:

    Hey

    I have used your code for merging few pdf files but I still have a problem to merge files that carry different sizes!
    I have try to had a command to set the page size

    document.newPage();
    pageOfCurrentReaderPDF++;
    currentPageNumber++;
    document.setPageSize(pdfReader.getPageSize(pageOfCurrentReaderPDF));

    Thanks, for the helpful code!

    Natalie

  16. uday says:

    Very good code for merging & splitting….
    good work itext…………exactly suits my requirement.

  17. teakey says:

    hi,thanks for your coding ,but I still have a problem to merge files that carry different sizes!
    how could i solve it? some page is large while other is small.I am tired with it.help!

  18. fryguy says:

    Does anybody have any examples for splitting a PDF at bookmarks that exist at a level 2? The code provided by Goblin_Queen works great for level 1 bookmarks, but I’m not quite sure how to expand that to level 2 bookmarks as the resulting hashmap for the “Kids” do not contain page information.

    HashMap bm = bookmarks.get(i);
    List kids = (List) bm.get(“Kids”);

    The kids hashmap does not have the “Page” defined.

  19. Searock says:

    You could reduce the number of lines by using PdfStamper class.

    PdfReader pdfReader = new PdfReader(fileDialog.FileName);
    FileOutputStream fout = new FileOutputStream(“C:\\output1.pdf”);
    PdfStamper splitter = new PdfStamper(pdfReader, fout);
    pdfReader.SelectPages(“1 – 10”);
    splitter.Close();
    pdfReader.Close();

  20. mani says:

    very useful code for all guys who r in initial stage thank u very much

  21. Dennis Lindeman says:

    I tried the concatPDFs code. It concatenates two files OK. But, it does not copy the signatures that were in the files. It also does not copy text data fields.

  22. to split says:

    thank you for the great tutorial and comments. They help me a lot, to write my own split function

  23. dhaval says:

    I used merging of pdf code but i got problem while i am going to merge A3 paper size pdf file. this code convert that files into A4 paper size with cut of the pdf image area.

  24. CMil says:

    This works but not for a lot of documents – I still used it as a good starting point, though. Since it opens the PDF FileInputStream’s in the main method, they have to stay open the entire time and never get closed. A better usage would be to create a List to hold the PDF’s filenames (instead of a List), then pass this to the concatPDFs method. Method concatPDFs would then need to loop/iterate through list (which it already does) and open each InputStream, read PDF, add it to readers, then close InputStream. Much better usage of resources when large number of files involved.
    I tried it the Viral’s way and it works for up to a few hundred PDF’s, but anything more and I got a “FileNotFoundException (Too many open files)”.

    • @CMil, That would be a better solution. Thanks for pointing that out. I will update the post with your suggestion. :)

  25. Lucas Voelk says:

    The above method does split the pdf however it does not preserve the orientation of the pages and will try and print them all as either landscape or portrait. This is fine unless you have a scanned document that has landscape pages turned on end in which case it tries to write out landscape in portrait format and clips the sides. The below is a method using pdfCopy that accomplishes the same thing but preserves orientation.

    public static void copyPages(InputStream inputStream, OutputStream outputStream, int fromPage, int toPage){
            try {
                PdfReader reader = new PdfReader(inputStream);
                Document document = new Document();
                PdfCopy copy = new PdfCopy(document, outputStream);
                int n = reader.getNumberOfPages();
                document.open();
                for(int i = 0; i=fromPage &amp;&amp; i&lt;=toPage){
                        copy.addPage(copy.getImportedPage(reader,i));
                    }
                }
                document.close();
            } catch (IOException e) {
                e.printStackTrace();  
            } catch (DocumentException e) {
                e.printStackTrace();  
            }
        }
    

  26. Fernando Bona says:

    Thank you very much!

  27. Prasad kamat says:

    This is amazing blog… you guys are really putting nice solutions in the forum which I am sure helpful for thousands of people.

  28. Nelson says:

    Hi I’m using itext to merge pdf but i’m new on this and I’m facing the fallowing issue:

    I want to merge each page of a pdf file lets call it A with a header pdf file lets call it B.
    The issue is that when I merge pages that are portrait it works fine but now i have a page on A that is in landscape and the header page on B that is on portrait so my code is not working .
    this is my code … Thanks in advance for your help

    public static void iTextMerge(File baseFile, File bgFile, File outFile) throws IOException, DocumentException
    	{
    		
    		PdfReader bgReader = new PdfReader(bgFile.getAbsolutePath());
    		PdfReader baseReader = new PdfReader(baseFile.getAbsolutePath());
    		FileOutputStream out = new FileOutputStream(outFile);
    		
    		PdfStamper stamp = null;
    		
    		stamp = new PdfStamper(baseReader, out);
    		Rectangle bgSize = bgReader.getCropBox(1);
    		PdfImportedPage bgContent = stamp.getImportedPage(bgReader, 1);
    		
    		PdfGState blend = new PdfGState();
    		blend.setFillOpacity(0.5f);
    	//	blend.setStrokeOpacity(0.5f);
    	//	blend.setBlendMode(PdfGState.BM_DARKEN);
    		for (int i = 1; i <= baseReader.getNumberOfPages(); i++)
    		{
    			Rectangle pageSize;
    			PdfContentByte content = stamp.getOverContent(i);
    			 
    			
    		//	content.setGState(blend);
    			
    			pageSize = baseReader.getCropBox(i);
    			pageSize.rotate();
    			
    			float hScale = pageSize.getWidth() / bgSize.getWidth(),
    				  vScale = pageSize.getHeight() / bgSize.getHeight(),
    				  dominantScale = hScale < vScale ? hScale : vScale,
    				  hTrans = (float) (pageSize.getLeft() - bgSize.getLeft() * dominantScale + (pageSize.getWidth() - bgSize.getWidth() * dominantScale) / 2.0),
    				  vTrans = (float) (pageSize.getBottom() - bgSize.getBottom() * dominantScale + (pageSize.getHeight() - bgSize.getHeight() * dominantScale) / 2.0);
    			
    			
    			content.addTemplate(bgContent, dominantScale, 0, 0, dominantScale, hTrans, vTrans);
    			
    		}
    		
    		stamp.close();
    		
    		
    	}  

  29. Sunil Kumar says:

    Hi Guys,
    I need to develop a program which should compare a set of pdf files stored in one folder with a set of pdf files stored in another folder. Both folders should contain the same no. of pdf files with same name in both folders. I am able to compare the pdf files text by storing it in a String array, bt i am not able to compare the changed FONTS , and changed images. Also, i need to write the generate a new pdf file which should highlight the difference of the pdf file by changing its background color, and when we place the mouse cursor over it, it should pop-up the contents of the next pdf file which is not matching. I am able to generate the pdf file, bt not able to highlight the difference. There can be n number of pdf files in the folders. we need to do 1 by 1 comparing of pdf files.
    GUYS, please suggest me some appropriate ways, how i need to implement. I need to do it in my project.

    The basic requirement is to match a pdf file with another pdf file, and it should generate a pdf file with the same contents as of first pdf file ,bt it should highlight the difference i.e. text,font,image,etc difference with a pop-up(if possible) in order to show the content of next pdf file

    I would appreciate any help

    Thanks in ADVANCE!!!

  30. Nikhil raj kumar says:

    how to extract references from a research paper such as ieee,
    and while reading ieee pdf research paper using itext pdf ,
    i am able to get 1st line of 1st col then 1st row of second column,
    (merging of two colums in a page),
    and plz help me to guide on how to extract references from a pdf file

  31. Vel Rajagounder says:

    Viral,
    I am using your code posted here to merge PDFs. It works good. Thanks for posting the code. In my PDF I have html hyper links. Those hyper links don’t work on the merged pdf, though it works on the individual PDFs. any thoughts on this?
    Thanks
    Vel

  32. Owais says:

    how to add header in iText ?

    i tried with
    HeaderFooter header = new HeaderFooter(new Phrase(chunk),true);
    header.setAlignment(Element.ALIGN_CENTER);
    header.setBorder(Rectangle.NO_BORDER);
    document.setHeader(header);

    but i t gives me error like Header Footer (Phrase,boolean) is undefined.
    and setHeader properties is not there..
    comment on it highly appreciate.

  33. raju says:

    h to split the single pdf into multuiple pdf(individuals)

  34. Nirban says:

    Neeed all of your help.

    I needed to comare two pdfs (say pdf1 and pdf2- exactly same format, same form, just different variable data field values like interest rate) and highlight the difference in pdf2, let’s say in yellow background color, just the interest rate. Can I do it with this? Can anyone suggest anything?

    I cannot thank enough if anyone can give me some idea. I am not a techie person, but I manage tech teams.

    Regards, Nirban

  35. Bert says:

    I have 2 pdf files to merge using the MergePDF class:
    first file contains A4(paper size) and the second file contains A3(paper size).

    result from the newly generated PDF file:
    the A3(paper size) page is cutting off, the portion of A4 size is merged.

    how to fix this?

  36. siddam says:

    i am creating a pdf. in the second page of the pdf i need to insert another page from an exsisting pdf. How can i do this.

    Could you please explain.

  37. Shruti says:

    Thank you so much for your help for splitting pdf. I would like to get have code for adding page numbers to output split file.

  38. lisa says:

    public static void iTextMerge(File baseFile, File bgFile, File outFile) throws IOException, DocumentException
       {
            
           PdfReader bgReader = new PdfReader(bgFile.getAbsolutePath());
           PdfReader baseReader = new PdfReader(baseFile.getAbsolutePath());
           FileOutputStream out = new FileOutputStream(outFile);
            
           PdfStamper stamp = null;
            
           stamp = new PdfStamper(baseReader, out);
           Rectangle bgSize = bgReader.getCropBox(1);
           PdfImportedPage bgContent = stamp.getImportedPage(bgReader, 1);
            
           PdfGState blend = new PdfGState();
           blend.setFillOpacity(0.5f);
       //  blend.setStrokeOpacity(0.5f);
       //  blend.setBlendMode(PdfGState.BM_DARKEN);
           for (int i = 1; i <= baseReader.getNumberOfPages(); i++)
           {
               Rectangle pageSize;
               PdfContentByte content = stamp.getOverContent(i);
                 
                
           //  content.setGState(blend);
                
               pageSize = baseReader.getCropBox(i);
               pageSize.rotate();
                
               float hScale = pageSize.getWidth() / bgSize.getWidth(),
                     vScale = pageSize.getHeight() / bgSize.getHeight(),
                     dominantScale = hScale < vScale ? hScale : vScale,
                     hTrans = (float) (pageSize.getLeft() - bgSize.getLeft() * dominantScale + (pageSize.getWidth() - bgSize.getWidth() * dominantScale) / 2.0),
                     vTrans = (float) (pageSize.getBottom() - bgSize.getBottom() * dominantScale + (pageSize.getHeight() - bgSize.getHeight() * dominantScale) / 2.0);
                
                
               content.addTemplate(bgContent, dominantScale, 0, 0, dominantScale, hTrans, vTrans);
                
           }
            
           stamp.close();
            
            
       }

    this code give error when i try use can any boy find the error and past update code

  39. I was unable to solution on internet.
    But finally I was able to find a trick to create a single PDF file with different page sizes.
    Below code is the sample to achieve the above task (This is not the full code).

    public static void main(String args[]) {

    Map rateDetails = new HashMap();
    rateDetails.put(“2M”, “256”);
    rateDetails.put(“3M”, “512”);
    rateDetails.put(“7M”, “450”);

    Document document1 = new Document();
    try {
    PDFRateCardBuilder2 ref = new PDFRateCardBuilder2();

    PdfWriter.getInstance(document1, new FileOutputStream(FILE1));
    document1.setPageSize(new Rectangle(1000, 1000));
    document1.setMarginMirroring(true);
    document1.setMargins(50, 50, 50, 50);
    document1.open();
    document1.add(ref.createFirstTable());
    document1.add(ref.createSecondTable());
    document1.add(ref.createThirdTable());
    document1.add(ref.getStaticText());
    document1.newPage();

    document1
    .setPageSize(new Rectangle(3000,
    600 + (new PDFRateCardBuilder2()
    .getDummyCustomerDTOs().length) * 100));
    document1.setMarginMirroring(true);
    document1.setMargins(50, 50, 50, 50);
    document1.open();
    document1.add(ref.createFirstTableForPage2());
    document1.add(ref.createSecondTable());
    document1.add(ref.createThirdTable());
    document1.add(ref.getStaticText());
    document1.close();

    System.out.println(“Hi”);

    Runtime.getRuntime().exec(
    “rundll32 url.dll,FileProtocolHandler ” + FILE1);
    } catch (Exception e) {
    e.printStackTrace();
    }
    }

  40. Java Programs says:

    The code is working fine for splitting pdf files. Thanks for posting.

  41. Archana says:

    I want to generate the hash for the pdf files after splitting them.But every time if i run a same file for 15 times i am getting the different output.. why? Code :-
    import java.io.FileInputStream;
    import java.io.FileNotFoundException;
    import java.io.IOException;
    import java.security.MessageDigest;
    import java.security.NoSuchAlgorithmException;
    import java.util.logging.Level;
    import java.util.logging.Logger;

    /**
    *
    * @author admin
    */
    public class FingerprintGenerator {

    public FingerprintGenerator() {
    }

    public String getFingerprint(String file_name) {

    String fingerprint = null;
    FileInputStream fis = null;
    try {
    MessageDigest md = MessageDigest.getInstance(“SHA-256”);
    fis = new FileInputStream(file_name);
    byte[] dataBytes = new byte[2048];
    int nread = 0;
    while ((nread = fis.read(dataBytes)) != -1) {
    md.update(dataBytes, 0, nread);
    }
    byte[] mdbytes = md.digest();
    StringBuilder hexString = new StringBuilder();
    for (int i = 0; i < mdbytes.length; i++) {
    hexString.append(Integer.toHexString(0xFF & mdbytes[i]));
    }
    fingerprint = hexString.toString();
    // System.out.println("Fingerprint : " + fingerprint);

    } catch (FileNotFoundException ex) {
    Logger.getLogger(FingerprintGenerator.class.getName()).log(Level.SEVERE, null, ex);
    } catch (IOException ex) {
    Logger.getLogger(FingerprintGenerator.class.getName()).log(Level.SEVERE, null, ex);
    } catch (NoSuchAlgorithmException ex) {
    Logger.getLogger(FingerprintGenerator.class.getName()).log(Level.SEVERE, null, ex);
    } finally {
    try {
    fis.close();
    return fingerprint;
    } catch (IOException ex) {
    Logger.getLogger(FingerprintGenerator.class.getName()).log(Level.SEVERE, null, ex);
    }
    }
    return fingerprint;
    }
    }
    Please reply

  42. Pankaj says:

    HI Viral,
    Thanks for giving nice tutorial for creating PDF File with split. I have a question that how to Generated pdf file save inside the my project folder.

    thanks

  43. Srinivasan says:

    hi viral
    i would like to create many number of bar charts and save it in a single pdf. can u help me out with ur same example for the bar and pie chart..

  44. Tanuja says:

    Viral how do i extract/divide a part of pdf page(single page) and add it to different pages of pdf

  45. honey says:

    I tried the concatPDFs code. It concatenates two files OK. But, it does not copy the signatures that were in the files. It also does not copy text data fields.

  46. Angie says:

    Hello Viral,

    I have the same problem as Vel Rajagounder.
    When I try to split my PDF source file that has html hyper links. Those hyper links don’t work on the split pdf.
    Do you have the solution?
    Thanks for your help
    Vel

  47. shankar says:

    Hi Viral,
    I have requirement like i am getting bunch(some times small data) of data in to the String and adding into new pdf page
    doc.newPage(),
    but i want split that data into no of page dynamically based on String length or size (which is coming from data base).
    Please help me on this asap.

    Thanks in advance

  48. Mahesh says:

    Hi Viral, Thanks, Nice Code working as expected.

  49. Michele says:

    I have a great problem, i have to merge two or more PDFs that have AcroFields but using your solution i lose all AcroFields, any suggestion?

  50. ramas says:

    Hi Viral, Good tutorial on “Merge PDF files in Java using iText JAR”, this works fine for me if I need to add (all pages) the second.pdf at the end of first.pdf…

    How to achieve, If I want to add the second.pdf at particular page no. @ first.pdf , please give me directions, Thx.

  51. ayat baloach says:

    hey How to use com.lowagie.text.pdf.PdfImportedPage it show an error in netbeans is it related to iText java please respond me thanksss.

  52. hamed says:

    hello
    good day
    I want create a site for convert pdf to Word persian .
    I want do with Java.
    I’m having problem in the reading pdf
    pdf worked Right in English, but the language is persian problem
    thank you

Leave a Reply

Your email address will not be published. Required fields are marked *