Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 340111 - PDF Report Generation Consumes Memory per Page
Summary: PDF Report Generation Consumes Memory per Page
Status: VERIFIED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: BIRT (show other bugs)
Version: 2.6.1   Edit
Hardware: All All
: P3 major with 22 votes (vote)
Target Milestone: 3.7.0 RC2   Edit
Assignee: Birt-ReportEngine-inbox@eclipse.org CLA
QA Contact: Xiaoying Gu CLA
URL:
Whiteboard: Obsolete
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-16 01:15 EDT by Scott Hamilton CLA
Modified: 2011-05-19 23:20 EDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Scott Hamilton CLA 2011-03-16 01:15:26 EDT
Build Identifier: 2.6.1

For each page of a report using the standard PDFEmitter, it appears as though memory is used for each page (or maybe row?) of the report.  At first I thought this was the PDFRender bookmarks and a similar bug to #340109 for the HTMLReportEmitter, but even when I copy and customize this emitter/renderer to not use/store bookmarks, I still get a huge memory consumption for a large report.

Doing some additional memory profiling seems like this is because the itext PDFBody class stores cross references continually so that (at the end?) it can write its cross reference table.

The down-side here is that this is being stored in persistent memory and for a report of a huge number of rows, this consumes a ton of memory.

I'm not familiar enough with the PDF format to know if this xref data is required, but if it is not, having the option to configure the PDFEmitter not to create this would be great.

For PDF reports, it would also be a possible work-around to be able to configure the maximum # of pages that a PDFEmitter would generate, and if this max was exceeded it would perhaps thrown a specific exception where we could close out the report (so it still generates a valid PDF) and report to the user that we had to cut his report short.

Or provide the ability to have an event handler given to the emitter to respond to events like next page, etc., and have the ability to cancel the report.

Bottom line, we need some way to manage the memory for reports that will end up generating many, many pages.  In a practical sense, no one will be able to handle a 1 million page PDF, so we should be able to stop its generation early enough to prevent an out of memory condition on the server.

Reproducible: Always

Steps to Reproduce:
Generate a PDF report consisting of 100,000 to 1,000,000 pages.  Watch the heap memory usage of the java process.
Comment 1 Scott Hamilton CLA 2011-03-16 11:34:44 EDT
Some more thoughts on this...

I don't know if the cross reference data is strictly necessary, but if it is, the TreeSet that it is stored in (com.lowagie.text.pdf.PdfWriter.PdfBody.xrefs) might be converted to a more memory-efficient structure, e.g. something that overflows to disk after a certain amount of elements have been added to the set. I understand the efficiency of such an operation might be horribly slow by comparison, especially if the ordering of the elements in the set are not consistent with the order in which they are added (don't know - didn't debug it). Another idea would be for the cross reference table to be built as elements are added to it, streaming that out to a temporary file, and then just appended to the PDF once done. Again if the order of the elements is an issue this could be inefficient.

I also understand this is getting into a change to iText itself - assuming the community reaches a viable solution to suggest/contribute back to them, perhaps that is the way to resolve this bug.
Comment 2 Gang Liu CLA 2011-05-18 05:42:43 EDT
fixed.

Add PDF render option IPDFRenderOption.PDF_PAGE_LIMIT, user can use this render option to limit PDF page count.
Comment 3 Scott Hamilton CLA 2011-05-18 10:42:18 EDT
Can you point me to where in the CVS repository this fix might be so I can back-port it for our version?

Thanks!
Comment 4 Xiaoying Gu CLA 2011-05-19 23:20:46 EDT
Verified in birt report engine 3.7.0.v20110520-0630 that PDF render option IPDFRenderOption.PDF_PAGE_LIMIT works.