Community
Participate
Working Groups
Build Identifier: M20080221-1800 I have been using BIRT2.6 for report generation. But Excel files which are generated by BIRT are huge in size(160 MB,50 MB).When I opened one Excel file using 'EditPlus' textpad, I noticed more than 80% size occupied by style tags. And if cell's data type is number then style id is being increased by 2 or 1 and new style tag is getting created with that Id.Usually one cell should use one style across all rows but it's not happening. It's causing duplicate style info and leading to huge size. Please find attached excel sheet for reference. Reproducible: Always Steps to Reproduce: 1.Have at least 1 column with data type number 2.Try generating report using excel emitter (Try with huge data) 3.Check style associated to columns(cells) across rows
Created attachment 200044 [details] My solution for the big files problem This is my solution For This problem. With This change, the size of files is 70% less. I hope this solution will serve
Hi Roland, Thanks for your help. The fix was checked in on 2011/07/04 into 2.6.2 and head. It was because the NumberformatValue didn't generate correct hashcode. As your fix is in binary, I'm not sure if your solution is the same. Hope you find the latest code working.
Created attachment 200136 [details] source code without libs Hi Raghava, The solution is partly that, but still the files generated were very large. Then I made a filter to prevent repeated styles I send attached the source code to look at it. I modified the following files: ExcelLayoutEngine.java - (new 765 Line Method "getOtherStyleId private int (int style)", used in "protected void OutputData (Page page, SheetData data, int start, int span)" NumberFormatValue.java (Line 102): public boolean equals(Object obj) { if (obj instanceof NumberFormatValue) { NumberFormatValue o = (NumberFormatValue) obj; boolean equals = getFormat() != null && getFormat().equals(o.getFormat()) && getFractionDigits() == o.getFractionDigits(); if (equals) { if (getRoundingMode() == null && o.getRoundingMode() == null) { return true; } if (getRoundingMode() != null && getRoundingMode().equals(o.getRoundingMode())) { return true; } } } return false; }
Raghava and others Is this issue resolved for you? Are you still observing large excel files generated from BIRT Excel emitter? Did the upgrade to 3.6.1 work for you? How big are the files are now? How much compression were you able to achieve?
Yes upgrade to Version 3.7.1 solved the problem for me. File size is now approx. 20 percent of before size. But this may depend on the number of usages of number formats in the report. The bug produced for every format and row new Style-tag in the SpreadsheetML. This led to long durations for loading the files in Excel (10min for a 15mb file). Now loading happens a split of a second.
Created attachment 234396 [details] The jar built on 2.6.2 release which includes the patch. Attach the jar built on 2.6.2 release which includes the patch.
Yu Chen, for the purpose of the source review, can you post the changed source code (e.g. in the form of file differences) as attachment(s)? Thanks.