This week's book giveaway is in the OCMJEA forum.
We're giving away four copies of OCM Java EE 6 Enterprise Architect Exam Guide and have Paul Allen & Joseph Bambara on-line!
See this thread for details.
The moose likes XML and Related Technologies and the fly likes XSL-FO Multiple Page Sequences Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "XSL-FO Multiple Page Sequences" Watch "XSL-FO Multiple Page Sequences" New topic
Author

XSL-FO Multiple Page Sequences

Kamohelo Mofokeng
Greenhorn

Joined: Oct 07, 2009
Posts: 8
Hi All,

I'm trying to implement a solution to create multiple page sequences (from the thread: http://www.coderanch.com/t/125494/XML/Mulitple-fo-page-sequence) in my xsl-fo file. I get a java.lang.OutOfMemoryError when I transform large data.

Although I increased -Xms/-Xmx & java_arguments to 2G, the problem still persists.

Here's how my xsl file looks:


The PROVIDER template returns a lot of data & this is where FOP runs out of memory.

& the input xml file looks as follows:


I appreciate all the help.

THANK YOU


_______________________________________

SCJP 6
g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 511
    
    3
>in my xsl-fo file. I get a java.lang.OutOfMemoryError when I transform large data.
I take it meaning there is no OutOfMemoryError when you transform a smaller set of data? If that has not been checked yet, I would suppose the first thing to do is to test the templates with a smaller data set.

More particularly, if it is established already that it is the PROVIDER where the problem arise, you've to load a normal data set and a smaller data set with
ROOT/MEM-STMT/COUNTRY ='MU'
and
ROOT/MEM-STMT/FIN-BAL/FIN-BAMT !='0'
satisfied.

If smaller data set still provoke out of memory error, you have to check the named templated PROVIDER. It is easy to make out of memory error if, for instance, somewhere somehow there might be a recursive call... Hence, its source of trouble is more algorithm than data set.

If you are satisfied no recursion or algorithmic problem and that smaller data set for PROVIDER being called is free of memory error, you can then varying the strategy of how to prepare the Source for the transformation. You can have DOMSource which has the bigger memory footprint. If you're using that, change the algorithm to feed SAXSource to the transformer...

Those are what I would look into...
Kamohelo Mofokeng
Greenhorn

Joined: Oct 07, 2009
Posts: 8
Hi g tsuji,

Thanks for the response - much appreciated.

>>> I take it meaning there is no OutOfMemoryError when you transform a smaller set of data?

Yes, the transformation works fine for small data sets - the memory problem arises when dealing with large data sets.

>>> You can have DOMSource which has the bigger memory footprint. If you're using that, change the algorithm to feed SAXSource to the transformer...

I'm in the process of changing my code to make use of DOMSource, currently it looks as follows:


Any ideas how to go about it? I was thinking of:


THANKS

g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 511
    
    3
I could sketch a well stripped down layout of what should appear in coding. Nevertheless, I would prefer to dissect the process in tandem of first outputting the fo and then eventually processing the fo. I think you can handle the 2nd part adequately. The first part can look something like below. By this, one can have a good idea which part eats up the memory out-of-control.

If that remains within the capacity of memory available, you can start look into how fop is memory hungry.
Kamohelo Mofokeng
Greenhorn

Joined: Oct 07, 2009
Posts: 8
Thanks g tsuji, I really appreciate the help

I have changed my code as follows:

3 scenarios play out when I run this code.

When I use:

I run out of memory - pdf file is not created (for large data sets).

When I use:

pdf file is not created (for large data sets) & I get the exception:

When I use:

a pdf file is created but when I try to open it I get the error:
There was an error opening this document. The file is damaged and could not be repaired.

What am I missing here?

THANKS
g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 511
    
    3
I read the code with some difficulty following the logic, and I am not necessary agree with some of the constructions (the out to BufferedOutputStream(out)...). Some code progession has lost me squarely, and I might think it wrong. But since the code is not mine, I might be excused for not paying enough thoughts into it to understand. So I decide not to modify line after line with comments but only a wholesale modification below.

This is a version that I change and move around code lines. Watch carefully, including the ordering of lines to successively establish the objects. I do not invent any new variable names. They are all yours.
Kamohelo Mofokeng
Greenhorn

Joined: Oct 07, 2009
Posts: 8
Hi g tsuji,

Please forgive my not-so-neat code - I've been chopping & changing while trying to get to the solution. Interestingly though, this is the code that's been working all along & still working for smaller data sets.

Anyway, I've made the changes according to the code you provided. I see that major difference in your arranged code is that you create both the file & bos objects before the transformation whereas I created them after the transformation.

With these changes in place I still get OutOfMemoryError, stack trace:

I think the problem lies with the line:
I don't why it is or how to move passed it.

THANKS
g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 511
    
    3
[0] It is unavoidable that the fop engine processing consumes heap space. There are some notes out there purporting that page-sequence will consume heap space without releasing... until it is switched to another. Hence, there is a urban wisdom saying something like that you have to try not use the same page-sequence master for a whole lot of pages. This is worth considering. Maybe you can try consistently use a fo:page-sequence for even page and another, amid the same in terms of fo:flow or else, for odd page etc... I say this taking into consideration that you've already done adding option -Xms and -Xmx up to Giga Bytes...

[1] I have tried to emulate the problem with 2K+ pages with single fo:page-sequence and fop can handle adding -Xms512m order of magnitude. But, the page is relatively simple without graphic and user generated fonts. But it is conceivable page size continuously increase until breaking all possible resource size of the machine... Hence, maybe the memory model of fop could be redesign one-day by those in that specialized field.

[2] If you are using fop-1.0 (it seems not available in fop-0.95), there is a ConserveMemoryPolicy setting for FOUserAgent, something like this:

You can try if that serves any purpose. My testing on fop-1.0 has no luck on noticeable effect memory-wise in this regard.

[3] It is a certainty that however large resource fop can handle, it would be exhausted by arbitrarily large data set and fo document. Hence, I am not able to help beyond this physical limit.
Kamohelo Mofokeng
Greenhorn

Joined: Oct 07, 2009
Posts: 8
[0] It is unavoidable that the fop engine processing consumes heap space. There are some notes out there purporting that page-sequence will consume heap space without releasing... until it is switched to another. Hence, there is a urban wisdom saying something like that you have to try not use the same page-sequence master for a whole lot of pages. This is worth considering. Maybe you can try consistently use a fo:page-sequence for even page and another, amid the same in terms of fo:flow or else, for odd page etc... I say this taking into consideration that you've already done adding option -Xms and -Xmx up to Giga Bytes...


I honestly thought this was a better option after I investigated all the other alternatives. I was also following the example: http://www.scriptorium.com/whitepapers/xslfo/xslfo_4.html but I couldn't get it to work for my environment. At this point, I have no preference over which alternative I'd like to implement just as long it works.

[1] I have tried to emulate the problem with 2K+ pages with single fo:page-sequence and fop can handle adding -Xms512m order of magnitude. But, the page is relatively simple without graphic and user generated fonts. But it is conceivable page size continuously increase until breaking all possible resource size of the machine... Hence, maybe the memory model of fop could be redesign one-day by those in that specialized field.

If I can generate 1K+ pages that would be ideal. As it is, I'm struggling to generate 500 pages. It's mainly because of graphics/PNG images (max. size 231 KB) & user defined fonts. Is there a way that I can change fonts to make the transformation less resource intensive? Problem is I don't have control over the stylesheet format.


[2] If you are using fop-1.0 (it seems not available in fop-0.95), there is a ConserveMemoryPolicy setting for FOUserAgent, something like this:

I'm using fop-0.93 but I think this approach is worth checking out. I'll upgrade & test - hopefully this works out.

[3] It is a certainty that however large resource fop can handle, it would be exhausted by arbitrarily large data set and fo document. Hence, I am not able to help beyond this physical limit.

I'm hoping it won't come to this - I don't have other viable alternatives.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: XSL-FO Multiple Page Sequences