I'm writing an app the allows users to select a stylesheet and a group of xml files to run it on...It works properly, but in some cases, when I have a very large file, the app just hangs.
In some cases in other apps i've gotten a heap space error, or something similar if the app has too many files open, but this doesn't throw any error at all, it just hangs there.
Any idea how to approach that problem?
Keep in mind that your XSLT Engine most likely is using a DOM tree internally. This is the traditional implementation. There might be a few newer prototypical implementations that don't use DOM. What this means? To build the DOM takes up considerable heap memory of a JRE for large files. One popular BI tool fails XML processing with 700 MB instance for example. In these cases the SAX API is the best choice for handling processing XML. I don't know what alternatives there are for XSLT processing though.
It sounds like you are hitting a memory-related DOM processing fault, but haven't completely written the error/exception handling code in your application. Netbeans IDE has with two profilers which work pretty good, one in the IDE and one is a plugin, e.g JConsole.
JConsole is a profiler application. I call it from within NetBeans IDE. You could get a heap dump this way. The details from the Profiler built into NetBeans IDE has a bit more data than just a heap dump. There is a graph screen that shows the memory usage within the JRE as the application is executing.
Increasing the heap memory of the JRE helped with the handling the files you are working with today. Keep in mind that if you increase the file-size (s) enough you will hit the same memory problem.
The -Xmx argument is for the JRE. There is no way to include the argument in the JAR file, that I know of. If you have a BAT file which executes the JAR, then you could add the argument to the command-line in the BAT file.
Also, there are some downsides of increasing memory that you should be aware of. Since you have more space for "garbage objects", the intervals of the garbage collection cycles usually get bigger, and then there is more gabage to collect so that takes more time as well. When the JRE is collecting garbage, you application is paused. This might not be so important with a standalone application executed from command-line. But it is a significant issue when increasing the memory of a Java application server. There are other related JRE settings that usually go with the one you used that involve the generations of objects.....something to research in the future