This week's book giveaway is in the Mac OS forum.
We're giving away four copies of a choice of "Take Control of Upgrading to Yosemite" or "Take Control of Automating Your Mac" and have Joe Kissell on-line!
See this thread for details.
The moose likes Performance and the fly likes XML Parsing and Validation with SAX parser shoots CPU to 100% utilization Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Performance
Bookmark "XML Parsing and Validation with SAX parser shoots CPU to 100% utilization" Watch "XML Parsing and Validation with SAX parser shoots CPU to 100% utilization" New topic
Author

XML Parsing and Validation with SAX parser shoots CPU to 100% utilization

RajaVivekanandhan SA
Greenhorn

Joined: Sep 14, 2005
Posts: 4
I am using Xerces implementation Xerces-J_2_3_0.I need to validate as well as parse an XML.What I observe is, irrespective of the size of input XML and schema(.xsd) the XML Parsing and Validation with SAX parser shoots CPU to 100% utilization.We tried with setting input buffer size of parser like
parser.setProperty("http://apache.org/xml/properties/input-buffer-size",new Integer(2));
, which doesn't make significant difference.
Here is my code snippet for parsing the XML.

public boolean validateResource(String inputXML, InputStream schemaStream) throws ClientApplicationDataProcessingException {
if (logger.isInfoEnabled()) {
logger.info("Validating Resource");
}
boolean xmlValid = true;
if ((inputXML == null) || ("".equals(inputXML.trim()))) {
xmlValid = false;
return xmlValid;
}
Reader inputXMLReader = null;
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(true);
factory.setValidating(true);
SAXParser parser = null;
try {
inputXMLReader = new StringReader(inputXML);
parser = factory.newSAXParser();
parser.setProperty(SCHEMA_LANGUAGE, XML_SCHEMA);
parser.setProperty(SCHEMA_SOURCE, schemaStream); parser.setProperty("http://apache.org/xml/properties/input-buffer-size",new Integer(2));
parser.parse(new InputSource(inputXMLReader), this);
} catch (SAXNotRecognizedException _snrex) {
throw new ClientApplicationDataProcessingException("Sax Parser Not recognised while validating XML", _snrex);
} catch (ParserConfigurationException _pcex) {
throw new ClientApplicationDataProcessingException("Parser Configuration exception during validation of XML", _pcex);
} catch (IOException _ioex) {
throw new ClientApplicationDataProcessingException("IOException while validating XML", _ioex);
} catch (SAXNotSupportedException _snsex) {
throw new ClientApplicationDataProcessingException("SAX Parser does not support this operation during validation of xml", _snsex);
} catch (SAXParseException _spex) {
StringBuffer buffer = new StringBuffer();
buffer.append("XML is not valid as per the Schema ");
buffer.append("Error occurred at : ");
buffer.append("Column number = ");
buffer.append(_spex.getColumnNumber());
buffer.append("Line Number = ");
buffer.append(_spex.getLineNumber());
buffer.append("Error Message = ");
buffer.append(_spex.getMessage());
logger.error(buffer.toString(), _spex);
xmlValid = false;
} catch (SAXException _sex) {
throw new ClientApplicationDataProcessingException("SAX Exception while parsing xml", _sex);
}
return xmlValid;
}

It would be great if any one could give an idea to reduce the 100% CPU utilization.

Thanks
Raja
Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

Moving this to the Performance forum.


Make visible what, without you, might perhaps never have been seen.
- Robert Bresson
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

SO what does your SAX handler ("this") do?


[Jess in Action][AskingGoodQuestions]
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12792
    
    5
Exactly what is your problem? Do you mean the CPU goes to 100% but the process never finishes? Is your event handler getting SAX events?
Why did you think that fiddling with the input buffer size would do anything?
Bill
(I suspect you have coded an infinite loop in one of the event handlers so the code you show has nothing to do with the problem.)
[ September 16, 2005: Message edited by: William Brogden ]
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
What's wrong about the 100% CPU utilization? How much CPU utilization did you expect?


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
RajaVivekanandhan SA
Greenhorn

Joined: Sep 14, 2005
Posts: 4
"this" is DefaultHandler Object

CPU goes to 100% , the process gets finished.But what I am bit confused is.. Does 100% CPU utilization means it is bad on the performance of the application ? When the CPU goes 100% no other process can be ran simultaneusly.


won't that 100% CPU utilization reduce the performance(response time increases for simultanous validation)
for a validation and parsing alone why does 100 CPU gets used?
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

Any code that actually does something efficiently will make the CPU spike to 100%. If it doesn't, then you're wasting throughput, right?

But the question is, does validating a short XML file with a simple XSD using DefaultHandler really make the CPU go to 100% for a measurable time? No offense, but the only explanation for that is a really pitiful computer. So what kind of machine we talking about here?
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by RajaVivekanandhan SA:
When the CPU goes 100% no other process can be ran simultaneusly.


No, that's not true. The process just uses all the power it is allowed to use - and when no other process is needing the CPU, that is 100%. When another process needs cpu time, too, the operating system should simply give it some. Your validation will then be slower, of course.
RajaVivekanandhan SA
Greenhorn

Joined: Sep 14, 2005
Posts: 4
I am using Intel p4 with 4 processor CPU and 1 gig RAM.Java program is running on Windows 2000 professional.
We did a load test on the application, with 10 concurrent users, and the result was, average CPU utilization was 99% and hence the average response
time is going higher when the request rate(load) to the application is increased.
I have a standalone Java program which does xml parsing and validation using the Xerces implented SAX which when running every time it touches 100% CPU for a second and comes down.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by RajaVivekanandhan SA:
I am using Intel p4 with 4 processor CPU and 1 gig RAM.Java program is running on Windows 2000 professional.
We did a load test on the application, with 10 concurrent users, and the result was, average CPU utilization was 99% and hence the average response
time is going higher when the request rate(load) to the application is increased.
I have a standalone Java program which does xml parsing and validation using the Xerces implented SAX which when running every time it touches 100% CPU for a second and comes down.


Yes, sounds reasonable. So?
RajaVivekanandhan SA
Greenhorn

Joined: Sep 14, 2005
Posts: 4
Since the load test report says it has average CPU time of 99% , we have considerations like , when the application goes to production in real time, this CPU issue is considered as a Risk.
Is it not risk for having an application running real time using 100% CPU?
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12792
    
    5
I REALLY do not understand why you think this is a problem. If your CPU was only used 50% it would indicate a bottleneck in the network or hard disk data transfer rate. A close to 100% means you have a very efficient IO system.
If you want to measure something significant, measure total time to convert a document of size X. With a Megabytes per second statistic you can estimate how the system will respond to your real customer load.
Bill
Avianu Sud
Ranch Hand

Joined: Jan 20, 2002
Posts: 55
Throughput, Processes waiting and memory usage are better indicators of performance. 100% CPU usage is fine as long as its not impacting throughput or dealing with runaway processes
 
GeeCON Prague 2014
 
subject: XML Parsing and Validation with SAX parser shoots CPU to 100% utilization