aspose file tools*
The moose likes XML and Related Technologies and the fly likes SAX parser problem Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of The Java EE 7 Tutorial Volume 1 or Volume 2 this week in the Java EE forum
or jQuery UI in Action in the JavaScript forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "SAX parser problem" Watch "SAX parser problem" New topic
Author

SAX parser problem

Rob MacKay
Ranch Hand

Joined: Apr 06, 2007
Posts: 35
    
    1
Hi,

I will try to explain the problem I am having when parsing with SAX. In my content handler I have the startElement method and in that method I have a check for a specific localName and then based on that name being found, I store the attributes for it in a vector. The idea being that each time that localName is encountered in the XML file, the vector has another element added to it, which is the attibutes found for that element.




Here is where it gets weird. If the element I am searching for is not the last element in the list, the attributes that I get are not correct. Let's say you have just two elements in your xml file:

<myElement name="thisName" data="myData"></myElement>
<newElement stuff="myStuff" otherStuff="myOtherStuff"></newElement>

If I use the code above and populate the vector based on finding "myElement", for some reason when I access the vector later I get the attributes from "newElement" instead.

I inserted debug code in to output the attributes in the if statement and it would display the correct attributes at that time but the vector would always be populated with the attributes from whatever the final element in the xml file was.

Does anyone know why this would happen and how I can prevent that from happening? Any other possible solution would be nice as well. I am basically trying to write a class that uses SAX to parse a given xml file but only return a vector of attributes from a specifically named element in the XML file, ignoring other elements.

Thanks in advance,

Rob
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41572
    
  54
Rob, welcome to JavaRanch.

On your way in you may have missed that we have a policy on screen names here at JavaRanch. Basically, it must consist of a first name, a space, and a last name, and not be obviously fictitious. Since yours does not conform with it, please take a moment to change it, which you can do right here.

As to your problem, I'm fairly certain that there is something the matter with the logic in the code, since SAX itself is time-tested and proven. Can you post a minimal code section and XML document that exhibits the problem?


Ping & DNS - my free Android networking tools app
Rob MacKay
Ranch Hand

Joined: Apr 06, 2007
Posts: 35
    
    1
Thanks Ulf.

I have a class called ParseXMLFile.

The ParseXMLFile class has three member variables:

public static Vector elementVector = new Vector();
public static String fileToParse;
public static String searchElement;

there is also a method called:

public static Vector parseFileForElement(String filename, String element)

This method is called from some external class that needs the information from the XML file.

The method sets the two class member variables and instantiates the parser, setting up the content handler and the error handler.

The startElement method of the content handler simply checks to see if localName is the element we are looking for:

if (localName.equals(ParseXMLFile.searchElement))
{
ParseXMLFile.elementVector.addElement(atts);
}

If it is correct, then it will add the elements to the vector. Once the parsing is complete, the parseFileForElement method will return the Vector back to the class that called it.

As I mentioned, I did insert some general debug code to check that the search element was correctly encountered and that the attributes matched what they should have been in the xml file. However, the vector is getting bad data and it only has the attributes which come from whatever the last element of the XML file contains, regardless of the name of that element.

I hope my explanation makes sense.

Rob
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18745
    
  40

This question is probably better answered in the XML forum, where all the XML experts hang out. Moving this topic to the XML forums...

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Your code doesn't say what "atts" is but I am guessing it is the Attributes object that is passed as a parameter to your startElement() method.

From what you say it appears that the SAX parser only has one Attributes object, and it passes you that same object every time, only filled with whatever attributes actually belong to the element. You could test that idea with some code like "if (elementVector.get(0) == elementVector.get(1))" to see if you're always getting a reference to the same Attributes object.

If you are, you'll have to work around that idiosyncrasy somehow. If not, then it's some kind of bug in your code.
Rob MacKay
Ranch Hand

Joined: Apr 06, 2007
Posts: 35
    
    1
atts is referring to the Attributes object passed into the startElement method.

A workaround I found was to create a Properties object and then iterate through the attributes, populating the Properties object with each attribute and then once the element was parsed, add that Properties object to the vector instead of adding the Attributes object to the vector. This worked properly and gave me the data I was expecting.

Strange behavior.

Is there another way in a SAX content handler to parse an XML file, only getting the data for each instance of a particular element? That's ultimately what I am trying to do anyway.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41572
    
  54
I think the problem is hinted at in this section of the AttributeList javadocs:
When an attribute list is supplied as part of a startElement event, the list will return valid results only during the scope of the event; once the event handler returns control to the parser, the attribute list is invalid. To save a persistent copy of the attribute list, use the SAX1 AttributeListImpl helper class.

Even though this is no longer mentioned in the javadocs for Attributes, it may still apply (AttributeList in SAX 1 was replaced by Attributes in SAX 2). Instead of keeping a reference to the Attributes object, try copying the ones you're interested in to some other data structure.
Rob MacKay
Ranch Hand

Joined: Apr 06, 2007
Posts: 35
    
    1
Yes - that's exactly what I ended up doing. Each time my element is encountered, I stored the attribute list in Properties object and then stuffed that object in a vector so that my end result would be a vector of properties objects, one for each occurance of the element I was looking for. Thanks for the assistance.
 
 
subject: SAX parser problem