Ok, so I thought I know
Java ... but I guess I don't. I altered some code I grabbed from a Google search, altered it to fit my needs and came up with this:
import java.util.Hashtable;
import java.util.Enumeration;
public class EntityResolver2
{
private static Hashtable<
String, String> XMLEntityLookupTable;
private static String temp = null;
static
{
XMLEntityLookupTable = new Hashtable<String, String>();
XMLEntityLookupTable.put("&", "&");
XMLEntityLookupTable.put("'", "'");
XMLEntityLookupTable.put(">", ">");
XMLEntityLookupTable.put("<", "<");
XMLEntityLookupTable.put("\"", """);
}
public static String replaceIllegalCharacters(String str)
{
// get the keys:
Enumeration<String> illegalChars = XMLEntityLookupTable.keys();
temp = str;
while(illegalChars.hasMoreElements())
{
// got the keys
String illegalChar = illegalChars.nextElement();
String entityReplacement = XMLEntityLookupTable.get(illegalChar);
temp = replace(temp, illegalChar, entityReplacement);
}
return temp;
}
private static String replace(String str, String
pattern, String replace)
{
int s = 0;
int e = 0;
StringBuffer result = new StringBuffer();
while ((e = str.indexOf(pattern, s)) >= 0)
{
result.append(str.substring(s, e));
result.append(replace);
s = e + pattern.length();
}
result.append(str.substring(s));
return result.toString();
}
}
After looking at if for a while I decided, hey, this could be optimized. After all, I'm using a Hashtable (synchronized), Enumeration (also synchronized), StringBuffer (another one that's synchronized) a couple of while loops, two method calls within the class and a whole whack of method calls to other objects of the other classes (the synchronized ones). So, I remembered my techniques from the old C programming days and decided to go "old school" on this thing and used char arrays like this:
public class EntityResolver
{
private static char[] cdataArray;
private static char[] temp;
public static String replaceIllegalCharacters(String cdata)
{
cdataArray = cdata.toCharArray();
int size = cdataArray.length;
for(int i = 0; i < size; i++)
{
char c = cdataArray[i];
switch(c)
{
case '<':
replace('<', i, "<");
break;
case '>':
replace('>', i, ">");
break;
case '\'':
replace('\'', i, "'");
break;
case '"':
replace('"', i, """);
break;
case '&':
replace('&', i, "&");
break;
}
size = cdataArray.length;
}
return new String(cdataArray);
}
private static void replace(char replace, int index,
String replacement)
{
temp = new char[cdataArray.length -1 + replacement.length()];
int tempLength = temp.length;
int i = 0;
for(; i < index; i++)
{
temp[i] = cdataArray[i];
}
char[] replacementChars = replacement.toCharArray();
int replacementCharsLength = replacementChars.length;
for(int j = 0; j < replacementCharsLength; j++)
{
temp[i] = replacementChars[j];
i++;
}
index++;
for(int k = i; k < tempLength; k++)
{
temp[k] = cdataArray[index];
index++;
}
cdataArray = temp;
}
}
I mean this has got to be faster right? I mean, I'm not using objects, I've severely culled my method calls which can be quite costly in performance and I've created local variables for the length values for arrays to cut down on referencing. The compare values are char literals instead of sitting in a synchronized hashing function class (ala HashTable). Maybe I've just been up too long but I have no idea why the char array version is slower.
They both functionally perform the same operation and they both work. By the way, what they are doing is accepting a string (of any size), scanning it for illegal XML characters ('<', '>', '&', ''', '"') and replacing them with entities ('<', '>', '&', ''', '"'). and returning the "legal" string.
Bored? Any takers?
a.