I'm new to Java but experienced as a developer. I'm working on a string parser that has to fire every second and scan a full page of text that is rendered as HTML. I'm wondering if I can make it run faster using better code. Here's what I'm using (see code) is there a better/faster way to do this? Would I get better performance using RegEx?
I think something helpful to experts is that HTML_STAT_SUMMARY gets refreshed about every 5 seconds. I query that page of HTML for about 30 substrings and I have 30 functions set up just like the one below. Is this a good approach or do I leverage my position in the document and move forward on each search? I do know where each value will tend to be in the HTML_STAT_SUMMARY so I think I could be more efficien in stepping through as well by saving my current position and doing forward lookups.
It is rather surprising that the string to parse is not passed in as a paramater. Getting it into the function scope by accessing the global variable HTML_STAT_SUMMARY is weird.
At one point you use TAG_CARRIER_INFO.length(), then again you use a 10 hardcoded.
If the 30 tags are in a certain order, it would definitively pay to search them in order and avoid reading the complete whole string 30 times.
Finally a shameless plug: Searching for 30 (or many more) strings (aka regular expressions) in a document to extract surrounding text is the perfect use case for monq.jfa (GPL software). It allows you to stick pattern/action pairs into a finite automaton that reads your text and calls the actions whenever a pattern is matched. Throughput is 1.5MB/s on a 2.6GHz Pentium. Setting up the automaton looks like this: