I have a simple String data (NOT XML) which might contain HTML special Characters ( like & or & ).
I am looking for a parser which can scan the input string for such codes and replace them with corresponding Special characters.
You can just use java.util.regex.Pattern and java.util.regex.Matcher for this. Create a Pattern for the place holders (&.+?; - the .+? is a non-greedy catch-all), look for all occurrences (as long as the Matcher's find() method returns true), investigate the match and if it's one you're looking for, replace it. You can use Matcher's appendReplacement and appendTail to finalize your String. In a bit of pseudo code:
In the case where we know beforehand that only a limited number of possible html entities may appear, a regex approach may do just fine. But often time, as the complete set of html entities is big, appeal to some library/utility class seems necessary.
For the functionality sought after, in Perl, say, there is HTML::Entities module to help. In java, we can, for instance, call upon org.apache.commons.lang.StringEscapeUtils to help. For a quite arbitary but valid html case study it may go like this.