| Author |
split a string and differentiate the elements in string array
|
Krishna Chaitanya Reddy Balam
Greenhorn
Joined: Feb 19, 2008
Posts: 22
|
|
Hi all: I am trying to split a string basing on some HTML tags inside it if String s = "I am <b> bold </b>"; I am want that to be converted to string array and I must be able to differentiate which element in the string array was inside the tags. Is there any way to do it.
|
 |
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32689
|
|
Depends what you want to split on. There is a split() method in the String class which does what you want, but it takes a regular expression as its parameter. If you are not familiar with regular expressions, try here to start you off.
|
 |
Krishna Chaitanya Reddy Balam
Greenhorn
Joined: Feb 19, 2008
Posts: 22
|
|
|
but will I be able to differentiate the string int between bold tags in the string array.
|
 |
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32689
|
|
|
You can probably design a regular expression which will match <b> and </b> tags, so you should be able to do that, yes.
|
 |
Krishna Chaitanya Reddy Balam
Greenhorn
Joined: Feb 19, 2008
Posts: 22
|
|
I think this is the pattern for <b> tags <b\b[^>]*>(.*?)</b> and this is for general HTML tags <([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1> but if I write something like String S4 = "I am <b>bold</b> and I am <i>italic</i> and I am <b><i>bold italic</i></b>" Pattern htmlTag = Pattern.compile("<([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1>"); int length = s4.length(); Matcher matcher = pbold.matcher(s4); String result = matcher.group(); I need to get the output to String array like String[] sa; and sa should contain {"I am", "bold", "and I am","italic","and I am","bold italic"} I konow I can get this but after storing in string array I need to differentiate that sa[1] was between bold tags and sa[3] was in italic tags and sa[5] was in bold italic tags. Is there any way to do this. Right now I am parsing the string character by character and doing it bu tI need something more generic as it is difficult to have nested tags with character logic. Please help
|
 |
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32689
|
|
Difficult to be sure just looking at the code, but you appear to be matching everything from a <b> tag to the next </b> tag. I think you want to match only the <b> and </b>. You might do well to Google for HTML parsers, as well, if you are looking for more than one kind of tag. Why spend hours and hours re-inventing the wheel?
|
 |
Krishna Chaitanya Reddy Balam
Greenhorn
Joined: Feb 19, 2008
Posts: 22
|
|
|
I did and nothing helps.
|
 |
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32689
|
|
Sorry to hear that. List of HTML parsers here. Try myString.split("</.+>") or myString.split("<b>").
[Campbell@queeg applications]$ java BoldSplitter I am bold</b> and I am <i>italic</i> and I am <i>bold italic</i></b> [Campbell@queeg applications]$
|
 |
 |
|
|
subject: split a string and differentiate the elements in string array
|
|
|