• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

removing a sequence from ANY string

 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi, wondering if anyone could help me out here.
I have HTML source code in a string called urlString.
In the code, there is a sequence: <meta name=keywords content="key,HTML,cat,dog">
How do I find this in the string? I want to be able to do this for any HTML source code I have.
So basically I want: aString = "key,HTML,cat,dog"
But in every HTML document, these keywords can change. How can I compensate for this?
I'm really struggling so any help will be greatly appreciated
 
Nathaniel Stoddard
Ranch Hand
Posts: 1258
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
String.indexOf(String, int) seems to be the best choice for you.
Given that you have a variable urlString of type String, you should create another String at runtime with the value of the values that you want to find. Use "indexOf" iterating through urlString changing the int parameter to adjust for the point in the urlString that you want to start looking for the next match.
In such a way you will be able to find all the occurences in urlString.
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Along the lines of what Nathaniel has suggested, perhaps you'll first want to find the index of "<meta name=keywords content=" inside your input String.
After that, figuring out the index of the following ">" shouldn't be too tough (if you use the method suggested above).
Then, extracting the part in between (accounting for the double quotes) with a call to substring(int, int) just might do the trick.
If you'd like a bit more nudging in the right direction, just ask. And don't hesitate to post any relevant code that you're working on.
[ March 16, 2004: Message edited by: Dirk Schreckmann ]
 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, got this so far

Its compiling fine, but not grabbing what I want out the website.
<META NAME="keywords" CONTENT="02, o2, o2.co.uk, mobile phone, wap, media messaging, free text message, free sms>
Im trying to put them keywords into a string.
When I run the code it doesn't do anything. I'm confused. Can anyone enlighten me please?
 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Running the following code, the command prompt gives me:
Error in method getKeywords() java.lang.NullPointerException
I'm catching the exception, But whats my problem?
[ March 16, 2004: Message edited by: James Palmer ]
 
Michael Dunn
Ranch Hand
Posts: 4632
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I know zip about web pages, but if your string ENDING is *always* in this format
"key,HTML,cat,dog">
i.e. keywords enclosed in quotes (separated by commas) and line ends with >
then this might work (you may need to throw in a trim() if there can be trailing spaces.
 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jeese man, I owe you a big thank you.
Wonder if you could help me with this bit now.
I put html source code into a string, say: stringH
I need to track this <meta name=keywords content="any,words,at,all"> from any given stringH.
This tag will be present in any stringH, the words in the tag will chnage though.
How can I strip <meta name=keywords content="any,words,at,all"> from stringH
If this is confusing please let me know, Thanks
 
Michael Dunn
Ranch Hand
Posts: 4632
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This *seems* to work OK, but needs testing (major testing)
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic