File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Beginning Java and the fly likes removing a sequence from ANY string Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "removing a sequence from ANY string" Watch "removing a sequence from ANY string" New topic
Author

removing a sequence from ANY string

James Palmer
Ranch Hand

Joined: Mar 15, 2004
Posts: 36
hi, wondering if anyone could help me out here.
I have HTML source code in a string called urlString.
In the code, there is a sequence: <meta name=keywords content="key,HTML,cat,dog">
How do I find this in the string? I want to be able to do this for any HTML source code I have.
So basically I want: aString = "key,HTML,cat,dog"
But in every HTML document, these keywords can change. How can I compensate for this?
I'm really struggling so any help will be greatly appreciated
Nathaniel Stoddard
Ranch Hand

Joined: May 29, 2003
Posts: 1258
String.indexOf(String, int) seems to be the best choice for you.
Given that you have a variable urlString of type String, you should create another String at runtime with the value of the values that you want to find. Use "indexOf" iterating through urlString changing the int parameter to adjust for the point in the urlString that you want to start looking for the next match.
In such a way you will be able to find all the occurences in urlString.


Nathaniel Stodard<br />SCJP, SCJD, SCWCD, SCBCD, SCDJWS, ICAD, ICSD, ICED
Dirk Schreckmann
Sheriff

Joined: Dec 10, 2001
Posts: 7023
Along the lines of what Nathaniel has suggested, perhaps you'll first want to find the index of "<meta name=keywords content=" inside your input String.
After that, figuring out the index of the following ">" shouldn't be too tough (if you use the method suggested above).
Then, extracting the part in between (accounting for the double quotes) with a call to substring(int, int) just might do the trick.
If you'd like a bit more nudging in the right direction, just ask. And don't hesitate to post any relevant code that you're working on.
[ March 16, 2004: Message edited by: Dirk Schreckmann ]

[How To Ask Good Questions] [JavaRanch FAQ Wiki] [JavaRanch Radio]
James Palmer
Ranch Hand

Joined: Mar 15, 2004
Posts: 36
Hi, got this so far

Its compiling fine, but not grabbing what I want out the website.
<META NAME="keywords" CONTENT="02, o2, o2.co.uk, mobile phone, wap, media messaging, free text message, free sms>
Im trying to put them keywords into a string.
When I run the code it doesn't do anything. I'm confused. Can anyone enlighten me please?
James Palmer
Ranch Hand

Joined: Mar 15, 2004
Posts: 36
Running the following code, the command prompt gives me:
Error in method getKeywords() java.lang.NullPointerException
I'm catching the exception, But whats my problem?
[ March 16, 2004: Message edited by: James Palmer ]
Michael Dunn
Ranch Hand

Joined: Jun 09, 2003
Posts: 4632
I know zip about web pages, but if your string ENDING is *always* in this format
"key,HTML,cat,dog">
i.e. keywords enclosed in quotes (separated by commas) and line ends with >
then this might work (you may need to throw in a trim() if there can be trailing spaces.
James Palmer
Ranch Hand

Joined: Mar 15, 2004
Posts: 36
Jeese man, I owe you a big thank you.
Wonder if you could help me with this bit now.
I put html source code into a string, say: stringH
I need to track this <meta name=keywords content="any,words,at,all"> from any given stringH.
This tag will be present in any stringH, the words in the tag will chnage though.
How can I strip <meta name=keywords content="any,words,at,all"> from stringH
If this is confusing please let me know, Thanks
Michael Dunn
Ranch Hand

Joined: Jun 09, 2003
Posts: 4632
This *seems* to work OK, but needs testing (major testing)
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: removing a sequence from ANY string