Hello, i have a problem with something.I want to get all the source code from an HTML page and then put it into a text file. Any help will be really appreciated.
Hi James, I am not sure what I need. Maybe the following will help: Right click on the html page in the browser and click on View Source. This will open the source in notepad (0r default text editor). You can now save this file as txt or anything else that you need. Hope that helps! Cheerio!
Thanks a lot for your reply Tom very helpful.I would now like to create a new string from the original, but this time I only want the string to contain words that are not inside <>.How can i do that?
Sorry my previous post was wrong.What i wanted to know is if i want to leave out things inside '& gt' and '& nbsp' how can the code above be made to include those?
[ March 15, 2004: Message edited by: James Palmer ]
James, Play with the code within the while loop where you examine each character. When you see the start character of what you want to exclude, stop appending to the buffer. After you see the ending character of your exclusion string, start appending again. Also see https://coderanch.com/forums/ for a way to use regular expressions to remove unwanted information from the string using the first piece of code. Tom
Tom Blough<br /> <blockquote><font size="1" face="Verdana, Arial">quote:</font><hr>Cum catapultae proscriptae erunt tum soli proscripti catapultas habebunt.<hr></blockquote>
The harder you work, the luckier you get. This tiny ad brings luck - just not good luck or bad luck.
a bit of art, as a gift, the permaculture playing cards