• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

HTML source code into a text file

 
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello, i have a problem with something.I want to get all the source code from an HTML page and then put it into a text file. Any help will be really appreciated.
 
Ranch Hand
Posts: 47
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi James,
I am not sure what I need. Maybe the following will help:
Right click on the html page in the browser and click on View Source. This will open the source in notepad (0r default text editor). You can now save this file as txt or anything else that you need.
Hope that helps!
Cheerio!
 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks for the help
What i really need though is to do this via java!!Sorry if that wasn't clear before.Thanks
 
Ranch Hand
Posts: 263
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Try this:


Hope that helps,
Tom Blough
 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks a lot pal you ve saved my life!!
 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks a lot for your reply Tom very helpful.I would now like to create a new string from the original, but this time I only want the string to contain words that are not inside <>.How can i do that?
 
Tom Blough
Ranch Hand
Posts: 263
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
James,
Try this:


Tom
 
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Moving this to the Intermediate forum...
 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks Tom
If i want to leave out things inside > and  
how can that be added to the code above?Thanks in advance
 
James Palmer
Ranch Hand
Posts: 36
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sorry my previous post was wrong.What i wanted to know is if i want to leave out things inside '& gt' and '& nbsp' how can the code above be made to include those?

[ March 15, 2004: Message edited by: James Palmer ]
 
Tom Blough
Ranch Hand
Posts: 263
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
James,
Play with the code within the while loop where you examine each character. When you see the start character of what you want to exclude, stop appending to the buffer. After you see the ending character of your exclusion string, start appending again.
Also see https://coderanch.com/forums/ for a way to use regular expressions to remove unwanted information from the string using the first piece of code.
Tom
 
The harder you work, the luckier you get. This tiny ad brings luck - just not good luck or bad luck.
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic