• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Reading in HTML file, having trouble finding tags

 
Greenhorn
Posts: 24
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, I'm reading some pages from the BBC.co.uk web site, all of the news articles have the tags <!-- S BO --> and then later on <!-- E BO --> to indicate where the main body of text is. I'm trying to locate the first tag, and then add everything I can find into a string until I locate the second tag.

This code I have pasted in is meant to find the first tag and print out confirmation when it does. The problem is that a lot of time it cannot find the first tag.

For instance, running it on this link, it can't find the tag:

http://news.bbc.co.uk/sport1/hi/football/teams/t/tottenham_hotspur/6225089.stm

even though if I go edit->view source in my browser, and do edit->find, I can see the tag I'm looking for

However if I run it on this article, it finds the tag:
http://news.bbc.co.uk/1/hi/education/6224801.stm




I cannot see what is going wrong with my program at all, if anyone can offer any advice then thanks a lot!
 
Rancher
Posts: 3742
16
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Could it be a timing problem ? If nothing has been returned from the server the first time you call in.ready(), it may never go into the while loop. I ran it on both URLs and neither of them worked. I then put a sleep in before the while loop and they both worked.
Try putting some debug statements in to see exactly what it is doing.


BTW - I believe it will be more efficient to use a StringBuffer/StringBuilder to build up your string rather than continuously concatenating.
[ January 02, 2007: Message edited by: Joanne Neal ]
 
Chris Blanchard
Greenhorn
Posts: 24
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey, thanks for the quick reply.

Where abouts would the sleep function go, because the first time I call
in.ready() is within the while loop. (I'll probably use an empty for loop for the time being)

Also, is putting in a sleep function an ideal solution? Or is it a bit of a hack that'll work but isn't necessarily ideal?

I'll have a look into the StringBuffer/StringBuilder later on as well.

Cheers.
 
Joanne Neal
Rancher
Posts: 3742
16
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, it is definitely a hack. I only put it in to see if that was what the problem was.

You could just loop until in.ready() returns true. You may have to put a timeout into the loop as well though in case you never get a response.
 
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You shouldn't use ready() at all, as it can return false anytime in the middle of the page.

Instead, check the return value of read() the -1 value, which indicates the end of the stream.
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, I believe Ilja has identified the problem.

You can see a discussion of a very similar issue here:

AvailableDoesntDoWhatYouThinkItDoes

The available() method works much like ready(), in the sense that avilalable() can return 0, or ready() return false, when there simply isn't any data available immediately - but if you wait a millisecond or two, maybe there will be. It's not a reliable way to detect the end of a file.
 
Chris Blanchard
Greenhorn
Posts: 24
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yeah thanks that seems to have fixed it, putting in a sleep function did make it more reliable but testing for -1 fixed it.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic