• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Locating Block of String - Brain Hurts

 
Ranch Hand
Posts: 15304
6
Mac OS X IntelliJ IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am having a hard time with this and just need a shove in the right direction. I am needing to scrape a web page here at work and locate a specific value. The page is all HTML. There is a TD element that looks like:



I need to get the number for Trigger Count. The number is not always a single digit. Is this even going to be possible? I have no access to the method that is giving this page that number because the web server that is running it is located on a power hardware device we are using to reset equipment remotely. Thanks.
 
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Are you ok with regular expressions? In JRE 1.4 or later look at the JavaDoc for Pattern and see if you can make one that matches the literal ">Trigger count =", then any number of digits 0-9, then the literal "<". You can use Matcher and getGroup to get the matching string, then break it apart on spaces or substringing or whatever. With "capturing groups" you should be able to get the number out directly.

Let us know if that's too vague ... we try to avoid giving the answer straight out but don't want to make it too hard to find either.
 
Gregg Bolinger
Ranch Hand
Posts: 15304
6
Mac OS X IntelliJ IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Stan James:
Are you ok with regular expressions? In JRE 1.4 or later look at the JavaDoc for Pattern and see if you can make one that matches the literal ">Trigger count =", then any number of digits 0-9, then the literal "<". You can use Matcher and getGroup to get the matching string, then break it apart on spaces or substringing or whatever. With "capturing groups" you should be able to get the number out directly.

Let us know if that's too vague ... we try to avoid giving the answer straight out but don't want to make it too hard to find either.



Thanks Stan. I'll look into that.

we try to avoid giving the answer straight out...

Really? Is that what we do?
 
Gregg Bolinger
Ranch Hand
Posts: 15304
6
Mac OS X IntelliJ IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, I've got some refactoring to do but it works. Thanks Stan.
 
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't know how you're reading the HTML page in your application - are you doing it by parsing it yourself, with lots of string manipulations etc.?

There are easier ways - you could use a HTML parser library like this one, for example: http://htmlparser.sourceforge.net/
 
Gregg Bolinger
Ranch Hand
Posts: 15304
6
Mac OS X IntelliJ IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jesper de Jong:
I don't know how you're reading the HTML page in your application - are you doing it by parsing it yourself, with lots of string manipulations etc.?

There are easier ways - you could use a HTML parser library like this one, for example: http://htmlparser.sourceforge.net/



Thanks. I didn't need to parse all the HTML for any other reason to find one little part and regex worked perfect for it.
 
Stan James
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
we try to avoid giving the answer straight out...

Really? Is that what we do?

Wow, guess I didn't see who that was from.
 
Does this tiny ad smell okay to you?
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic