• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Copy HTML content into Excel by Java

 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am a Java beginner and it would be grateful if you could help to provide some sample codes or guidelines for below situation

I have a large number of html files, each file contains some school's info. Each html file may locate at different hierarchy of folder path but for sure it is always in the lowest level of the folder path. And some folders may have no school html files

For example

C:\schools\england\london\hampstead\school_A.html [1 html in 1 folder]
C:\schools\england\london\southwark\school_B.html [multiple files in 1 folder]
C:\schools\england\london\southwark\school_C.html
C:\schools\england\london\southwark\school_D.html
C:\schools\wales\monmouth\school_E.html [file at different path level] C:\schools\scotland\aberdeen\aberdeen [folder has no file]


HTML CONTENT TO BE COPIED
< h1 id="MainControl_CustomFunctionality_ZoneMain_EmbeddedUserControlPlaceholderControl1_ctl01_schoolName" class="schoolName">school_A

< li id="MainControl_CustomFunctionality_ZoneMain_EmbeddedUserControlPlaceholderControl1_ctl01_boardingTypeContainer" style="list-style: none;">Day/boarding type: Day, full boarding and weekly boarding

< li id="MainControl_CustomFunctionality_ZoneMain_EmbeddedUserControlPlaceholderControl1_ctl01_boardingFeeContainer" style="list-style: none;">Boarding fees per term: £7,317 to £8,370

EXPECTED RESULTS IN EXCEL TABLE
3 Columns Headers: "SCHOOL" "BOARDING TYPE" "BOARDING FEES PER TERM"

Row 1: "school_A" "Day,full boarding and weekly boarding" "£7,317 to £8,370"

Thank you very much for your help
 
Marshal
Posts: 79177
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to the Ranch

That is not a beginner's project. You can find HTML parsers and you can use the Apache POI project to link directly to spreadsheets. We do not have a bank of code which we give to users.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic