Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Agile forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Relative URLs

 
Farakh khan
Ranch Hand
Posts: 833
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How my java code will get the all relative URLs e.g.
www.yahoo.com/aa
www.yahoo.com/bb
www.yahoo.com/cc
www.yahoo.com/dd
www.yahoo.com/ee
etc.

Thanks & best regards
 
Rob Spoor
Sheriff
Pie
Posts: 20495
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If all those URLs are located in an HTML page you can parse the page and look for all HREF and SRC attributes.
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What do you mean by "relative URLs"? URLs are always absolute; paths within a web site may be relative.

Can you give an example of an input and an output of what you're trying to do?
 
Farakh khan
Ranch Hand
Posts: 833
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Ulf Dittmer:
What do you mean by "relative URLs"? URLs are always absolute; paths within a web site may be relative.

Can you give an example of an input and an output of what you're trying to do?


http://www.javaranch.com has many other related URLs e.g.
http://www.coderanch.com/forums/user/edit
http://www.coderanch.com/forums/user/login
http://faq.javaranch.com/Watch/
http://www.javaranch.com
etc.

How can my java code read the related URLs of http://www.javaranch.com

Thanks again & best regards
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So the input would be a web page, and the output would be a list of all URLs on that web page?
 
Farakh khan
Ranch Hand
Posts: 833
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Ulf Dittmer:
So the input would be a web page, and the output would be a list of all URLs on that web page?


yes but how could I achieve this

Thanks again
 
Rob Spoor
Sheriff
Pie
Posts: 20495
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Like I said, parse the page and filter out the right attributes.



Of course SRC is not the only one. The following could also be used:
ACTION (forms)
BACKGROUND
CODEBASE
SRC (images, iframes, etc)

Plus possibly others.
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Ulf Dittmer:
So the input would be a web page, and the output would be a list of all URLs on that web page?


Mhhh, my initial understanding was that the input would be a website address, and the output would be the URLs of all pages that belong to that site.

To which the answer would have been: not possible in general, not with Java or any other language. The HTTP-protocoll simply doesn't provide the necessary information.
 
Farakh khan
Ranch Hand
Posts: 833
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Rob Prime:
Like I said, parse the page and filter out the right attributes.



Of course SRC is not the only one. The following could also be used:
ACTION (forms)
BACKGROUND
CODEBASE
SRC (images, iframes, etc)

Plus possibly others.


great!

Thanks a lot. I am trying to understand

Thanks & best regards
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic