| Author |
Getting link attachment
|
J Lalit
Greenhorn
Joined: May 19, 2008
Posts: 15
|
|
Hi,
I have a requirement that states extraction of links in body of some content.I have achieved it but stuck up in verifying whether a given HTTP link in content is a simple link to a page/site or it is an attachment?
I did enough googling but no success!!!
Any idea or link or piece of code?
Thanks & Regards,
Lalit
|
Thanks & Regards
J Lalit.
|
 |
Steve Luke
Bartender
Joined: Jan 28, 2003
Posts: 3026
|
|
To get it right you pretty much need to send a request to the URL and read the CONTENT-TYPE header. You may be able to get away with sending a HEAD request to the site, which would ask the server to only respond with the headers for a particular page. You would send the request, read the CONTENT-TYPE response, and based on that determine the type of link.
More on HTTP methods (including the HEAD method)
HttpURLConnection API, Tutorial on URLConnections
Example of reading Headers with URLConnection
|
Steve
|
 |
J Lalit
Greenhorn
Joined: May 19, 2008
Posts: 15
|
|
Hi,
Thanks for reply.How can i differentiate that a particular link is a normal HTTP link or an attachment.Does this work:
For normal Link will return "text/html".
For an attachment will return application/type where type can be MSWORD,PDF etc.
Does this formula hold true always?
|
 |
Henry Wong
author
Sheriff
Joined: Sep 28, 2004
Posts: 16681
|
|
Well, it depends on what you define as a "normal link". Are HTML pages the only thing you consider as a normal link? What about regular text pages (text/plain) ? What about xml pages (text/xml)?
The list of possible content types is actually pretty big. And growing. You need to decide where to draw the line.
BTW.... did a quick google.... http://www.iana.org/assignments/media-types/
Henry
|
Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
|
 |
 |
|
|
subject: Getting link attachment
|
|
|