• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Getting link attachment

 
J Lalit
Greenhorn
Posts: 15
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have a requirement that states extraction of links in body of some content.I have achieved it but stuck up in verifying whether a given HTTP link in content is a simple link to a page/site or it is an attachment?

I did enough googling but no success!!!

Any idea or link or piece of code?

Thanks & Regards,
Lalit
 
Steve Luke
Bartender
Pie
Posts: 4181
21
IntelliJ IDE Java Python
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
To get it right you pretty much need to send a request to the URL and read the CONTENT-TYPE header. You may be able to get away with sending a HEAD request to the site, which would ask the server to only respond with the headers for a particular page. You would send the request, read the CONTENT-TYPE response, and based on that determine the type of link.

More on HTTP methods (including the HEAD method)
HttpURLConnection API, Tutorial on URLConnections
Example of reading Headers with URLConnection
 
J Lalit
Greenhorn
Posts: 15
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Thanks for reply.How can i differentiate that a particular link is a normal HTTP link or an attachment.Does this work:

For normal Link will return "text/html".
For an attachment will return application/type where type can be MSWORD,PDF etc.

Does this formula hold true always?

 
Henry Wong
author
Marshal
Pie
Posts: 20831
75
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, it depends on what you define as a "normal link". Are HTML pages the only thing you consider as a normal link? What about regular text pages (text/plain) ? What about xml pages (text/xml)?

The list of possible content types is actually pretty big. And growing. You need to decide where to draw the line.


BTW.... did a quick google.... http://www.iana.org/assignments/media-types/

Henry
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic