File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

URL validation

 
Ming Chen
Greenhorn
Posts: 13
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need to check the url user enters in my servlet.
The code on the following works fine for simple url such as "www.google.com" but if the urlString has query string such as "http://www.google.com/search?sourceid=navclient&q=velocity+apache", Exception throws on url.openStream(), I don't know why, because the url with query string is valid, but it just won't work. Could anyone help. Is there any alternative for this?
// open a connection to the url to get the size of the page
URL url = null;
DataInputStream input = null;
try{
url = new URL(urlString);
InputStream in = url.openStream();
input = new DataInputStream(in);
} catch(Exception e) {
out.println("The url you provided can not be accessed, please try again");
return;
}
 
Michael Target
Greenhorn
Posts: 12
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ming Chen, on query "http://www.google.com/search?sourceid=navclient&q=velocity+apache" your program received HTTP response code 403 (Forbidden) from Google because of Google Terms of Service for Your Personal Use
"No Automated Querying
You may not send automated queries of any sort to Google's system without express permission in advance from Google. Note that "sending automated queries" includes, among other things:
using any software which sends queries to Google to determine how a website or webpage "ranks" on Google for various queries;
"meta-searching" Google; and
performing "offline" searches on Google.
Please do not write to Google to request permission to "meta-search" Google for a research project, as such requests will not be granted"
 
Ming Chen
Greenhorn
Posts: 13
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks, Michael Target.
You are very helpful. I tried to use query string in url for other webpage, it works fine. But how google knows the query is not sent from web browser.
Ming
 
Michael Target
Greenhorn
Posts: 12
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Google may determine web browsers or something else from an HTTP user agent string.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic