aspose file tools*
The moose likes Servlets and the fly likes Hide Servlet Response from appearing in Google Search Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Servlets
Bookmark "Hide Servlet Response from appearing in Google Search" Watch "Hide Servlet Response from appearing in Google Search" New topic
Author

Hide Servlet Response from appearing in Google Search

ujjwal soni
Ranch Hand

Joined: Mar 28, 2007
Posts: 403
I have a Servlet which generates a PDF report, now the report contents appear in google search. Is there any way i can restrict this ?


Cheers!!!
Ujjwal B Soni <baroda, gujarat, india> <+919909981973>
"Helping hands are better than praying lips......"
ujjwal soni
Ranch Hand

Joined: Mar 28, 2007
Posts: 403
Update ::

I added two meta tags which i hope will work...i am now waiting for google to reindex

Jagdish Hatagale
Ranch Hand

Joined: Apr 07, 2010
Posts: 33
hi you have to use JSP page which generate the PDF report with the Dyana mic fields so that the data will never display on google
ujjwal soni
Ranch Hand

Joined: Mar 28, 2007
Posts: 403
Thanks for your reply..I am generating dynamic pdf using servlet, i do not use jsp here, do you think generating pdf in jsp will not index my pdf in google ?

I am storing a pdf as a blob and then displaying it on a servlet by reading blob.

Tim Moores
Rancher

Joined: Sep 21, 2011
Posts: 2408
If something is accessible without restriction, then it's liable to get indexed. How it was generated does not matter. Drawback of a PDF is that you can't add a NOINDEX meta tag. You can set up a robots.txt file for your site, though.

Note that both these approaches rely on the spider cooperating. Google does so, but other spiders may not. If you want to be sure that your information is safe, don't make it publicly available.
ujjwal soni
Ranch Hand

Joined: Mar 28, 2007
Posts: 403
Thanks for your reply Tim. I will password protect the PDF's now.
Devaka Cooray
ExamLab Creator
Saloon Keeper

Joined: Jul 29, 2008
Posts: 3164
    
  47

One another thing you can do besides setting up a robot.txt is checking the user-agent header to see where the request is coming from. However it should be noted that there are some stupid spiders presenting themselves as Firefox or IE. If you need to prevent all spiders grabbing your sensitive data, you should never expose them in a publicly-accessible page.


Author of ExamLab ExamLab - a free SCJP / OCPJP exam simulator
What would SCJP exam questions look like? -- Home -- Twitter -- How to Ask a Question
Tim Moores
Rancher

Joined: Sep 21, 2011
Posts: 2408
checking the user-agent header

That can be spoofed, just like the Referer header, so it can't be relied upon for anything that matters (like security).
ujjwal soni
Ranch Hand

Joined: Mar 28, 2007
Posts: 403
Its better to password protect pdf's then...In my case, there are two types of Pdf's, one which are publicly available and searchable in google and other ones are private (non indexable). So, i created two servlets one which serves public pdf's and other one for private pdf's....private pdf servlet is now SSO password protected so google wont be able to crawl it.

I am going to test this tomorrow morning on live server but it is currently working fine on my test system

thanks for help.

Michael Cropper
Ranch Hand

Joined: Sep 30, 2009
Posts: 137
Just following on from what everyone else has said, but if something is accessible without restriction then Google will find and index this. I work in SEO professionally and can honestly say that if you don't want Google to see something then it must be password protected. Personally I prefer to put all sensitive information in a /secure/ directory which required authentication prior to accessing everything, opposed to placing a password on the document as has been described above.

All of the information spoke about above (robots.txt, noindex etc) are all guidelines which Google may choose to ignore, and often does.

Thanks
Michael
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Hide Servlet Response from appearing in Google Search