File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes GWT and the fly likes Making GWT crawlable Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Frameworks » GWT
Bookmark "Making GWT crawlable" Watch "Making GWT crawlable" New topic
Author

Making GWT crawlable

Mike Cheung
Ranch Hand

Joined: Feb 01, 2013
Posts: 85
Hi, so we know GWT is not crawlable without some work because if we look at the page source it often doesn't have the same content as what the browser displays. However, I'm wondering if we can have say Tomcat to detect when Google bot is visiting, and return a snapshot of the page to the Google bot. The way I'm thinking of doing this is on every controller class, add a few lines to look whether it's a bot and if it is, use HtmlUnit to retrieve the page and return it to the view. Any issues with doing this?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42030
    
  64
Yes. Treating the Google bot differently than human visitors is a big no-no that will be penalized by Google when detected.


Ping & DNS - my free Android networking tools app
Mike Cheung
Ranch Hand

Joined: Feb 01, 2013
Posts: 85
Thanks. But on Google's site it mentions about recommending us to change AJAX powered sites to use '#' to '#!' In the URL such that the Google bot will make a query with the 'escaped_fragment' thing as documented here....
https://developers.google.com/webmasters/ajax-crawling/docs/getting-started?csw=1

If I have mis-interpreted the above please let me know.

And then there is also a GWT Platform and one of its feature is to make sites built using this framework crawl able I think. How I still figured out yet. But given we can tell the URL being requested and if we can load web pages in full via HtmlUnit that's why I'm trying to fin out if it can be done.

Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42030
    
  64
Yes, that's the way to make GWT sites crawlable. Besides the pages you found, more information about that is at http://code.google.com/p/gwt-platform/wiki/CrawlerSupport.
Mike Cheung
Ranch Hand

Joined: Feb 01, 2013
Posts: 85
Hi okay I've looked at that before and just tried looking at it again but I'm not getting it. It talks about MVP, AppEngine, Guice, etc. Unfortunately I don't know enough about GWT and rest of what is mentioned to make any senses out it.

But basically is it saying by using GWTP, any GWT based pages will become searchable?

I'm looking to create a public web site and am wondering if I should find someone to do this using JSP or GWT. Both are Java based but quite different.

If there are certain aspect of GWT that'd make it not crawl able as compared to JSP then I'd not want to use it but unfortunately I'm not in a position to quite tell. So any advise is appreciated.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42030
    
  64
GWT certainly means extra effort, and the pages still are not as easily indexable. What's more, just because Google honors the URL trick, Bing or other search engines may not.

But GWT is fundamentally meant for web apps, not content pages. For those I'd advise against GWT (and possibly for a CMS - depends on how much "app" functionality is involved).
Mike Cheung
Ranch Hand

Joined: Feb 01, 2013
Posts: 85
Okay thanks. Also is it true that even if I am using GWT to create simple dynamic web pages (ie not an app), the load time is still slower than JSP based pages?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42030
    
  64
I don't know that you can state it with such generality. But GWT sites do seem to be slower than other sites. But then, there are also some very fast GWT sites, but it seems to take more wizardry to make them fast than it does for non-GWT apps. And wizards are generally in short supply :-)
salvin francis
Ranch Hand

Joined: Jan 12, 2009
Posts: 928

A few inputs ...
Sure the dynamic content isn't crawlable, but the static content is.
Use <noscript> tag to add in content or even topics related to your page.
Use meta tags with keywords and description.
Use alts in all your static images.
Use descriptive page <title>

As far as I understand, you need your site to be searchable only for a few key points or phrases which you can include in the above points. They are in-fact encouraged by google


My Website: [Salvin.in] Cool your mind:[Salvin.in/painting] My Sally:[Salvin.in/sally]
 
GeeCON Prague 2014
 
subject: Making GWT crawlable