This week's book giveaways are in the Refactoring and Agile forums.
We're giving away four copies each of Re-engineering Legacy Software and Docker in Action and have the authors on-line!
See this thread and this one for details.
Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Agile forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Making GWT crawlable

 
Mike Cheung
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, so we know GWT is not crawlable without some work because if we look at the page source it often doesn't have the same content as what the browser displays. However, I'm wondering if we can have say Tomcat to detect when Google bot is visiting, and return a snapshot of the page to the Google bot. The way I'm thinking of doing this is on every controller class, add a few lines to look whether it's a bot and if it is, use HtmlUnit to retrieve the page and return it to the view. Any issues with doing this?
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes. Treating the Google bot differently than human visitors is a big no-no that will be penalized by Google when detected.
 
Mike Cheung
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks. But on Google's site it mentions about recommending us to change AJAX powered sites to use '#' to '#!' In the URL such that the Google bot will make a query with the 'escaped_fragment' thing as documented here....
https://developers.google.com/webmasters/ajax-crawling/docs/getting-started?csw=1

If I have mis-interpreted the above please let me know.

And then there is also a GWT Platform and one of its feature is to make sites built using this framework crawl able I think. How I still figured out yet. But given we can tell the URL being requested and if we can load web pages in full via HtmlUnit that's why I'm trying to fin out if it can be done.

 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, that's the way to make GWT sites crawlable. Besides the pages you found, more information about that is at http://code.google.com/p/gwt-platform/wiki/CrawlerSupport.
 
Mike Cheung
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi okay I've looked at that before and just tried looking at it again but I'm not getting it. It talks about MVP, AppEngine, Guice, etc. Unfortunately I don't know enough about GWT and rest of what is mentioned to make any senses out it.

But basically is it saying by using GWTP, any GWT based pages will become searchable?

I'm looking to create a public web site and am wondering if I should find someone to do this using JSP or GWT. Both are Java based but quite different.

If there are certain aspect of GWT that'd make it not crawl able as compared to JSP then I'd not want to use it but unfortunately I'm not in a position to quite tell. So any advise is appreciated.
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
GWT certainly means extra effort, and the pages still are not as easily indexable. What's more, just because Google honors the URL trick, Bing or other search engines may not.

But GWT is fundamentally meant for web apps, not content pages. For those I'd advise against GWT (and possibly for a CMS - depends on how much "app" functionality is involved).
 
Mike Cheung
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Okay thanks. Also is it true that even if I am using GWT to create simple dynamic web pages (ie not an app), the load time is still slower than JSP based pages?
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't know that you can state it with such generality. But GWT sites do seem to be slower than other sites. But then, there are also some very fast GWT sites, but it seems to take more wizardry to make them fast than it does for non-GWT apps. And wizards are generally in short supply :-)
 
salvin francis
Bartender
Posts: 1263
10
Eclipse IDE Google Web Toolkit Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A few inputs ...
Sure the dynamic content isn't crawlable, but the static content is.
Use <noscript> tag to add in content or even topics related to your page.
Use meta tags with keywords and description.
Use alts in all your static images.
Use descriptive page <title>

As far as I understand, you need your site to be searchable only for a few key points or phrases which you can include in the above points. They are in-fact encouraged by google
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic