This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
"spider" is effectively a synonym for "webcrawler". Both describe a piece of software which finds links in a web page and follows them to other web pages. In the context of a search engine, imagine that you wish to set up a search engine for your own web site. For any kind of decent performance you really need an index, but there are several ways of creating that index. You could manually run some program on each web page, but you might forget a page, or forget to regenerate it when a page changes. You might also accidentally index a page that should not be publicly visible. A better way might be to set some software pointing at your "home" page which indexes all the words on that page, then looks for all the links to other pages on your site and repeats the process. When it has finished it will have indexed all the publicly visible pages on your site. You can then set this "indexing spider" to run from time to time (say first thing every monrning) so that whatever changes you make to the structure or content of your web site, the search index will always be up to date.
Hi Frank! Thanx for ur reply.now i got a vague idea about spiders. what i understood about it is it's a program which tracks all the links and stores them, and finally we use those results for our search engine.i am sorry i was not able to get a more clear picture about this.anyway thanx once again. Have a Nice Day! Preethi.