the trailboss abuses his CodeRanch power for his other stuff (power corrupts. absolute power corrupts absolutely is kinda neat!)
permaculture light bulbs permaculture electric heat permaculture cast iron permaculture wood burning stove permaculture solar food dehydrators
The moose likes Meaningless Drivel and the fly likes How does search engines work? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Other » Meaningless Drivel
Bookmark "How does search engines work?" Watch "How does search engines work?" New topic
Author

How does search engines work?

chaitanya karthikk
Ranch Hand

Joined: Sep 15, 2009
Posts: 806

Hi all, I am Chaitanya, I want to know how google gets the links of a certain search. Suppose if I search about "cat", all the related websites about cats are displayed. How does google knows about the related websites? One more doubt regarding the same, suppose if I search the same from yahoo search engine also, I get almost same results. How is this done exactly?

Thank you all in advance. Have a good day.


Love all, trust a few, do wrong to none.
Rahul Sudip Bose
Ranch Hand

Joined: Jan 21, 2011
Posts: 637

use google


SCJP 6. Learning more now.
chaitanya karthikk
Ranch Hand

Joined: Sep 15, 2009
Posts: 806

Rahul Sudip Bose wrote:use google

The entire history is given there. I need a straight forward answer dude. Its time consuming reading all those.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

Google once posted this very accessible article about their core technology.


[Jess in Action][AskingGoodQuestions]
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 61761
    
  67

That's so 2002! They've updated with Birds of Paradise since.


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

Bear Bibeault wrote:That's so 2002! They've updated with Birds of Paradise since.



Sorry, Bear, I'm not really a hardware guy. The importing thing is that the underlying code is very similar (ATCGATATGC...)
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 61761
    
  67

Thank you for truncating the entire code. The Internet thanks you as well!
chaitanya karthikk
Ranch Hand

Joined: Sep 15, 2009
Posts: 806

Finally I came to know how this works after reading this article and discussing with my friend.

I will explain what I understood, please tell me if I miss anything or if I am wrong any where.

Suppose that there is a website and let the domain is from yahoo. Yahoo asks the website owner whether to submit the site to the popular search engines. While submitting you will be asked to enter key words. These key words will be used as search keys. Not only yahoo does this, everyone who sells domain does the same. In this case its yahoo.

When you hit submit a request is sent to all the popular search engines chosen. Each and every search engines runs few programs called as spiders. These spiders will read the requests and get into the sites and will download all the static pages to their discs and will give an index to each and every key word. Will also store the address of the webpage, from where it is downloaded. This process is called as web crawling. Don't worry, crawling will not be done the entire day. The crawling process will be scheduled when to run. Many search engines run their spider programs in the night time because the traffic will be low.

From the next search onwards your site is also included in the searching process.

Suppose you now have searched for "Why main in java is static?" Now the search engine algorithms will search their file systems, search the downloaded pages whose key is "Why main in java is static?", extracts the associated web site addresses, then build a web page consisting all the links, then sends the page to the user. The user now based on his interest clicks on any link, the he will be redirected to the particular site and respective page.

Note: The web pages will or any thing the spider programs download, are not saved in a database. All information is saved in flat files. Because searching a database takes more time searching the file system.

Each and every search engine employees its won spider programs. Google has its own disc space to store all the static files. Whereas yahoo does not have its own disc space. Yahoo depends on other organization (I think netlap or netapp or may be another) to run searching programs. Those organizations will do the web crawling process and yahoo just uses their discs, searches it and builds a web page consisting of many links.

Please tell me if I am wrong or if I miss anything. Thank you all in advance.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How does search engines work?