Building a search engine
Interesting article at
Kuro5hin:
bq. Without a doubt this has been one of the most absurd and strangest projects I have started so far. Not long ago the idea that I could build a search engine capable of indexing the Internet as a whole seemed so far away. Now it is becoming a reality. Without further ado I wish to announce the early release of
mozdex.com an Open Search Engine.
bq. Mozdex.com was dreamed up from the belief that searching should be more of a science and a factual process rather then a proprietary and secretive process. Through the beauty of open source and the hard work of the Nutch team we have been able to use Nutch build a beta test index of nearly 50 million pages.
bq. What we want to do is provide a search system where you can see how the algorithm ranks pages. The ability to see incoming anchors and references to the pages gives more insight into the results. We feel that by working with an open API and Algorithm that the mass of great minds on the Internet can work together to come up with an algorithm that doesn’t lend itself so much to being cheated by “spammy” sites. The premise being that a well thought out algorithm can understand the basic tricks of the trade and more quickly react to new hacks & cheats used to "spam" indexes.
This is a severe case of re-inventing the wheel but it is an interesting one. Google does publish details on its page-rank system but not the source. I don't know that this project would have the financial resources that Google does to maintain the server farm and the bandwidth but it will be interesting to see.
Another search engine I like is here
Kartoo
Posted by DaveH at May 3, 2004 10:22 AM