<-- ads -->

Google vs Jimmy Wales & Open Source Search

Jimmy Wales, the founder of not-for profit Wikipedia and for-profit, San Mateo, Calif.-based Wikia is part of a growing number of people who are discomforted by the growing control Google over search. And he is doing something about it. His company, Wikia, last week bought the distributed crawler Grub from LookSmart and plans to make it available in open source. Not that Looksmart wasn’t really using it anyway, not to mention the ad business they got from Wikia.

His bet: like Linux became a migraine for the monopolist of the last generation, open source search tools will keep companies like Google honest. It is not an easy task, for it is impossible to get away from Google that is firmly embedded into our digital lives.

“Search is part of the fundamental infrastructure of the Internet. And, it is currently broken,” Wales said back in December 2006, when Wikia launched Search Wikia effort. “Why is it broken? It is broken for the same reason that proprietary software is always broken: lack of freedom, lack of community, lack of accountability, lack of transparency.”

Wales launched Search Wikia earlier this year, and Grub acqusition is part of that strategy. (You can run Grub on your Windows or Linux-based PC, either in the background or as a screensaver.) Following the announcement, we spoke with Wales, who outlined that with Grub, and other tools such as Lucene, an open source indexing software, innovation around search can thrive.

By marrying these search results and the human context provided by says Wikia wikis, the final search results could actually become useful once again. Grub, Lucene and Nutch (a web crawler based on Lucene) are the powder and spark of the open search revolution.

Grub is not by any means the final move, and should be viewed as a first concrete step in a long term strategy. Jeremie Miller, inventor of Jabber and XMPP protocol (and also CTO of Wikia) gave a talk at OSCON about the architecture of open source search. Miller in his talk pointed out that the monolithic search can be broken into three components, and interested parties could implement one or more of the three components.

The three components are - factories that crawl, present and present content; collectors who rate and rank content from multiple sources; and brokers who direct user queries to the collectors or factories. Miller believes that this is a five year process. Grub is one of the many first components that will be needed for building a truly open source search infrastructure. The biggest hindrance to any search start-up taking on Google (or Microsoft, Ask or Yahoo for that matter) is the high cost of infrastructure.

Sure Amazon’s EC2 service has helped but it isn’t enough. Google, thanks to its money machine has been able to build an infrastructure that lets it crawl, index and show results at a faster pace. Even if a start-up comes up with a better alogrithm, it still needs to sink millions into infrastructure, to just get into the business, and offer a fast-experience most people associate with Google.

Grub, on the other hand is a way to build massive, distributed user-contributed processing network, and can help offset with the power of a wiki to form social consensus, the open source Search Wikia project has taken the next major step towards a future where search is open and transparent. Another nascent but promising open source P2P search engine, Yacy, coming out of Germany. (Also check out Faroo, a German P2P search start-up.)

Can it work?

Wales faces an uphill climb. First he has to ensure that there are enough people using Grub, and are more importantly are hacking enhacements to the software. At the same time, he has to address other concerns as pointed out by this commentator on the Search Engine Land and other blogs.

While Google might be impossible to beat in a full frontal assault, it is vulnerable to smaller more focused attacks. While Linux may not have been able to kill Microsoft, it has stolen opportunities from the OS giant. It has been particularly effective in the Internet infrastructure (data centers.)

Open source search can do precisely the same - take away opportunities from large search engines. Perhaps like Linux, we will see a shift away from Google, and Venture Capitalists, for long scared by the prospect of competing with Google, will loosen their purse strings.

If Linux ended up spawning devices as diverse as TiVo and mobile phones, open source search can lead to many more specialized search engines, also called Vertical Search Engines. Today, it the cost of building a good vertical search engine costs millions of dollars. However, building and operating a vertical search engine is not for the faint of the heart.

In an interview with Fast Company magazine earlier this year, Wales quipped:

“The other thing we’re looking to is some of the second-tier search companies,” he admits. “We’ve talked to–I can’t say who–different people, asking, would they be better off participating in a project that helps quality search results to become a commodity?”

Put it another way - Wales is hoping for death by a thousand cuts to the search incumbents.

More @ Resource Shelf.

Share This

Original post by Om Malik

<-- ads -->

Leave a Reply

You must be logged in to post a comment.

<-- ads -->