Thursday, June 21, 2007

Local Search in India: Search Relevance and Experience.

Search in India has a unique problem. Most Localities in India are words in local languages. When written in English, there are multiple ways of representing the same word. Resolving all these spellings to that particular location is a non-trivial task. Most search engines at the moment do not do such resolution and a few like Onyomo and Burrp have taken the easy way out by giving a drop down of localities the moment the user starts typing in the locality names. Guruji resolves locations to some extent but still has a long way to go. Justdial doesn’t resolve locations at all and looks for direct matches.

Beyond this, Local Search suffers from all other problems that are common to a categorized directory search ranging from inadequate keyword and category aliases to incorrect categorization to search tuning by category parameters. Technically, a site like Ilaakaa wouldn’t even qualify as a Local Search site because their search merely uses the Indiacom Yellow Pages search.

Since the key to success lies largely in the comprehensiveness and searchability of the data, many of the Local Search players don’t seem to have invested much time or effort on user experience and navigation. Justdial with the best data has one of the poorest user experiences on the site. Onyomo and Burrp have the best user experience and navigation among all the sites, with minimum clutter and ordered presentation.


Sunday, June 17, 2007

Local Search in India - Data Woes

The Local Search market in India is heating up. The space is largely taken up by startups at the moment. Considerable hype has been created of late in this space with Guruji getting backed by Sequoia Capital and Onyomo unveiling its SMS search platform.

The primary factors that determine the effectiveness of a Local Search engine are data quality, search relevance and ease of navigation. One of the biggest challenges that players face in this space is the lack of availability of rich local data. Unlike the US, the local data market is highly fragmented and most players are Yellow Page companies whose data is largely outdated. Arguably, the best database of Local listings currently rests with JustDial, a company which serves Local Information primarily on the phone but has recently entered the online fray as well. Ever since its launch, www.justdial.com has had the maximum traffic in this space, rapidly gaining over Guruji.

Most of the other players rely on Yellow Page companies for Local Data. Guruji, a recent entrant sources its data from Infomedia and hence suffers from the problem of outdated listings. Ilaaka, another player in the local space redirects its search to the Indiacom Yellow Pages site while MapMyIndia sources its local listings from GetIt.

Onyomo has adopted a different approach towards data. They have feet-on-street teams which have been collecting local data by street surveys. This is similar to A9’s effort at collecting pictures of Local Listings by feet-on-street except that the economics for such an exercise work far better in India where labor comes much cheaper.

AOL has also launched a local site but its data is very sparse and largely sourced from websites and web directories.

Clearly, the data problem has been solved only by JustDial and remains a huge barrier to entry for any other players who do not wish to take the Yellow Pages route.

One of the ways of solving the data problem over time is to have the user community contribute towards editing and adding new listings. JustDial and Onyomo have already implemented features to facilitate this process.

Over time, data will definitely be one of the key factors to determine success in Local Search.

Wednesday, March 07, 2007

What the hell is Web 2.0 all about anyway?

Web 2.0 has heralded a shift. Existing wisdom on the net on which previous generation products were based has been rudely challenged and effectively proven to be outdated.

Catering to the Long Tail

Niche websites are on the rise and account for the majority of websites out there now. Initiatives like blogging have just spurred on the trend. The long tail has vital implications for advertisers since niche sites are visited by a very specific user group and provide great targeting opportunities and better ROI on the eyeballs.

Data as a differentiator

The Internet is built around specialized databases today. Whether it is an index of crawled websites or digitized offline data, the web is driven by data and comprehensiveness and searchability of the same differentiates products and services on the net. The need for control and ownership of data is so critical that online media players have even started backward integrating into content creation.

The importance of the User Base

In today’s internet economy, the user base is the key to success. In the past, a user base was all about eyeballs and advertising money. However, users are fast changing from being mere content consumers to being content creators as well. With the growth of annotations and what is popularly known as ‘folksonomy’, the user base has fast come up as a vital source for data enrichment. The era of categorization taxonomy is in the past now and online companies are fast moving towards an "architecture of participation" to allow users to actively help grow the data and hence the business.

Goodbye IPR

The days of IPR are fast coming to an end. IPR restricts users and makes the content economy very centralized. Web 2.0 is all about a distributed economy as epitomized by wiki-based products which thrive on user-generated content and have no issues with IPR promoting regeneration and reuse. The growing discontentment against DRM in the case of online music may herald a similar era in that field.

The Phased-out Launch

Online Products are no longer released one fine day as fully-tested fool-proof products. Phased out launches are common and the beta phase typically extends to almost the entire life-cycle with users actively chipping in as testers too. Features are added incrementally and the product stays in a perpetual state of Beta.

Mash-up era

Mash-ups are the order of the day. Take existing products and come up with an entirely new product, a new user experience, and hence a new way to get money, all this, without having to invest too much on the product creation itself.

Delivery on multiple platforms

Delivery platforms for the internet have extended beyond the PC to include a variety of handhelds. Products that launch well across all platforms go down well with the consumer. Consumers love freedom of use and portability and multiple platform compatibility is the thing for the future.

Tuesday, January 09, 2007

The Challenge of Content Management

A large number of online media companies source their content from the traditional content-producing media companies like news agencies, recording labels etc. Consequently, handling content and presenting it in a particular common format is a major challenge for these companies especially when content is sourced from several providers with differing presentation patterns. Content management at this level involves two broad tasks:

  • Mapping all content to a common format for presentation and for ease of mining.
  • Categorization of content by user intent.

Storing content in a common schema is essential if content mining is to be automated to some extent. However, the bigger challenge surfaces after this when the content is to be categorized. Categorization is normally done to aid the user in browsing the content and also in presenting relevant search results.

Categorization can be done based on several parameters. In case of music, the problem is simplified to a great extent since most of music is categorized by genre and further by artists where an artist can fall under several genres. In this case, the categorization can be fully automated once the taxonomy of genres and artists is in place. However, in case of language content which needs to be categorized by subject matter, a good deal of social intelligence is required to understand the topic to which a certain item belongs. In such cases, machine learning and classification will always yield errors. Errors may dip with increased machine learning but will always remain. Hence, some amount of manual effort is essential in such categorization. The trick then is to figure out which parts should be automated and which parts manual since there is a trade off between cost and accuracy here.

Categorization essentially can be broken down into two further tasks:

  • Finding categories ‘related’ to an item.
  • Deciding which of those categories is/are actually ‘relevant’.

In the case of every item being mapped to only one category (one-to-one mapping), the second problem can essentially be combined with the first. However, in a one-to-many mapping, one needs to use social intelligence (possible with a manual workforce) to determine which categories are relevant.

To provide a general rule (with exceptions), finding related categories should be automated, especially in a one-to-many mapping since an algorithm will do a more exhaustive job than a human. However, in deciding which of these categories are relevant, manual workforce alone can bring in the level of social intelligence that is required to determine the subject matter of a certain item and accordingly categorize it.

Wednesday, January 03, 2007

Dogster: Thinking outside the Kennel!

I stumbled upon Dogster on an idle afternoon while trying Google’s search results for some weird queries. Just when you thought you’d seen it all, you land up at a social networking site for dogs! Or, to be technically correct, for Dog Lovers! The very idea seems incredulous and you might wonder who’d be goofy enough to pose as his own dog and write blogs form his dog’s perspective.

That is, until you realize that it is actually a neat business idea to target and serve a distinct segment of the consumer population. By catering to a niche segment, this provides the $36 bn pet industry advertisers with highly relevant eyeballs. And the results speak for themselves.

Dogster and Catster are community sites for owners to share pictures and diaries of their pets. The founders of Dogster Inc apparently started up the sites as a parody of Friendster, and the like. Dogster's advertising rate at around $5 cpm is almost 40 times that of MySpace, a much more general SN. Dogster makes 95 % of its revenue from advertising and sponsorships and the rest from premium subscriptions. Advertisers include names as big as Disney, Target, PetSmart, Gap and Warner Brothers.

What makes a seemingly crazy idea like Dogster click? Dogster creates a new user experience since it orients itself around dogs rather than actual people. Users can stay anonumous and yet identify with something as dear as their pet. Since users share a common interest, canine related nomenclature and puns are plenty. People love to talk about stuff they are really passionate about.

The site is doing great with more than 300000 registered users. CEO Ted Rheingold followed up Dogster’s success with launching Catster and now wants to extend this to cover every pet and ultimately every hobby.

Monday, January 01, 2007

Netflix: Success of Simplicity

Netflix is a potent example of simplicity working wonders. It operates on a very simple business model allowing subscription for home delivery of DVD movies. Why does it work? Primarily because it is an answer to the single biggest hassle to DVD rentals: sky-rocketing fines on returning the DVD late. Providing various schemes depending on period of subscription and number of rentals during that period, it allows the users to retain the DVDs as long as he wants to without worrying about late fines. The favorable consumer experience is extended by allowing online ordering and pre-paid return envelopes for the DVDs. Additionally, Netflix also mines user behavior to recommend movies to the user based on his previous choices.

The flexibility, ease of use and comprehensiveness of choices with Netflix have resulted in a faux ownership for the users. Any system which relaxes constraints on the user experience is going to achieve greater acceptance. Beyond this, a few eternal factors have also helped the company make hay. Sales of DVD players took off around the time Netflix switched to a subscription service in late 2000, and Hollywood studios signed deals with the company around that time to counter the growing power of Blockbuster.

Netflix’s business model has redefined the DVD rental industry in such a big way that retail giant Wal Mart has backward integrated to incorporate such a system and long time competitor Blockbuster has also worked towards a similar model. Back home in India, www.seventymm.com has worked on a similar model with some success.