Intelliseek CEO Mahendra Vora says it's a shame Firestone wasn't able to use his company's online rumor-detection software when its tires started blowing up. Firestone's media nightmare peaked in August 2000, the month it recalled 6.5 million failure-prone tires. But a search after the fact, using Intelliseek's software, "found evidence of the Firestone problem as early as August 1998," boasts Vora. Firestone might have begun its damage control efforts a lot earlier had it known that people were already beginning to talk about its tires on the Net.
Intelliseek's system, called Corporate Intelligence Services, combs through areas of the Web that don't show up in search engines, such as archives of Usenet forums, message boards and chat-room discussions, to ferret out discussions relating to a brand's products or services.
Intelliseek, along with its chief competitor, BrightPlanet, is coming up with ways to work around the limitations of standard search engines to let users plumb what's become tagged the "deep Web" or "invisible Web," the formerly unsearchable depths of the Internet. Even consumer-based engines such as Google (dossier) are beginning to offer findings from parts of the Web that had been previously hard to mine.
"There's a lot of material you'll never find using a general search engine, no matter how hard you search," says Gary Price, an independent research consultant and co-author of the forthcoming The Invisible Web.
While currently 1.5 billion Web pages are available to the average searcher, BrightPlanet estimates that some 550 billion documents would never show up in an index. This includes material such as Salary.com's comparative compensation statistics, the U.S. Patent and Trademark Office's full-text and full-page image databases, Securities and Exchange Commission records, academic papers, census data, Library of Congress records, medical research and untold numbers of art images and music files.
The problem is that the software "spiders" used by search companies such as Lycos and AltaVista (dossier) to crawl around the Web and generate indexes are too stupid to access most databases or information stored in formats other than HTML (the formatting language used to construct ordinary Web pages). Intelliseek and BrightPlanet have designed software agents that can automatically extract requested information from multiple invisible databases at the same time and present the results in customizable reports.
Chris Sherman, associate editor of the online site Search Engine Watch, and Price's co-author, says the deep Web is "of huge value as a business. Look at the market that traditional proprietary database or information service providers like Dow Jones or Lexis-Nexis have." (That would be $2.2 billion and $1.8 billion in sales, respectively, for 2000.) "There are a huge number of really authoritative resources out there that are either free or very low-cost. I think that they pose a real threat to some of these more established concerns."
But if the information is cheap - or free - how's money to be made from it? One way is by directing people to it and making it meaningful. That's BrightPlanet's mission. In May, the 2-year-old Sioux Falls, S.D., company launched a subscription-only Web site that lets corporate clients set up automated search queries and generate reports of both the deep Web and the surface Web.





