When a user enters a search term, say, "osteoporosis," a "direct query engine" automatically configures the request to conform to the syntaxes of the various deep site forms, sending the same query out to multiple databases at once. Of the 40,000 deep Web databases that BrightPlanet can access, only those determined to be relevant are queried. "That's part of what our automated technology does," says BrightPlanet Chairman Michael Bergman. "We actually evaluate the search request, match that against our profiles of the search sites and make a selection in the background as to where the query gets directed."
Intelliseek also is dredging up treasure from the bottom of the deep Web. Founded in 1997 with backing by Ford Ventures and Nokia (NOK) Ventures, the Cincinnati-based company uses its technology to collect and analyze what people on the Web are saying about its clients' products and services. Companies such as Ford, Goldman Sachs, Nokia and Procter Gamble (PG), schedule recurring searches to monitor Usenet forums, message boards, archived chat-room discussions and news articles, grabbing anything related to a particular trademark or company.
Intelliseek's Vora says the technology's real strength is not how it finds information from disparate sources but the way it can take that information and generate reports and charts to monitor competitors and identify trends in customer attitudes.
"Assume that I'm a Lincoln brand manager," he says. "I can do a search on 10 different engines all day long and end up with 300 documents. I don't have time to read 300 documents. All I care about is that 37 percent of the people prefer my interior over the Chrysler 300M."
Because general consumers are unwilling to pay for search services, businesses are likely to be the main market for these new search capabilities for now. But BrightPlanet and Intelliseek might need a new business model before long: At least one consumer search engine is diving into the deep Web.
In late February, Mountain View, Calif.-based Google added 13 million portable-document format files to its search engine index. Adobe Systems (ADBE)' electronic file format is popular with online publications, including white papers, academic articles and business reports. Google now has 70 percent of the publicly available PDF documents on the Web, with more to come, says David Krane, a Google spokesman.
In recent weeks, Google also obtained access to Usenet archives - containing 500 million discussion messages on every topic imaginable - dating back to 1995.
Krane says Google has made the deep Web a "top priority" and may even endow its crawler with a bigger brain. If Google succeeds in advancing its spider on the evolutionary tree, then the other search engine crawlers will have to evolve or perish, their remains lurking somewhere in the depths of the deep Web.
Mark Frauenfelder is a frequent contributor to The Standard.





