Sunday, March 29, 2009

Google Making Progress With Deep Web Content

In an article in Computer World, Alfred Spector, Google’s VP of research, states that Google has developed technologies that enable the Google crawler to get content hidden behind passwords and user names.

This was in response to the following question:

Do you have plans to go after that huge body of information on the Internet that is not currently searched?


His answer:

“There is stuff on the Web, the so-called Deep Web, that is only “materialized” when a particular query is given by filling fields in a form. Since crawlers only follow HTML links, they cannot get to that “hidden” content. We have developed technologies to enable the Google crawler to get content behind forms and therefore expose it to our users. In general, this kind of Deep Web tends to be tabular in nature. It covers a very broad set of topics. It’s a challenge, but we’ve made progress.”

Several companies are currently developing technologies to access what is known as the Deep Web. We profiled Deep Peep recently. Google’s purchase of Transformics is helping them compete in what will be an important aspect of online search.

2 comments:

Custom Search