Google reads forms and fills them out - sometimes

Posted by Marshall on April 11, 2008 | Link It

I thought this news was kinda interesting about Google crawlers start filling out forms which used to not be possible - in other words, if you had a site with protected content that you had to login to access, Google would stop and not crawl the site past that page.

But now, Google is actually trying to fill out the form and put something into the fields of the forms, and if it succeeds, crawling the pages that are generated by filling out the form. 

According to Danny Sullivan in a post on Google Now Fills Out Forms & Crawls Results

"… In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn't find and index for users who search on Google. Specifically, when we encounter a <FORM> element on a high-quality site, we might choose to do a small number of queries using the form. For text boxes, our computers automatically choose words from the site that has the form; for select menus, check boxes, and radio buttons on the form, we choose from among the values of the HTML."

Have to admit, I'm curious to know what the Googlebot would put in to a form to generate a page.  Depending on how this works out - perhaps Google will ask site owners to create a dummy account so it can log in and crawl protected content - which - all things being equal, ought to be an option. 



Post a Response

Name (required)

Email (required, not published)

Website (optional)

Note: The following tags are approved for comments on this blog:
<a href=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <del> <strong>