Google Asks Feds For Better Document Access
The numerous agencies of the federal government possess thousands of documents and pieces of information that can’t be found by Google’s crawlers.
As the leading search engine, those who use Google in an attempt to find what they need probably won’t find it if it’s stuffed behind an online search form at an agency’s site.
Today, the Google Public Policy blog noted the company’s testimony before the Senate Homeland Security and Governmental Affairs Committee about this problem. Google’s J.L. Needham called out forms as an obstacle to indexing content effectively:
The most common barrier is the search form for a database that asks users to input several fields of information to find what they’re looking for. Our crawlers cannot effectively follow the links to reach behind the search form.
Google tipped the Sitemaps protocol, accepted at all of the major search engines, as a way for government sites to help guide the spiders to the content that citizens want to discover in search. It’s also in use at the government’s main information portal, USA.gov.
The search ad company also cited the release of a Center for Democracy & Technology report on how the government has published information and made it available to searchers. Their report lambasted availability of responses to some important queries:
A search for "New York radiation" does not find basic FEMA and DHS information about current conditions and monitoring.
A search to help grandparents with a question about visitation of their grandchildren in any search engine does not turn up an article of the same title located on the Web site of the Administration for Children & Families.
A search for "small farm loans" turns up the commercial offers for loans, and statistics about government loans, but not most of the major federal government programs designed to help fund small farms.
Like Google, CDT exhorted the Feds to pass the E-Government reauthorization act, and to take steps to enable search crawlers to find content more efficiently.