Is Google Bugged?
I have implemented a search facility that utilizes Google on this web site. It is suppose to help you find the relevant material that you are looking for. But I've noticed that Google returns interesting results. And instead of displaying the most relevant results first, they are displayed last with no title and no excerpt.
The screen shot below shows what I am talking about. A search for "rio carbon" returns four results. (To make it easy for you, click here to do the search.) The first result is the index to the discussion forum that contains Rio Carbon threads. The second result is a blog syndication. The third and fourth results are actually the more relevant information. The third result is "Rio Carbon Internals", while the fourth is "Rio Carbon works with Linux".
As you can see both articles are shown only with its URL with neither title nor text excerpt. To make things worse, the URL is truncated.
Is Google search buggy?
This bug was quite intriguing to me, so I sent the following bug report to Google on Tuesday:
"I have integrated Google search into my web site, located here:
The search seems to identify web pages with no title and no text excerpt. I am wondering if that is a bug. I documented the problem extensively at [this page that you are viewing].
If you don't mind taking a look at it, I would really appreciate it."
Today, I got the following response from Google:
"Thank you for your note. The Google index contains two types of pages: fully indexed and partially indexed pages. A site that's listed by its URL and appears without a cached copy and a detailed title is partially indexed. When a site is partially indexed, it's because our robots were unable to completely review its content during a recent crawl.
Also, while the Google index does include dynamically generated webpages, including .asp pages, .php pages, and pages with question marks in their URLs, these pages can cause problems for our crawler and may be ignored. If you're concerned that your dynamically generated pages are being ignored, you may want to consider creating static copies of these pages for our crawler. If you do this, please be sure to include a robots.txt file that disallows the dynamic pages in order to ensure that these pages are not seen as having duplicate content.
We're always working to increase the number of fully indexed pages in our index. While we can't guarantee that pages in our search results will always be fully indexed, "crawler-friendly" pages have a greater chance of being fully indexed. For more guidelines on creating a crawler-friendly site, please visit http://www.google.com/webmasters/guidelines.html
Lastly, you may want to comb http://groups.google.com/groups?q=google.public.support.general for suggestions from our users and webmasters or to post a question of your
Although I do appreciate Google's detailed response to my bug report and its effort of providing tips and to help their search engine index our web sites, I am still not sure what to make of the response. Should I change my web site because Google search engine is having problem indexing dynamic pages? Or should they fix the problem they are having with their search engine's inability to accurately index dynamic pages?
I have been monitoring this problem on an off. Up till a few weeks ago, the problem still persisted. But today, Google have provided valid results for the search mentioned in this thread. After four months, Google has finally fixed the bug.
Today, if you done the search is slightly better. There are actually two revelent results as shown in the picture below. The "Rio Carbon Internals" thread that I mentioned earlier is one of the results. But the "Rio Carbon works with Linux" thread is no longer part of the search result. Did Google search index go backwards?
Basically exactly a year from when I started this thread, Google's search index is still buggy. Or maybe it's just selectively index what it wants from GearHack, which makes me wonder what other relevent web sites am I missing when I do a search on Google. The pages I mentioned has been sitting around for more than a year now. There haven't been much improvements.
Did your message disappear? Read the Forums FAQ.
Spam Control | * indicates required field
TrackBack only accepted from WebSite-X Suite web sites. Do not submit TrackBacks from other sites.
No TrackBacks yet. TrackBack can be used to link this thread to your weblog, or link your weblog to this thread. In addition, TrackBack can be used as a form of remote commenting. Rather than posting the comment directly on this thread, you can posts it on your own weblog. Then have your weblog sends a TrackBack ping to the TrackBack URL, so that your post would show up here.
Messages, files, and images copyright by respective owners.
41 Users Online
Copyright © 2004 - 2022. All Rights Reserved.