Back to our Homepage News Archives Site Search Advertise on Search Engines Today Search Engines Today brings you the latest news on Google, Yahoo, MSN and most other major Web search engines.




Global Business Listing is the fastest-growing paid inclusion search engine there is today. Click here for more information.


SureMail™ is the most reliable email service there is. Get less spam and less email virusses. Unlimited autoresponders. Learn more by clicking here.


Find the answers fast to any SEO or SEM-related question you have at the SEO Help Forum. Click here and get all your answers.


If you're an avid blogger or a passionate writer, we're interested in talking with you. Apply here.



Save thousands of dollars by building your own Web site. No programming skills necessary. No software to download or install. Learn more by clicking here.


Choose the right words in your infomercials and advertorials. Have them done by professionals. Click here for more information.



Get the best tech support and pay the lowest price on any Web hosting package with Avantex. Click here for more information.


You read correctly! Many people don't know that. Find out more by visiting Press Broadcast -- Click here.
Google

Google updates its Robots.txt and Meta Tags features

Add to del.icio.us     Digg this story Digg this

Aug. 16, 2007

Today, Google has improved its Robots.txt analysis tool to better recognize sitemap declarations and relative URLs.

Google's earlier versions weren't aware of sitemaps at all, and understood only absolute URLs. Anything else was reported as 'Syntax not Understood'.

Google's new and improved version of the robots.txt file now tells a site owner or webmaster whether the sitemap's URL and scope are valid.

According to Google, overall testing can also be done against relative URLs with a lot less typing.

Line reporting is also better too. Webmasters now will be told of multiple problems per line if they exist, unlike Google's earlier versions which only reported the first problem encountered.

Google has also made other general improvements to the robots.txt analysis tool and its overall validation.

For instance, a webmaster or developer managing the domain www.site.com wishes search engines to index everything on the site, except for the /images folder. The webmaster wants to make sure his or her sitemap gets noticed, so he or she saves the following code as the new robots.txt file:

disalow images

user-agent: *
Disallow:

sitemap: http://www.example.com/sitemap.xml

Webmasters can visit Google's Webmaster Central to test their site against the robots.txt analysis tool, using these two test URLs:

http://www.site.com
/archives

Google also wants to make sure webmasters understand the new "unavailable_after META" tag announced on the Google Blog August 2. This allows for a more dynamic relationship between any website site and Googlebot.

With the above example (www.site.com), any time a site has a temporarily available news story or limited offer sale or promotion page, site owners can now specify the exact date and time they want specific pages to stop being crawled and indexed in Google's database.

For instance, a site promotion that expires at the end of 2007: in the headers of page www.site.com/2007promotion.html, a site manager can use the following:

Another new feature: the new "X-Robots-Tag" directive, which adds Robots Exclusion Protocol (REP) META tag support for non-HTML pages. Now, webmasters and site owners can have the same control over videos, spreadsheets and other indexed file types.

Using the example above, if a promotion page is in PDF format on a typical link such as www.site.com/2007promotion.pdf, a developer now would use the following command in the file's HTML headers:

X-Robots-Tag: unavailable_after: 31 Dec 2007 23:59:59 EST.

Google underlines that webmasters need to remember that REP META tags can be useful for implementing no-archive, no-snippet and now unavailable_after tags for page-level instruction. This is in comparison to the previously used robots.txt file, which was controlled at the root of the domain.

Google says it gets some requests from bloggers and webmasters for these new features.

To learn more, visit Google's Webmaster Help Group.

Add to del.icio.us     Digg this story Digg this

This article was featured on the Business 5.0 portal. Click here to visit the site.     This article was featured on Business 5.0.

Source: Microsoft






home | news archives | site search | advertise with us

Search engine marketing by Rank for $ales        Web design by MWD

Get our free search engine newsletter        Web hosting by Avantex

Copyright © Search Engines Today. All rights reserved.