Apr 1, 2009
Please refer to my earlier post (April 2007) on xml sitemaps. The search engines have agreed on a joint xml sitemap protocol.
I personally like adding some formatting to my xml sitemap - I use the Google sitemaps toolbox - I am able to see the total number of url's in the sitemap, and sort the sitemap by each of its columns.
An xml sitemap is a listing of url's that are contained in your website that contains extra information:
- like when the page was last updated,
- the priority that you want Google to spider the page compared to other pages
- how often you want the page spidered - ie all the time, daily, weekly, monthly
You can submit an xml sitemap, an rss feed, or even a list of url's in a text file.
The information can be submitted to Google via Google webmaster tools, and you can also include a link to the sitemap in your robots.txt file
Why a Google sitemapA sitemap is best when you have a content management system that automatically creates updated "last modified" entries, and automatically adds new pages. You set things up so that Google is automatically pinged to let them know of the new entries.
In general, Google will spider your website as often as it schedules it, based on the highest Google PR of your pages. Having a sitemap will not in itself get your pages spidered more often. However, Google is more likely to respider your latest additions and changes if you notify it via pinging and differences in the sitemap compared to the last time it spidered it.
So don't spam Google and tell it that all pages have been changed when in fact only one page has been changed. Make sure you have pages with different priorities, otherwise in effect you are giving no information to Google.
If you have a large website with thousands of pages, prioritised spidering will get your new information updated faster than otherwise. With a small website there is little benefit, as Google would be spidering the whole lot in one go anyway.
Submitting your sitemap to Google will let Google know your total number of website pages. You will then have the stat "Total URL's:x Indexed url's: y" - you will know how many of your pages are in Google. A very useful stat.
There are some great sitemap generators that will spider your website, and create an xml sitemap - but the risk is that you will not update it with new pages, nor use it to tell Google about updates to pages. So such a sitemap does not give that much benefit.