|
Back to previous page
Google SiteMaps - Get ALL your pages listed!
by Robert Fuess
Google has a new trick! Don't get left behind! It's the hottest thing
since RSS.
I was working on my Google AdWords campaign for SchoolAndTeacher.com and I
noticed that Google had a new feature called "Google Sitemap". It was a
perfect fit with my needs. I was concerned with how deep Google would be
able to get into my site. Teachers are able to have homework posted daily
on my site but will Google see the daily changes? Now they can. When a new
teacher signs up, how many months will it be before Google recognizes it?
With Google Sitemap, Google can be informed about new pages quickly and
when a page last changed.
If you have a website and want Google to know about ALL the pages in your
site, build a Google Sitemap. That's it. Keep reading, since I will show
you how to do so, and where it proves most useful.
Webmasters (like me) have been frustrated by the slow methodological way
in which Google gradually finds the pages in your site. If your site
starts out big and has a lot of dynamic pages, then this is unacceptable.
Hurray for Google! They are listening.
Many webmasters like using DHTML menus, but are concerned about the search
engines finding all the pages. This will help Google to find them. (In
reality, I would still recommend having a regular site-map to help the
other search engines find the rest of your pages. Google may drive the
most traffic but customers coming in through any search engine are
welcome. Don't throw this away.)
DO WE KEEP OUR OLD SITE MAPS?
Yes! The Google Sitemap is not useful to your normal human user just to
Google. (It is in XML format, not HTML.) Also remember that other search
engines (like Yahoo and MSN) don't use this type of site map yet.
Hopefully they will.
WHAT DOES A GOOGLE SITEMAP LOOK LIKE?
It is XML. For XML gurus, here is the schema:
http://www.google.com/schemas/sitemap/0.84/sitemap.xsd.
But if you are new to XML, here is a sample. I will go through it. Don't
let the tags scare you.
<?xml version="1.0" encoding="UTF-8" ? >
<urlset xmlns ="http://www.google.com/schemas/sitemap/0.84" xmlns:xsi
="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation
="http://www.google.com/schemas/sitemap/0.84
http://www.google.com/schemas/sitemap/0.84/sitemap.xsd">
<url>
<loc>http://www.SchoolAndTeacher.com/OneClass.aspx?ClassId=1</loc>
<lastmod>2005-7-4</lastmod>
<changefreq>monthly</changefreq>
<priority>0.7</priority>
</url>
<url>
<loc>http://www.SchoolAndTeacher.com/OneClass.aspx?ClassId=2</loc>
<lastmod>2005-7-1</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
</urlset>
EXPLANATION OF TAGS:
<?xml version="1.0" encoding="UTF-8" ? >
This should be at the top of the document. It states that this is an XML
document and what version of XML is being used.
<urlset xmlns="http://www . . . ">
This tag is the wrapper for all the URLS in the sitemap. Just copy it from
the full example above. If you are new to XML, don't fret. Just recognize
that this is how GOOGLE wants it.
<url> .. </url>
This wraps the set of XML elements (pair of opening and closing XML tags)
for each URL you want to tell Google about.
<loc>YourUrl</loc>
Here is where you put the URL to the webpage you want Google to know
about. Try to avoid extra spaces.
<lastmod>YYYY-MM-DD</lastmod>
This tells Google when you last modified your web page. This is a really
important tag. (You can put the time in also if you feel the need to.)
<priority>#.#</priority>
This tells Google what you feel is the most important to crawl. These are
relative numbers from 0.0 1.0, with 1.0 being the most important. The
default value is 0.5 (even if you leave the tag out). THIS HAS NO IMPACT
ON PAGERANK!!! This is just a relative weight for Google to crawl YOUR
site. If you have all of them at .9 it would be no different than all of
them being .1
Look at it this way. If Google was super busy one day and had time to
crawl only 3 pages in your site, which ones should it crawl? I would want
them to crawl the three with the highest priority (to me) that have
changed recently.
BUT I DON'T WANT TO USE XML
You can just provide a text document with the list of URL's. This will
help Google find your pages, but will not help Google effeciently decide
on what to spider. I would strongly encourage you to build an XML version
of google sitemap.
BENEFITS OF USING THE XML VERSION:
Google can know what pages have changed and not have to re-crawl those
pages that haven't changed.
If you have a lot of pages and Google doesn't have time to crawl all of
your pages all right away, it will focus on the ones that changed
according to your priority.
Google is your friend.If they want information on how to efficiently crawl
your site give it to them.
WHAT TO SUBMIT?
You may submit a sitemap, or an index of your sitemaps. Either will do.
Google has documentation on both. (Don't worry about the index yet. That
is addressed at the end of this article.)
I BUILT ONE - NOW HOW DO I SUBMIT IT?
First TELL GOOGLE IT EXISTS
Upload your Sitemap to your site to the highest folder in your website.
Sign into Google Sitemaps with your Google Account. (Use this link: Google
Sitemap Login )
Click on "Add a Sitemap" link.
Type the URL to your Sitemap location.
Congratulations! Google now knows about it!
Now tell Google whenever something changes. It will check this sitemap to
see where the changes are.
Quick and Easy Way:
Type the following into the address section of your browser:
http://www.google.com/webmasters/sitemaps/ping?
sitemap=http://www.YourDomain.com/YourSitemap.xml
Of course, you should replace the YourDomain.com with your domain and
YourSitemap.xml with your sitemap.
If you have a dynamically built site then you would want to automate this
using screen scraping techniques.
HOW OFTEN TO SUBMIT?
Ideally, it should be submitted when changes are made. Personally, I would
avoid doing so more than once per day. However, we will look to Google as
they may provide further guidance on this. Search engines are our friends,
and we should be respectful of abusing any service they provide or making
them process things needlessly.
WILL THIS IMPROVE MY GOOGLE RANKING?
Google doesn't make any promises of this. This is mainly a way for Google
to find your pages, and to efficiently know what pages need to be
re-crawled on your website. If your site makes frequent changes, this
feature helps Google to know about them more quickly. It won't have to
spider through your whole site to find the changes.
HOW MANY URLs CAN I HAVE IN A GOOGLE SITEMAP?
According to Google documentation, you may have up to 50,000. If you
anticipate more than this, then you should build several sitemaps and use
a Google Sitemap Index. This Index will point to the several sitemaps. If
you want more information on the Sitemap Index, go to
http://www.google.com/webmasters/sitemaps/docs/en/
protocol.html#sitemapFileRequirements
FOR more information, please refer to Google Documentation at
http://www.google.com/webmasters/sitemaps/docs/en/about.html
About The Author: Robert Fuess, owner of SpiderwebLogic.com, has
been building web pages for several years. He has a lot of expertise in
ASP.Net, building shopping carts, SEO, and integrating with databases.
|