The Fubra Blog

Sitemap Validation

Posted Wednesday 17th September 2008 by Alex

Introduction

Hello, my name is Alex Buell, I am profoundly deaf, and work as a Linux
system administrator within the Fubra infosphere. I usually spend most
of my time working on open source projects, giving back to the
community extra value in tools that allow us to do our job.

What are sitemaps?

They provide a way for webmasters (people who run
websites) to give out information about the content on their websites.
Search engines (i.e. www.google.co.uk) look (‘crawling’) through
websites to build up indexes to allow people to search for things that
they are interested in looking for.

Essentially a sitemap is just a file containing URLs in XML format,
along with additional metadata about each of these URLs (when was it
last updated, how often does it change, how vital it is), enabling
search engines to make intelligent decisions about searching for links
to or from pages on the web sites.

Sitemaps have to be regularly updated so we have a tool that
automatically generate sitemaps which are stored within the web site,
search engines can download these files and read through all the URLs.

A new tool for validating sitemaps

We at Fubra have developed a web tool to help with validating sitemaps
on our websites. The tool does two things; reads the sitemap files off
the website, and presents to us in a human readable form, and if asked,
check through the URLs, displaying the http code (i.e. 404, 301, or 200)
in the status icons on the right side.

Sitemap Validator

How to use the Validator webtool

To use the tool, point your browser at the
Sitemap Validator site, and type in the URL
(i.e. www.talkfootball.co.uk), and see what the tool does with it.

As it stands, the validator tool will look through the URLs and
its associated metadata for validity and colour the icon on the right
accordingly. Red icons means that the URL is a duplicate, orange means
the date and time associated with the URL is invalid, whilst green
means that the URL and its metadata is valid. The tool sorts the URLs
in a way that all invalid URLs comes first and all valid URLs comes
last. This makes it easy to see what is wrong with the sitemap.

Additionally, there is a button ‘Check’ in the top row, which, if
clicked upon, will run a check on all the URLs on the site, and display
the http code.

Any other business?

All comments, flames and feedback welcome, please do drop me an email
at alex at fubra dot com

Tags: , ,

Leave a Reply