Search Engine Site Map Creator
The major search engines can now base their indexing on XML site maps. For further information see sitemaps.org.
In a site map, each page of the web site requires its own XML entry in the format:
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://www.braemoor.co.uk/software/sitemapper.shtml</loc> <lastmod>2009-04-01</lastmod> <changefreq>monthly</changefreq> <priority>0.8</priority> </url> </urlset>
The <lastmod>, <changefreq>, and <priority> attributes are all optional.
There are a number of online facilities which create site maps for you, but all the optional attributes are left to take default values, and it is necessary to add these in manually. When you recreate the site map any additions you may have made will have been lost.
This facility also allows you to create site maps, but using a PHP transaction held on your own web site. It also understands three metatags held in the header source code of the web pages, which may be used to specify the optional attributes <changefreq>, <priority>, and <lastmod>.
<meta name="sitemap-priority" content = "1.0" /> <meta name="sitemap-changefreg" content="monthly" /> <meta name="sitemap-lastmod" content="2008-31-12" />
the values of which take the same as the attributes in the XML format. This means that the optional attributes associated with an URL entry are kept with the source of the URL and the site map can be readily recreated.
If the "sitemap-lastmod" metatag is missing the date the associated file was last modified is used. This is normally accurate, but if the page is dynamic this date will not reflect when the data was last updated. This can be overcome by giving the "sitemap-lastmod" metatag an explicit date value such as "2009-31-01", or by giving it the value "default", in which case the whole attribute will be missed out from the XML for this URL of the sitemap.
The PHP transaction spiders its way through the web site, taking into account robots.txt file and any <robots> metatags, sorts the URLs into directory order, and generates the standard XML site map file.
It also creates a second site map, mysitemap.xml, which is in the format:
<urlset> <url> <loc>http://braemoor.co.uk/</loc> <title>Braemoor: Home Page</title> <description>>Braemoor Home Page</description> </url> </urlset>
This is constructed from the <title> and <description> fields in the page header, and may be used to construct your own user-friendly sitemap.
Once the download file has been unzipped and loaded into the root directory of the web site, it is ready to run: sitemap.php. Depending on the size of your web site, it can take a few minutes. However, there is one line of code which specifies the file extensions of the pages which are to be processed:
$extensions = array('shtml', 'html', 'htm', 'php');
This may need adapting to suit your own requirements.
Download compressed PHP file (5.91Kb)