WebAssert HTML & CSS validation testing library v0.1 released

A while back I posted about automating the checking of HTML validity of your ASP.NET site using unit tests that leverage the W3C Markup Validation Service. I’ve showed the technique in a number of presentations since then and used it on a number of projects to good effect.

In an effort to make it easier to consume in your own project and allow for future expansion with new features, I’ve refactored it and rolled it into a new open-source library called WebAssert, up on CodePlex.

Thanks to some scripting help from the ever talented Tatham Oddie, I’m happy to announce the release of WebAssert v0.1 (beta).

This initial release supports checking for markup and CSS validity of URLs using the W3C hosted validators, or your own hosted instances. This release supports the MSTest framework in Visual Studio but there is already a fork containing a wrapper for NUnit which I plan to integrate soon. You can also test sites hosted using the AspNetDevelopmentServer attribute under MSTest.

Any feedback please let me know.


Adding HTML validity checking to your ASP.NET web site via unit tests

[UPDATED AGAIN: I’ve updated the test helper code to ensure that the W3C validation service is not hit more than once a second in accordance with their API guidelines. Thanks to Scott Baldwin for the tip and Mitch Denny for the threading code.]

[UPDATED: I’ve amended the helper method to return an object to represent the W3C validation check result rather than a simple boolean. This provides more flexibility to the test writer to check the number of errors or warnings returned. I’ve also changed the method to retrieve these values from the custom headers the validation service provides rather than searching the returned HTML.]

When developing standards compliant web sites, it is important to regularly check your mark-up for validity to ensure you are adhering to the standards your HTML documents declare they use (via a DOCTYPE declaration). While the W3C provides an excellent online validator for checking documents, it can be cumbersome to use regularly with internal, dynamically generated web sites, like those under development with ASP.NET.

We generally write unit tests against our .NET code to ensure it functions as expected. Therefore, it makes sense to me to test the HTML validity of our ASP.NET applications and web sites via a unit test. Doing so allows you to easily check the validity of your site while under development using the same tools integrated into Visual Studio that you use to test other parts of your solution. This also allows it to be easily integrated into your automated build and test processes, and break the build, if necessary, when your site’s HTML doesn’t validate.

The following class can be used in conjunction with the unit testing framework in Visual Studio 2008 to test the validity of your site’s runtime HTML against the W3C Markup Validation Service:

using System;
using System.Collections.Specialized;
using System.IO;
using System.Net;
using System.Text;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Threading;

namespace UnitTests
{
    /// <summary>
    /// Represents a result from the W3C Markup Validation Service.
    /// </summary>
    public class W3CValidityCheckResult
    {
        public bool IsValid { get; set; }
        public int WarningsCount { get; set; }
        public int ErrorsCount { get; set; }
        public string Body { get; set; }
    }

    static class TestHelper
    {
        private static AutoResetEvent _w3cValidatorBlock = new AutoResetEvent(true);

        private static void ResetBlocker(object state)
        {
            // Ensures that W3C Validator service is not called more than once a second
            Thread.Sleep(1000);
            _w3cValidatorBlock.Set();
        }

        /// <summary>
        /// Determines whether the ASP.NET page returns valid HTML by checking the response against the W3C Markup Validator.
        /// </summary>
        ///
<param name="testContext">The test context.</param>
        ///
<param name="aspNetServerName">Name of the ASP.NET server.</param>
        ///
<param name="path">The relative path of the resource to check.</param>
        /// <returns>
        /// An object representing indicating whether the HTML generated is valid.
        /// </returns>
        public static W3CValidityCheckResult ReturnsValidHtml(TestContext testContext, string aspNetServerName, string path)
        {
            var result = new W3CValidityCheckResult();
            WebHeaderCollection w3cResponseHeaders = new WebHeaderCollection();

            using (var wc = new WebClient())
            {
                string url = String.Format("{0}{1}",
                    testContext.Properties["AspNetDevelopmentServer." + aspNetServerName].ToString(),
                    path);
                string html = GetPageHtml(wc, url);

                // Send to W3C validator
                string w3cUrl = "http://validator.w3.org/check";
                wc.Encoding = System.Text.Encoding.UTF8;
                var values = new NameValueCollection();
                values.Add("fragment", html);
                values.Add("prefill", "0");
                values.Add("group", "0");
                values.Add("doctype", "inline");

                try
                {
                    _w3cValidatorBlock.WaitOne();
                    byte[] w3cRawResponse = wc.UploadValues(w3cUrl, values);
                    result.Body = Encoding.UTF8.GetString(w3cRawResponse);
                    w3cResponseHeaders.Add(wc.ResponseHeaders);
                }
                finally
                {
                    ThreadPool.QueueUserWorkItem(ResetBlocker); // Reset on background thread
                }
            }

            // Extract result from response headers
            int warnings = -1;
            int errors = -1;
            int.TryParse(w3cResponseHeaders["X-W3C-Validator-Warnings"], out warnings);
            int.TryParse(w3cResponseHeaders["X-W3C-Validator-Errors"], out errors);
            string status = w3cResponseHeaders["X-W3C-Validator-Status"];

            result.WarningsCount = warnings;
            result.ErrorsCount = errors;
            result.IsValid = (!String.IsNullOrEmpty(status) && status.Equals("Valid", StringComparison.InvariantCultureIgnoreCase));

            return result;
        }

        private static string GetPageHtml(WebClient wc, string url)
        {
            // Pretend to be Firefox 3 so that ASP.NET renders compliant HTML
            wc.Headers["User-Agent"] = "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1 (.NET CLR 3.5.30729)";
            wc.Headers["Accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
            wc.Headers["Accept-Language"] = "en-au,en-us;q=0.7,en;q=0.3";

            // Read page HTML
            string html = "";
            using (Stream responseStream = wc.OpenRead(url))
            {
                using (var sr = new StreamReader(responseStream))
                {
                    html = sr.ReadToEnd();
                    sr.Close();
                }
            }

            return html;
        }
    }
}

You use it from your unit test like so:

[TestMethod]
[Description("Tests that the HTML outputted by the home page is valid using the W3C validator")]
[AspNetDevelopmentServer("WebApplication", "WebApplication")]
public void HomePageIsValidHtml()
{
    Assert.IsTrue(TestHelper.ReturnsValidHtml(this.TestContext, "WebApplication", "Default.aspx").IsValid,
        "The home page failed W3C Markup Validation (http://validator.w3.org).");
}

Note the use of the AspNetDevelopmentServer attribute on the test method. This tells the unit testing framework to spin up an instance of Visual Studio’s inbuilt ASP.NET web server (Cassini) with the name and at the path you specify. The runtime URL of that instance is then retrieved from the test class’ property bag (on the TestContext property) by the helper method above.

This sample could be easily extended to do more thorough checking against numerous endpoints automatically if need be, perhaps by reading the website’s .sitemap file, or crawling the hyperlinks found in the response to a given depth.

So now you have no excuses! You can easily incorporate the checking of your ASP.NET site’s runtime HTML for validity into your normal development cycle.