Adding HTML validity checking to your ASP.NET web site via unit tests

[UPDATED AGAIN: I've updated the test helper code to ensure that the W3C validation service is not hit more than once a second in accordance with their API guidelines. Thanks to Scott Baldwin for the tip and Mitch Denny for the threading code.]

[UPDATED: I've amended the helper method to return an object to represent the W3C validation check result rather than a simple boolean. This provides more flexibility to the test writer to check the number of errors or warnings returned. I've also changed the method to retrieve these values from the custom headers the validation service provides rather than searching the returned HTML.]

When developing standards compliant web sites, it is important to regularly check your mark-up for validity to ensure you are adhering to the standards your HTML documents declare they use (via a DOCTYPE declaration). While the W3C provides an excellent online validator for checking documents, it can be cumbersome to use regularly with internal, dynamically generated web sites, like those under development with ASP.NET.

We generally write unit tests against our .NET code to ensure it functions as expected. Therefore, it makes sense to me to test the HTML validity of our ASP.NET applications and web sites via a unit test. Doing so allows you to easily check the validity of your site while under development using the same tools integrated into Visual Studio that you use to test other parts of your solution. This also allows it to be easily integrated into your automated build and test processes, and break the build, if necessary, when your site’s HTML doesn’t validate.

The following class can be used in conjunction with the unit testing framework in Visual Studio 2008 to test the validity of your site’s runtime HTML against the W3C Markup Validation Service:

using System;
using System.Collections.Specialized;
using System.IO;
using System.Net;
using System.Text;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Threading;

namespace UnitTests
{
    /// <summary>
    /// Represents a result from the W3C Markup Validation Service.
    /// </summary>
    public class W3CValidityCheckResult
    {
        public bool IsValid { get; set; }
        public int WarningsCount { get; set; }
        public int ErrorsCount { get; set; }
        public string Body { get; set; }
    }

    static class TestHelper
    {
        private static AutoResetEvent _w3cValidatorBlock = new AutoResetEvent(true);

        private static void ResetBlocker(object state)
        {
            // Ensures that W3C Validator service is not called more than once a second
            Thread.Sleep(1000);
            _w3cValidatorBlock.Set();
        }

        /// <summary>
        /// Determines whether the ASP.NET page returns valid HTML by checking the response against the W3C Markup Validator.
        /// </summary>
        ///
<param name="testContext">The test context.</param>
        ///
<param name="aspNetServerName">Name of the ASP.NET server.</param>
        ///
<param name="path">The relative path of the resource to check.</param>
        /// <returns>
        /// An object representing indicating whether the HTML generated is valid.
        /// </returns>
        public static W3CValidityCheckResult ReturnsValidHtml(TestContext testContext, string aspNetServerName, string path)
        {
            var result = new W3CValidityCheckResult();
            WebHeaderCollection w3cResponseHeaders = new WebHeaderCollection();

            using (var wc = new WebClient())
            {
                string url = String.Format("{0}{1}",
                    testContext.Properties["AspNetDevelopmentServer." + aspNetServerName].ToString(),
                    path);
                string html = GetPageHtml(wc, url);

                // Send to W3C validator
                string w3cUrl = "http://validator.w3.org/check";
                wc.Encoding = System.Text.Encoding.UTF8;
                var values = new NameValueCollection();
                values.Add("fragment", html);
                values.Add("prefill", "0");
                values.Add("group", "0");
                values.Add("doctype", "inline");

                try
                {
                    _w3cValidatorBlock.WaitOne();
                    byte[] w3cRawResponse = wc.UploadValues(w3cUrl, values);
                    result.Body = Encoding.UTF8.GetString(w3cRawResponse);
                    w3cResponseHeaders.Add(wc.ResponseHeaders);
                }
                finally
                {
                    ThreadPool.QueueUserWorkItem(ResetBlocker); // Reset on background thread
                }
            }

            // Extract result from response headers
            int warnings = -1;
            int errors = -1;
            int.TryParse(w3cResponseHeaders["X-W3C-Validator-Warnings"], out warnings);
            int.TryParse(w3cResponseHeaders["X-W3C-Validator-Errors"], out errors);
            string status = w3cResponseHeaders["X-W3C-Validator-Status"];

            result.WarningsCount = warnings;
            result.ErrorsCount = errors;
            result.IsValid = (!String.IsNullOrEmpty(status) && status.Equals("Valid", StringComparison.InvariantCultureIgnoreCase));

            return result;
        }

        private static string GetPageHtml(WebClient wc, string url)
        {
            // Pretend to be Firefox 3 so that ASP.NET renders compliant HTML
            wc.Headers["User-Agent"] = "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1 (.NET CLR 3.5.30729)";
            wc.Headers["Accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
            wc.Headers["Accept-Language"] = "en-au,en-us;q=0.7,en;q=0.3";

            // Read page HTML
            string html = "";
            using (Stream responseStream = wc.OpenRead(url))
            {
                using (var sr = new StreamReader(responseStream))
                {
                    html = sr.ReadToEnd();
                    sr.Close();
                }
            }

            return html;
        }
    }
}

You use it from your unit test like so:

[TestMethod]
[Description("Tests that the HTML outputted by the home page is valid using the W3C validator")]
[AspNetDevelopmentServer("WebApplication", "WebApplication")]
public void HomePageIsValidHtml()
{
    Assert.IsTrue(TestHelper.ReturnsValidHtml(this.TestContext, "WebApplication", "Default.aspx").IsValid,
        "The home page failed W3C Markup Validation (http://validator.w3.org).");
}

Note the use of the AspNetDevelopmentServer attribute on the test method. This tells the unit testing framework to spin up an instance of Visual Studio’s inbuilt ASP.NET web server (Cassini) with the name and at the path you specify. The runtime URL of that instance is then retrieved from the test class’ property bag (on the TestContext property) by the helper method above.

This sample could be easily extended to do more thorough checking against numerous endpoints automatically if need be, perhaps by reading the website’s .sitemap file, or crawling the hyperlinks found in the response to a given depth.

So now you have no excuses! You can easily incorporate the checking of your ASP.NET site’s runtime HTML for validity into your normal development cycle.

About these ads

5 Comments on “Adding HTML validity checking to your ASP.NET web site via unit tests”

  1. jens says:

    Hi,

    Great job!
    Here’s an equivalent in java using jUnit4 and hamcrest for validation:

    public class W3CValidityCheckerIntTest {

    private final String OUR_URL = “http://localhost:8080/”;
    private final String VALIDATOR_URL = “http://validator.w3.org/check”;

    @Test
    public void testValidity() throws Exception {

    URLConnection our_url = (new URL(OUR_URL)).openConnection();
    BufferedReader br = new BufferedReader(new InputStreamReader(our_url.getInputStream()));
    String l;
    StringBuffer buff = new StringBuffer();
    while((l = br.readLine()) != null) {
    buff.append(l).append(“\n”);
    }

    String data = URLEncoder.encode(“fragment”, “UTF-8″) + “=” + URLEncoder.encode(buff.toString(), “UTF-8″);
    data += “&” + URLEncoder.encode(“prefill”, “UTF-8″) + “=” + URLEncoder.encode(“0″, “UTF-8″);
    data += “&” + URLEncoder.encode(“group”, “UTF-8″) + “=” + URLEncoder.encode(“0″, “UTF-8″);
    data += “&” + URLEncoder.encode(“doctype”, “UTF-8″) + “=” + URLEncoder.encode(“Inline”, “UTF-8″);

    URL url = new URL(VALIDATOR_URL);
    HttpURLConnection conn = (HttpURLConnection) url.openConnection();

    conn.setDoOutput(true);
    conn.setDoInput(true);
    conn.setRequestMethod(“POST”);

    OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
    wr.write(data);
    wr.flush();

    String valid = conn.getHeaderField(“X-W3C-Validator-Status”);
    System.out.println(“Warnings: ” + conn.getHeaderField(“X-W3C-Validator-Warnings”));
    System.out.println(“Errors: ” + conn.getHeaderField(“X-W3C-Validator-Errors”));
    System.out.println(“Status: ” + valid);

    BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
    String line;
    while ((line = rd.readLine()) != null) {
    System.out.println(“line: ” + line);
    }
    wr.close();
    rd.close();

    boolean is_valid = (!valid.equalsIgnoreCase(“invalid”));
    assertThat(“The page is valid”, is_valid, is(true));
    }
    }

  2. [...] had this exact problem and solved it’. I put my googles on and found that indeed someone had for dot net.. I needed the same test for a java based project and thus wrote [...]

  3. [...] page in the site using the W3C Markup Validation Service (inspired by Damian Edwards’ excellent Adding HTML validity checking to your ASP.net web site via unit tests post). Over the course of building this web site, I learned a lot about what you can and can’t do [...]

  4. Finlay says:

    Hi

    Many thanks for this. Exactly what I was looking for. Made a few tweaks and up and running in VB.net.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 35 other followers