Ok, this is not new. I’ve also written about this a few times in the past. The thing is that removing whitespace is a very tricky discipline that is different from site to site. At least that was what I thought until very recently.

For some unexplained reason I started working on a little simple method to remove whitespace in a way so it works on all websites without breaking any HTML. Maybe not unexplained since I’ve written about it so many times that it would seem I got a secret obsession.

Obsession or not, here is the code I ended up with after a few hours of hacking. Just copy the code onto your base page or master page and watch the magic.

[code:c#]

private static readonly Regex REGEX_BETWEEN_TAGS = new Regex(@">\s+<", RegexOptions.Compiled);
private static readonly Regex REGEX_LINE_BREAKS = new Regex(@"\n\s+", RegexOptions.Compiled);
 
/// <summary>
/// Initializes the <see cref="T:System.Web.UI.HtmlTextWriter"></see> object and calls on the child
/// controls of the <see cref="T:System.Web.UI.Page"></see> to render.
/// </summary>
/// <param name="writer">The <see cref="T:System.Web.UI.HtmlTextWriter"></see> that receives the page content.</param>
protected override void Render(HtmlTextWriter writer)
{
  using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter()))
  {
    base.Render(htmlwriter);
    string html = htmlwriter.InnerWriter.ToString();
 
    html = REGEX_BETWEEN_TAGS.Replace(html, "> <");
    html = REGEX_LINE_BREAKS.Replace(html, string.Empty);
 
    writer.Write(html.Trim());
  }
}

[/code]

Remember that whitespace removal speeds up rendering in especially IE and reduces the overall weight of your page.

One of the things that have always seemed a little weird to me is that ASP.NET auto-generates JavaScript and injects it in the rendered HTML. The JavaScript is needed to handle validation, postbacks, callbacks etc. but why does it have to write the same static functions when it could just as well be placed in a referenced .js file? If all the static functions where placed in an external .js file, it would be downloaded once instead of every time a page loads.

I thought I’d do something about it and wrote an HttpModule that removes and rewrites some of the auto-generated JavaScript. Then I put the static functions into an external .js file and referenced that from the <head> section instead. It also changes all document.getElementById(id) to $(id).

The result is a smaller and cleaner HTML output

I’ve implemented it on this website and if you take a peek at the HTML source you’ll notice that you don’t find functions such as __doPostBack and ValidatorOnSubmit along with some other JavaScript logic. It has been moved to my global external .js file instead.

Implementation

Download the zip file below. It holds two files – an HttpModule and a JavaScript file. You need to reference the .js file or copy the contents into your own external referenced .js file. Then hook the HttpModule up in the web.config.

CleanPage.zip (1.63 kb)