In the light of the immense popularity AJAX has received the last couple of months and the emerging tools like Atlas and AJAX.NET, I thought it was the right time to talk about the implications on search engine behaviour on AJAX enabled websites.

In this post I’ll split websites into two categories – the public and the protected. A public website is accessible to all people and does not require login of any kind. It’s the most common type of website out there. The protected website could be an intranet site or a password protected membership site. In other words, the search engines index the public sites and not the protected sites.

When developing a protected site, you can do just about anything without the concern of search engine ranking. When developing a public website, you do not have that kind of liberty. The public site has to be search engine friendly.

So, when AJAX enabling a public website you have to make sure to keep the search engines happy at all times. If they aren’t happy with your website, neither should you be.

That’s why I’ve made a quick little list of Do’s and Don’ts about AJAX enabling your public website without loosing the search engines in the process.

Do’s

Do use AJAX for user specific actions
Set cookies, track sessions and log actions as long as the content isn’t dependant of it. Search engines will have no trouble indexing your content.

Do use AJAX to save content
When a user enters information in a form field and hits the save button, you can use AJAX as much as you like. Search engines will never push the save button anyway and is therefore unaware of the use of AJAX.

Do use AJAX to do form field validation
When validating form fields, you can use AJAX to validate the input without disturbing the search engines. Search engines do not fill out forms so that won’t be a problem.

Do use AJAX to display status messages
Displaying status messages of any kind based on user actions, is no problem for search engines, because the do not execute the JavaScript needed anyway and status messages are not important content for search engines to index anyway.

Don’ts

Don’t use AJAX for displaying static text content
By static content I mean the main text content of a page and not simple information like the number of current active session or something like that. The main text content of a page is the single most important thing for search engines, so never use AJAX for this purpose.

Don’t use AJAX for paging a table or list
If the table is filled with numbers with no search engine relevancy, you can skip this point. If your table or list contains book reviews, chances are that you want them indexed correctly. If your paging is AJAX enabled, the search engines will only index the first page of the table.

Don’t use AJAX for navigational purposes
This is not AJAX specific, the same rule applies to simple JavaScript as well. Search engines don’t follow JavaScript links, so they will get stuck on the entry page and leaves again without indexing the rest of your site.

The list isn’t complete, but I think it covers the basics and will help you to avoid the biggest caveats.

For some strange reason, Firefox caches ASP.NET pages even if you tell it not to. This could create problems with the back-button. Normally you can tell the browser how to cache your web page in two ways. You can use meta tags or set some HTTP headers. To be absolutely sure that all browsers understand you cache policy, you could combine the two.

However, when you don’t want the browser to cache your page, Firefox caches it anyway. It does so because it cannot se any difference between how the page looked before and after you pressed the back-button. This issue drove me absolutely nuts today, and it took me a while to come up with a solution.

To make the back-button work with the cache in Firefox, add the ETag header to your page. Its value is a random number, and this will tell Firefox that all pages are different, even when you use the back-button to view a previously viewed page. Here’s an example that tells Firefox not to cache your page. Just add these lines to the Page_Load event:

Random rd = new Random();
Response.AddHeader("ETag", rd.Next(1111111, 9999999).ToString());
Response.AddHeader("Pragma", "no-cache");
Response.CacheControl = "no-cache";
Response.Cache.SetNoStore();
Response.Expires = -1;

I don’t know why the ETag header is necessary in Firefox. I find it rather disturbing that the big browsers cannot agree on network issues like this.