The Internet Marketing Driver: Glenn Gabe's goal is to help marketers build powerful and measurable web marketing strategies.

Wednesday, March 17, 2010

.htaccess for Windows Server: How To Use ISAPI Rewrite To Handle Canonicalization and Redirects For SEO


ISAPI Rewrite, .htaccess for Windows Server.If you’ve read previous blog posts of mine, then you know how important I think having a clean and crawlable website structure is for SEO. When performing SEO audits, it’s usually not long before the important topic of canonicalization comes up. Canonicalization is the process of ensuring that you don’t provide the same content at more than more URL. It’s also one of the hardest words in SEO to pronounce. :) If you don’t address canonicalization, you can end up with identical content at multiple URL’s, which can present duplicate content issues. And you don’t want duplicate content. For example, you don’t want your site to resolve at both non-www and www, at both http and https, using mixed case, having folders resolve with and without trailing slashes, etc.

In addition to handling canonicalization, you also want to have a system in place for handling 301 redirects. A 301 redirect is a permanent redirect and will safely pass PageRank from one URL to another. This comes in handy in several situations. For example, if you go through a website redesign and your URL’s change, if you remove campaign landing pages, if you remove old pieces of content, etc. If you don’t 301 redirect these pages, you could end up paying dearly in organic search. Imagine hundreds, thousands, or millions of URL’s changing without 301 redirects in place. The impact could be catastrophic from an SEO standpoint.

Enter ISAPI Rewrite, .htaccess for Windows Server
So I’m sure you are wondering, what’s the best way to handle canonicalization and redirects for SEO? If you conduct some searches in Google, you’ll find many references to .htacess and mod_rewrite. Using mod_rewrite is a great solution, but it’s only for Apache Server, which is mainly run on linux servers. What about windows hosting? Is there a solution for .net-driven websites?

The good news is that there is a solid solution and it’s called ISAPI Rewrite. ISAPI Rewrite is an IIS filter that enables you handle URL rewriting and redirects via regular expressions. It’s an outstanding tool to have in your SEO arsenal and I have used it now for years. There are two versions of ISAPI Rewrite (verseions 2 and 3) and both enable you to handle most of what .htaccess can do. Actually, I think so much of ISAPI Rewrite, that it’s the topic of my latest post on Search Engine Journal.

So, to learn more about ISAPI Rewrite, the two versions available, and how to use it (including examples), please hop over to Search Engine Journal to read my post.

ISAPI Rewrite: Addressing Canonicalization and Redirects on Windows Server

GG

Labels: , , , , , ,

Tuesday, October 13, 2009

SEO and AJAX: Taking a Closer Look at Google’s Proposal to Crawl AJAX


Taking a closer look at Google's proposal for crawling AJAX.Last week at SMX, Google announced a proposal to crawl AJAX. Although it was great to hear the official announcement, you had to know it was coming. Too many web applications are using AJAX for Google to ignore it! After the news was released, I received a lot of questions about what the proposal actually means, how it works, and what the impact could be. There seemed to be a lot of confusion, and even with people in the Search industry. And I can understand why. If you don’t have a technical background, then Google’s blog post detailing the proposal to crawl AJAX can be a bit confusing. The mention of URL fragments, stateful pages, and headless browsers can end up being confusing for a lot of people, to say the least. And if you’ve never heard of a headless browser, fear not! Since it’s close to Halloween and I grew up near Sleepy Hollow, I’ll spend some time in this post talking about what a headless browser is.

So based on my observations over the past week or so, I decided to write this post to take a closer look at what Google is proposing. My hope is to clear up some of the confusion so you can be prepared to have your AJAX crawled. And to reference AJAX’s original slogan, let’s find out if this proposal is truly Stronger Than Dirt. :)

Some Background Information About SEO and AJAX:
So why all the fuss about AJAX and SEO? AJAX stands for asynchronous JavaScript and xml, and when used properly, it can create extremely engaging web applications. In a nutshell, a webpage using AJAX can load additional data from the server on-demand without the page needing to refresh. For example, if you were viewing product information for a line of new computers, you could dynamically load the information for each computer when someone wants to learn more. That might sound unimpressive, but instead of triggering a new page and having to wait as the page loads all of the necessary images, files, etc., the page uses AJAX to dynamically (and quickly) supply the information. As a user, you could quickly see everything you need and without an additional page refresh. Ten or more pages of content can now be viewed on one… This is great for functionality, but not so great for SEO. More on that below.

Needless to say, this type of functionality has become very popular with developers wanting to streamline the user experience for visitors. Unfortunately, the search engines haven’t been so nice to AJAX-based sites. Until this proposal, most AJAX-based content was not crawlable. The original content that loaded on the page was crawlable, but you had to use a technique like HIJAX to make sure the bots could find all of your dynamically loaded content. Or, you had to create alternative pages that didn’t use AJAX (which added a lot of rework.) Either way, it took careful planning and extra work by your team. On that note, I’ve yet to be part of project where AJAX developers jump up and down with joy about having to do this extra work. Based on what I explained above, Google’s proposal is an important step forward. But there just had to be a better solution.

What is Google’s Proposal to Crawl AJAX?
When hearing about the proposal, I think experienced SEO’s and developers knew there would be challenges ahead. It probably wasn’t going to be a simple solution. And for the most part, we were right. The proposal is definitely a step forward, but webmasters need to cooperate (and share the burden of making sure their AJAX can be crawled). In a nutshell, Google wants webmasters to process AJAX content on the server and provide the search engines with a snapshot of what the page would look like with the AJAX content loaded. Then Google can crawl and index that snapshot and provide it in the search results as a stateful URL (a URL that visitors can access directly to see the page with the AJAX-loaded content).

If the last line threw you off, don’t worry. We are going to take a closer look at the process that’s being proposed below.

Getting Your AJAX Crawled: Taking a closer look at the steps involved:

1. Adding a token to your URL:
Let’s say you are using AJAX on your site to provide additional information about a new line of products. A URL might look like:

example.com?productid.aspx#productname

Google is proposing that you use a token (in this case an exclamation point !) to make sure Google knows that it’s an AJAX page that should be crawled. So, your new URL would look like:

example.com?productid.aspx#!productname

When Google comes across this URL using the token, it would recognize that it’s an AJAX page and take further action.

2. The Headless Browser (Scary name, but important functionality.)
Now that Google recognizes you are using AJAX, we need to make sure it can access the AJAX page (and the dynamically loaded content). That’s where the headless browser comes in. Now if you just said, “What the heck is a headless browser?”, you’re not alone. That’s probably the top question I’ve received after Google announced their proposal. A headless browser is a GUI-less browser (a browser with no graphical user interface) that will run on your server. The headless browser will process the request for the dynamic version of the webpage in question. In the blog post announcing this proposal, Google referenced a headless browser called HTMLUnit and you can read more about it on the website.

Why would Google require this? Well, Google knows that it would take enormous amounts of power and resources to execute and crawl all of the JavaScript being used today on the web. So, if webmasters help out and process the AJAX for Google, then it will cut down on the amount of resources needed and provide a quick way to make sure the page gets properly crawled.

To continue our example from above, let’s say you already provided a token in your URL so Google will recognize that it’s an AJAX page. Google would then request the AJAX page from the headless browser on your server by escaping the state. Basically, URL fragments (an anchor with additional information at the end of a URL), are not sent with requests to the server. Therefore, Google needs to change that URL to request the AJAX page from the headless browser (see below).

Google would end up requesting the page like this:
example.com/productid.aspx?_escaped_fragment=productname
Note: It would make this request only after it finds a URL using the token explained above (the exclamation point !)

This would tell the server to use the headless browser to process the page and return html code to Google (or any search engine that chooses to participate). That’s why the token is important. If you don’t use the token, the page will be processed normally (AJAX-style). If that’s the case, then the headless browser will not be triggered and Google will not request additional information from the server.

3. Stateful AJAX Pages Displayed in the Search Results
Now that you provided Google a way to crawl your AJAX content (using the process above), Google could now provide that URL in the search results. The page that Google displays in the SERPs will enable visitors to see the same content as if they were traversing your AJAX content on your site. i.e. They will access the AJAX version of the page versus the default content (which is what would normally be crawled). And since there is now a stateful URL that contains the AJAX content, Google can check to ensure that the indexable content matches what is returned to users.

Using our example from above, here is what the process would look like:
Your original URL:
example.com/productid.aspx#productname

You would change the URL to include a token:
example.com/productid.aspx#!productname

Google would recognize this as an AJAX page and request the following:
example.com/productid.aspx?_escaped_fragment=productname

The headless browser (on your server) would process this request and return a snapshot of the AJAX page. The engines would then provide the content at the stateful URL in the search results:
example.com/productid.aspx#!productname

Barriers to Acceptance
This all sounds great, right? It is, but there are some potential obstacles. I’m glad Google has offered this proposal, but I’m worried about how widespread of an acceptance it’s going to gain. Putting some of the workload on webmasters presents some serious challenges. When you ask webmasters to install something like a headless browser to their setup, you never know how many will actually agree to participate.

As an example, I’ve helped a lot of clients with Flash SEO, which typically involves using SWFObject 2.x to provide alternative and crawlable content for your flash movies. This is a relatively straightforward process and doesn’t require any server-based changes. It’s all client side. However, it does require some additional work from developers and designers. Even though it’s relatively painless to implement, I still see a lot of unoptimized flash content out there… And again, it doesn’t require setting up a headless browser on the server! There are some web architects I’ve worked with over the years that would have my head for requesting to add anything to their setup, no pun intended. :) To be honest, the fact that I even had to write this post is a bad sign… So again, I’m sure there are challenges ahead.

But, there is an upside for those webmasters that take the necessary steps to make sure their AJAX is crawlable. It’s called a competitive advantage! Take the time to provide Google what it wants, and you just might reap the benefits. That leads to my final point about what you should do now.

Wrapping Up: So What Should You Do?
Prepare. I would spend some time getting ready to test this out. Speak with your technical team, bring this up during meetings, and start thinking about ways to test it out without spending enormous amounts of time and energy. As an example, one of my clients agreed to wear a name tag that says, “Is Your AJAX Crawlable?” to gain attention as he walks the halls of his company. It sounds funny, but he said it has sparked a few conversations about the topic. My recommendation is to not blindside people at your company when you need this done. Lay the groundwork now, and it will be easier to implement when you need to.

Regarding actual implementation, I’m not sure when this will start happening. However, if you use AJAX on your website (or plan to), then this is an important advancement for you to consider. If nothing else, you now have a great idea for a Halloween costume, The Headless Browser. {And don’t blame me if nobody understands what you are supposed to be… Just make sure there are plenty of SEO’s at the Halloween party.} :)

GG

Related Posts:
The Critical Last Mile for SEO: Your Copywriters, Designers and Developers
Using SWFObject 2.0 to Embed Flash While Providing SEO Friendly Alternative Content
6 Questions You Should Ask During a Website Redesign That Can Save Your Search Engine Rankings
SEO, Forms, and Hidden Content - The Danger of Coding Yourself Into Search Obscurity

Labels: , , , ,

Friday, April 11, 2008

LiveHttpHeaders and SEO, How to Check Your HTTP Response Headers for Red Flags


Using LiveHttpHeaders to Check Your HTTP Response Codes for SEODoes your website throw a 302 when it should throw a 301? Does it throw a 200 when it should be a 404? Are there 500’s thrown on your site that look like 404’s? Do you think I’m insane yet? Hear me out…

Whether you understand the introduction above or don’t know what I’m talking about, there’s still something extremely important for you in this post. Every time you load a webpage, your browser REQUESTS a file and then the server provides a RESPONE to that request (also called a Response Header). Response headers can help you identify critical issues on your site (especially from an SEO standpoint). Now, you probably have a few key questions.

1) How do I check my response headers?
2) What should I be looking for?

Although I can’t cover everything about response headers in this post, I will answer the two questions listed above and provide some examples along the way.

Let me start by answering the first question since it’s the easiest… I highly recommend using LiveHttpHeaders, an add-on for Firefox that displays http headers in real time (as you browse webpages). This tool can save you a lot of time and possibly help you diagnose some serious SEO-related issues. I will answer the second question later in the post.

Install LiveHttpHeaders Now:
First, visit the LiveHttpHeaders project website and install the add-on. You will need to restart Firefox after installing LiveHttpHeaders. Once restarted, you can trigger LiveHttpHeaders in two ways. You can click Tools, LiveHttpHeaders, which will trigger a new window where you can view header responses in real time as your browse the web. You can also click View, Sidebar, LiveHttpHeaders to view response headers in a sidebar within Firefox. I prefer the new window, since I have dual monitors and it doesn’t take up any browser space. :-) Either way works fine.

To quickly test it out before we go any further, go and visit Google with LiveHttpHeaders running. When you hit the homepage of Google, you will see a bunch of information scroll by in LiveHttpHeaders. For our purposes, let’s look at the top of the window (the first piece of information sent back to you). I have stripped out some of the information you don’t need to focus on for this example.

http://www.google.com/

GET / HTTP/1.1
Host: www.google.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 200 OK
Content-Type: text/html; charset=UTF-8
Content-Length: 2547


I have bolded the response code, which shows 200 (or ok). 200 is a good thing... and now I can explain more about response headers and codes. By the way, where you see 200 in the window is where you would also see other response codes like 301, 302, 404, etc.

Back to our two questions for a minute. The second question was “What should I be looking for?” In a nutshell, you should be looking for the file requested and the response code sent by the server. Let’s start with a definition of a response header and identify some core http response codes.

As I mentioned earlier, when you load a webpage, two things happen. There is an http request (by your browser) and then an http response is sent (by the server). There is a response code returned as part of the http response. Some of the following codes might be familiar to you, and others aren’t. If you focus on SEO, then get to know them…they can really help you diagnose problems on your website.

Some of the most common http response codes are:

200 – ok (the webpage was returned ok)
301 – Permanent redirect (seo friendly) J
302 – Temporary redirect (don’t use this unless you absolutely have to!)
400 – Bad request (uh, not good)
401 – Unauthorized (you need to be authenticated)
403 – Forbidden (doesn’t matter who you are, it’s forbidden)
404 – Not found (not necessarily a bad thing…I’ll explain more later.)
500 – Internal server error (something went very wrong processing the webpage…)
502 – Bad gateway (also not a good thing)

Why are http response codes important? One of your goals as an SEO is to enable the bots to easily index your site. You don’t want them to get caught up in any way, shape or form. For example, 302 redirects are not the SEO-friendly way to tell Google where a page you removed now resides (you should use a 301 redirect instead). So constantly providing 302’s would be a very bad thing to do. Or how about throwing a 200 (ok) when you really should be throwing a 404. For example, the page isn’t there, but you just told Google that it is. Again, not a good thing to do. Therefore, finding 302’s, 404’s, 403’s, 500’s, etc. is critical to creating a clean path for the bots, which means you can have more of your content indexed and at a solid frequency. Let’s take a look at how LiveHttpHeaders can help you out.

Checking Your Response Headers:
Let’s take a look at a hypothetical situation. One morning you wake up and decide that you want to increase your natural search rankings. You launch an SEO initiative and get moving quickly. The first thing you want to do is to audit your current site structure (since you know that without a sound and clean structure, you’re dead in the water). So as part of your audit, you want to ensure your response headers and response codes look ok.

You fire up LiveHttpHeaders and visit your website:

* You hit the Homepage, 200 returned,
* You visit a Top Level Product Category, 200 returned,
* Then you try and visit a product detail page and you hit a 302 redirect. Hold on… You find that all links to your product detail page go through a 302 redirect. This was implemented as part of a recent code change. This is something you would want to change ASAP. The content on your product detail page is obviously important so you wouldn’t want to be throwing 302’s prior to the bots hitting those pages…
* But it doesn’t stop there. You know that you changed dozens of older product pages recently and created new URL’s. You check out the old URL’s and find 302 redirects to the new product pages. You would want to change that too… and provide a 301 redirect from the old page to the new page, safely passing link power from the old page to the new one.
* Then you check out some product categories on your site that have been removed completely (you won’t be selling those products anymore), but you find 200’s instead of 404’s. A 404 (page not found) is the proper response code to throw in this situation, as it will tell the engines that the page has been removed and that it should be de-indexed. You don’t want the page to be indexed if it’s not actually there, right?
* Last you check some newly added pages and find they are not displaying correctly. It looks like they aren’t on the site, which means you should see a 404. But…the server returns a 500 (or internal server error). Again, not a good thing as the bots traverse your website content and this is something invisible to the naked eye as you test your website. You would need to be checking response codes to find this issue…

OK, I think you get the picture! Keep in mind that a full SEO assessment covers much more than just checking response codes, but it’s an important part to revealing SEO-related issues. And you know what? Sometimes it’s darn easy to find a serious issue that can be resolved fairly quickly. For example, providing a 302 redirect right on your homepage! Or throwing 200’s for any page that’s been removed from your site.

Think Like a Bot:
I’ll end this post with an analogy. Imagine you needed to check every room in a 10 story hotel to document the type of TV that’s in each room. But the elevators don’t work properly, some of the staircase entrances are locked, and every now and then the room numbers change on you. Would you have an easy time completing your task? Would you keep trying to come back to “index” each room? Or would you stop a few rooms in and say, “Hey look, Lost is on.” And then sit back and watch the show….and forget about the TV’s (or your content). ;-)

GG

Labels: , , , ,

Tuesday, February 19, 2008

Using SWFObject 2.0 to Embed Flash While Providing SEO Friendly Alternative Content


Providing Alt Content for Flash Using SWFObject 2.0
Or is it friendly? More on that later... While mapping out and building your website content, chances are you have come across a situation where you really want to utilize flash versus html content (for some functionality). Although flash can provide an extraordinary level of interactivity, the problem (SEO-wise), is that flash content cannot be indexed by the search engines (at least for now). So, you might find yourself wanting to use flash for a given task, but you might also be struggling with the lack of search engine friendly content. I have been developing with flash for over 10 years and I also work extensively on Natural Search projects, so believe me, I feel your pain. :-) I wanted to write this post to introduce and explain SWFObject 2.0, the latest and greatest version of the popular flash replacement library.

A Quick SWFObject Disclaimer:
Unfortunately, I (or anyone else for that matter) cannot tell you that using SWFObject is entirely search engine safe. In a perfect world, providing an accurate, alternative html version of your flash content is extremely beneficial. I’m sure that Google and the other engines would have no problem with developers using it that way. But…and it’s an important but, there will always be those who exploit something like SWFObject for cloaking.

Let’s define cloaking: Providing one version of your content to the search engines while providing a different version of content to visitors. i.e. Altered content meant to trick the search engines.

You can easily see why this could be problematic for the search engines… There has been much debate about whether SWFObject is search engine safe or not, and I cannot give you the answer. That said, I think if you utilize SWFObject to provide alternative content that directly reflects your flash content, then you should be fine. I will show you how to do this later in the post.

SWFObject 2.0 Versus SWFObject 1.5
So what’s the difference between SWFObject 2.0 and SWFObject 1.5? Well, 2.0 is the latest version of the package (thank you Captain Obvious), which enables you to provide alt html content for your flash content using standards compliant markup. Version 2.0 will replace 1.5 and other forms of flash replacement like the flash player detection kit and UFO. That said, SWFObject 1.5 is still a great solution and you may choose to keep using 1.5 until you feel comfortable using 2.0. However, you will probably want to use version 2.0 based the benefits of the new process. :)

Static Versus Dynamic Publishing
There are 2 ways to use SWFObject 2.0, providing alt content using standards compliant markup (called static publishing) and inserting alt content using unobtrusive JavaScript (called dynamic publishing). Using dynamic publishing with SWFObject 2.0 is very similar to using SWFObject 1.5, where using static publishing is the new process. In this post, I will cover the standards compliant way (static) to use SWFObject 2.0 to embed flash content in your webpage. Let’s get started.

Download SWFObject 2.0
First, visit the Google Code Project for SWFObject 2.0 and download the zipfile containing the files you need. (FYI, you should download swfobject_2_0_rc2.zip). You can also download the official documentation and always have it handy. Extract the files to your hard drive and then copy the contents to your working directory. That way, you always have the original as a backup….good lesson from my programming days. :) View the screenshot below to see which files and folders your swobject2 directory should contain.

Folder Contents for SWFObject 2.0

SWFObject and Static Publishing
Let’s implement the standards compliant version of the package to replace your flash content with alternative html content. The alt content should directly reflect the content contained in your flash movie.

1. In your swfobject2 directory, open the index.htm file, which uses the static version of swfobject 2.0. Use this file as the template for your own implementation.
2. Look at the source code to follow along. In the head of the document, you will notice the following line of code:

Adding the SWFObject JavaScript Library to Your HTML Document

3. This line of code adds the SWFObject JavaScript library in your document. Including this code is a necessary component for the package to work properly.
4. Next, let’s hop down to the html portion of the document. Note, I have changed the code below to reflect my own flash movie and alt content. You can still easily follow along, though:

Click the image below to view a larger version:
The Nested Object Tags When Using the SWFObject Static Method

5. The code above includes a series of nested object tags, which enables the SWFObject package to provide cross-browser support. When adding your own content, you will need to replace a few items:

a. Replace “swfobject2-exampleb.swf” with the name of your actual flash movie. Note, the swfobject download includes a file named “test.swf”, so if you want to run the page using that flash movie, you should be good to go.

b. Change the width and height to match your actual flash movie’s width and height. My flash movie is 400x300.

6. About half way down the page, you will find a div tag for your alternate content. This is where you will provide alternate html content that directly reflects your flash movie's content. Feel free to use any html tags here to provide your alternative content. As you can see in the image below, I described my flash movie content in HTML.

Click the image below to view a larger version:
Providing Alternative HTML Content for Your Flash Movie

7. Let’s move back to the head of your html document for a second. You will need to register your flash movie with the swfobject library. Note, my page uses "exampleID" for the outer object tag id. You can use whatever you like or just keep the current id. You will see the following lines of code:

Register Your Flash Movie with SWFObject

8. The three parameters contain:

a.The id of the outermost object tag (myID). Note, you can change the id of the outermost object tag, but it must match what you enter in the JavaScript code when you register your flash movie. So, if you entered “flashID” instead, then you would need to enter “flashID” when you register your flash movie in the code above. Again, I used "exampleID".

b. The version of the flash plugin you are targeting (9.0.0), and

c. The name of the express install flash movie (if you wish to use one). Note, express install will display a standard dialog box that will enable your visitors without the required plugin version to download the flash plugin. I have noticed some buggy behavior with the express install functionality, so I just provide my own link to the flash plugin. Therefore, I enter false as the third parameter.

SWFObject 2.0 Code Generator
That’s all you need to do in order to use the standards compliant version of SWFObject 2.0. I know that opening the hood and working with code directly can be tough for non-programmers, so the creators of SWFObject have been nice enough to create a code generator for you. I didn’t want to mention it until after you went through the code so you can get a good feel for how this works. :-) I know…tough love! You can download the generator from the Google Code Project. The generator presents a form where you can enter the necessary information about your projects and then it generates the right code for you. I actually find it easier to drill into the code, but that’s what I’m used to!

A Working Example
Here is a simple example of using the standards compliant version of SWFObject 2.0. After viewing the flash content, you can click View, and then Source in your browser to see the alt content in the html. I also uploaded a webpage where I am forcing the browser to show you the alt content. This is what visitors would see if they didn't have the required version of the flash plugin. In addition, the static version of SWFObject 2.0 doesn’t rely on JavaScript to provide your flash content, so your visitors will see your flash content even if they have JavaScript turned off. A nice benefit. When you look at the source code, you can see an additional parameter I added for turning off the standard right click menu. You need to add this in two locations (both object tags) as you’ll see in the code. You can use a number of flash parameters and the SWFObject 2.0 documentation lists them for you. i.e. menu, loop, quality, wmode, etc.

Adding parameters within your object tags.

Click to Activate this Control
I know…Ugh. I won’t go into how or why Internet Explorer 6+ users must click to activate a flash movie, but it’s extremely annoying (especially for flash developers that work hard on creating killer flash movies!) Unfortunately, the standards compliant version of SWFObject 2.0 doesn’t alleviate this problem, where the dynamic versions of both SWFObject 2.0 and 1.5 alleviate the problem! Go figure. If you are looking to get rid of the dreaded “click to activate” message, then use the dynamic version of SWFObject 2.0 or 1.5 (not covered in this post). I may detail using the dynamic version of SWFObject in future posts, but this post is already getting too long! ;-)

Summary
OK, that was a lot to cover, but now you have a way to provide alternative html content for your killer flash content…and the search engines can index the alt content to boot! Again, nobody can guarantee that this is 100% search engine safe…thanks to some bad people who exploit this functionality. That said, if your alt content directly reflects your flash content, you should be ok. Used properly, this enhances the accessibility and usability of your site and will enable your killer content to be found by the search engines.

Just don’t go nuts when providing your alt content… :)

GG

Labels: , , , , ,

Tuesday, January 08, 2008

301 Redirect HTML Files Without Using ISAPI Rewrite


Using 301 Redirects When All Else FailsWhen you run a website, there are times that you'll need to redirect older webpages to newer webpages or you might want to redirect multiple domain names to a single domain name. There are two key ways to accomplish this task, issuing a 301 redirect or a 302 redirect. What you might not know is that a 301 redirect is search engine friendly and a 302 redirect is not. 301’s will safely tell the search engines that one page has been permanently moved to a new location, while 302’s tell the search engines that it’s a temporary redirect (which can cause problems down the line.) This shouldn’t be news for anyone working in the search industry, but might be news for website owners outside of the industry. My post today isn’t about what 301’s and 302’s are, but it’s about a unique challenge I ran into recently with one of my clients. We needed to 301 redirect several HTML files to new pages on the website without using the standard methods of issuing a 301 redirect. Also, the website was running on a shared server, which was an added barrier. By writing this post, my hope is that I can help some of you who might run into the same situation. More on this soon. Let’s start with a quick review of redirects.

Let’s Define 301 and 302 redirects:
A 301 redirect is a permanent redirect and tells the search engines that the old webpage has been permanently moved to a new location. It basically tells Google and the other engines that you have permanently moved one page from HERE to THERE. If you need to redirect a file on your website, then you should always use a 301 redirect.

A 302 redirect is a temporary redirect, and is not search engine friendly. It basically tells Google and the other engines that the file in question has temporarily moved from HERE to THERE. There have been vulnerabilities in the past with using 302 redirects, which is a reason that 302’s aren’t trusted. If you need to redirect one page to another on your website, then don’t use a 302. Always use a 301 redirect when possible.

The 301 Challenge
Back to the redirect challenge that I recently faced. Again, my hope is that the solution can help some of you who might run into the same situation. One of my clients has a website that’s running on a windows server and contains a combination of HTML, ASP, and ASP.net files. We needed to redirect several older HTML pages to new ASP.net pages, which at first glance would be relatively simple to do. If you are on a windows server, I highly recommend using ISAPI rewrite to issue 301 redirects. This is similar to using an .htaccess file on a linux or unix server. You can issue one line commands using a text file named httpd.ini that sits at the root level of your website. It easily enables you to issue 301 redirects, rewrite URLs, etc. It’s a great utility to have installed…

The Shared Server Problem
Here was the problem. We couldn’t use ISAPI rewrite. The website was running on a shared server and the web hosting company would not install ISAPI rewrite on the server. Some hosting companies will and others won’t…this specific hosting provider wouldn’t after several requests to do so.

Issue the 301 Via ASP.net Code
So, my next move was to issue the 301 redirects via code (either through ASP or ASP.net). There was also a problem with using this technique. The files we needed to redirect were HTML files and not ASP or ASP.net files, so I couldn’t add the necessary VB or VBScript code to the pages that needed to be redirected. Moving on…

Run HTML Files Through ASP.net
My third idea was to run all HTML files on the website through ASP.net, which would enable me to add ASP.net code to each of the HTML files. Basically, when an HTML file is requested, it would run through the ASP.net engine. Then I could issue the 301 redirect via ASP.net code instead of using ISAPI rewrite. Cool, right? The hosting provider made the change on the server (running HTML files through ASP.net), but to our dismay, some of the HTML files on the site were not rendering properly. So, we reverted back to the original setup (where HTML files were not run through ASP.net). Again, moving on…

The Fourth Time is a Charm…
My fourth idea finally worked. The hosting provider basically said we were out of luck, but I wasn’t ready to give up so fast… I knew that Classic ASP is still supported on windows server, even when running ASP.net. Classic ASP was the original version of Microsoft’s server side scripting framework. The next version of the framework was ASP.net, which has also gone through its own upgrades over the years. So, I posed the question…couldn’t we try and run HTML files through Classic ASP instead of ASP.net? My client’s hosting provider made the change and bingo, it worked like a charm. We can now issue search engine friendly 301 redirects on HTML pages. Just to clarify, this meant that I could add Classic ASP code to any HTML file running on the website. For our purposes, I could issue a 301 redirect via Classic ASP code, the HTML file would be run through the Classic ASP engine, and everyone would be happy. :)

The Added Benefits of Using This Solution:
The obvious benefit is that we can now use 301 redirects with any HTML file on the website, when needed. The added benefit is that we can now also use Classic ASP code within any HTML file running on the website. Typically, HTML files can only contain HTML code (no server side functionality.) But with this solution, I can make database calls, provide dynamic content, use session variables, and any other Classic ASP functionality available. It’s a flexible solution, to say the least.

In closing, please remember the following items when you need to redirect HTML files on your website:

1. If you need to redirect a webpage or domain name, use a 301 redirect.

2. Don’t use 302 redirects. If you do, use them at your own peril. {cue mad scientist laughter}.

3. If your website is hosted on a windows server, use ISAPI rewrite to issue your 301 redirects. It's a great utility.

4. If you can’t use ISAPI rewrite and you are in a shared environment, try and issue the redirect via ASP or ASP.net code. If you are trying to redirect HTML files, you’ll need to skip to #5 below.

5. If you can’t add ASP.net or Classic ASP code because you are working with HTML files, then try running your HTML files through the ASP.net or Classic ASP engine. Then you’ll be able to add the 301 redirect code to your HTML files.

Happy Redirecting!

GG

Labels: , , , ,