The Internet Marketing Driver: Glenn Gabe's goal is to help marketers build powerful and measurable web marketing strategies.

Friday, April 11, 2008

LiveHttpHeaders and SEO, How to Check Your HTTP Response Headers for Red Flags


Using LiveHttpHeaders to Check Your HTTP Response Codes for SEODoes your website throw a 302 when it should throw a 301? Does it throw a 200 when it should be a 404? Are there 500’s thrown on your site that look like 404’s? Do you think I’m insane yet? Hear me out…

Whether you understand the introduction above or don’t know what I’m talking about, there’s still something extremely important for you in this post. Every time you load a webpage, your browser REQUESTS a file and then the server provides a RESPONE to that request (also called a Response Header). Response headers can help you identify critical issues on your site (especially from an SEO standpoint). Now, you probably have a few key questions.

1) How do I check my response headers?
2) What should I be looking for?

Although I can’t cover everything about response headers in this post, I will answer the two questions listed above and provide some examples along the way.

Let me start by answering the first question since it’s the easiest… I highly recommend using LiveHttpHeaders, an add-on for Firefox that displays http headers in real time (as you browse webpages). This tool can save you a lot of time and possibly help you diagnose some serious SEO-related issues. I will answer the second question later in the post.

Install LiveHttpHeaders Now:
First, visit the LiveHttpHeaders project website and install the add-on. You will need to restart Firefox after installing LiveHttpHeaders. Once restarted, you can trigger LiveHttpHeaders in two ways. You can click Tools, LiveHttpHeaders, which will trigger a new window where you can view header responses in real time as your browse the web. You can also click View, Sidebar, LiveHttpHeaders to view response headers in a sidebar within Firefox. I prefer the new window, since I have dual monitors and it doesn’t take up any browser space. :-) Either way works fine.

To quickly test it out before we go any further, go and visit Google with LiveHttpHeaders running. When you hit the homepage of Google, you will see a bunch of information scroll by in LiveHttpHeaders. For our purposes, let’s look at the top of the window (the first piece of information sent back to you). I have stripped out some of the information you don’t need to focus on for this example.

http://www.google.com/

GET / HTTP/1.1
Host: www.google.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 200 OK
Content-Type: text/html; charset=UTF-8
Content-Length: 2547


I have bolded the response code, which shows 200 (or ok). 200 is a good thing... and now I can explain more about response headers and codes. By the way, where you see 200 in the window is where you would also see other response codes like 301, 302, 404, etc.

Back to our two questions for a minute. The second question was “What should I be looking for?” In a nutshell, you should be looking for the file requested and the response code sent by the server. Let’s start with a definition of a response header and identify some core http response codes.

As I mentioned earlier, when you load a webpage, two things happen. There is an http request (by your browser) and then an http response is sent (by the server). There is a response code returned as part of the http response. Some of the following codes might be familiar to you, and others aren’t. If you focus on SEO, then get to know them…they can really help you diagnose problems on your website.

Some of the most common http response codes are:

200 – ok (the webpage was returned ok)
301 – Permanent redirect (seo friendly) J
302 – Temporary redirect (don’t use this unless you absolutely have to!)
400 – Bad request (uh, not good)
401 – Unauthorized (you need to be authenticated)
403 – Forbidden (doesn’t matter who you are, it’s forbidden)
404 – Not found (not necessarily a bad thing…I’ll explain more later.)
500 – Internal server error (something went very wrong processing the webpage…)
502 – Bad gateway (also not a good thing)

Why are http response codes important? One of your goals as an SEO is to enable the bots to easily index your site. You don’t want them to get caught up in any way, shape or form. For example, 302 redirects are not the SEO-friendly way to tell Google where a page you removed now resides (you should use a 301 redirect instead). So constantly providing 302’s would be a very bad thing to do. Or how about throwing a 200 (ok) when you really should be throwing a 404. For example, the page isn’t there, but you just told Google that it is. Again, not a good thing to do. Therefore, finding 302’s, 404’s, 403’s, 500’s, etc. is critical to creating a clean path for the bots, which means you can have more of your content indexed and at a solid frequency. Let’s take a look at how LiveHttpHeaders can help you out.

Checking Your Response Headers:
Let’s take a look at a hypothetical situation. One morning you wake up and decide that you want to increase your natural search rankings. You launch an SEO initiative and get moving quickly. The first thing you want to do is to audit your current site structure (since you know that without a sound and clean structure, you’re dead in the water). So as part of your audit, you want to ensure your response headers and response codes look ok.

You fire up LiveHttpHeaders and visit your website:

* You hit the Homepage, 200 returned,
* You visit a Top Level Product Category, 200 returned,
* Then you try and visit a product detail page and you hit a 302 redirect. Hold on… You find that all links to your product detail page go through a 302 redirect. This was implemented as part of a recent code change. This is something you would want to change ASAP. The content on your product detail page is obviously important so you wouldn’t want to be throwing 302’s prior to the bots hitting those pages…
* But it doesn’t stop there. You know that you changed dozens of older product pages recently and created new URL’s. You check out the old URL’s and find 302 redirects to the new product pages. You would want to change that too… and provide a 301 redirect from the old page to the new page, safely passing link power from the old page to the new one.
* Then you check out some product categories on your site that have been removed completely (you won’t be selling those products anymore), but you find 200’s instead of 404’s. A 404 (page not found) is the proper response code to throw in this situation, as it will tell the engines that the page has been removed and that it should be de-indexed. You don’t want the page to be indexed if it’s not actually there, right?
* Last you check some newly added pages and find they are not displaying correctly. It looks like they aren’t on the site, which means you should see a 404. But…the server returns a 500 (or internal server error). Again, not a good thing as the bots traverse your website content and this is something invisible to the naked eye as you test your website. You would need to be checking response codes to find this issue…

OK, I think you get the picture! Keep in mind that a full SEO assessment covers much more than just checking response codes, but it’s an important part to revealing SEO-related issues. And you know what? Sometimes it’s darn easy to find a serious issue that can be resolved fairly quickly. For example, providing a 302 redirect right on your homepage! Or throwing 200’s for any page that’s been removed from your site.

Think Like a Bot:
I’ll end this post with an analogy. Imagine you needed to check every room in a 10 story hotel to document the type of TV that’s in each room. But the elevators don’t work properly, some of the staircase entrances are locked, and every now and then the room numbers change on you. Would you have an easy time completing your task? Would you keep trying to come back to “index” each room? Or would you stop a few rooms in and say, “Hey look, Lost is on.” And then sit back and watch the show….and forget about the TV’s (or your content). ;-)

GG

Labels: , , , ,

If you enjoyed this post and you need assistance
with your online marketing projects,
then contact Glenn Gabe today>

4 Comments:

  • At 9:55 PM, Anonymous Mr. SEO said…

    Ah! Redirects and header codes. My favorite things to see when I wake up in the morning knowing I have to climb the stairs of some new CMS system with a busted elevator.

    People need to make sure that a redirection plan is part of their CMS database driven website. Otherwise they might loose some of their valuable SEO visibility.

    The best way to check if the redirection plan is working correctly is to check the HTTP Response Headers. Even for the Firefox haters out there or the infrequent users there are lots of free tools that work just as well.

    When I am not using my normal machine I use SEO Consultants free checker. It works equally as well. I don't use the firefox plugin you use. Firefox Firebug works well for me. There is an endless ammount of tools out to use. Just as long as you use one of them during your SEO morning plan you should be good!

     
  • At 5:43 AM, Blogger Glenn Gabe said…

    Great points Chris. Redirects are one of the most overlooked elements and utilizing a tool in order to check header codes is always eye-opening, to say the least. :) I also use the server header tool at SEOConsultants to check response headers. It includes a cool bulk URL option too. Very handy… Depending on the job at hand and the size of the site, I go back and forth with LiveHttpHeaders and SEOConsultants.

    I’d like to hear more of your experiences with cms systems with busted elevators and blocked staircases! Unfortunately, I have some stories too... ;-)

    GG

     
  • At 10:47 AM, Anonymous christian said…

    Hey Glen, is it possible to make a custom 403 page on an IIS system? I've got a custom 404 all set up, but running into a wall with the 403.

     
  • At 9:10 AM, Blogger Glenn Gabe said…

    Hi Christian. I believe you can set this up in your web.config file. Are the restricted pages aspx pages or html pages?

    GG

     

Post a Comment

Links to this post:

Create a Link

<< Home