A few weeks ago, I was contacted by a small business owner about my SEO services. And what started out as a simple check of a website turned into an interesting case study about hidden SEO dangers. The company has been in business for a long time (30+ years), and the owner was looking to boost the site’s SEO performance over the long-term. From the email and voicemail I received, it sounded like they were struggling to rank well across important target queries and wanted to address that ASAP. I also knew they were running AdWords to provide air cover for SEO (which is smart, but definitely not a long-term plan for their business).
Unfortunately, my schedule has been crazy and I knew I couldn’t take them on as a longer-term client. But, I still wanted to quickly check out their website to get a better feel for what was going on. And it took me about three minutes to notice a massive problem (one that is killing their efforts to rank for many queries). And that’s a shame because they probably should rank for those keywords based on their history, services, content, etc.
Surfacing a Giant SEO Problem
As I browsed the site, I noticed they had a good amount of content for a small business. The site had a professional design, it was relatively clean from a layout perspective, and provided strong content about their business, their history, news about the organization, the services they provided, and more.
But then it hit me. Actually, it was staring me right in the face. I noticed a small 404 icon when hitting one of their service pages (via the Redirect Path Chrome extension). OK, so that’s odd… The page renders fine, the content and design show up perfectly, but the page 404s (returning a Page Not Found error). It’s like the opposite of a soft 404. That’s where the page looks like a 404, but actually returns a 200 code. Well in this situation, the page look like a 200, but returns a 404 instead. I guess you can call it a “soft 200″.
So I started to visit other pages on the site and more 404 header response codes followed. Actually, almost every single page on the site was throwing a 404 header response code. Holy cow, the initial 404 was just the tip of the iceberg.
After seeing 404s pop up all over the site, I quickly decided to crawl the website via Screaming Frog. I wanted to see how widespread of a problem it was. And it ends up that my initial assessment was spot on. Almost every page on the site returned a 404 header response code. The only pages that didn’t were the homepage and some pdfs. But every other page, including the services pages, news pages, about page, contact, etc. returned a 404.
For those of you familiar with SEO, then you know how this problem can impact a website. But for those of you unfamiliar with 404s and how they impact SEO, I’ll provide a quick rundown next. Then I’ll jump back to the story.
What is a 404 Header Response Code?
Every time a webpage is requested, the server will return a header response code. There are many that can be returned, but there are some standard codes you’ll come across. For example, 200 means the page returned OK, 301 means permanent redirect, 302 is a temporary redirect, 500 is an application error, 403 is forbidden, and 404 means page not found.
Header response codes are extremely important to understand for SEO. If you want a webpage indexed, then you definitely want it to return a 200 response code (which again, means OK, the request has succeeded). But if the page returns a 404, then that tells the engines that the page was not found and that it should be removed from the index. Yes, read that last line again. 404s basically inform Google and Bing that the page is gone and that it can be removed from each respective index. That means it will have no shot of ranking for target keywords.
And from an inbound links perspective, 404s are a killer. If a page 404s, then it cannot benefit from any inbound links pointing at the url. And the domain itself cannot benefit either (at an aggregate level). So 404s will get urls removed from Google’s index and can hamper your link equity (at the url level and at the domain level). Not good, to say the least.
Side Note: Checking Response Codes
Based on what I’ve explained, some of you reading this post might be wondering how to easily check your header response codes. And you definitely should. I won’t cover the process in detail in this post, but I will point you in the right direction. There are several tools to choose from and I’ll include a few below.
You could Fetch as Google in Google Webmaster Tools to check the response sent to Googlebot (which includes the header response code). You can also use a browser plugin like Web Developer Tools or Redirect Path to quickly check header response codes on a url by url basis.
Fetch as Google and browser plugins are great, but they only let you process one url at a time. But what if you wanted to check your entire site in one shot? For situations like that, you could use a tool that crawls an entire website (or sections of a site). For example, you could use Xenu or Screaming Frog for small to medium sized sites and then a tool like Deep Crawl for larger-scale sites. All three will return a boatload of information about your pages, including the header response codes. Now back to the case study.
Dangerous, But Invisible to the Naked Eye
Remember, the entire site was returning 404 header response codes, other than the homepage and a few pdfs. But this 404 situation was sinister since the webpages looked like they resolved ok. You didn’t see a standard 404 page, but instead, you saw the actual page and content. But, the pages were actually 404ing and not being indexed. Like I said, it was a sinister problem.
Based on what I just explained, you could tell why an SMB owner would be baffled and simply not understand why their website wasn’t ranking well. They could see their site, their content, the various pages resolving, but they couldn’t see the underlying problem. Header response codes are hidden to the naked eye, and most people don’t even realize they are being returned at all. But the response code returned is critically important for how the search engines process your webpages.
My Response – “You’re At SEO Defcon 2”
This was a tough situation for me. I absolutely wanted to help the business longer-term, but couldn’t based on my schedule. But I absolutely wanted to make sure they understood the problem I came across while quickly checking out their website.
So I crafted a quick email explaining that I couldn’t help them at this time, but that I found a big problem on their site. As quickly and concisely as I could, I explained the 404 situation, provided a few screenshots, and explained they should get in touch with their designer, developer, or hosting provider to rectify the situation ASAP. That means ensuring their webpages return the proper header response codes. Basically, I told them that if their webpages should be indexed, then they should return a 200 header response code and not the 404s being returned now.
I hit “Send” and the ball was in their court.
Their Response – “We hear you and we’re on the right track – we think.”
I heard back from the business owner who explained they started working with someone to rectify the problem. They clearly didn’t know this was going on and they were hoping to have the situation fixed soon.
But as of today, the problem is still there. The site still returns 404 header response codes on almost every page. That’s unfortunate, since again, the pages returning a 404 have no chance at all of ranking in search and cannot help them from a link equity standpoint. The pages aren’t indexed and the site is basically telling Google and Bing to not index any of the core pages on the site.
I’m going to keep an eye on the situation to see when the changes take hold. And I hope that’s soon. It’s a great example of how hidden technical dangers can destroy SEO.
Opening Up The Site – How Will The Engines Respond?
My hope is that when the pages return the proper response codes that Google and Bing will begin indexing the pages and ranking them appropriately. And that will help on several levels. The website can drive more prospective customers via organic search, while the business can probably pull back on AdWords spend. And the site can grow its power from an inbound link standpoint as well, now that the pages are being indexed properly.
But as I often say about SEO, it’s all about the execution. If they don’t implement the necessary changes, then their situation will remain as-is. I’ll try an update this post if the situation improves.
Summary – Know Your Header Response Codes
Although hidden to the naked eye, header response codes are critically important for SEO. The right codes will enable the engines to properly crawl and index your webpages, while the wrong codes could lead to SEO disaster. I recommend checking your site today (via both manual checks and a crawl). You might find you’re in the clear with 200s, but you also might find some sinister 404s. So check now.