The Internet Marketing Driver: Glenn Gabe's goal is to help marketers build powerful and measurable web marketing strategies.

Tuesday, October 13, 2009

SEO and AJAX: Taking a Closer Look at Google’s Proposal to Crawl AJAX

Taking a closer look at Google's proposal for crawling AJAX.Last week at SMX, Google announced a proposal to crawl AJAX. Although it was great to hear the official announcement, you had to know it was coming. Too many web applications are using AJAX for Google to ignore it! After the news was released, I received a lot of questions about what the proposal actually means, how it works, and what the impact could be. There seemed to be a lot of confusion, and even with people in the Search industry. And I can understand why. If you don’t have a technical background, then Google’s blog post detailing the proposal to crawl AJAX can be a bit confusing. The mention of URL fragments, stateful pages, and headless browsers can end up being confusing for a lot of people, to say the least. And if you’ve never heard of a headless browser, fear not! Since it’s close to Halloween and I grew up near Sleepy Hollow, I’ll spend some time in this post talking about what a headless browser is.

So based on my observations over the past week or so, I decided to write this post to take a closer look at what Google is proposing. My hope is to clear up some of the confusion so you can be prepared to have your AJAX crawled. And to reference AJAX’s original slogan, let’s find out if this proposal is truly Stronger Than Dirt. :)

Some Background Information About SEO and AJAX:
So why all the fuss about AJAX and SEO? AJAX stands for asynchronous JavaScript and xml, and when used properly, it can create extremely engaging web applications. In a nutshell, a webpage using AJAX can load additional data from the server on-demand without the page needing to refresh. For example, if you were viewing product information for a line of new computers, you could dynamically load the information for each computer when someone wants to learn more. That might sound unimpressive, but instead of triggering a new page and having to wait as the page loads all of the necessary images, files, etc., the page uses AJAX to dynamically (and quickly) supply the information. As a user, you could quickly see everything you need and without an additional page refresh. Ten or more pages of content can now be viewed on one… This is great for functionality, but not so great for SEO. More on that below.

Needless to say, this type of functionality has become very popular with developers wanting to streamline the user experience for visitors. Unfortunately, the search engines haven’t been so nice to AJAX-based sites. Until this proposal, most AJAX-based content was not crawlable. The original content that loaded on the page was crawlable, but you had to use a technique like HIJAX to make sure the bots could find all of your dynamically loaded content. Or, you had to create alternative pages that didn’t use AJAX (which added a lot of rework.) Either way, it took careful planning and extra work by your team. On that note, I’ve yet to be part of project where AJAX developers jump up and down with joy about having to do this extra work. Based on what I explained above, Google’s proposal is an important step forward. But there just had to be a better solution.

What is Google’s Proposal to Crawl AJAX?
When hearing about the proposal, I think experienced SEO’s and developers knew there would be challenges ahead. It probably wasn’t going to be a simple solution. And for the most part, we were right. The proposal is definitely a step forward, but webmasters need to cooperate (and share the burden of making sure their AJAX can be crawled). In a nutshell, Google wants webmasters to process AJAX content on the server and provide the search engines with a snapshot of what the page would look like with the AJAX content loaded. Then Google can crawl and index that snapshot and provide it in the search results as a stateful URL (a URL that visitors can access directly to see the page with the AJAX-loaded content).

If the last line threw you off, don’t worry. We are going to take a closer look at the process that’s being proposed below.

Getting Your AJAX Crawled: Taking a closer look at the steps involved:

1. Adding a token to your URL:
Let’s say you are using AJAX on your site to provide additional information about a new line of products. A URL might look like:

example.com?productid.aspx#productname

Google is proposing that you use a token (in this case an exclamation point !) to make sure Google knows that it’s an AJAX page that should be crawled. So, your new URL would look like:

example.com?productid.aspx#!productname

When Google comes across this URL using the token, it would recognize that it’s an AJAX page and take further action.

2. The Headless Browser (Scary name, but important functionality.)
Now that Google recognizes you are using AJAX, we need to make sure it can access the AJAX page (and the dynamically loaded content). That’s where the headless browser comes in. Now if you just said, “What the heck is a headless browser?”, you’re not alone. That’s probably the top question I’ve received after Google announced their proposal. A headless browser is a GUI-less browser (a browser with no graphical user interface) that will run on your server. The headless browser will process the request for the dynamic version of the webpage in question. In the blog post announcing this proposal, Google referenced a headless browser called HTMLUnit and you can read more about it on the website.

Why would Google require this? Well, Google knows that it would take enormous amounts of power and resources to execute and crawl all of the JavaScript being used today on the web. So, if webmasters help out and process the AJAX for Google, then it will cut down on the amount of resources needed and provide a quick way to make sure the page gets properly crawled.

To continue our example from above, let’s say you already provided a token in your URL so Google will recognize that it’s an AJAX page. Google would then request the AJAX page from the headless browser on your server by escaping the state. Basically, URL fragments (an anchor with additional information at the end of a URL), are not sent with requests to the server. Therefore, Google needs to change that URL to request the AJAX page from the headless browser (see below).

Google would end up requesting the page like this:
example.com/productid.aspx?_escaped_fragment=productname
Note: It would make this request only after it finds a URL using the token explained above (the exclamation point !)

This would tell the server to use the headless browser to process the page and return html code to Google (or any search engine that chooses to participate). That’s why the token is important. If you don’t use the token, the page will be processed normally (AJAX-style). If that’s the case, then the headless browser will not be triggered and Google will not request additional information from the server.

3. Stateful AJAX Pages Displayed in the Search Results
Now that you provided Google a way to crawl your AJAX content (using the process above), Google could now provide that URL in the search results. The page that Google displays in the SERPs will enable visitors to see the same content as if they were traversing your AJAX content on your site. i.e. They will access the AJAX version of the page versus the default content (which is what would normally be crawled). And since there is now a stateful URL that contains the AJAX content, Google can check to ensure that the indexable content matches what is returned to users.

Using our example from above, here is what the process would look like:
Your original URL:
example.com/productid.aspx#productname

You would change the URL to include a token:
example.com/productid.aspx#!productname

Google would recognize this as an AJAX page and request the following:
example.com/productid.aspx?_escaped_fragment=productname

The headless browser (on your server) would process this request and return a snapshot of the AJAX page. The engines would then provide the content at the stateful URL in the search results:
example.com/productid.aspx#!productname

Barriers to Acceptance
This all sounds great, right? It is, but there are some potential obstacles. I’m glad Google has offered this proposal, but I’m worried about how widespread of an acceptance it’s going to gain. Putting some of the workload on webmasters presents some serious challenges. When you ask webmasters to install something like a headless browser to their setup, you never know how many will actually agree to participate.

As an example, I’ve helped a lot of clients with Flash SEO, which typically involves using SWFObject 2.x to provide alternative and crawlable content for your flash movies. This is a relatively straightforward process and doesn’t require any server-based changes. It’s all client side. However, it does require some additional work from developers and designers. Even though it’s relatively painless to implement, I still see a lot of unoptimized flash content out there… And again, it doesn’t require setting up a headless browser on the server! There are some web architects I’ve worked with over the years that would have my head for requesting to add anything to their setup, no pun intended. :) To be honest, the fact that I even had to write this post is a bad sign… So again, I’m sure there are challenges ahead.

But, there is an upside for those webmasters that take the necessary steps to make sure their AJAX is crawlable. It’s called a competitive advantage! Take the time to provide Google what it wants, and you just might reap the benefits. That leads to my final point about what you should do now.

Wrapping Up: So What Should You Do?
Prepare. I would spend some time getting ready to test this out. Speak with your technical team, bring this up during meetings, and start thinking about ways to test it out without spending enormous amounts of time and energy. As an example, one of my clients agreed to wear a name tag that says, “Is Your AJAX Crawlable?” to gain attention as he walks the halls of his company. It sounds funny, but he said it has sparked a few conversations about the topic. My recommendation is to not blindside people at your company when you need this done. Lay the groundwork now, and it will be easier to implement when you need to.

Regarding actual implementation, I’m not sure when this will start happening. However, if you use AJAX on your website (or plan to), then this is an important advancement for you to consider. If nothing else, you now have a great idea for a Halloween costume, The Headless Browser. {And don’t blame me if nobody understands what you are supposed to be… Just make sure there are plenty of SEO’s at the Halloween party.} :)

GG

Related Posts:
The Critical Last Mile for SEO: Your Copywriters, Designers and Developers
Using SWFObject 2.0 to Embed Flash While Providing SEO Friendly Alternative Content
6 Questions You Should Ask During a Website Redesign That Can Save Your Search Engine Rankings
SEO, Forms, and Hidden Content - The Danger of Coding Yourself Into Search Obscurity

Labels: , , , ,

Monday, September 28, 2009

SEO Technical Audits - A Logical First Step for Improving SEO Results

SEO Website Audits, Why Extensive Technical Audits Are Critically Important.When I begin assisting new SEO clients, I typically start each engagement by completing a thorough SEO technical audit. Actually, I believe technical audits are so important that it's rare for me not to complete one. The reason is simple. An extensive audit identifies the strengths, weaknesses, and opportunities that a client has in natural search. It’s essentially a full analysis of a website and it takes into account several key factors that impact organic search. Needless to say, it's an important part of my seo services.

When speaking with new clients about natural search, I often refer to the four pillars of seo, including structure (a clean and crawlable structure), content (ensuring you have the right content and that it’s optimized), links (inbound links are the lifeblood of seo), and analytics (ensuring you track and analyze your natural search efforts). Then I typically jump back to pillar one and explain that without a clean and crawlable structure, you’re dead in the water. You can essentially forget about the other three pillars if your content can’t be crawled and indexed... For example, I was helping a site that already had over 1.3 million inbound links, yet the site ranked for almost no target keywords. The site had a massive structural problem, which was wreaking havoc on a number of important factors for SEO. The site could have built another 1.3 million links and nothing would have changed. The structure and architecture needed to be addressed before any impact would be seen. That’s a good example of when a technical audit was desperately needed (and you better believe I started one quickly to identify all of the barriers present on the site.)

The Core Benefits of an SEO Technical Audit
SEO technical audits yield several key benefits for clients looking to improve their results in natural search. The first benefit is that the audit yields an actionable remediation plan, which is a deliverable that documents each of the findings from the audit (along with how to address each issue.) To me, it’s one of the most important deliverables in SEO (especially in the beginning phases of an SEO engagement.) The remediation plan enables clients to fully understand where their website (or network of websites) stands SEO-wise. They get a lay of the land, understand the core problems impacting their website, and identify key opportunities in natural search (some of which can be tackled immediately). For example, I once helped a website jump from 250K pages indexed to 1.1 million in less than a month based on relatively painless changes to the site’s structure. That opened up a massive amount of content that was essentially hidden from the search engines. Without the audit, they probably would have stayed at 250K pages indexed and missed a huge opportunity…

Another benefit is that the audit helps build an SEO roadmap, which is a critical plan for how a client is going to achieve its goals in natural search. You know where the site stands, what needs to be addressed, what the key opportunities are, and how long each step will take. Working directly with a client’s team (executives, marketers, programmers, designers, copywriters, etc.) you can map out the necessary steps to remediate the site and expand your efforts. Everyone should have a solid feel for what needs to completed, and every person on the team is involved. In case you haven’t read my previous posts, I typically refer to a company’s team of developers, designers, and copywriters as The Critical Last Mile for SEO. Without their input and cooperation, you’re going to have a heck of time getting things done and seeing success.

What Can You Learn From an SEO Technical Audit?
Extensive audits produce a wealth of knowledge about the website in question. Although there are some people that might want to charge the (SEO) hill without conducting a thorough audit, I think that's a dangerous proposition. Thorough research and analysis are critically important when trying to determine obstacles in natural search. Without fully understanding what you are facing, you risk wasting time, a massive amount of effort (from everyone involved), burning through budget, and all while producing little results. Don’t charge the hill without a solid plan in place.

So, what can you find when performing a technical audit? To answer that question, let’s take a look at a hypothetical situation. Imagine you’re a VP or Director of Marketing that has a serious SEO problem. How important would finding the following things be for you?

Your SEO website audit revealed:

* Your company was using seven domains, and splitting your content across all of them. All seven have built up their own amount of SEO power (and none of them are very powerful).
* A website redesign was just completed, but without a proper migration strategy in place. This left thousands of pages, and possibly hundreds of thousands of inbound links, in limbo.
* Your website just added a killer web application, but that same application is hiding 90% of your content.
* Your website houses 750 videos across 30 categories, but none of them are indexed and ranking.
* Your navigation is half as robust as it needs to be, and uses several 302 redirects to link to each page.
* Every campaign landing page you launch disappears after the campaign ends (wasting thousands of powerful links.)
* Your new product pages are beautiful, but they contain a heavy amount of flash content and almost no text. And to add insult to injury, your flash content isn’t even optimized.
* 600 pages on your website are optimized the same exact way.
* Your site contains 200 pages, but over 2000 are indexed. Huh? What does that even mean?
* Your 404 page looks great, but it issues 200 codes (telling the engines the pages in question loaded successfully).
* At any given time, thousands of URL’s can change, wasting all of the SEO power they have built up over time.

I can keep going here... and you can probably start to see why I think SEO technical audits are so important. :) You never know what you’ll find, and many times these little gremlins are severely impacting your natural search efforts. Without conducting an extensive audit, you might only identify a small percentage of the problems impacting the website. That could leave the most important, and deepest structural problems hidden and unaddressed. And those deeper structural problems might be causing 90% of your SEO issues. By tackling only 10% of your problems, you might not make a dent in your efforts and performance in natural search.

SEO Audit Details: Deliverables, Cost, and Length of Time
In case you are wondering what a technical audit looks like, the deliverable is typically a PowerPoint presentation. Using PowerPoint enables you to provide visuals, screenshots, callouts, etc. It also works well when you need to present to larger groups of people. There are times a Word document will suffice, but unless you're audience is extremely familiar with the technical aspects you will be referring to in the remediation plan, I recommend going with PowerPoint. The length of time for completing an audit (and subsequent cost) completely depends on the size and complexity of the website. For example, larger, more complex sites might yield a 70 or 80 slide deck where smaller websites might yield 25-30 slides. I’ve seen audits completed in less than a week and others that take 6-8 weeks to complete. It makes sense if you think about it. You might have one website that has fewer than 50 pages and another site that has millions of webpages… The two presentations might look very different.

A Critical Component: The Analyst Completing Your Audit
It’s important that you find a consultant or agency that matches well with your business, industry, and the type of content you provide. You definitely don’t want to spend time and money on an audit that produces little results. So it's important that you choose a consultant or agency that can produce a remediation plan that's technically sound, thorough, and actionable. Find out how many audits the agency or analyst has completed. Find out which verticals they have focused on, and then ask for results based on their audits. For example, if you're a small business, find out if the SEO focuses on SMB's and local search. If you have expanded internationally, then ask if the SEO understands international SEO. If you focus on video, make sure the SEO has in depth experience with Video SEO. If you have 10 million webpages, then find out the largest website the consultant has worked on. You get the picture.

A quick example: All technical audits are not created equally:
I was asked to analyze a website last year and give the site a score for SEO (0-100, where 100 was be the best possible SEO situation). Before presenting my findings, I was told that the site was previously audited and was given a score of 75%. I was pretty shocked to hear that score. I had given the website a score of 35%. From my perspective, the site needed serious help… There's a big difference between the two scores, right? But, there’s also a reason the company had chosen to have a second audit performed. They weren’t seeing results after the first was completed. A score of 35% was accurate and we quickly were able to identify projects to tackle and develop a roadmap.

Unfortunately, technical audits that provide a shallow or incomplete view of your website can be dangerous. That type of audit could yield what I call “the snake oil effect”. That’s when internal employees become desensitized to SEO, don’t believe it can actually work, and focus their attention on less powerful initiatives. Think about it, if you’re an executive that allocated significant budget for several SEO efforts but never saw results, then your view of SEO will probably be skewed. Don’t let that happen! Natural search is too important.

The Most SEO Bang for Your Buck
If you are unhappy with your natural search results and you are determining where to begin, don’t overlook the power of an SEO technical audit. As I mentioned above, an audit can yield a detailed remediation plan in a relatively short amount of time. The remediation plan can yield a roadmap for your efforts, which can include projects that improve your overall SEO performance (including crawlability, indexation, content optimization, rankings, and targeted traffic.) That’s why I consider technical SEO audits a logical first step for most companies. It can provide serious SEO bang for your buck.

GG

Related Posts:
6 Questions You Should Ask During a Website Redesign That Can Save Your Search Engine Rankings
The Critical Last Mile for SEO, Your Designers, Developers, and Copywriters
SEO, Forms, and Hidden Content - The Danger of Coding Yourself Into Search Obscurity

Labels: , ,

Tuesday, September 08, 2009

SEO, Forms, and Hidden Content - The Danger of Coding Yourself Into Search Obscurity

How forms and web applications can hide content from the search engines.When I perform a competitive analysis for a client, I often uncover important pieces of information about the range of websites they are competing with online. Sometimes that information is about traffic, campaigns, keywords, content, inbound links, etc. There are also times I uncover specific practices that are either beneficial or problematic for the competitor. For example, they might be doing something functionality-wise that could be inhibiting the overall performance of the site. If I do uncover something like that, I usually dig much deeper to learn more about that problem to ensure my clients don’t make the same mistakes. So, I was analyzing a website last week and I uncovered an interesting situation. On the surface, the functionality the site was providing was robust and was a definite advantage for the company, but that same functionality was a big problem SEO-wise. Needless to say, I decided to dig deeper to learn more.

Slick Web Application Yielding Hidden Content
As part of the competitive analysis I was completing, I came across a powerful web application for finding a variety of services based on a number of criteria. The application heavily used forms to receive information from users. The application included pretty elaborate pathing and prompted me to clarify answers in order to provide the best recommendations possible. After gathering enough information, I was provided with dozens of targeted service listings with links to more information (to more webpages on the site). So you might be thinking, “That sounds like a good thing Glenn, what’s the problem?” The problem is that the web application, including the robust form functionality, essentially hid all of the content from the search engines. In this case, we are talking about more than 2000 pages of high quality, high demand content. I say “high demand”, because I completed extensive keyword research for this category and know what people are searching for. Unfortunately for this company, the application yielded results that are simply not crawlable, which means the site has no chance to rank for competitive keywords related to the hidden pages. And by all means, the site should rank for those competitive keywords. For those of you asking, “but isn’t Google crawling forms?” I’ll explain more about that below. For this application, none of the resulting content was indexed.

Losing Visitors From Natural Search and Missing Opportunities For Gaining Inbound Links
Let’s take a closer look at the problem from an SEO standpoint. Forms often provide a robust way to receive user input and then provide tailored information based on the data collected. However, forms can also hide that content from the search engine bots. Although Google has made some strides in executing forms to find more links and content, it’s still not a perfect situation. Google isn’t guaranteeing that your forms will be crawled, it limits what it will crawl to GET forms (versus POST), and some the form input is generated by common keywords on the page (for text boxes). That’s not exactly a perfect formula.

Using forms, you might provide an incredible user experience, but you might also be limiting the exposure and subsequent traffic levels to your web application from natural search. I come across this often when conducting both SEO technical audits and competitive analyses for clients. In this case, over 2000 pages of content remain unindexed. And if the content is not indexed, then there is no way for the engines to rank it highly (or at all).

The Opportunity Cost
Based on the keyword research I performed, a traffic analysis of competing websites, and then comparing that data to the 2000 pages or so of hidden content, I estimate that the site in question is missing out on approximately 10-15K highly targeted visitors per day. That additional traffic could very easily yield 300-400 conversions per day, if not higher, based on the type of content the site provides.

In addition to losing targeted traffic, the site is missing a huge opportunity to gain powerful inbound links, which can boost its search power. The content provided (yet hidden) is so strong and in demand, that I can’t help but think the 2000 pages would gain many valuable inbound links. This would obviously strengthen both the domain’s SEO power, as well as the power of the specific pages (since the more powerful and relevant inbound links your site receives, the more powerful it is going to become SEO-wise.)

Some Usability Also Hindered
Let’s say you found this form and took the time to answer all the questions. After you completed the final steps of the form, you are provided with a list of quality results based on your input. You find the best result, click through to more information, and then you want to bookmark it so you can return later. But unfortunately you can’t… This is due to the web application, which doesn’t provide permanent URL’s for each result. Yes, the form is slick and its algorithm is great, but you don’t have a static page that you can bookmark, email to someone else, etc. How annoying is that? So if you want to return to the listing in question, you are forced to go back through the form again! It’s another example of how SEO and usability are sometimes closely related.

SEO and Forms, A Developer's Perspective
I started my career as a developer, so I fully understand why you would want to create a dynamic and powerful form-based application. This specific form was developed using asp.net, which utilizes postback (where the form actually posts back information to the same page). The URL doesn’t change, and the information submitted is posted back to the same page where the programmer can access all of the variables. Coding-wise, this is great. SEO-wise, this produces one URL that handles thousands of different pieces of content. Although you might have read that Google started crawling html forms in 2008, it’s a work in progress and you can’t guarantee that all of your forms will be crawled (to say the least…) On that note, you should really perform a thorough analysis of your own forms to see what Google is crawling and indexing. You might be surprised what you find (good or bad). So, the application I analyzed (including the forms) isn’t being crawled, the URL never changes, the page optimization never changes, and the content behind the form is never found. This is not good, to say the least.

If I were advising the company using this application, I would absolutely recommend providing another way to get the bots to all of this high quality content. They should definitely keep their robust web application, but they should also provide an alternative path for the bots. Then they should optimize all of those resulting webpages so they can rank for targeted queries. I would also disallow the application in robots.txt, blocking the bots from crawling any URL’s that would be generated via the form (just in case). With the right programmer, this wouldn’t take very long and could produce serious results from natural search…

The Most Basic SEO Requirement: Your Content Needs to be Found In Order to Rank
It sounds obvious, but I run into this problem often as I perform SEO technical audits. Your killer content will not rank just because it’s killer content. The content needs to be crawled and indexed in order to rank highly for target keywords. In this case, the site should definitely keep providing its outstanding functionality, but they should seriously think about the search implications (and provide an easy way for the bots to find optimized content.)

The bad news for my client's competitor is that I believe they aren’t aware of the severity of the problem and how badly it’s impacting their natural search traffic. However, the good news for my client is that they know about the problem now, and won’t make the same mistake as their competitor. That’s the power of a competitive analysis. :)

GG

Related Posts:
6 Questions You Should Ask During a Website Redesign To Save Your Search Engine Rankings
The Critical Last Mile for SEO, Your Copywriters, Designers, and Developers

Labels: , , ,

Monday, December 01, 2008

The Critical Last Mile for SEO: Your Copywriters, Designers and Developers

The last mile of SEO, your web developers and web designers.As I’m mapping out a half day SEO training course for creative and technical employees, I started to think about the importance of the last mile in SEO. In the telecommunications industry, the last mile (or final mile) refers to the final connection to end users (usually referring to data connectivity to businesses and consumers). It’s often an area where issues can arise. In SEO, there’s also a last mile, although it’s slightly different. The last mile in SEO includes your copywriters, designers and developers. Let me give you a quick example. Let’s say you were hired to help a company with a large SEO project. Your job was to enhance the company’s SEO efforts by removing technical barriers, optimizing important categories of content, and increasing quality inbound links. You start by performing an extensive technical audit and you identify key barriers to indexation. Then you map out a full remediation plan. Your client is excited, you’ve built up some well-deserved credibility, and everyone involved believes that better rankings and targeted traffic are on their way. But hold on a second... Your changes still need to be implemented successfully. Enter the critical last mile for SEO, or your designers and developers that need to implement those changes. Needless to say, your technical and creative teams are extremely important to your SEO efforts.

Why The Last Mile In SEO Is So Important
It is critical that your creative and technical teams successfully implement your SEO changes. If they don’t, then your changes run the risk of having no impact at all (or worse, having a negative impact). That’s right, imagine you’re brought in to fix a problem and you end up making things worse! It’s definitely possible. Keep in mind that problems typically arise in the last mile of SEO when dealing with larger sites when there are more people involved. For example, a 500,000 page website with 75 people working on it. However, whether you hand off technical SEO changes to a single developer or a team of developers, you’re relying on them to implement something they might not be very familiar with. And you need to understand that without your designers and developers, it’s going to be extremely hard to get your SEO changes implemented swiftly and accurately. Like I said earlier, they encompass the critical last mile… That said, your designers and developers also need to understand that your SEO changes are important to the success of the website. It’s a symbiotic relationship and each party needs to understand the value that the other brings to the table.

Let’s take a look at some quick examples of last mile SEO breakdowns, and more importantly, how you can make sure this doesn’t happen in the future:
(Note, I’ve included just a few examples below and not an exhaustive list.)

Search Engine-Friendly Redirects
The Breakdown: Instead of search engine-friendly 301 redirects, 302 redirects or meta refresh redirects were implemented on the website. Both 302’s and meta refresh redirects are not search engine friendly and will not safely pass the link popularity from the old pages to the new ones. Needless to say, this is not good. If your redirects are implemented incorrectly, then you could waste thousands of inbound links and the search power they provide. In addition, you could have wasted countless hours of inbound link analysis.

XML Sitemaps Throwing Errors
The Breakdown: The database administrator generating your xml sitemap files didn’t know that each xml file cannot exceed 50,000 URL’s or 10MB in uncompressed file-size. The files released to the website exceeded those limits, and the engines wouldn’t process the files. Unfortunately, he didn’t know that the files were throwing errors until your SEO Coordinator received the errors in Google Webmaster Tools.

--I worked on a site with over 20 million webpages last year, and we definitely went through a few iterations of sitemap files before we settled on the final result.

Content Optimization, Keyword Research, and Wasted Opportunities
The Breakdown: Important new sections of content went live without being optimized based on keyword research. You’ve lost a great opportunity to provide optimized content and to possibly rank for target keywords. For example, a new product section goes live and it unfortunately contains generic title tags, non-descriptive links, no heading tags, a lack of target keywords, etc.

Canonicalization
As part of your technical audit, you might find URL canonicalization issues, which could cause duplicate content problems. For example, you might find URL’s that resolve using mixed case, querystring parameters, index files and root URL’s. 1 URL might look like 5 to the search engines (all with the same exact content).

For example:
www.yourwebsite.com
yourwebsite.com/
yourwebsite.com
yourwebsite.com/index.htm
yourwebsite.com/index.htm?value=duplicatecontent

The Breakdown: Your developers fix the most obvious problem, www and non-www versions of each page, but don’t tackle the other canonicalization problems, including trailing slashes and mixed case. You will unfortunately still have an issue although the action item might be checked off by project management.

Flash and AJAX
Let’s say you have a killer promotion going live along with campaign landing pages. There’s lot of good content to optimize and you have a feeling this promotion will gain some valuable inbound links. You hand off your content optimization spreadsheet, excited to see the pages go live.

The Breakdown:
Your new campaign landing page goes live, but the entire page was developed in flash or using AJAX. If you’ve read my blog before, then you know I’m a big fan of using flash and AJAX, when needed. That said, entire webpages or applications should not be developed using flash or AJAX (at least at this point). They should only be used for elements that require their power. If you do use flash or AJAX for entire webpages, then you run the risk of essentially hiding a lot of your content from the search engines.

Graceful Degradation and Progressive Enhancement
The Breakdown: User Experience wants to take 6 distinct sections of content on a product detail page and provide a tabbed structure instead (for usability). If the tabbed content launches without using Graceful Degradation or Progressive Enhancement, then you run the risk of hiding 5 out of 6 sections of content. For example, the search engines would only find the initial content on the page and not the additional five pieces of content. However, making sure your web developers use Graceful Degradation or Progressive Enhancement to expose the content would still put you in a good place SEO-wise.

So How Do You Prevent a Breakdown in the Last Mile of SEO?
Reading the examples above, you might think that SEO can be frustrating. It is sometimes, but there is a way to nip these last mile problems in the bud. Did you notice a common thread in the examples listed above. The common thread was simply a lack of information. So how do you make sure your designers and developers know about SEO best practices? The answer is training. SEO Training is critical to ensuring technical changes go live using SEO best practices.

In my experience, most designers and developers want to learn SEO best practices. Sure, there will be some push back (and I’m being nice with the term “push back”). But, it’s a great skill for your designers and developers to add to their skillset. They can still create killer applications and websites, but those sites will also launch using SEO best practices. SEO Training can also overcome conflict in the future by ensuring everyone developing a project understands SEO best practices. For example, there should be no surprises when reviewing projects if everyone understands how sites get crawled and indexed.

The Definition of Insanity
I’ll end this post with the definition of insanity. It’s doing the same thing over and over again and expecting different results. Don’t become an insane SEO. :) Introduce SEO training, best practices, examples, etc. and you can make your life easier while helping everyone involved improve their skillset.

Now I need to get back to fleshing out my half day SEO training course. Actually, I think writing this post has helped me create a better training course. I’ll let you know how it goes.

GG

Labels: , , ,

Thursday, October 09, 2008

6 Questions You Should Ask During a Website Redesign That Can Save Your Search Engine Rankings

Questions to ask during your next website redesign or update.If you are currently involved in or are planning a website redesign, then I’m sure the title of my post caught your attention. I’m not one to strike fear into people about SEO, but in my experience, website redesigns (or even website updates) have a knack for hurting Natural Search rankings. It actually makes a lot of sense if you think about it. During website redesigns, many companies try to make noticeable and impactful changes. You might add more interactivity and rich media, you might use the latest coding techniques to enhance the user experience, you might remove older webpages that you don’t believe need to be on the site anymore, you might change your URL structure, so on and so forth. But, and this a significant but, if you don’t look at your redesign through the lens of SEO, then you have a distinct possibility of hurting your search rankings. Actually, you can crush your rankings if you aren’t careful.

So, I decided to write this post to help you stand out as the person that saves the day. The person that flies in with SEO on your chest, swoops down and identifies SEO issues with your redesign and then corrects a potential disaster in the making.
--BTW, these are actual SEO scenarios I have come across. Also, there are many more issues that can pop up, but I decided to focus on these 6 for the post. And don’t laugh when you read each item, this might be happening as part of your next redesign. :-)

Without further ado, here are 6 questions you can ask during your website redesign that can save your search engine rankings:

1. Are we using Flash in the right ways and only when we need its unique power?
If you know me at all, then you know I’m a big advocate of Flash (having developed with it for over 10 years). But, replacing HTML content with full Flash pages or a significant amount of Flash can really cause problems SEO-wise. Run a cache command on a full flash webpage and you’ll see the problem quickly. That is unless you want to rank for “big blank white space”! ;-) If you do add more flash content to your site, then definitely utilize SWFObject 2.0 to provide search engine friendly alternative HTML content. I’ve written an in depth post about how to use SWFObject 2.0 here. And for those of you that are saying, “We’ll be ok since the engines are now crawling flash...”, please read my other post about Google crawling flash. There are several variables that can impact how Google and Yahoo crawl your swfs (the two engines working with Adobe now). My tests and recommendations were backed up this week at SMX during the Flash and SEO session with Adobe, Google, Yahoo, and Live Search. What’s my rule of thumb with Flash? Use it where you need the unique power of Flash. Do not, I repeat, do not use Flash for your entire site or for entire pages of content. Use it for webpage elements only.

2. Did we analyze the Search Equity of webpages marked for removal?
If you will be removing content from your site, make sure you determine the Search Equity of your pages. Your current rankings are heavily based on the quality and relevance of your inbound links. You’ve worked hard to build those links, so why would you throw them away?? This happens all too often when you don’t take into account which pages are important from a Natural Search standpoint.

Campaign landing pages are a great example of this. Let’s say you launch a new product and use a wide range of marketing channels to promote the new product and landing page. When the campaign ends, you decide the page isn’t needed anymore, so you just delete it. But hold on… if you had taken a look at the Search Equity of the page, you would have realized it built more than 5000 links for you, mostly from industry-relevant blogs and websites! It earned a Pagerank 5 and you just threw away all of those links by deleting the page! I hate when I see this happen. Do your homework before deleting pages.

So what should you do? You should either keep the page as-is or 301 redirect the page to a corresponding page on your site. That might be the product category page or a similar product page. 301 redirects are the proper way to pass link power from one URL to another. It’s a permanent redirect and tells the engines that Page A has moved permanently to a new location (Page B). Tip: Do not use 302 redirects when you remove a page. 302's are temporary redirects and are not search engine friendly. I can write an entire post about redirects, but just remember that 301’s are good and 302’s are bad.

3. Are we changing our URL structure during the redesign? If we are, did we make sure the engines know where the old pages will reside on the new website?
Similar to the bullet above, be careful if you decide to change your URL structure. If you change a URL from abcd.asp to efgh.asp, the engines will look at the page as NEW, even though the same content has been around for a long time (and has built up links and search power). Basically, the new page won’t automatically inherit the search power of the original page. Now imagine the impact if you change thousands of URL’s, tens of thousands of URL’s or even more?

For example, let’s say you decide to include target keywords in your URL’s, such as a product name and category. The old URL’s that have built up a nice amount of Search Equity will all be changed to your new taxonomy during the redesign. That’s great, but again, all of that search power will unfortunately be lost unless you tell the engines where the new URL’s are. Based on what I mentioned above, you can probably guess that it’s Mr. 301 redirect to the rescue again. You can redirect your old URL’s to your new ones and safely pass their link power. I’ve seen this overlooked plenty of times, and again, the results can be devastating.

4. Are we using Vanity URL’s or custom domains for our campaign microsites?
Note, this doesn't fall under something that will crush your current rankings, but it sure can impact how your site builds more power based on your hard work.

Let's say you have a new marketing campaign going live soon and someone on your team wants to register a bunch of new domain names for the microsite. You know, something like www.TheBestDarnBagelOnThePlanet.com or something catchy like that… Here’s the problem. It will be a brand new domain that needs to build its own search power versus inheriting the trust from your core domain, which is why I’m a bigger fan of using subdirectories, such as yourdomain.com/campaigntitle. Then your campaign will leverage your trusted domain, rank faster, and help build links for your trusted domain. It’s a win-win.

5. Are we replacing keyword-rich text content with images or Flash in order to achieve an aesthetic advantage? AKA, we want things to look pretty…
Your design team went nuts with the redesign, the new site looks incredible, and it uses all sorts of images and flash content in place of text content. You know, because the standard browser fonts aren’t sexy enough. I get that, I really do... but the SEO impact can be serious. For example, taking keyword rich text content on each page and throwing it into images to get a desired look. Taking your text navigation and placing it in Flash or in images. Again, this happens all too often. Text links are still the best way to get the bots to all of your content. And, using descriptive anchor text, you can tell the engines what they will find at the other end of the link. For example, using a text link with the anchor text Adidas Running Sneakers is much more powerful than using an image that holds the text Adidas Running Sneakers. Even if you use alt text with that image, it’s a much better idea to use descriptive text links. And, if you use Flash, then you’ll run into even more problems, which is why you should use SWFObject to provide an HTML version of your navigation. And for those of you who are saying, “I’ll just provide an xml sitemap to the engines and I’ll be fine”, keep in mind that the optimal way to get the engines to your pages is via a traditional crawl (as noted by a Google engineer at SMX this week). :) XML Sitemaps are a great supplement and help with more than just content discovery, but they don’t replace text links and navigation as the best way to get the bots to your website pages.

6. Did we do such as a good job at coding that we essentially removed key pages from our website? i.e. Where one page now handles the equivalent of 10 pages. The URL doesn’t change, but the content does big time!
Your developers did a great job of streamlining your code. They did such as good job, that 10 pages of content can now be handled dynamically by just one page. That one page posts back to itself and dynamically provides the content of 10 pages from your old site. Code-wise this might be outstanding, SEO-wise, it’s a nightmare. Beyond removing 10 pages from your site that might have built up Search Equity, you cannot optimize a page for each of the 10 items that will be presented on the fly. You are going to have a heck of a time getting those products to rank if they cannot be crawled! In addition, you cannot optimize the typical HTML elements like you normally would. For example, the title tag, h1, h2, body copy, inline links, etc. since the information will be loaded dynamically. Coming from a development background, I totally understand why you would want to code this way. However, from an SEO-standpoint, it can cause all sorts of issues. I would make sure you can present each of the 10 pieces of content in an optimized webpage with a distinct URL. You can still use code to streamline the process and delivery, but try not to handle everything at one URL.

A quick example would be a category page that dynamically presents each product within that category. This might happen when you click each product image (and this all happens at at one URL). The engines would only see one URL and crawl the initial content. Not good.

So there you have it, 6 ways you can save the day during your next website redesign or website update. Keep in mind that you will probably have a challenging time when you first introduce these questions. There will be pushback and requests to back up your recommendations. But once you do, and everyone involved starts to understand SEO best practices, the problems I mentioned will be less likely to occur. If they are less likely to occur, then you have a better chance of keeping your organic search power. If you keep your organic search power then you can keep driving natural search traffic to your site. If you keep driving natural search traffic to your site, then you can reap the benefits of that traffic, which can be increased exposure, customers, and revenue.

So don't be afraid to speak up!

GG

Labels: , , , ,