Wednesday, March 2nd, 2011

Successfully Revealing the Iceberg – Avoid These Common Pitfalls When Opening Up New Content for SEO

by Glenn Gabe

How to open up new content for SEO.

When I’m helping clients with SEO, I often look for opportunities to expand the amount of optimized content they provide on their sites by leveraging information they already own (or that they already have developed). I see opportunities to do this often, since it’s easy to overlook data that’s right under your nose (if you aren’t looking at the situation from an SEO standpoint). This can sometimes be low hanging fruit for companies SEO-wise, and can greatly expand the amount of content indexed by the search engines.

For example, if you had a database full of information relevant to a specific topic, category, or industry vertical, but it’s only used as part of an application (and isn’t accessible to the search engine bots). Or maybe you find content that’s only used for print material or for training purposes. SEO-wise, I consider this “low hanging fruit” because the content is already there, but you might just have it in a form that’s not crawlable or search engine friendly. Once this data is revealed and a plan developed, it might only take a short amount of time to open up that content and have indexation boom on the site. And if mapped out the right way, all of that content can be optimized based on keyword research. I’ve seen this approach work extremely well for my clients.

They Beat Us To The Punch, Or Did They?
Beating competitors to the punch in SEO.
I’ve been helping a client develop a plan to open up thousands of pages of content on their site, based on data that’s currently only used in applications. Based on my estimates, the new content can increase indexation by a factor of 20, which is a huge jump in the amount of content on the site. The plan is to roll out the new content over time, and ensure each page is optimized based on keyword research (and SEM intelligence). This could be a huge win for them, to say the least.

As the project was being developed and nearing completion, I received an email from my client that read, “They beat us to the punch!” with a link to a competitor that made a similar move. It looked like they opened up a lot of content for Search (ahead of my client), which put a damper on things. So I decided to check out their solution in detail. About an hour later, I sent an email back to my client that read, “Don’t worry. None of their new content can be crawled, and to add insult to injury, even if it could, it’s not optimized. Full steam ahead.”

When analyzing the new content on the competitor’s site, it didn’t take me long to realize that they structured a solution that simply couldn’t be crawled easily. All of the links to their new content were in JavaScript, the implementation included some AJAX that wasn’t crawlable, the content wasn’t optimized, and there was a serious lack of drilldown into the content (even if they used straight text links). Needless to say, I was happy for my client. The competitor obviously didn’t have an SEO involved when mapping out the project, which is unfortunately a common occurrence when developing websites or web applications.

How To Open Content Up The Right Way, and Avoid The Madness
So, if you’re ready to leverage content you already own and have stored away, how do you ensure that new content benefits your SEO efforts? You definitely don’t want to waste time, resources, budget, etc. on a solution that does nothing for you in organic search (especially if SEO is an important reason for opening that content in the first place). Below, I’ve listed some key points to consider while opening up your content for Search. By no means is this the full list, but the following points can definitely help you have a greater chance of success, and avoid the potential madness of what I saw in the example above.

1. Make It Clean and Crawlable
If you’ve read previous posts of mine about SEO technical audits, then you know how important I believe a clean and crawlable structure is. When you look to open up a lot of content on your site that’s currently databased, you need to make sure the bots can easily crawl and index that content. This sounds simple, but I can’t tell you how many times I’ve seen solutions that throw serious barriers up to the search engine bots. The result is a lot of new content that never finds its way into Google’s index. The worst part is that the companies implementing the new content don’t know that it’s not crawlable until nothing changes SEO-wise. The answer usually comes out during an audit, months down the line (or longer).

In order to accomplish what you need with the new content, you should develop a strong information architecture to ensure that new content is organized logically. For example, depending on the content, you might organize it by category, subcategory, location, vertical, or other dimensions that make sense. Then you can use a robust internal linking structure to ensure the bots get to your content using descriptive text links. Then depending on the content at hand, you can provide relevant links from deeper pages to other pages you are opening up. The goal is to ensure both your users and the search engine bots can find all of the new content, while also influencing that new content via other pages on your site (more on that below).

2. XML Sitemaps Will Help, But They Won’t Save You
If you think that simply providing all the new URL’s in an xml sitemap will instantly give you SEO power, think again. XML sitemaps are important, but they are a supplement to a traditional web crawl. You should definitely use them, but you shouldn’t rely on them in the same way you rely on traditional links from other pages on your site. You can’t influence your new pages via an xml sitemap. For example, you won’t be passing any PageRank to the new pages by simply adding them to an xml sitemap. But you can pass PageRank by linking to your pages via a strong internal linking structure. I find a lot of people don’t realize how you can influence other pages on your site via smart linking. And by the way, this typically helps both users and SEO. I’m definitely not saying to add a bunch of links to the new content just for SEO. A smart internal linking structure is good for usability and natural search performance.

3. Avoid JavaScript-based Links, and Make Sure Your AJAX is Crawlable
If you take my advice and map out a robust internal linking structure for your new content, do not use JavaScript-based links to drill into that content. Use direct text links whenever possible. The reason is because you cannot guarantee that those JavaScript-based links will be crawled effectively. Worst case scenario, all of the links to your new content won’t be crawled at all. And that could leave most of that new content with no way of ranking. To clarify, if it can’t be crawled, it won’t be indexed. If it can’t be indexed, you have no way of ranking. If it can’t rank, you can’t drive targeted traffic via SEO.

Also, in order to create powerful ways to access new content, some companies utilize AJAX in their implementation. That’s fine, but you need to ensure your AJAX is crawlable. If not, you can run into a similar situation like what I listed above with JavaScript-based links. Your content simply won’t be crawled. To overcome situations like this, Google developed a method for ensuring your AJAX gets crawled. The problem is that many companies don’t know that it’s possible, how to implement it, etc. If you choose to use AJAX for usability purposes when opening up new content, make sure you follow Google’s guidelines. If not, you might end up with a lot of new content in theory, when in reality, none of it gets crawled, which of course means it can’t rank.

4. Dynamic Optimization – Optimize Your New Content Programmatically
If you are taking the time to open up thousands of pages (or more) of new content, make sure you take the time to optimize that content. The solution I mentioned earlier (my client’s competitor) implemented the same exact metadata for each new page (across thousands of pages). Needless to say, that isn’t going to help them at all. When you open up a lot of content, you can work with your development team to create a formula for dynamic optimization. You can analyze the database structure and utilize those fields to help optimize the title tag, meta description, heading tags, internal links, etc. If you come up the right formula, then you can optimize all of your new content programmatically. That’s an awesome way to go for database-driven content. Think about it, are you ready to optimize 12K new pages of content manually? Instead, have a developer write code that can leverage the information you have already databased to uniquely optimize each piece of content. Awesome.

5. Avoid (Creating) Duplicate Content
If you don’t map out a sound structure and navigation for your new content, you can run into duplicate content problems. I won’t go into great detail about duplicate content in this post, but it’s not a good thing for SEO. Duplicate content is when you have the same content resolve at more than one URL. As you can guess, this usually isn’t intended. For example, imagine you had one product that’s part of six different categories. When opening up this content, you could very easily have six different product pages versus one canonical product page. Each page holds the same exact content, but resolves at six different URL’s. That’s a good example of duplicate content. If possible, you definitely want to ensure each piece of content resolves at one canonical URL. Using the example from above, it would be smart to have each of the category pages link to one product page.

Bonus: Watch Out For Session ID’s
One issue I’ve seen pop up in projects involving application-driven content is the dreaded session ID. Make sure you are not appending session ID’s to your URL’s when implementing new content. If you do, then you will certainly be creating a lot of duplicate content on your site, which based on what I explained earlier, can be a bad thing SEO-wise. You should never have session ID’s resolve in the URL, and instead, you should use a cookie-based approach to maintaining state. If session ID’s end up in your URL’s, you can end up with thousands of pages of duplicate content (since you might have many URL’s for each piece of content.) In a nutshell, the planning stage is critically important to ensuring you don’t run into a canonicalization problem.

Open New Content, Don’t Bottle It
I hope this post provided some guidelines for ensuring you don’t waste your time when opening up new content on your site. If you find data that’s not being utilized, and choose to implement new content based on that data, then make sure it can help you SEO-wise. Don’t make what could be a boom of new content turn into a squeak of SEO technical issues. Make it crawlable, avoid duplicate content, map out a robust internal linking structure, and make sure your AJAX is crawlable. If you do, you can reap great rewards SEO-wise. If you don’t, you’ll keep the iceberg of great content underneath the water, where nobody can find it.

GG

2 Responses to “Successfully Revealing the Iceberg – Avoid These Common Pitfalls When Opening Up New Content for SEO”

  1. Glenn,

    You make excellent points here. Very helpful. I totally agree with you, the more relevant, well written and good content you can put on your site the better. The more content that is on your subject and on your site the more indexation you get in the Search Engines.

    Keep up the good work Glenn!

    -Seth

  2. Glenn Gabe says:

    Absolutely Seth, companies can often increase the amount of quality and optimized content on their sites by finding data and information that’s already created. Then after developing a plan, they need to implement that new content so it’s crawlable. Unfortunately, this is where the project can break down. Hopefully the tips I provided in the post can help more companies succeed.

    GG