Automatically post friendly URLs to Twitter and Facebook

January 12th, 2008 Jason

In a fit of self-propagation, I set about this week to explore making Wordpress post to my Twitter any time I update.

I found a basic, but functional, plugin called Twitpress, which does exactly what I wanted. Except…I’m also using the All In One SEO Pack, which rewrites page URLs into an SEO-friendly format. (Really, a must-have plugin.) Twitpress by default will tweet the stock version of a post URL:

http://RelevantText.com?p=24

instead of the format I want to show:

http://RelevantText.com/making-the-most-of-server-errors-20080111/

Now, I know that a)Twitter links are nofollowed, so this doesn’t really matter for the spiders, and b)Twitter also automatically turns long links into tinyurls, but it still bothered me (more on why in a minute). So, I set about to fix the plugin.

After reading through what the plugin code was doing, I surfed through the WP database tables a little bit, and discovered that I needed to change one line in Twitpress. Hooray!

In the twitpress.php code, replace line 85:

$proto = str_replace("[link]“, get_option(’home’).”?p=”.$postID, $proto);

with

$proto = str_replace("[link]“, $post->guid, $proto);

‘guid’ is a field in the wp_posts table, if you care.

Bingo. I’m very pleased with myself.

So why, you may ask, do I care about how the links look in Twitter if they aren’t spiderable? Because I’ve also installed the Twitter App on Facebook, so any time I update Twitter, my Facebook status updates as well…which means the link is then being pushed out along the newsfeeds of all my contacts there. The link is still not spiderable, but it is potentially much more likely to get seen, followed, and possibly linked to. Through the tinyurl redirect, it now goes to the right version of the URL, and when people subsequently link to the post, I want them using the right one. This, I think, will help that along.

Jan 14 Update: After my initial excitement, I’ve discovered that this is still slightly buggy - notifications occasionally appear on twitter with the p= URL, and sometimes with no URL at all.  This seems to only happen when a post is first published, and not when later edited, but I’m not clear why, as the ‘guid’ field is populated with the first publish of a post. So, this is cool when it works, but I’m still looking at it. 

Posted in Facebook, Twitter, geek, php, plugins, seo, site, wordpress | No Comments »

Making the most of server errors

January 11th, 2008 Jason

Nobody thinks twice about planning for and dealing with 404 errors on their website. It’s going to happen, right? Not because you didn’t properly redirect when you moved a page or something, of course! But you expect that somebody will mistype a URL someday, and you plan for it and have your fancy or funny 404 page in place on launch day.

I was reminded today that people often don’t deal with 500 server errors at all, but on a large dynamic site these errors are just as bound to happen as 404s, and they’re far more troublesome. They are unpredictable, untrackable (unless you want to trawl through server logs, which I for one don’t), and harbingers of doom for your site because more often than not, they are indicators of something very bad going on behind the scenes…and you can bet your AdSense check that if a user sees them, a search spider does, too. When a spider hits a server error, it’s usually dead in the water, and that spells disaster for your rankings.

The good news is, it’s actually not too hard to deal with them properly.

One of my large corporate sites was having some massive issues with server response time last year, and as a result we were seeing a significant uptick in the number of 500 errors being reported in Google WMT’s crawl stats.

For the most part, it seemed like simply backing up and reloading the page usually got past the error, but GoogleBot isn’t going to do that. We really had no way of knowing just how pervasive the problem was, but we knew it was affecting the user experience, and clearly killing GoogleBot on a regular basis. While the technology group worked on the backend issues, we stemmed the problem from the front end by creating a custom error page to display any time a 500 error occurred.

The criteria were minimal:

  1. Improve the user experience when an error occurs
  2. Provide search spiders a way to continue through the site, and
  3. Be able to solidly track the number of server errors being delivered as part of our overall statistics

Fortunately, both .NET and Apache make it very easy to define a custom page to display when a server error happens.

In Apache, it’s dead simple - add a line to your .htaccess file like this:

ErrorDocument 500 /friendly500.html

(the nice thing here is that you don’t need to tweak the server config file, which you probably can’t do if you don’t manage your own servers…)

Microsoft servers are a little more involved. For a friendly error page in .NET, IIS tells yout to edit the web config file to include this code:

<customErrors mode="On" defaultRedirect="errors/friendly500.html">
</customErrors>

As noted here on Techrepublic, you may define different pages for different errors:

<customErrors mode="RemoteOnly" defaultRedirect="errors/ErrorPage.aspx">
<error statusCode="400" redirect="errors/
friendly400.html" />
<error statusCode="401" redirect="errors/
friendly401.html" />
<error statusCode="403" redirect="errors/
friendly403.html" />
<error statusCode="404" redirect="errors/
friendly404.html" />
<error statusCode="408" redirect="errors/
friendly408.html" />
<error statusCode="500" redirect="errors/
friendly500.html" />
<error statusCode="503" redirect="errors/
friendly503.html" />
</customErrors>

The page can have either an .aspx or .html extension, but keep in mind that if the server is having problems there’s no sense in trying to deliver another dynamic page. Keep it static.

One caveat : IE will try to display a friendly error message of its own, unless the error page is over 512k, so put some text on it.

As our existing 404 page is essentially a sitemap, we quickly realized that we could simply duplicate it as ‘error.html’ and with a few text changes, use that. Users now get a friendly “Oops!” message, and spiders and users alike have a variety of useful links enabling them to continue navigating the site instead of going elsewhere.

Results?

A snapshot report from Google WMT in July showed 218 server errors that happened during their crawls in the previous two weeks. Today, there are none listed at all. (To be fair, the tech guys have been doing loads of work to make things run better as well, and credit where credit is due.) But we can also now see in our statistics that regardless of what GoogleBot is seeing, the error page has actually loaded…um…let’s just say “rather a lot” this month so far, and we can now start assembling solid numbers of how much the server issues are affecting the user experience and to argue for even more improvement work on the backend.

Posted in geek, seo | No Comments »

Does new equal better at Google now?

January 2nd, 2008 Jason

An interesting question on the Google Operating System blog, related to my post about Google’s quick indexing: does the new hi-speed indexing mean that newer pages are being artificially weighted to rank higher?

The argument is that a brand-new page won’t have a bunch of backlinks pointing to it, so there’s no reason it should appear near the top of the SERPs directly after being indexed…unless Google is giving greater importance - at least temporarily - to newer content.

Maybe it’s only true for whatever they currently identify as ‘hot topics,’ but it’ll be worth watching. I may try a little experiment to document later today or next week; I’ve certainly seen Google give a nice big boost to a newly indexed page before it fell off into a more stable position. If, by interpreting ‘more recent’ content as ‘more relevant’ content, Google has slipped up here, it’s only a matter of time before the blogspammers start capitalizing on it by simply posting more crap more frequently to maintain consistent high rankings.

Posted in Google, seo | No Comments »

5 important search developments of 2007

December 31st, 2007 Jason

I’m not saying these are the ‘top 5 most important’ changes of 2007, I’m just pointing out some things I’ve seen as significant. There’s certainly more (like the whole paid links debate), but I’m on holiday, so I’m stopping at 5 I find worth mentioning. In no particular order:

  1. No more supplemental index
    Supposedly, this means more relevant results for all searches, all the time. (Interesting, since I thought that was the goal anyway…) But probably a key thing here is more relevance for foreign language queries as well, which may ‘translate’ into Google getting a bigger slice of the bits of foreign search they don’t already have. I think it will also mean less confusion about just how deeply/thoroughly a site is indexed.
  2. Sphinn
    Sphinn is no Digg. Only SEO’s are going to see any traffic boost from Sphinn; it’s not something that Bob’s Widgets is going to try to game for traffic and links. But it has quickly become invaluable as a means of connecting the vast network of search marketers out there and bringing attention to important and interesting news or opinion…without having to monitor eight hundred blogs every day.
  3. Universal search results
    Of course, with Google’s acquisition of YouTube happening this year as well, it followed that YouTube content would start getting a higher visibility in the SERPs, but Google and Yahoo! both started integrating video, news, and image results into the ‘main’ results page this year, and it all seemed conspicuously timed as a response to Ask’s big facelift. But it’s much more than a presentational change; it’s really completely affected how search marketing works and shifted the focus of what’s important to get noticed and rank well.
  4. Facebook?
    Sure, Facebook has been around for a few years, but it was this year that anybody with an email address (i.e., not an academic one) could join, and it blew up into the place to be. And now everybody and their dog’s company thinks they need to build a Facebook app. I think it remains to be seen whether a good Facebook app has real SEO benefit, but it can a big deal for brand recognition, which of course can have a real downstream impact on what people are searching for.
  5. I got a job
    Okay, this is a cheese-out, but it’s true. Landing in SEO seems to have really taken all the bits of technology and marketing and general geekery I’ve been cobbling together over the years and focused them all into a very clear path.

Posted in seo, social networking | No Comments »

How quick is Google?

December 8th, 2007 Jason

I was at Pubon’s “Tools of the Trade” session yesterday afternoon. I took some notes, but missed a URL that I found myself wanting to check out this evening.

Todd Malicoat (aka Stuntdubl) was speaking, and mentioned a bookmarklet he used which would give a listing of the sites associated with an IP block. Right, so, Google: “stuntdubl number of sites on IP”

Number two is a blog entry titled “Tools of the Trade,” which is the seoroundtable liveblog entry from the session. With a reference to the tool I’m after: seolog’s Reverse IP domain tool. Boom.

I know that Google has gotten really good in the last few months with basically “instant indexing,” but this is the first time I’ve really seen it in action. Nice. A little scary, but impressive and powerful as well.

Posted in Google, pubcon, seo, tools | No Comments »

Experimenting with H1 headers

October 10th, 2007 Jason

I’m driving a little experiment on one of our sites, based on something I noticed on the W3C home page.

With images on, the page looks like this:

W3c home page

With images off, you get this:

W3C home page - images off

“Basic image-replacement CSS,” you may think, but it’s not even that involved. The text is, indeed, just the alt text for the image.

<h1 id="logo"><img alt="The World Wide Web Consortium (W3C)" height="48" width="315" src="/Icons/w3c_main" /></h1>

OK, big deal, right?

Well, what caught my eye was the <h1> wrapping around the logo image. With images off, the alt text renders as an <h1>-level heading. In this case, it’s completely proper for the site, since it is at the very top of the page. But in my understanding, this is also what a searc h spider would see when crawling the page.

I did loads of digging around to see if anyone was talking about the potential to exploit this. Surely, if the header logo of your site could be replaced by a big fat relevant <h1> keyword or two at the top of every page, this would be something everybody knew about, right?

Apparently not.

So, we’re trying it out. One of our sites uses an image for the header tagline, and the alt text needed attention anyway. We’ve now wrapped the logo and tagline images in an <h1> tag, so with images off we have a keyword-rich, branded header appearing sitewide. I’m not convinced that this will be any kind of SEO silver bullet, but it will be interesting to see what - if anything - happens with traffic for the keyword. I don’t think it will hurt, although there is some question whether the header logo link will come off as any more spammy than normal. If nothing else, though, it’s a big step for usability.

Posted in seo | No Comments »