Contents:
Googles April Algorithem Changes Panda3.5 19 April and Penguin 24th April
Google have released an number of different algorithm update this month the following is a brief description of the changes Google have made in April. I did a brief summary for our internal users – and thought it might be useful for the wider internet.
Penguin 24th April
Googles latest update the Penguin update launched on April 24. It was a change to Googles search results that was designed to remove pages that have been spamming Google. Spamming in this case is where people do things like keyword stuffing, “hiding text” or cloaking that violate Googles guidelines.
Panda 3.5 19th April
On the 19th an update of the Panda algorithm was launched. Panda is an algorithm designed to promote higher quality pages over lower quality sites.
Parked Domains Problem April 17th
Google also made a rare admission that it made a mistake on the 17th April they had a problem that was incorrectly identifying sites as parked domains. A parked domain is one that you own but has no content apart from a holding page.
This is an executive summary of a longer post at search engine land here

Best Adsense Fail or Scary Devil Nunnery Recruiting and a SEO Fail on jobs.guardian.co.uk
Whilst perusing the Guardians job section to analyse the platform they use I both found a number of ways to completely mess up that entire section of the site and I also found what must be the strangest Adsense advert of all time.
Having managed to create arbitrary pages on the job site I took a look at the Adsense served up at the base of the page which is show here (note the faked page I created was IT related).

Though I must say holding SEO Audits in the style of the “Congregation for the Doctrine of the Faith” does appeal some times – especially when one comes across pages whose markup could be best described as “Your aving a laugh mate”. Though I suspect that HR might winge when we took people down to the basement for the “shewing of the instruments “.

HPCC High Performance Computer Cluster Open Sourced
I love my Job I get to play with large amounts of data and some cool new cutting edge cloud based toys such as Map Reduce and Mahout and some interesting Web 2.0 Machine learning and AI type algorithms
Map Reduce is a software frame work developed by Google to allow processing on large datasets on clusters of commodity computers. Though in an odd coincidence the Map stage of map reduce is effectively the same approach we used at Telecom Gold to handle processing the Large logs in the Telecom Gold Billing system with a system called GLE Generic Log Extract (written in PL1).
After some hacking I have got a small test cluster up and running to try out Map reduce for some interesting work on clustering documents, in this case web pages on some well known large websites.
I was having some difficulty in getting Mahout which is an open source set of algorithms to perform clustering of documents using map reduce when almost by chance I found that out parent company has its own system HPCC (High Performance Computing Cluster) is a massive parallel-processing computing platform that solves Big Data problems that Map Reduce is used for.
HPCC used to be just an internal system developed by Lexis Nexis and has been used for lexis nexis customers for the past decade. But recently ie last week HPCC has been open sourced. As with Hadoop there is a web based interface

ECL Watch the Web front end for HPCC
Also there is a windows IDE which directly connects to a HPCC cluster to allow you to run ECL which is the declarative non procedural language used to program jobs to be run on your HPPC cluster.

There is a test virtual machine available for down load here to allow people to test HPCC and learn ECL here binaries for Centos and Red Hat are avaible and source should be available in a few weeks.

Panda Update Hits the UK and All English Queries
It looks like the infamous Panda Google update has arrived outside of the USA. According to Google the Panda update (some times called Farmer/Panda) is meant to better identify low-quality pages and sites.
These are the sort of pages (often seen on content farms) with text that is automatically tuned to match the query – but may not provide the best user experience. (Google apparently calls it a high quality sites algorithm.)
I am due to help give a presentation on SEO to a group of RBIs developers on Wednesday so guess whos going to be quickly revamping the presentation deck tomorrow – as well as groveling in the Analytics data to see if any of our sites have been hit.
Though its my boss that will be fielding the calls from the senior management I am glad to say. Coverage here and Googles own blog here

New Funtionality in GWT Non Informative Title Tags and Non Indexable Content
Google have just lanched some new funtionality in GWT (Google Webmaster tools) two new items in the html sugestions: Non Informative Title Tags and Non Indexable Content

Could be usefull in diagnosing problems in sites that need fixing – espesialy as a non informative title tag is a big low quality signal.

Steam Punk Sara Palin
Comics or graphic novels if we are being pretentious have had some odd one offs and crossovers -and recently a genre which mixes science with Jules Verne HG Wells era SF called steam punk has become popular.
Today browsing some gawker properties to see if they have fixed the major javascript snafu they had.
And what did I find…. Drum roll please! Ladies and Gentlemen I give you Steam Punk Sarah Palin.

One reviewer commented
“Steampunk Palin defies classification into any literary genre, unless there’s a genre I’m unaware of simply called “WTF?!?
It seems to be in the so bad its good territory I cant wait for the film. A Review is here
and thus been fairly or unfairly labelled as “Goths who decided it might be fun to wear brown for a change.”

RIP Gladys Horton of The Marvelettes
Sad to see that Gladys Horton one the founders of the Marvelettes has recently passed away. I saw her obit in the Guardian the other day. I thought I should post up a link to one of my favorite Marvelettes tracks for an early Harlem Apollo show in 63.
As you can see they where doing the moonwalk years before Micheal Jackson and in high heels!

Bing Copying Google Throws Toys out of Pram
Oh dear sounds like Google is getting upset over Bing using googles results to improve theirs from the write up on Searchengine land here.
Google has run a sting operation that it says proves Bing has been watching what people search for on Google, the sites they select from Googles results, then uses that information to improve Bings own search listings. Bing doesnt deny this.
Reverse engineering is legal other wise we would still all be using IBM PC’s – Google should just man up and take it as a compliment.

Black Templar Space Marines WH40K
I have been thinking about doing some WH40K gaming and one a visit to the mother ship at
Home