Statistically Improbable Phrases

In December of 2005, we introduced the concept of Statistically Improbably Keywords.

Today, we’re extending that concept to “Statistically Improbable key Phrases” (SIPs). These can be found only by clicking on the EzinePublisher view of any article. Consider this feature beta as we’ll work to improve it and just wanted to get it out there so that we could move to phase 2 of making it more valuable.

Let’s look at an example:

Carrie Reeders article, 3 Things to Look For in a Debt Management Company Online – returns this for the SIKs/SIPs:

Statistically Improbable Keywords In This Article:
debt – 5%
company – 4%
companies – 3%
online – 1%
amount – 1%

Statistically Improbable Phrases In This Article:
online debt management company with so
companies negotiate with credit card companies to
make sure that the company you deal with
off of the debt shop around for a
your credit card debt it
enough with the credit card companies or they
amount of your debt
to shop around for a company you are
companies online that
of debt so you

We’re really not sure what to do with this data, but if you can help us to understand how this data could be useful…we’ll be able to tweek the settings that provide this SIPs meta-data (data about data).

[March 22, 2006 UPDATE: Thanks to Phil for catching our spelling error in this blog entry. Fixed. -CK]


Phil writes:

As a writer and editor, you really should have noticed that the title of this piece – Statistically Improbably Phrases – is nonsense. You compound the felony by repeating it in another form – “Statistically Probably key Phrases”.

‘Improbable’ is the adjective; ‘improbably’ the adverb.

Comment provided March 22, 2006 at 5:00 PM


Programmer #1 writes:

This routine has been adjusted to smaller more dense word/phrases, I think giving better results.

The new results for this article for example are,
debt management
online debt
debt shop
companies online
your debt
the company
company you
make sure
credit card

Comment provided March 27, 2006 at 12:58 PM


Jan Verhoeff writes:

I use the long tail data as information specifically for searching my articles when I think they may be misused.

Other than that I think the long tail data may be important for web crawlers.


Comment provided August 30, 2007 at 9:36 PM


RSS feed for comments on this post.

Leave a comment

Please read our comment policy before commenting.