Dealing better with Document Frequency

document-frequencyUsing SEO Hero, you’re dealing with a metric called document frequency (DF). We’re not going to make a brain breaking blog post about it, but here SEO Hero users will find pretty interesting advices to make a better use of DF.

At a statistic level, is pretty simple to understand. The document frequency of a word is the percent of URLs where the given word occurs, within a set of web pages.

Many people focus only on words that show a strong DF (+40% for example) thinking that these words are the ones that really matter. Yes, they matter, but this is a big SEO mistake. If you were doing it, we’re sure that you’re not going to do it anymore.

Of course, words that have DF + 40% are very important, but making the primary focus on these terms, you’re for sure going to miss the topic entities that really matter.

If you really want to outrank your competitors on Google results pages, here’s a little secret : usually gold mine terms are not which with a really high document frequency.

It’s also pretty easy to understand, words with high DF are the ones that help your content to be coherent from the Google point of view. In fact, it could be strange if you are trying to rank a web page about “Google Penguin” without mentioning “Google”. Right ?

Now, missing one important word is something that can happen, but imagine you’re also missing “website”, “SEO”, “ranking” and “results”? A page that is built to rank on Google Penguin is unlikely to miss all of these obvious terms. Without these entities, the page seems to be not coherent, and consequently… your page is simply not a candidate to ranking. So what’s the point?

Because these words are just obvious, they’re not enough to help your content in getting high organic rankings. Just think about it one second, if these words have a strong DF, it means that a lot of your competitors are using it. Conclusions? These words are not the ones that make your content stand up.

Using better SEO Hero and spending a little more time on our latent semantic tables you’ll easy find « Google Penguin-specific entities ». In fact, looking better and deepper (also terms with low DF), you will find what we like to call it « the golden mine terms » like « Toxic links » « anchor text » « Bad links » « recovery »

Remember why these words are really important because writing a coherent content is not enough, your content must be specific and stand out.

