Nearly every SEO article you will ever read give us the same advice.
Create good content and you’ll rank higher.
How might Google define “good content” for SEO?
Better question, how would Google grade millions of pages that could match a search query, and decide if the content is “good”, “bad”, or “mediocre”?
Also, the Google ranking algorithm must judge what is a good result, without having the experience of a human being that is a subject-matter expert, for millions of subjects, some of which are new and emerging.
This article looks at how the ranking algorithm works (in a very basic sense), and what factors Google might use to measure content quality for any given query.
The ranking algorithm must determine what is a good match for any given query, in any given subject. How would a machine learning algorithm determine “good content” in a potentially infinite amount of subjects?
One Component: Natural Language API
Google has made advances in understanding content through the Natural Language API. This machine learning component of their product processes words, extracts meaning, classifies “entities” (more on that in a minute), and analyzes sentiment in a web page or document. The goal of this initiative is to understand language as a human would understand it. By analyzing sentence structure, Google can look at a page of text and get semantic meaning from the page.
Nouns, such as business names, corporations, services, and cities or what Google calls entities. Google defines an entity as:
Google defines an entity as: “A thing or concept that is singular, unique, well-defined and distinguishable.”
Entities have connections to other entities within the real world. They can be things, like a person, a business, an idea, or a concept. Understanding these connections through information gathered on the web, Google has put together what is known as the Knowledge Graph.
Connecting the Dots: The Knowledge Graph
You can see part of the Knowledge Graph when you search certain for entities. For example, when you Google “Chicago”, you’ll see the Knowledge Graph for the city of Chicago comes up on the right hand side of the page.
You’ll see different statistical information about the city. Points of interest (which are also entities) and other similar cities, and the State of Illinois are included as links.
Google knows that the majority of people are searching for the city, but on the chance that people are searching for the rock band Chicago, they include a link to that Knowledge Graph at the bottom.
Businesses like yours are also entities. If you have a Google Business Profile listing, your company has a Knowledge Graph. When you search your business name, you should see a Knowledge Graph on the right hand side of the desktop page. If you do not see a Knowledge Graph, or the KG for a similarly named business, Google has a low confidence that is what people are looking for, or they don’t have adequate information on your company.
If you have a Google Business Profile/Google Maps listing, you should have a Knowledge Graph. As Google collects information from around the web, it should be able to “know” where you are headquartered, what business category you operate in, and other information.
When you are a published author, of either a physically printed and distributed book, or an e-book on Amazon, Google will have a Knowledge Graph started for your author information.
Famous business people, entertainers, athletes, and other celebrities usually have a more extensive Knowledge Graph, as there is more information about them on the web. Some veteran Internet marketers have Knowledge Graphs, if they created a profile on a database called Freebase, which was publicly available from 2007 to 2010. This database was acquired in 2010 by Google to help create the foundation of today’s Knowledge Graph.
To recap what we’ve covered so far, Google uses machine learning to understand the context of text on a page, or spoken words in YouTube videos through the Natural Language API. The search engine also uses the Knowledge Graph to make connections between entities, assigning more meaning to the content of a page.
Another way Google may evaluate the quality of content on a web page is through task completion.
Search Intent and Task Completion
In order to deliver the best results for each search query, Google must understand what people are trying to find with a query, and what goal they’re trying to complete. That is search intent.
If you search Google for “auto parts store”, most likely you are looking for a parts store that is nearby, or an online store (e-commerce). It is very unlikely that you are looking for a Wikipedia page on a company like Napa Auto Parts or O’Reilly.
The search intent is what the majority of people are trying to accomplish when they type in a search.
When Google understands what the majority of people are looking for with a query, the search results page will satisfying to most searchers. As a result, most of the results on page one of Google will satisfy the same search intent. If Google is unsure of what people want, the first page of search results may be mixed. There may be six or seven results that fit the same pattern, two or three results with an alternate task completion, and maybe one result with a third search intent.
For example, if you search “medical billing” from San Jose, California, there are a few different types of search intent on page one of Google.
- Six results are information on what medical billers do, with links to find a school. (Search intent: “I want to research medical billing as a career and find a school”.)
- Two results are information on medical billing and career outlook. (Search intent: “I want to know if medical billing is a viable career for me”.)
- One result is a job board for medical billing jobs. (Search intent: “I want to find a medical billing job”.)
- One result is a video showing a demo of medical billing software. (Search intent: “I want to find medical billing software”.)
Some search queries are less ambiguous than this, and the search intent is more clear cut. The more specific the search query, the more likely Google is to return the same search intent in the first page of results. Those longer searches are usually referred to as “long tail keywords”.
Let’s look at another example. If someone searches “how to tie a Windsor knot”, people are searching for instructions on how to tie a formal necktie. A piece of content that has a video showing how to tie this knot, and with written step by step instructions with illustrations might be considered “high quality”. This would make it easy for searchers to complete their intended goal, and give them a different visual aids on how to complete their task.
As you can see, it is vital that Google can understand the implied search intent, even if it is a query that they have not seen before.
Can Google Understand Queries They Haven’t Seen Previously?
Google says that 15% of the search queries they see each year have never been seen before. They must have a system for understanding these keyword searches.
In 2019, they introduced the Bidirectional Encoder Representations from Transformers, or BERT for short. BERT is a neural-network based technique that works in conjunction with the Natural Language API. This was designed to better understand search queries and deliver the results people are looking for.
Now’s a good time to mention that what people want to find with the same keyword search can change over time.
Search Intent of a Query Can Change Over Time
Here’s a good example I can show you of search intent changing as time goes on. There’s a page on this site called The Ultimate Guide to Google Review Stars.
At the time when it was originally published, when people would search “google review stars”, most people were trying to figure out why their rating star average on Google Business Profile wasn’t moving. A few years ago, Google was using a Bayesian average for star ratings, so if you had four 5-star reviews, you’d still have a 4.8 average, and that confused people.
The other things people were searching for was when their review count would go up. In 2016, if you got a big influx of reviews on Google Business Profile, the review count wouldn’t move for a week or two. So that was another answer that people were searching for.
In September 2019, Google started taking away rating stars from organic results. Up to that point, people were using Schema markup or third-party review aggregators to make rating stars appear in normal search results. This was an advantage, as searchers would be drawn to results that stood out from the normal search results, and those would get more clicks.
Google saw that some people were manipulating Schema markup to get stars, so Google shut the feature down for everyone. The one exception they left alone was third-party review sites like Yelp or Home Advisor, because regular business owners can’t manipulate those results.
So that is one example of how the meaning of a query can change over time.
Be sure to check your content if your traffic or ranking is dropping for that page. It’s possible the meaning of the keyword search has changed.
A Factor Few People Talk About
In a June 2018 Google Webmasters Hangout, Google representative John Mueller confirmed that Google uses real user experience metrics from Chrome to evaluate speed and other factors. Mueller has also said in the past that Google has a method for measuring the search experience end to end without using Google Analytics. This is because Google Analytics is installed on only about half the websites on Earth.
The implication is that Google uses real user signals via Chrome for evaluating a web page.
This would give Google a good way to measure whether users are satisfied with the content on a page. Using real user signals from Chrome can give Google a good idea of whether the content is satisfying to users.
Mythical Ranking Factors and Content
There are many factors that are bad signals for ranking or quality content. In most cases, it is because these signals can be manipulated or spammed. Let’s debunk these signals for judging the quality of a web page.
Google Doesn’t Use Bounce Rate
Many SEOs have taught that low bounce rate is connected to higher rankings. The “conventional wisdom” is that if a page is “high-quality” then people will go to more pages on your site. But if searchers accomplish the goal they set out to do on that page, then leave, a high bounce rate shouldn’t be a harmful SEO factor.
Google has said several times that they don’t use bounce rate. Google representative Gary Illyes said this in 2017.
Could automatically generated content be considered quality content if it answers the query/intent of user? @methode Asking for a friend…
— William Wright, April 4, 2017 via Twitter
CONT: IF that content was automatically generated on a site based on previous site use/queries/actions?
— William Wright, April 4, 2017 via Twitter
Lines are super blurry there, I can't really answer in 140 characters. Try to create metrics that gauge user satisfaction & go from there
— Gary 鯨理／경리 Illyes, April 4, 2017 via Twitter
(bounce rate is not a good signal)
— Gary 鯨理／경리 Illyes, April 4, 2017 via Twitter
Matt Cutts said this in 2008 as well, stating that bounce rate is a noisy, spammable signal. Cutts was head of Google’s web spam team for many years.
These are signals that can be manipulated, and they are not good indicators of quality. Word count is also not necessarily a good indicator of quality, though it can be a correlation of quality for some informational queries.
Is Dwell Time a Sign of High Quality Content?
Google has never said one way or the other whether they use dwell time for ranking or a sign of high quality. Dwell time is the average of how long users stay on the page.
But if bounce rate is a spammable signal, then dwell time is also a spammable signal.
I’m a firm believer that any ranking signal that can be manipulated with a Fiverr or Amazon Mechanical Turk gig is a bad signal. These are platforms where you can hire people to do small tasks for small sums of money. Where these become spammable signals is if an unscrupulous SEO hires several people to spend a lot of time on specific pages (dwell time) or go to several pages in a website (bounce rate).
Some SEOs believe dwell time is a ranking signal, because if people are on the page longer, it means they’re reading the article, finding value in that page.
As far as signals go, dwell time is a bit better than bounce rate. But it’s still a signal which can be manipulated, and it doesn’t tell whether the page content is high-quality or not.
Is Word Count a Signal of High Quality Content?
Several high profile SEOs have published studies implying higher word count is an indicator of content quality. Some studies do seem to show that there is a correlation between longer content and higher rankings. Other SEOs argue that longer content does not cause higher rankings, it merely correlates. Another valid argument is that it is not necessary to have longer content for every search query in order to rank. In many cases, people want to find the information they are looking for and leave. This would be an indicator of quality.
Depending on the search query, longer content may or may not be necessary in order to rank higher.
“Longer content is better content” has become such a mantra, there’s even a term for it, The Skyscraper Technique. The idea is, look at the highest ranking articles for a keyword phrase, then make an article that is even more in-depth and longer. On the surface, this isn’t a terrible idea, but there’s limits to how long articles can become before this technique is impractical. Also, length of content, by itself, is not an indicator of quality.
Most of the time, people don’t need, or want a 10,000 word article, they just want the information that they’re looking for.
There are certainly many cases where a long piece of content is more helpful, but only if the resource author is answering all the questions that users might have when they’re typing in that query.
In the end, if the content helps users get to their desired goal, easily and efficiently, that is the surest sign of quality.
Final Tips on Quality Content
SEO isn’t remotely the same as it was 15 years ago. The content that is the most helpful for a particular keyword search, that delivers that content with the least friction, is going to rise to the top over time.
With that in mind, here’s a reap of what you should do to have “high quality” content for any given search query.
- Determine the search intent fora keyword phrase, have a page that matches that intent.
- For each web page targeting a keyword phrase, include elements found on other top performing pages.
- Address and answer all questions a searcher might have in the page content.
- Double check that search query intentions have not changed in the last few years. Adjust the content of the page accordingly.
- Make the page easily scannable. Break up the content with headlines that direct people to the information that they need. Have a intuitive visual hierarchy in your typography.
- Make the content as long as it needs to be, as short as it needs to be to answer every question, or satisfy the intent of the search.
- Include any functionality that is necessary for searchers to reach their goals (tools, calculators, booking forms, contact form, e-commerce).
- The website must be user-friendly. Text should be easily readable by people with less than perfect vision, all color contrast should be accessible, and the content should be easy to scan and use both on mobile and desktop.
- No grammatical errors, and the content is easy to understand.
- Original research is a plus. If you reference authoritative studies, that is also a plus.
- Visual aids like images, illustrations, or videos can also make the page more valuable to users.
If you do these things and keep all these principles in mind, you will have higher quality content than the majority of your competition.
The quality of content is not about avoiding mistakes. It’s about letting people get to their goal, and helping people get to their goal with the least amount of friction, and the most amount of clarity.