Top Lessons from the Google Leaked Algorithm

Top 10 Insignts and Revelations for SEOs from the Google Algorithm Leak

The SEO world is abuzz with revelations from a massive leak of Google’s internal search ranking documents. This unprecedented leak, which includes over 14,000 potential ranking features, provides an extraordinary glimpse into the intricacies of Google’s closely guarded search algorithms. For years, SEO professionals have speculated, tested, and theorized about what really drives rankings in Google’s search results. Now, with this leak, we have concrete evidence that confirms some long-held suspicions and introduces new, surprising factors that influence search rankings. This blog post will delve into four of the most important revelations from the leak, each of which has significant implications for SEO strategies.

1. NavBoost and Click Data

One of the most groundbreaking revelations from the leak is the confirmation of NavBoost, a system Google uses to track and analyze click data on search results. This system has been in place since at least 2005 and has evolved over the years to become a critical component of Google’s ranking algorithm. NavBoost collects data on various types of clicks, including good clicks, bad clicks, long clicks, and short clicks, to determine the quality and relevance of a search result.

Previously, Google representatives have repeatedly denied the use of click data as a ranking factor. For example, Google’s Search Liaison Danny Sullivan has stated on numerous occasions that click-through rate (CTR) is not a direct ranking signal. However, the leaked documents reveal that Google does, in fact, use click data extensively to influence rankings. This includes data from Chrome and Android devices, which means that user interactions on these platforms can directly impact search result rankings.

NavBoost’s impact extends beyond individual clicks; it also analyzes patterns of user behavior. For instance, if many users search for a particular keyword, click on a specific result, and stay on that page for a significant amount of time (indicating satisfaction), that result is likely to receive a boost in rankings. Conversely, if users quickly return to the search results page after clicking a result (a behavior known as pogo-sticking), the result may be demoted.

This revelation underscores the importance of user engagement and satisfaction in SEO strategies. To leverage this, website owners should focus on creating high-quality, engaging content that meets the needs and expectations of their audience. This involves optimizing page load times, improving user experience (UX), and ensuring that content is both relevant and valuable to visitors.

2. Seven Types of PageRank

Your complete guide to accomplish perfect SEO – Bizwebjournal

The leak also uncovers the existence of seven different types of PageRank, a far cry from the single PageRank metric that has been widely discussed in the SEO community. These variations include the deprecated ToolBarPageRank, which was once visible to webmasters through the Google Toolbar, and other types like pageRank_NS (Nearest Seed), which is associated with document understanding.

Historically, Google has downplayed the importance of PageRank since the public removal of the ToolBarPageRank metric. The common narrative has been that while PageRank is still a factor, it is just one of many and not as influential as it once was. However, the leaked documents suggest otherwise, revealing that PageRank in its various forms remains a foundational element of Google’s ranking algorithm.

One particularly interesting type is the Nearest Seed PageRank, which focuses on the relevance and proximity of a webpage to a localized subset of the network around seed nodes. This method ensures that pages within a specific topic cluster are considered more relevant, effectively replacing the broad, web-wide PageRank formula with a more focused approach.

For SEOs, this means that building high-quality backlinks is still crucial. However, the emphasis should be on obtaining links from authoritative sites within relevant niches. This approach not only boosts the traditional PageRank but also enhances the Nearest Seed PageRank by situating the website within a robust network of relevant, high-quality links.

3. Site-wide Authority Metrics

What Is Domain Authority? | NING

Another significant discovery is Google’s use of site-wide authority metrics. This includes data collected from Chrome browsers, which Google had previously denied using for ranking purposes. The leaked documents show that Google tracks various site-wide signals, such as traffic from Chrome users, to assess the overall authority and credibility of a website.

This contradicts Google’s long-standing claims that they do not use domain authority metrics. For instance, John Mueller, a Google Webmaster Trends Analyst, has often stated that Google does not have a domain authority metric similar to those offered by third-party tools like Moz’s Domain Authority (DA). However, the leak reveals that Google indeed uses metrics that evaluate the overall authority of an entire domain.

These site-wide authority signals are essential for determining the trustworthiness and reliability of a website. Factors such as consistent traffic, user engagement, and overall site performance contribute to this assessment. Websites with high site-wide authority are more likely to rank well across various queries, not just for individual pages.

To optimize for these metrics, SEOs should focus on building a strong, authoritative domain. This involves regularly publishing high-quality content, earning backlinks from reputable sources, and maintaining a positive user experience site-wide. Additionally, ensuring that the site performs well in terms of speed, mobile-friendliness, and security (e.g., HTTPS) can enhance its authority signals.

4. Site Embeddings and Page Embeddings

The concept of embeddings, which Google uses for both sites and individual pages, is another intriguing revelation. Embeddings are a way of representing text and other data in a high-dimensional space, allowing Google to measure the relevance and context of content more accurately. The leaked documents mention several types of embeddings, including siteFocusScore, siteRadius, and siteEmbeddings.

SiteFocusScore indicates how focused a site is on a specific topic, while siteRadius measures how far page embeddings deviate from the site’s overall embedding. This means that Google creates a topical identity for a website and evaluates each page’s content against this identity. Pages that closely align with the site’s main topic are likely to rank higher, while those that deviate may not perform as well.

Google’s use of embeddings highlights the importance of topical relevance and consistency in content creation. SEOs should aim to establish a clear topical focus for their websites and ensure that all content supports this focus. This involves conducting thorough keyword research to identify relevant topics and creating content clusters that comprehensively cover these topics.

For example, a website focused on digital marketing should create content around various subtopics like SEO, social media marketing, email marketing, and PPC advertising. Each piece of content should not only be relevant to the main topic but also interlinked to create a cohesive content network. This approach helps Google understand the topical identity of the site and improves its overall relevance and ranking potential.

5. Quality Rater Feedback Integration

Quality Rater Guide Leaked: What Businesses Need To Know | My Local ...

One of the more surprising revelations from the Google algorithm leak is the direct integration of Quality Rater Feedback into the ranking algorithms. Quality raters, using a platform known as EWOK, provide human evaluations of websites, which are then incorporated into Google’s search systems. This feedback includes detailed assessments of a page’s relevance, usability, and overall quality.

Historically, Google has downplayed the significance of quality raters, suggesting their evaluations are mainly used for training purposes and to validate updates. However, the leaked documents indicate that rater feedback is more directly influential. Modules in the leak mention a “per document relevance rating” sourced from these human evaluations, suggesting that their input has a tangible impact on search rankings.

For SEO professionals, this means that optimizing for human evaluation is more critical than ever. Websites should strive to meet the criteria outlined in Google’s Search Quality Evaluator Guidelines. These guidelines emphasize the importance of Expertise, Authoritativeness, and Trustworthiness (E-A-T), along with overall user experience. Ensuring content is well-researched, accurately cited, and presented in a user-friendly manner can help improve a site’s ratings from quality raters, which in turn can positively influence its search rankings.

6. Whitelists for Specific Queries

The leak also reveals the existence of whitelists that Google uses for certain sensitive queries. These whitelists ensure that specific, trusted sites are prominently displayed for searches related to critical or controversial topics, such as travel, COVID-19, and elections. For example, Google maintains a list of “Good Quality Travel Sites” that are given preferential treatment in travel-related searches.

This practice contradicts Google’s public stance that their algorithms are impartial and do not favor specific sites. In reality, these whitelists act as a safeguard to ensure that authoritative and reliable sources are surfaced for queries that can significantly impact public perception and behavior.

For SEOs, understanding the existence of whitelists is crucial, especially for sites operating in sensitive or high-stakes industries. While it may be challenging to get on such whitelists, focusing on becoming a trusted authority in your niche is essential. This involves building a strong reputation through consistent, high-quality content and earning trust signals from other reputable sources. Additionally, ensuring your site complies with Google’s quality standards and guidelines can improve your chances of being recognized as a reliable source.

7. Fresh Links Over Existing Links

Another notable insight from the leak is Google’s preference for fresh links over existing ones. This is contrary to the longstanding belief that older, established links hold more value. According to the documents, links from newer webpages carry a higher value, especially when these pages are receiving significant traffic and engagement.

Google has previously suggested that link age does not diminish its value as long as the link remains relevant and the page authoritative. However, the emphasis on fresh links indicates a shift towards valuing recent content and current user interest more highly.

For SEO practitioners, this underscores the importance of continuously building new backlinks. Strategies should include regularly publishing fresh, high-quality content that can attract new links. Engaging in outreach efforts to secure mentions from authoritative sites can also help maintain a steady flow of new links. Additionally, updating existing content to keep it relevant and adding new internal links can further enhance the overall link profile of a website.

8. Spam Signals and Penalties

Google Uses Spam Reports To Improve Its Spam Detection Algorithm

The leaked documents also detail the extensive use of spam signals and penalties in Google’s algorithm. These include specific scores and penalties for issues like keyword stuffing, anchor text spam, and content quality. For instance, the phraseAnchorSpamPenalty targets spammy anchor text, while the keywordStuffingScore assesses the likelihood of keyword stuffing within a document.

Google has always maintained a strong stance against spam, emphasizing the importance of high-quality, user-focused content. However, the leak provides more granular details on how these spam signals are calculated and applied. It reveals that Google employs a sophisticated array of signals to detect and penalize spammy behavior, which can significantly impact a site’s rankings.

To avoid these penalties, SEOs should adhere strictly to best practices for content creation and link building. This includes avoiding over-optimization tactics such as keyword stuffing and ensuring that anchor texts are natural and relevant. Additionally, maintaining high content quality and avoiding practices like content spinning or using filler text is crucial. Regularly auditing your site for potential spam signals and correcting any issues promptly can help protect against these penalties.

9. Image Quality Signals

Google Says Quality Is More Of A Site Level Signal

One of the fascinating insights from the leak is how Google evaluates image quality based on user engagement metrics. The metric called ImageQualityClickSignals is used to assess the quality of images on a webpage. This evaluation includes signals such as usefulness, presentation, appealingness, and engagingness of images, which are collected through user interactions like clicks.

Historically, Google has emphasized the importance of alt text for images and overall relevance to the content, but it has not explicitly stated that user engagement with images significantly affects rankings. The leaked documents, however, reveal that the quality of images and how users interact with them are critical factors in Google’s ranking algorithms.

For SEOs, this means optimizing images is more important than ever. This involves using high-quality, relevant images that enhance user experience and engagement. Additionally, ensuring that images load quickly and are properly tagged with descriptive alt text can improve their chances of being favorably ranked. Implementing structured data for images, such as Schema.org markup, can also help Google better understand the context and quality of the images.

10. Host NSR (Normalized Site Rank)

The concept of Host NSR, or Normalized Site Rank, is another significant revelation. This metric evaluates the overall quality of a website by breaking it down into smaller sitechunks, each of which is analyzed for quality and relevance. The Host NSR incorporates various factors, including site_pr and new_nsr, to determine the aggregate quality score of a website.

Google has often stated that individual page quality is paramount, and while site-wide signals were acknowledged, the emphasis was typically on page-level factors. However, the Host NSR indicates that Google does assess the quality at the host level, meaning that the overall performance and quality of a website can influence the rankings of individual pages.

To optimize for Host NSR, SEOs should ensure that their entire site adheres to best practices for quality and relevance. This includes maintaining consistent quality across all pages, avoiding low-quality or spammy content, and ensuring a cohesive and user-friendly site structure. Regularly updating and auditing the site to remove or improve underperforming pages can also help enhance the overall site quality.

Comment on Google’s Previous Indications

Google has long suggested that while they consider site-wide factors, the primary focus is on individual page quality. Statements from Google representatives have downplayed the extent to which site-wide metrics influence rankings. However, the revelations about Host NSR and site-wide authority metrics show that Google does place significant emphasis on the overall quality and performance of a website.

Additionally, Google has consistently emphasized content quality and relevance for images but has not highlighted the importance of user engagement with images in ranking decisions. The leaked information on ImageQualityClickSignals demonstrates that user interaction with images plays a crucial role in how those images and, by extension, the pages containing them are ranked.

Final Thoughts

469,016 Finals Images, Stock Photos & Vectors | Shutterstock

The insights from the Google algorithm leak offer a deeper understanding of the complexities behind search rankings. Key takeaways, such as the importance of user engagement metrics, the role of site-wide authority, and the specific evaluation of image quality, provide SEOs with valuable guidance for optimizing their strategies. These revelations also highlight discrepancies between Google’s public statements and the actual factors that influence search rankings.

To stay competitive, SEOs must continuously adapt to these evolving insights and focus on creating high-quality, engaging content across their entire site. By understanding and leveraging the factors revealed in the leak, they can better navigate the intricacies of Google’s ranking algorithms and achieve more sustainable, long-term success in their SEO efforts.

Avatar photo

Mark Whitaker

Author

Mark Whitaker is a seasoned digital marketing expert and author known for his in-depth knowledge of search engine optimization strategies. With over a decade of experience in the industry, Mark has helped numerous businesses enhance their online visibility and drive organic traffic through effective SEO practices. His insightful articles and blog posts on topics such as keyword research, on-page optimization, and link building have garnered a loyal following among digital marketers and business owners alike. Mark's passion for staying abreast of the latest SEO trends and algorithms, combined with his knack for simplifying complex concepts, makes his content both informative and actionable for readers looking to improve their search engine rankings.