Google has reportedly authenticated thousands of internal documents that were allegedly leaked earlier in May. The data reportedly includes information about how Search works and Google’s collection of user data to rank websites. Although the company initially refused to respond to the leak, it is now reported to have acknowledged it, although Google also advised caution against “incorrect assumptions”.
Google confirms Search data leak
In an email to The Verge, Google spokesman David Thompson said, “We would caution against making incorrect assumptions about Search based on out-of-context, out-of-date, or incomplete information.” Thompson also claimed that Google is working to protect the integrity of search results from manipulation, adding that the company has “shared extensive information about how Search works and the types of factors that our systems weigh.”
The problem reportedly came to light when search engine optimization (SEO) experts Rand Fishkin and Mike King published analyzes of 14,014 attributes (internal API documents) leaked from Google’s search department and shared with them by a source.
These documents are said to be part of the “Content API Warehouse” used by company employees as a repository. It further states that the document’s code was uploaded to GitHub on March 27th and was not removed from the platform until May 7th.
Contradictory information
In a blog post, Fishkin argued that many of the claims Google has made over the years contradict the information provided by the source, such as considering click-through rate (CTR) as a ranking signal and subdomains as separate entities.
In yet another example of contradiction, the documents allegedly mention data from Chrome when it comes to ranking websites in Search. However, the tech giant has repeatedly claimed otherwise, saying it doesn’t use Chrome data to rank websites.
According to Fishkin, many of these claims also overlap with what Google revealed in its testimony during the US Department of Justice’s antitrust case. Furthermore, other claims also suggest insider knowledge. While most of the information would be better understood by SEO staff, Fishkin’s analysis reveals what data Google actually collects from searches, web pages, and websites.