- Google surreptitiously updated its policy when it comes to AI model training, now giving itself the authority to use public data.
- The training would apply to platforms like Bard, Google Translate, and Cloud AI.
- Now anything users post publicly online could be used by Google to train AI.
In the big tech equivalent of Obama giving himself a medal meme, Google has updated its policy when it comes to training AI models with public data. To that end, in a change that took effect as of 1st July 2023, now Google has given itself the authority to train AI models with public data.
Publicly available online data can now be used by Google in order to not only build features for existing tools and platforms like Bard, Translate, and Cloud AI, but also develop entirely new offerings altogether.
“We may collect information that’s publicly available online or from other public sources to help train Google’s
languageAI models and build products and features like Google Translate, Bard, and Cloud AI capabilities. Or, if your business’s information appears on a website, we may index and display it on Google services,” one of the key paragraphs in its updated policy explained.
“Many websites and apps partner with Google to improve their content and services… These services may share information about your activity with Google and, depending on your account settings and the products in use (for instance, when a partner uses Google Analytics in conjunction with our advertising services), this data may be associated with your personal information,” added a section regarding user activity on other sites and apps.
As such, Google appears ready to pull information, data, and content from a myriad sources in order to improve its existing AI offerings, along with create new ones.
While we must commend the company for being transparent about the change, the fact that it was not officially communicated in a press release or blog post does strike us as odd. Added to this is the fact that Google was espoused the need to enforce user privacy in recent years, but it remains to be seen how this policy change impacts that.
For now though, as has often been the case for any company that wants to train AI models, all bets are seemingly off when it comes to publicly available data.
As Engadget points out, what this means for platforms like Twitter and Reddit that have taken measures to restrict others from scraping or accessing data for free, is also unclear at this stage.
Either way, regulators and watchdogs will no doubt be keeping tabs on how the likes of OpenAI, Microsoft, Google, and others are training their AI.