Google accused of using ‘hundreds of millions’ of people’s data to train AI bot

A group of US citizens is suing Google for $5 billion, saying it is using public’s digital footprint to develop tech such as Bard
Google Bard
The claimants say Google Bard’s development benefited from the use of private and copyrighted information available online
Mojahid Mottakin/Pexels
Andrew Williams13 July 2023

A class action lawsuit has been filed in the US, claiming Google has used “hundreds of millions” of people’s data in order to feed its AI tech, including the Bard chatbot.

The lawsuit was filed by eight claimants “on behalf” of the rest of the US population.

It claims: “Google has been secretly stealing everything ever created and shared on the internet by hundreds of millions of Americans” including “personal and professional information, our creative and copywritten works, our photographs, and even our emails – virtually the entirety of our digital footprint”.

The eight people who filed the suit are only represented by their initials, but details elucidated in the filing, which has been rehosted by the Register, reveal one of the plaintiffs is six years old, another 13.

One of the claimants is an “actor and professor”, one an author, another regularly posts on YouTube and TikTok.

The documents detail how the author’s work can be reproduced when using Google Bard. “On demand, Bard will offer not only to summarize the book in detail, chapter by chapter, but it also offers to regenerate the text of her book verbatim,” the case document reads.

The group is suing Google parent company Alphabet for $5 billion (£3.8 billion).

Their lead claim is “Publicly available” has never meant free to use for any purpose and that “Google must understand, once and for all: it does not own the internet, it does not own our creative works, it does not own our expressions of our personhood, pictures of our families and children, or anything else simply because we share it online.”

Google is attacked on 10 fronts in the case, including violations of the Digital Millennium Copyright Act, the 1998 US law that provides the mechanisms by which copyright-infringing content is removed from platforms like YouTube.

Earlier this month Google updated the wording of its privacy policy to more clearly state it does indeed implement user data in the development of its products.

“Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public. For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities,” its policy read in early July according to Search Engine Journal, although the wording appears to have changed once again since.

Google’s privacy policy still says your data is used in “developing new products and features that are useful for our users”.

This is not the first case to be levelled at the companies developing generative AI technologies.

Last January, Getty Images sued Stable Diffusion creator StabilityAI in London, claiming it had used the Getty’s copyrighted pictures to train its generative AI models. This was followed up in February by a similar filing in the US.

Create a FREE account to continue reading

eros

Registration is a free and easy way to support our journalism.

Join our community where you can: comment on stories; sign up to newsletters; enter competitions and access content on our app.

Your email address

Must be at least 6 characters, include an upper and lower case character and a number

You must be at least 18 years old to create an account

* Required fields

Already have an account? SIGN IN

By clicking Create Account you confirm that your data has been entered correctly and you have read and agree to our Terms of use , Cookie policy and Privacy policy .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged in