The Red Queen Effect
The suit cites the “unlawful and harmful conduct’ of the Defendants (OpenAI) while developing, marketing, and operating their AI products (ChatGPT 3.5, 4.0, Dall-E, and Vall-E) by using stolen private information, including personally identifiable information, from hundreds of millions of internet users, including children of all ages, without their consent or knowledge.” While the suit recognizes that said products ‘undoubtedly have the potential to do much good in the world, like aiding life-saving scientific research and ushering in discoveries that can improve the lives of everyday Americans”, it notes that OpenAI was originally founded as a non-profit research organization with a mandate to create AI that would be used to benefit humanity. However in 2019 OpenAI restricted and formed a for-profit business to pursue commercial opportunities.
As part of the ChatGPT training process, the suit alleges that OpenAI secretly harvested massive amounts of personal data from the internet, including private information and private conversations, medical data, and information about children – essentially every piece of data exchanged on the internet it could take, without notifying owners or user of such information, much less with anyone’s permission. The allegations go on to assert that OpenAI used this ‘stolen’ information, which included copyrighted material, to create its LLM (Large Language Model) and algorithms to generate a human-like language that could be used in a wide range of applications, with no additional safeguards. As applications and products that incorporate such ‘stolen’ information are developed, OpenAI’s valuation increased to between $27b and $29b and has created an ‘economic dependency within our society’ as OpenAI products are embedded in other applications.
The idea of ‘web-scraping’, which violates the ‘Terms of Use’ for many websites has been litigated for years, usually resulting in commercial scrapers being forced to register as data brokers, which OpenAI has not done, and studies have valued an individual’s on-line information at between $15 and $40, although more specific on-line identity information can be sold (dark web) for over $1,000, and the suit goes on for many pages (159 in total) about how OpenAI has violated the rights of children.
The suit calls for an injunction, essentially a temporary freeze on commercial access to OpenAI products, the establishment of an independent body that would be responsible for approving the use of products before release, establishing ethical principles and guidelines, and implementing appropriate transparency concerning data collection, including the ability for users to opt-out of data collection. Of course there is also the establishment of a monetary fund to compensate class members for OpenAI’s misconduct, statutory, punitive, and exemplary damages, and ”…reasonable attorneys’ fees”.
We expect that this suit will also wander through the courts for years without final rulings that might be used as actual case precedent, and that legal challenges, regardless of the outcome, will continue ad infinitum, but there is certainly a necessity to resolve such issues legally. Right now, it is open season on data used for Ai training and as systems are developed that extend data parameters past the 2trillion level, almost all data sources will be harvested without regard for ownership rights. Privacy issues aside, the implications for copyright protection and digital asset ownership are so vast that the over 300 years of written copyright law seems insignificant. Time is of the essence.