OpenAI has pushed back against the court order to retain output logs. This comes as part of an ongoing lawsuit between the AI company and The New York Times. Filed in 2023, this lawsuit claims that OpenAI used the New York Times’ copyrighted works to train its models, and the models are now competing with the publication.
As per an order in this case in May this year, the New York Times had urged that OpenAI should preserve output logs on a “wholesale basis.” In response, the company had brought up user data deletion requests and the fact that numerous privacy laws in the US contemplate data deletion rights for users.
The court, however, ordered OpenAI to preserve and segregate all output log data that would otherwise be deleted going forward until a further court order, even if a user asked the company to delete it or if privacy laws required the company to do so. “We strongly believe this [demand for data retention] is an overreach by the New York Times. We’re continuing to appeal this order,” OpenAI says.
Who does the order affect?
In a post discussing the development, OpenAI says that the order will affect users with a ChatGPT Free, Plus, Pro, and Team subscription and users who access the OpenAI API (without a Zero Data Retention agreement). Previously, OpenAI used to delete a chat from ChatGPT free, plus, pro, and team based on user requests within 30 days. A similar policy is also in place for OpenAI API users, besides Zero Data Retention APIs, wherein the company never logs inputs and outputs. Users with the ChatGPT Enterprise and ChatGPT Edu subscriptions will also remain unaffected.
The company explained that it will not automatically share the data retained as a result of the order with the New York Times or anyone else. “It’s locked under a separate legal hold, meaning it’s securely stored and can only be accessed under strict legal protocols,” OpenAI explained. It says that if the New York Times demands this data, the company will “protect user privacy at every step.”
What is the rationale behind the demand for data retention?
The demand for preserving user logs first came before the court in January 2025 as part of another case in which OpenAI is fighting against the Authors Guild. In this case, the guild had demanded that OpenAI should carry out wholesale preservation of user logs (end-user prompts and outputs).
At the time, the court had denied the demand, but later in May, a group of news organisations (including The New York Times, The Center for Investigative Reporting, Chicago Tribune Company) requested the court to compel OpenAI to identify the output logs it has destroyed since The Times filed its Original Complaint on December 27, 2023. They mentioned that the volume of user conversations that OpenAI destroyed since the original complaint began is substantial.
OpenAI mentions that the reason the news organisations wanted this data retained was to find something that may support their case. It argues that the court’s order for data retention suggests that OpenAI carried out targeted deletions in response to litigation events, which the company claims is “unequivocally false.”
Advertisements
Why it matters:
In a letter OpenAI sent to the court asking it to reconsider its stance on data retention, the company points out that this requirement adversely affects its users. “Millions of individuals, businesses, and other organisations use OpenAI’s services in a way that implicates uniquely private information—including sensitive personal information, proprietary business data, and internal government documents,” the company explains. It gives the example of data points like conversations that a user may have had with OpenAI’s AI models about a family member’s health condition or immigration status. As such, from the company’s perspective, the data retention demand can undermine user trust.
From the publishers’ perspective, given the millions of interactions users probably have with ChatGPT in a day, the data could potentially include examples of alleged copyright violations, adding more strength to their case. However, as OpenAI mentioned in its letter, it currently retains tens of billions of user interactions, which is far more than any expert could review, making it challenging for the publishers to find evidence.
Also read:
Support our journalism:
1 Comment
ky4jt2