Privacy at risk in OpenAI’s GPT apps: study reveals flaws in data management | Learn large language models | Llm ai meaning | Llm python -- tutorial | Turtles AI

Privacy at risk in OpenAI’s GPT apps: study reveals flaws in data management
Researchers reveal opaque data collection practices and vulnerabilities within the GPT Store, raising concerns about user security
Isabella V2 September 2024

 

 A Washington University study has revealed serious privacy management flaws in GPT apps in the OpenAI Store. Researchers uncovered non-compliant data collection practices, including sensitive information, often undocumented. The study highlights OpenAI’s lack of transparency and ineffective control over its ecosystem.

Key points:

  •  Poor documentation of data collection practices in GPT apps.
  •  Extensive collections of sensitive information, including personal data and passwords.
  •  OpenAI removed over 2,800 non-compliant GPTs in a four-month period.
  •  Actions within GPTs can access data shared between different apps.


A recent study conducted by Washington University in St. Louis uncovered serious concerns about privacy and security in the ecosystem of GPT apps in the OpenAI Store. The researchers, Evin Jaff, Yuhao Wu, Ning Zhang and Umar Iqbal, examined a large sample of about 120,000 GPTs and more than 2,500 actions embedded in the apps over a four-month period. The analysis revealed that many of the apps examined collect sensitive user data, often without clear documentation in privacy notices, potentially violating OpenAI’s own policies.

The study, titled “Data exposure from LLM apps: an in-depth investigation of OpenAI’s GPTs,” shows that only 5.8 percent of the actions explicitly document their data collection practices. The information collected includes personal data, web browsing history and even passwords, creating significant security risks for users. According to the researchers, although password capture may occur for convenience reasons, such as easier login, this practice increases the risk of data compromise because passwords could be unintentionally included in model training data.

In addition, the study found that shares, often from third parties, operate in a shared memory space within GPTs, which allows mutual access to the data. This mechanism, although technically useful, poses an additional privacy risk, as actions could collect and share information across multiple apps without users’ explicit consent.

During the four-month scan period, OpenAI removed 2,883 GPTs from the Store, signaling an attempt to thwart violations. However, the researchers argue that these efforts are insufficient, highlighting how the platform lacks adequate tools to ensure that users can exercise their privacy rights. The lack of isolation between actions also further exposes user data to potential abuse.

The study points out that although OpenAI requires GPT creators to comply with data privacy laws, many third-party developers seem to ignore these guidelines. This phenomenon is unfortunately also common in other digital ecosystems, such as mobile and web apps, but its presence in emerging platforms based on large language models (LLMs) is particularly troubling.

 The findings of this study clearly indicate that while OpenAI has initiated actions to improve compliance, there is still a long way to go to ensure secure and transparent management of user data within its GPT ecosystem.