Abstract
Large Language Models (LLMs) such as ChatGPT and LLaMA have demonstrated immense potential in various applications. However, these models primarily reflect the broader internet and do not inherently account for the nuances of specific communities or private data. We introduce the "Secure Community Transformers: Private Pooled Data for LLMs" project, a novel approach to augmenting LLMs with private community and personal data in a secure, privacy-preserving manner. By leveraging a combination of traditional privacy transformations, LLM-enabled privacy transformations, trusted execution environments, custodial control of data, and consent-based privacy choices, we enable the continuous updating of community data within a privately hosted LLM, resulting in a tailored Q&A tool that embodies community values and individual circumstances.
Our solution addresses the limitations of LLMs that stem from their reliance on historical public data and lack of secure contextualization. The Community Transformers project empowers communities and organisations to securely and privately amalgamate local data, enabling LLMs to provide contextually relevant answers tailored to the specific needs of the community. This approach not only enhances the utility of LLMs but also ensures the protection of sensitive community and personal information.
Tobin South, Guy Zyskind, Robert Mahari, Thomas Hardjono, Alex 'Sandy' Pentland