LLMs like ChatGPT have enormous potential, but we need to think pragmatically about how to minimize their risks.
- Alex 'Sandy' Pentland

At MIT Connection Science, we're addressing the challenges of AI that matter to organizations through a holistic approach across the AI supply chain. Our work spans three interconnected areas: enhancing data transparency and provenance, securing AI deployment and data sharing, and navigating the complex landscape of AI policy and law. These initiatives collectively aim to create a more responsible, secure, and ethically sound AI ecosystem.

Data Provenance Initiative

This work conducts large-scale audits of the massive datasets that power state-of-the-art AI models, auditing over 4,000 popular text, speech, and video datasets, tracing them from origin to creation, cataloging data sources, licenses, creators, and other metadata. In addition, this work analyzed 14,000 web domains, to understand the evolving provenance and consent signals behind AI data. The purpose of this work is to map the landscape of AI data, improving transparency, documentation, and informed use of data.

Scroll for more

Securing AI Deployment and Data Sharing

Securing AI Deployment and Data Sharing involves developing and implementing robust methods to ensure that AI technologies are deployed securely and that data shared across systems remains protected. This initiative addresses the growing need for end-to-end security in AI systems, via leveraging a patchwork of technologies across the AI supply chain, particularly in the face of increasingly sophisticated threats. The focus is on innovative techniques such as private RAG, verifiable model evaluations, and secure credentials to safeguard digital spaces.

Scroll for more

Policy, AI, and the Law

Policy, AI, and the Law explores the intersection of artificial intelligence, policy-making, and legal frameworks. This initiative focuses on developing guidelines and legal mechanisms that ensure AI systems are designed, deployed, and evaluated within a framework that prioritizes safety, fairness, and accountability. The work emphasizes the concept of "Regulation by Design," advocating for AI systems to be built with compliance to legal standards from the ground up. It also addresses the need for safe harbors in AI evaluation and red teaming, allowing for robust testing and assessment of AI models without legal repercussions. Through these efforts, the initiative aims to contribute to a legal environment that fosters innovation while safeguarding public interest and ethical standards in AI.

Scroll for more

The Team

Scroll for the full team