At MIT Connection Science, we're addressing the challenges of AI that matter to organizations through a holistic approach across the AI supply chain. Our work spans three interconnected areas: enhancing data transparency and provenance, securing AI deployment and data sharing, and navigating the complex landscape of AI policy and law. These initiatives collectively aim to create a more responsible, secure, and ethically sound AI ecosystem.
Data Provenance Initiative
This work conducts large-scale audits of the massive datasets that power state-of-the-art AI models, auditing over 4,000 popular text, speech, and video datasets, tracing them from origin to creation, cataloging data sources, licenses, creators, and other metadata. In addition, this work analyzed 14,000 web domains, to understand the evolving provenance and consent signals behind AI data. The purpose of this work is to map the landscape of AI data, improving transparency, documentation, and informed use of data.
New York Times Coverage
Read about our work in the New York Times.
Data Provenance Homepage
Decline of the AI Data Commons
Explore our research on the changing landscape of AI data.
Audits of Dataset Licenses
Learn about our comprehensive audits of AI dataset licenses.
Data Authenticity, Consent, and Provenance for AI Are All Broken
Discover the challenges in AI data management.
Comment to U.S Copyright Office
Read our submission on Data Provenance and Copyright.
Securing AI Deployment and Data Sharing
Securing AI Deployment and Data Sharing involves developing and implementing robust methods to ensure that AI technologies are deployed securely and that data shared across systems remains protected. This initiative addresses the growing need for end-to-end security in AI systems, via leveraging a patchwork of technologies across the AI supply chain, particularly in the face of increasingly sophisticated threats. The focus is on innovative techniques such as private RAG, verifiable model evaluations, and secure credentials to safeguard digital spaces.
Private Retrieval Augmented Generation
Learn about our innovative RAG techniques.
Verifiable Model Evaluations
Discover our methods for ensuring AI model integrity.
Open Problems in Technical AI Governance
Explore key challenges in AI governance.
Personhood Credentials
Learn about our work on protecting digital spaces.
Zero-knowledge tax disclosures
Discover our research on privacy-preserving financial disclosures.
Two party private data sharing.
Explore our work on making health feeds private decentralised data sharing.
Policy, AI, and the Law
Policy, AI, and the Law explores the intersection of artificial intelligence, policy-making, and legal frameworks. This initiative focuses on developing guidelines and legal mechanisms that ensure AI systems are designed, deployed, and evaluated within a framework that prioritizes safety, fairness, and accountability. The work emphasizes the concept of "Regulation by Design," advocating for AI systems to be built with compliance to legal standards from the ground up. It also addresses the need for safe harbors in AI evaluation and red teaming, allowing for robust testing and assessment of AI models without legal repercussions. Through these efforts, the initiative aims to contribute to a legal environment that fosters innovation while safeguarding public interest and ethical standards in AI.
Regulation by Design for AI Systems
Explore our approach to integrating regulation into AI development.
A Safe Harbor for AI Evaluation and Red Teaming
Learn about our proposals for secure AI testing environments.
Discit ergo est: Training Data Provenance And Fair Use
Discover our work on data provenance and fair use in AI.
Generative AI for Pro-Democracy Platforms
Explore how AI can enhance online political discourse and foster mutual understanding.
Competition Between AI Foundation Models
Explore the dynamics of competition in AI development.
Art and the science of generative AI
Learn about the intersection of art and AI technology.
The Team
Alex 'Sandy' Pentland
MIT Professor and Stanford HAI Fellow
Tobin South
Senior PhD and AI Security Research Lead
Robert Mahari
Harvard Law School JD and MIT Senior PhD
Shayne Longpre
Senior PhD and Data Provenance Leader
Gabriele Mazzini
Chief Architect of EU AI Act, MIT Connection Science Fellow
Guy Zyskind
Serial Crypto & Web3 Founder, MIT PhD Alum
Thomas Hardjono
CTO of MIT Connection Science