Rohan Taori

I'm a researcher on the multimodal pretraining team at Anthropic. I like studying the foundations of machine learning in the context of real-world data and systems.

I recently received my PhD in Computer Science at Stanford, advised by the wonderful Tatsu Hashimoto. I was supported by the NSF GRFP and OSV Fellowship. I previously graduated with my BS in EECS from UC Berkeley, where I was fortunate to work with Ludwig Schmidt. I also had an amazing time teaching at Machine Learning @ Berkeley.

Google Scholar / Github / Twitter / Blog

Email: rtaori_at_cs_dot_stanford_dot_edu

Research

An Interactive Agent Foundation Model
Zane Durante, Bidipta Sarkar, Ran Gong, Rohan Taori, Yusuke Noda, Paul Tang, Ehsan Adeli, Shrinidhi Kowshika Lakshmikanth, Kevin Schulman, Arnold Milstein, Demetri Terzopoulos, Ade Famoti, Noboru Kuno, Ashley Llorens, Hoi Vo, Katsu Ikeuchi, Fei-Fei Li, Jianfeng Gao, Naoki Wake, Qiuyuan Huang
Paper

Agent AI: Surveying the Horizons of Multimodal Interaction
Zane Durante, Qiuyuan Huang, Naoki Wake, Ran Gong, Jae Sung Park, Bidipta Sarkar, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Yejin Choi, Katsushi Ikeuchi, Hoi Vo, Fei-Fei Li, Jianfeng Gao
Paper

Benchmarking Multi-Domain Active Learning on Image Classification Use
Jiayi Li, Rohan Taori, Tatsunori B Hashimoto
Paper

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton*, Hritik Bansal*, Jack Hessel*, Rulin Shao, Wanrong Zhu, Anas Awadalla, Josh Gardner, Rohan Taori, Ludwig Schimdt
NeurIPS Benchmarks and Datasets, 2023.
Paper / Website / Code & Data

AlpacaEval: An Automatic Evaluator for Instruction-following Language Models
Tianyi Zhang*, Xuechen Li*, Yann Dubois*, Rohan Taori*, Ishaan Gulrajani, Jimmy Ba, Carlos Guestrin, Percy Liang, Tatsunori B Hashimoto
Website / Code & Data

AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
Yann Dubois*, Xuechen Li*, Rohan Taori*, Tianyi Zhang*, Ishaan Gulrajani, Jimmy Ba, Carlos Guestrin, Percy Liang, Tatsunori B Hashimoto
Spotlight at NeurIPS, 2023.
Paper / Blog Post / Code & Data

Stanford Alpaca: A Strong, Replicable Instruction-Following Model
Rohan Taori*, Ishaan Gulrajani*, Tianyi Zhang*, Yann Dubois*, Xuechen Li*, Carlos Guestrin, Percy Liang, Tatsunori B Hashimoto
Blog Post / Code & Data

Data Feedback Loops: Model-driven Amplification of Dataset Biases
Rohan Taori, Tatsunori B Hashimoto
Oral at ICML, 2023. Spotlight at NeurIPS Distribution Shifts Workshop, 2022.
Paper / Code

Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning
Shibani Santurkar, Yann Dubois, Rohan Taori, Percy Liang, Tatsunori B Hashimoto
ICLR, 2023.
Paper

OpenCLIP: An Open-Soure Implementation of OpenAI's CLIP
Gabriel Ilharco, Mitchell Wortsman, Ross Wightman, Cade Gordon, Nicholas Carlini, Rohan Taori, Achal Dave, Vaishaal Shankar, Hongseok Namkoong, John Miller, Hannaneh Hajishirzi, Ali Farhadi, Ludwig Schmidt
Code

On the Opportunities and Risks of Foundation Models
Rishi Bommasani, Drew A. Hudson, ...<et al>..., Rohan Taori, ...<et al>..., Percy Liang
Paper

Are We Learning Yet? A Meta Review of Evaluation Failures Across Machine Learning
Thomas Liao, Rohan Taori, Inioluwa Deborah Raji, Ludwig Schmidt
NeurIPS Benchmarks and Datasets, 2021.
Paper

Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization
John Miller, Rohan Taori, Aditi Raghunathan, Shiori Sagawa, Pang Wei Koh, Vaishaal Shankar, Percy Liang, Yair Carmon, Ludwig Schmidt
ICML, 2021.
Paper / Talk / Interactive Plotting

Measuring Robustness to Natural Distribution Shifts in Image Classification
Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt
Spotlight at NeurIPS, 2020.
Paper / Website / Talk / Code & Data / Interactive Plotting

Transposer: Universal Texture Synthesis Using Feature Maps as Transposed Convolution Filter
Guilin Liu, Rohan Taori, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum Reda, Karan Sapra, Andrew Tao, Bryan Catanzaro
Paper / Short Video / Long Video

Targeted Adversarial Examples for Black Box Audio Systems
Rohan Taori, Amog Kamsetty, Brenton Chu, Nikita Vemuri
IEEE Deep Learning and Security Workshops, 2019.
Paper / Code

Teaching

I ran or was heavily involved with Machine Learning @ Berkeley's Data Science Class for a number of semesters. You can find complete course content (lecture slides, demos, & homeworks) from those semesters online: Fall '17, Spring '18, Fall '18. Some lectures I gave:

Data Science 101 Lecture on analyzing 2016 campaign finance donations using numpy, pandas, and matplotlib.
Clustering Lecture on common clustering techniques and a very cool flower compression demo.
Linear Regression Lecture on the basics of linear regression.
SVM Lecture on how Support Vector Machines (SVMs) work along with a cool demo.
GAN Lecture on how Generative Adversarial Networks (GANs) work.
Image Captioning Lecture on how common image captioning models are trained.
Intro to Deep Learning Workshop I gave with Sajel Shah at CalHacks 5.0.
Intro to Reinforcement Learning Workshop I gave with Brenton Chu.

Website template from Jon Barron.