VP of AI at micro1 · Founder at DeepForest.org · Stanford CS
I am VP of AI at micro1, leading development of data-centric ML systems that produce training data, evaluations, and expert feedback for frontier AI labs. I also founded DeepForest.org, which directly conserves forest land in California. DeepForest operates living laboratories to explore how robotics, AI, and ecological science can increase wildfire resilience while using forests as a powerful tool for atmospheric carbon capture. At Stanford I teach CS 224S: Spoken Language Processing and support AI efforts within the StartX founder network.
Prior to micro1, I founded Pointable, developing optimized retrieval systems for enterprise LLM applications. Apple acquired Pointable in 2025 (the Pointable website and blog are no longer active as a result). I served as a Director of Engineering at Apple working on deep learning architectures for speech technology as well as LLM-based approaches to spoken dialogue systems. Earlier at Apple's Special Projects Group I led teams building data-centric deep learning approaches and tooling for robot autonomy. From 2013 to 2019 I was co-founder and chief scientist at Roam Analytics (acquired by Parexel), building a healthcare natural-language-processing platform. In 2015–2016 I was a research scientist in spoken-language deep learning at Semantic Machines (acquired by Microsoft).
I completed my PhD in Computer Science at Stanford in 2015 advised by Andrew Ng and Dan Jurafsky, with dissertation work on deep neural networks in speech recognition. My research was supported by a National Science Foundation graduate research fellowship. I hold a BS in Computer Science and Cognitive Science from Carnegie Mellon (2009). My research sits at the intersection of machine learning, natural language processing, machine perception, and cognitive science. Human perception and learning are remarkable when we consider the complex data entering our senses, and developing algorithms to automatically find structure in audio, text, images, and other data will enable autonomous systems to better integrate into everyday life to positively transform the ways we live and work.
The AI platform for human intelligence — connecting domain experts with AI labs to produce training data and evaluations that advance frontier models.
Exploring how robotics, AI, and forest science can increase wildfire resilience while using forests as a powerful tool for atmospheric carbon capture.
Spoken Language Processing — traditional and deep-learning approaches to conversational agents and spoken language understanding systems.
50,000 polarized movie reviews introduced in our ACL 2011 paper. A standard binary sentiment-classification benchmark used by ULMFiT, ELMo, BERT, RoBERTa, XLNet, and many others. 179K monthly downloads on Hugging Face.
CodeDeep recurrent denoising auto-encoder for noise reduction in robust ASR. Reference implementation for the Interspeech 2012 and CHiME 2013 papers.
CodeDemo code for the NIPS 2010 deep-learning workshop paper introducing a probabilistic latent-variable model for semantic word vectors.
CodeReference code accompanying the COLING 2018 paper on functional retrofitting of distributional embeddings to KG relations.
Video demos accompanying the maximum-entropy inverse-RL navigation work (2008–2009).