Arman


Assistant Professor
CS Department
Yale University

Research Scientist
Allen Institute for AI (AI2)

Arman Cohan

I am an Assistant Professor in the Computer Science Department at Yale University.
I am also a Research Scientist at Allen Institute for AI (AI2).

At Yale, I lead the Natural Language Processing lab. My research spans various problems at the intersection of NLP and Machine Learning, including language modeling, representation learning, retrieval, and applications in specialized domains.

🔍 Current Research Focus

The following is a list of some of the problems that we are currently working on (in no particular order). For more details please visit the lab website.

  • Understanding generalization capabilities of LLMs
  • Evaluation and science of language models
  • Post-training and its evaluation
  • Language modeling for complex document-level tasks and long sequences
  • Multi-document and extreme multi-document language processing
  • Scientific and scholarly document processing

Teaching
  • [Spring 2025] - CPSC 477/577: Natural Language Processing -- Language Modeling
    This edition of the course primarily focuses on the fundamentals of natural language processing (NLP) with a strong emphasis on language modeling. Topics include foundational concepts in NLP, neural networks for NLP, CNNs, RNNs, transformers, early language models, tokenization, text processing, word embeddings, NLP applications such as summarization, translation, and generation. Students will explore recent advances in scaling language models, post-training, complex reasoning, and efficient fine-tuning, along with safety and practical considerations. This course is designed for upper undergraduates and graduate students, requiring prior knowledge in introductory machine learning or artificial intelligence.
  • [Spring 2024] - CPSC 477/577: Natural Language Processing
  • [Fall 2023] - CPSC 488/588: AI Foundation Models
    This course focuses on building blocks of foundation models. While the course primarily focuses on advances on LLMs, we will also cover foundation models in computer vision, as well as multi-modal foundation models.
  • [Spring 2023] - CPSC 670: Topics in Natural Language Processing
    This seminar course is focused on Large Language Models and other recent advances in NLP.

Information about my lab
For more information about my lab and our research, including available opportunities, please check out the Yale NLP lab website lab website.


Publications
  • For most recent list of publications please refer to my Google Scholar page.


Contact