Cold Spring Harbor Laboratory · A C G T ·

Anirban Sarkar Interpretable machine learning · Generative design for biology

I am a computational postdoctoral fellow in the Koo Lab at Cold Spring Harbor Laboratory. My research makes AI systems transparent, from attribution methods in computer vision to generative models that design DNA with tunable regulatory activity.

A·T·G·C·T·A·C·G
Portrait of Anirban Sarkar

01 — About

I aim to bridge human and machine intelligence, creating AI that is robust in the wild and transparent enough to be a trusted partner in scientific discovery.

Deep Generative Modeling Explainable AI Inference-time Control Genomic Sequence Design Multimodal Medical AI Counterfactual Generation

At Cold Spring Harbor Laboratory, I am part of the Koo Lab within the Simons Center for Quantitative Biology, where my research with Dr. Peter Koo centers on generative modeling for genomics: designing regulatory DNA with tunable activity (D3) and developing inference-time, model-agnostic methods that push sequence generation beyond the activities seen during training (GPA).

Before CSHL, I was a postdoctoral associate in the Sinha Lab for Developmental Research within MIT's Brain and Cognitive Sciences department, where Dr. Xavier Boix, Dr. Pawan Sinha, and I studied explainability under distribution shift.

My Ph.D., in Computer Science & Engineering at IIT Hyderabad under Dr. Vineeth N Balasubramanian, focused on interpretability and robustness for computer vision: self-explaining networks, and defenses against unseen, adversarial, and attributional attacks.

02 — Research

From interpreting models to designing with them

One thread runs through my work: first understand what a model knows, then turn that understanding into action, to design, to steer, to reach past the limits of the training distribution (GPA), and increasingly to imagine data that does not exist yet.

  1. 2016–2022 · PhD

    Interpretable computer vision

    Attribution and self-explaining models that reveal what a vision network sees and why: Grad-CAM++, ante-hoc concept models, attributional robustness, and more.

  2. 2022–2023 · MIT

    Explainability under distribution shift

    Understanding and debugging vision models when the test world drifts away from the world they were trained on.

  3. 2023–now · CSHL

    Generative modeling for genomics

    Turning understanding into design: generating regulatory DNA with tunable activity through discrete diffusion (D3).

  4. Current focus

    Inference-time, model-agnostic design

    Pushing generation beyond the activities seen in training, at inference time, with no retraining and no model-specific hooks, so it drops onto frozen generators (GPA).

Where I'm headed

Concept-aligned, cross-modal counterfactual generation

An open problem I'm drawn to: aligning different medical modalities, histopathology, regulatory DNA, transcriptomes, around a shared, interpretable space of concepts. Once they share that language, one modality can be generated from another, concept for concept, then interrogated with counterfactuals: what changes if a single concept does?

It ties together the threads I care about most, interpretability, generative modeling, and causal intervention, with real stakes for biology and medicine.

// early and unproven, but the kind of problem worth the next few years.

03 — Highlights

Recognition

  • Recipient of the IKDD Best Doctoral Dissertation in Data Science Award 2022 (winner).
  • NASSCOM AI Gamechanger Award 2022, AI Research — DL Algorithms & Architecture (runner-up).
  • CVPR 2022 travel grant to present at the conference.
  • WACV 2021 Doctoral Consortium participant.
  • ICML 2019 travel grant to present at the conference.
  • Research Intern, IBM India Research Lab — AI Explainability & Active Learning (May–Aug 2019).
  • Machine Learning Summer School, Universidad Autónoma de Madrid (Aug–Sep 2018).
  • Sakura Science Plan award — internship at University of Tokyo (Jun–Jul 2017).

Updates

  • 2026

    “GPA: Generative Population Annealing for Test-Time Sequence Design” accepted as a poster at GenBio 2026, the ICML Workshop on Generative & Agentic AI for Biology.

  • Apr 2025

    Presented “Understanding DNA Discrete Diffusion for Engineering Regulatory DNA Sequences” at the Workshop on AI for Nucleic Acids, ICLR 2025.

  • Sep 2024

    Long oral “Designing DNA With Tunable Regulatory Activity Using Discrete Diffusion” at MLCB 2024; also featured at the NeurIPS 2024 Workshop on AI for New Drug Modalities.

  • Apr 2023

    Started as computational postdoctoral fellow at Cold Spring Harbor Laboratory.

  • Oct 2022

    Presented ongoing work on explainability under distribution shift at Fujitsu Limited, Japan.

  • Apr 2022

    Started as postdoctoral associate at MIT.

  • Mar 2022

    Successfully defended my Ph.D. dissertation.

04 — Trajectory

Experience

PD

Computational Postdoctoral Fellow, CSHL 2023 — present

Generative modeling for genomic sequence design.

PD

Postdoctoral Associate, MIT 2022 — 2023

Explainability under distribution shift.

RI

Research Intern, IBM India Research Lab 2019

Self-explaining neural networks with meaningful concepts as building blocks.

RI

Research Intern, University of Tokyo 2017

Causal inference and applications of causality in machine learning.

Education

Ph

Ph.D., Computer Science, IIT Hyderabad 2016 — 2022

Rational Deep Machines: toward explainable, trustworthy, and robust deep learning systems.

M

M.Tech, NIT Rourkela 2014 — 2016

Source camera identification: classifier learning, learning curves, and their interpretation.

05 — Publications

Selected work

06 — Talks & Workshop Papers

07 — Contact

Let's talk — explainability,
generative modeling, or the mix.

GS GH in