Reconstructing Visual Experience from Brain Activity

AI-based decoding of visual experience from fMRI using biologically-constrained generative video models

Can we reconstruct what someone is seeing — or imagining — directly from their brain activity? This long-standing challenge in systems neuroscience has seen dramatic progress with the advent of large-scale generative AI models.

The question. How faithfully can we decode visual experience from fMRI BOLD responses, and what constraints should biological plausibility impose on the generative model?

Our approach. In collaboration with Dr. Jack Gallant (UC Berkeley), we are developing AI-based pipelines for reconstructing visual experience from fMRI data using biologically-constrained state-of-the-art generative video models. Our approach incorporates:

  • Neural encoding models that map from fMRI responses to latent representations aligned with the visual hierarchy
  • Generative video diffusion models constrained to respect known properties of visual cortical processing
  • Quantitative evaluation frameworks for assessing reconstruction fidelity across spatial, temporal, and semantic dimensions

Significance. Beyond its basic science value, faithful neural decoding has potential clinical implications for brain-computer interfaces and communication devices for individuals with severe motor impairments.