Designing ultrastable carbonic anhydrase with deep generative models and high-throughput assays

Raised of $49,200 Goal
Funded on 11/18/23
Successfully Funded
  • $49,200
  • 100%
  • Funded
    on 11/18/23

About This Project

To minimize the impact of CO2 emissions on life on earth, we need technologies for carbon capture exceeding even that of CO2-metabolizing organisms. One path is to enhance industry-scale chemistry via efficient biological catalysts. Using machine learning and high-throughput screening, we will engineer a pivotal enzyme, carbonic anhydrase, for use in CO2 capture technologies.

Ask the Scientists

Join The Discussion

What is the context of this research?

Atmospheric CO2 removal (CDR) and point-source capture (PSC) of CO2 are well-accepted as being necessary for successfully decarbonizing within climate goals (1). Direct air capture (DAC) is a CDR pathway with ideal verifiability and durability. Both DAC and PSC are cost constrained, primarily by the CapEx of the gas contactor and the energy required to drive large swings in temperature or pH to regenerate CO2 from the capture material (2).

Those high cost and energy requirements are driven by a thermodynamic trade-off between the rate of CO2 absorption and the CO2 regeneration energy: CO2 capture materials with high absorption rate, which reduce cost by reducing the gas contactor size, typically have high CO2 regeneration energy, and vice versa (3).

What is the significance of this project?

Carbonic anhydrases (CAs) catalyze fast CO2 absorption in solvents with low CO2 regeneration energy, resolving the tradeoff described above (4). CA could reduce DAC and PSC cost by reducing parameter swing size or gas contactor size, if it were stable in DAC or PSC processes that may include high pH, temperature, or ionic strength. E.g., thermostable CA via protein engineering (PE) can already reduce PSC cost >30% (5,3).

AI-driven PE and screens of many natural variants are revolutionizing PE but haven’t been applied to CA. Ultrastable CAs produced using those tools likely could reduce DAC and PSC cost substantially. While modeling is needed to quantify application-specific benefits and target CA properties, PE for ultrastable CA can begin now and later be adapted to specific uses.

What are the goals of the project?

While modeling analyses are ultimately required to provide target properties for ultrastable CAs to be used in development of novel CA-enhanced DAC and PSC, initial efforts to use AI-based PE and screens of many variants should target many-fold CA stability improvements compared to the state-of-the-art while retaining high activity (kcat/kM ~108 M-1s-1). For a comprehensive discussion of state-of-the art CA engineering and performance, see (6) and (4).

Examples of the state of the art are:

  • temperature stability

    • 203-day half-life at 60˚C in Tris buffer (7)

    • 73% activity retention after 24 hrs at 80˚ C in K2CO3 (4,8)

  • pH stability

    • 90% activity retention after 24 hrs at pH 11.0 (9)

Stability demonstrations should be performed in solvents relevant to DAC and PSC, such as 10-20% K2CO3.


Please wait...

(see solution statement)

Project Timeline


Oct 31, 2024

Project completion

Meet the Team

Pascal Notin
Pascal Notin


Harvard Medical School
View Profile
Nathan Rollins
Nathan Rollins


Seismic Therapeutic
View Profile
Yarin Gal
Yarin Gal


University of Oxford
View Profile
Debora Marks
Debora Marks


Harvard Medical School
View Profile

Team Bio

We are an interdisciplinary team of scientists at the forefront of machine learning and biology research. We've pioneered new ML approaches for generative modeling, active learning, bayesian optimization, uncertainty quantification, and more. These methods have been instrumental in predicting protein structures and function, as well as advancing protein engineering.

For more info about us:

- Marks lab:…


Pascal Notin

I am a computer scientist and researcher at the intersection of AI and computational biology. My mission is to develop novel AI-driven models to support the engineering of enzymes, targeting climate change mitigation and sustainability.

I have recently joined the Marks lab, where I lead the development of ML approaches for protein engineering. My academic journey took shape with a PhD in the Oxford Applied and Theoretical Machine Learning Group, under the supervision of Yarin Gal.

This journey was marked by breakthroughs in protein modeling, mutation effects prediction, viral escape anticipation, and drug design. My findings have been published in premier scientific journals and machine learning conferences, including Nature, NeurIPS, ICML, and ICLR.

I have co-created and am the lead organizer for the Machine Learning for Drug Discovery (MLDD) workshop at ICLR, co-organized the GeneDisco challenges, and co-organize the Workshop on Computational Biology (WCB) at ICML.

I have several years of applied machine learning experience developing AI solutions, primarily within the healthcare and pharmaceutical industries. Before going back to academia, I was a Senior Manager at McKinsey & Company in the New York and Paris offices, where I was leading cross-disciplinary teams on fast-paced analytics engagements.

For my academic endeavors, I am the recipient of a award and recognized as a Google Cloud Innovator.

More info at:

Nathan Rollins

My mission is to translate the immense capabilities of biomolecules into human applications.

While keeping one foot in academia, I am a senior scientist at Seismic Therapeutic, a biomedicine startup launched out of my PhD and post-doc work with Debora Marks. At Seismic, my role is to develop machine learning methods for accelerating discovery and improving properties of protein therapeutics. To this end, as throughout my career, I work in equal parts as a synthetic biologist and as a computer scientist.

The themes of my work are symmetric:
(1) Identify challenges in bioscience & design ML solutions in response.
(2) Identify opportunities in bioscience unlocked by ML & design wetlab experiments to take full advantage.

I started on this path ten years ago, designing de novo proteins at the UW Institute for Protein Design, while I studied chemical engineering and biochemistry. I then pursued a PhD in Engineering and Physics in Biology, supervised jointly by computational biologist Debora Marks and synthetic biologist Pamela Silver. In that time, I gained a fantastic breadth of experience - advancing projects to: design enzymes [1], drugs [2], and more [3,4,5]; predict virus evolution [6,7,8,9,10]; and to maximize what's learned from next-gen wetlab methods such as deep mutational scanning [11,12].

To learn more, see my papers on Google Scholar.

Yarin Gal

I am an Associate Professor of Machine Learning at the University of Oxford, where I lead the Oxford Applied and Theoretical Machine Learning (OATML) group. I made several contributions to modern Bayesian deep learning to support the quantification of uncertainty in deep learning models. These methods have been widely used in industry and academia across various application domains, such as medicine, robotics, computer vision, or astronomy.

Beyond my academic work, I have worked with various industrial partners on deploying robust ML tools safely and responsibly. I am a co-chair for the NASA FDL AI committee, an advisor with Canadian medical imaging company Imagia and Japanese robotics company Preferred Networks, and have recently been appointed as the Research Director for the UK AI taskforce.

More info at:

Debora Marks

I established my laboratory eight years ago after a career in industry and more recent degrees in mathematics and computational biology, aiming to accelerate fundamental discoveries in biomedicine and biotechnology. I have developed statistical methods including novel machine learning methods for biological data with an emphasis on interpretability and causality. My lab was able predict 3-dimensional protein structures from sequence alone, predict the fitness effects of human genetic variation, make robust generative models for protein therapeutics, antibody design and deimmunization design and discover new causes of microbial resistance using whole genome epistatic models. My mission is to develop machine learning for design of biological interventions for environment and human health now focusing on viral genome forecasting, infectious disease pathophysiology.

My work has been published in premier biological venues (e.g. Nature, Nature Biotechnology) as well as premier Machine Learning /AI venues (NeurIPS, ICML). In 2016, I received the ICSB Overton Award for outstanding accomplishment to the field of computational biology, in 2018 the Chan Zuckerberg Initiative Ben Barres Early Career Acceleration Award in the Neurodegeneration Challenge Design, in 2020 an NIH Director's Transformative award for antibody design and in 2022 I became a Fellow of ISCB for outstanding contribution to the fields of computational biology and bioinformatics.

More info at:

Project Backers

  • 0Backers
  • 100%Funded
  • $49,200Total Donations
  • $0Average Donation
Please wait...