AI Gene Editing: Analyzing the Eli Lilly-Profluent Deal

Executive Summary
This report examines the landmark April 28, 2026 partnership between Eli Lilly and Profluent Bio – a $2.25 billion (development and commercial milestone) deal to create AI-designed gene editors. In this collaboration, biotech innovator Profluent will use its large-scale generative protein design models to create site-specific recombinases (custom enzymes that cut and rejoin DNA) targeted to chosen genomic locations. Lilly obtains exclusive rights to develop and commercialize the best of these enzymes as novel genetic medicines. The deal signals an aggressive move by Lilly—flush with revenue from its obesity/diabetes franchises—into advanced genetic medicine. It follows a string of recent Lilly transactions (e.g. a $1.12 B January 2026 recombinase pact and acquisitions of Verve, Kelonia, Ajax, etc.) aimed at bolstering its gene editing and AI capabilities ([1]) ([2]).
This report provides deep context and analysis of the Lilly–Profluent collaboration. We begin by reviewing background on gene editing and AI-driven protein design: how existing tools like CRISPR, base editing and prime editing work (and their limitations for large DNA insertions), and how generative AI models (e.g. protein language models and diffusion models) are revolutionizing protein engineering ([3]) ([4]) ([5]). We then profile Profluent Bio – its founding, investors (e.g. Bezos Expeditions), and pioneering work in applying large language models to design novel genome editing proteins ([6]) ([7]). Next, we dissect the Lilly–Profluent deal itself: its terms (undisclosed upfront, up to $2.25 B in milestones plus royalties) and scientific focus (AI-designed recombinases for kilobase-scale DNA editing) ([8]) ([9]). We contrast this approach with other gene editing modalities and note the “holy grail” of kilobase-scale editing that Lilly and Profluent are targeting ([10]) ([11]).
Extensive industry context is provided, including Lilly’s broader push into genetic medicines and AI (e.g. a $2.75 B deal with Insilico Medicine in Mar 2026, at least 15 AI deals in five years ([12])), as well as case studies of related gene therapy partnerships (Table 1). We cite statements from Lilly and Profluent executives (e.g. “AI-designed recombinases...precision targeting” ([13])) and independent analysts. Data on funding rounds, deal valuations and historical precedents are included. We also survey expert perspectives: the dealroom analysis calls this a strategic gamble in an unproven field ([14]), while Profluent calls it validation that “AI protein design” has arrived ([15]).
Finally, we discuss implications and future directions. We analyze potential risks (no AI-designed drug yet approved, challenges delivering large enzymes to patients, contingent milestone payments) ([14]) ([16]), and the transformative promise if successful (a “fully programmable” platform to treat diseases previously out of reach ([17])). We consider the broader impact for biotech and medicine: Lilly has priced the idea of AI-designed biology at billions before any clinical proof ([15]), suggesting a shift in how industry funds early-stage innovation. We also touch on regulatory, ethical and commercial questions raised by these next-generation “therapeutic editors.” The conclusion synthesizes how this deal fits into the evolving landscape of genetic medicine and AI-driven drug discovery.
All statements are backed by citations to peer-reviewed studies, company releases, and media reports. Key points are illustrated with data (deal values, timelines) and two summary tables. The tone is analytical and comprehensive, aiming to inform both scientific and business audiences.
Introduction
Genetic medicine – using genome editing tools to correct disease-causing DNA – is rapidly advancing. Traditional genome editing technologies like CRISPR/Cas9 have transformed research and are entering the clinic, but they are largely limited to cutting and correcting small DNA errors. Many human diseases, however, involve complex mutations or require the insertion of whole genes or large DNA segments. The Lilly–Profluent deal represents a bold attempt to transcend current limits by using artificial intelligence to design entirely new genome editing enzymes capable of kilobase-scale DNA edits. This introduction provides the necessary background in gene editing and AI-driven protein design that underpins the significance of this collaboration.
1.1 Gene Editing Technologies and Limitations
Sequence-specific nucleases have been the mainstay of genome editing. Tools like zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) predate CRISPR but are harder to engineer. Since 2012, CRISPR/Cas systems (notably Cas9 and Cas12) have dominated, using a guide RNA to target almost any short DNA sequence. CRISPR is highly efficient at introducing double-strand breaks (DSBs) at targeted loci, leading to repair by cellular machinery. Non-homologous end-joining (NHEJ) often produces small insertions/deletions (indels), effectively knocking out genes or disrupting mutations. With a DNA donor template, homology-directed repair (HDR) can introduce small DNA changes, but HDR is inefficient in most cells and typically limited to tens of bases (often <20–30 bp). Newer variants have expanded capabilities: base editors (e.g. cytosine or adenine deaminases fused to Cas) can change single nucleotides without cutting the DNA ([5]), and prime editors (Cas nickase fused to a reverse transcriptase) can introduce short DNA patches (up to a few dozen base pairs at most) ([18]).However, all of these methods struggle when very large DNA sequences must be inserted or replaced. In practice, prime editing has “only achieve [d] knock-ins up to ~50 bp in length” ([18]), and conventional CRISPR knock-in approaches are severely size-limited by the difficulty of delivering long templates and the low efficiency of repairing large DSBs.
This limitation has become a bottleneck for diseases requiring insertion of whole genes or correcting genomic regions spanning kilobases. For instance, cystic fibrosis (CF) has over 3,000 causal mutations in the CFTR gene (approximately 4.5 kb coding sequence), and GJB2-related hearing loss involves hundreds of point mutations scattered throughout a gene. No single small edit can fix all such mutations simultaneously. As the press release notes, “Large-scale DNA editing (i.e., the ability to insert long stretches of DNA, sometimes entire genes, at precise locations in the genome) could address this challenge but remains one of the most significant unsolved problems in genetic medicine” ([10]). In other words, multi-kilobase insertion – adding entire gene sequences into specific sites – is often called the “holy grail” of genetic therapy ([11]) ([13]). Traditional genome editing methods do not yet reliably achieve this.
1.2 Site-Specific Recombinases and Integrases
An alternative class of genome editing enzymes historically used in research are site-specific recombinases (SSRs). These include the tyrosine recombinases (e.g. Cre/loxP, Flp/FRT) and serine integrases (e.g. φC31, Bxb1), which recognize specific DNA sequences and catalyze precise DNA rearrangements. SSRs can mediate integration, deletion or inversion of DNA segments without adding or losing nucleotides ([19]). For example, the well-known Cre recombinase cuts at loxP sites and can excise or integrate the intervening DNA. SSR technology has long been used to manipulate genomes in model organisms and plants with great precision ([20]). In transgenic crops, for instance, SSRs have been used to “effectively resolve complex transgene insertions to single copy, remove unwanted DNA, and precisely insert DNA into known genomic target sites” ([20]). Notably, SSR-based edits are all-or-nothing: if the recognition site is present at two DNA loci, the recombinase will either delete or swap the segment between them, enabling insertion of large “transgene” cassettes in one step. Unlike CRISPR’s reliance on cellular repair, recombinases rejoin DNA cleanly, which can be an advantage for inserting large sequences.
However, a key limitation is that natural SSRs only act on their specific recognition sequences (e.g. loxP every time). To treat human genes directly, one would need designer recombinases that recognize novel genomic sequences not used by nature. Attempts at reprogramming recombinases have been made by directed evolution or protein engineering ([21]), but they have had limited scope. In summary, while recombinases are perfectly suited for kilobase-scale editing once a suitable enzyme exists, current technologies lack the ability to readily create site-specific recombinases for arbitrary human genomic targets. This is exactly the challenge Lilly and Profluent aim to address with AI. As one press report notes, “Traditional approaches rely on finding naturally occurring enzymes that happen to work at target sites. Profluent is taking a different approach: using AI to create designer recombinases – custom enzymes programmed to target exact locations in the genome” ([22]). In short, SSRs can in principle insert entire genes precisely ([19]) ([20]), but generating a new SSR for each disease target has been “utterly intractable” without AI ([23]).
1.3 Generative AI in Protein Engineering
Artificial intelligence, especially deep learning, has revolutionized many fields. In biology, deep learning models have been applied to protein science: in 2021, AlphaFold2 (a neural network from DeepMind) predicted 3D protein structures from sequences with near-experimental accuracy, changing structural biology. More recently, generative AI for proteins has emerged. These models (many built on architectures like transformers or diffusion processes) can generate novel protein sequences with desired properties. For example, a 2021 Nature Machine Intelligence study demonstrated that a GAN-based model (“ProteinGAN”) could learn the diversity of a multi-domain enzyme (malate dehydrogenase) and produce new sequences, of which 24% were experimentally functional in catalysis ([24]). Similarly, a 2023 Nature study introduced “Chroma”, a diffusion language model capable of sampling entirely new protein folds and complexes, with experimental validation showing many designs were biophysically stable and even crystallizable ([3]) ([25]). A recent review notes that generative models (language models, VAEs, diffusion models) are now “adept at generating novel, yet realistic proteins that display desirable properties and perform specified functions,” with success rates approaching 20% in some cases ([5]).
These advances mean that, in theory, AI can write new protein sequences (like recombinases) in silico. Instead of laborious directed evolution screening of millions of variants, one can infill or draft candidate sequences that plausibly meet design goals. Ruffolo and Madani of Profluent (both coauthors on this deal) recently emphasized that protein language models have become powerful tools for sequence design, variant effect prediction, and structure prediction ([26]). Many large-scale efforts have gathered enormous protein datasets (e.g. the Pfam/UniProt collections, and Profluent claims an internal set of >115 billion sequences ([27])) and trained “foundation models” akin to GPT or BERT to capture evolutionary relationships. These models can then be prompted or fine-tuned to output novel proteins for specific tasks, such as cutting DNA at a target site. Indeed, Profluent’s own work illustrates this: in early 2025 they announced an AI model Protein2PAM that reprogrammed CRISPR-Cas enzymes to recognize new PAM sequences, achieving up to 50-fold higher editing activity without any laboratory evolution ([28]). This single-shot AI approach bypassed manual protein engineering, demonstrating that large language models can engineer functional protein–DNA interactions from data alone ([28]).
In summary, generative protein design is a rapidly maturing field. Large “AI protein languages” can generate diverse functional sequences ([24]) ([25]). This approach is just reaching drug discovery: Lilly’s deal with Profluent exemplifies applying generative AI not to small-molecule drugs, but to create entirely new biological machinery (therapeutic enzymes) that could edit the human genome. As Profluent co-founder Ali Madani states, they are “opening up a new frontier in AI-powered drug discovery” where biology can be reprogrammed beyond natural limits ([29]). This alliance thus sits at the intersection of two cutting-edge trends: gene editing and generative AI.
Profluent Bio and Generative Protein Design
To appreciate the Lilly–Profluent collaboration, it is important to understand Profluent as a company and its technology platform. Profluent Bio (founded in 2022, Emeryville CA) is an AI biotech startup founded by Ali Madani (formerly of Recursion) and colleagues. It has raised major venture funding – notably a $106 million Series A round in November 2025 co-led by Bezos Expeditions and Altimeter Capital ([6]) ([30]) – and bills itself as pioneering “large-scale foundation models for protein design.” According to press materials, Profluent’s platform is trained on a compendium of billions of protein sequences, enabling it to learn the “evolutionary relationships of protein sequences directly from the complex multidimensional amino-acid sequence space” ([31]) ([32]). This yields powerful generative models (like its “ProGen3”) that can output new protein sequences meeting desired criteria.
Profluent has already demonstrated the effectiveness of its approach in research collaborations. As noted, its Protein2PAM model engineered custom CRISPR-Cas variants for novel PAM (protospacer adjacent motif) recognition, significantly expanding the targeting range of Cas enzymes ([28]). In another example, the company published a study (Nature 2025) showing it could design de novo CRISPR-like enzymes: “the novelty and diversity of the gene editors Profluent can create with our LLMs transcend that of proteins made by nature or traditional engineering,” CEO Madani proclaimed ([29]). This “OpenCRISPR-1” effort reportedly used AI to produce entirely new Cas9-like proteins with potentially novel target profiles. Internally, Profluent refers to its models as a “comprehensive database of naturally occurring recombinases” and a “toolkit” for genetic medicines ([33]).
According to Profluent, the key value proposition is that its AI can generate fully programmable gene editing enzymes on demand for virtually any sequence. As Hilary Eaton, Profluent’s Chief Business Officer, explains: standard CRISPR knockouts and base editors leave whole disease categories out of reach, but “kilobase-scale DNA editing is how we reach them – and Profluent’s generative models, trained on the world’s largest protein dataset... were built for exactly this problem” ([17]). Profluent aims to provide a “platform” that can deliver custom recombinases for both rare and common diseases alike, essentially unlocking targets previously deemed impossible ([17]). In the words of their press release: “We believe Lilly is the ideal partner to bring these tools to the patients who need them most” ([17]).
Profluent’s technology is still new, but its backers and founders have deep expertise. The involvement of Bezos Expeditions (personal venture fund of Jeff Bezos) and other top VCs underscores confidence in Profluent’s approach. Jeff Bezos has invested in various AI and biotech ventures, and his group co-led that hefty 2025 funding for Profluent ([6]). The company’s co-founders, including Madani and others (some from Recursion and Salesforce AI research), are influential figures in the intersection of AI and biology. In summary, Profluent represents one of the leading efforts in “AI-native” protein design, equipped with cutting-edge models and data, and seeking aggressive partnerships to validate its technology. The Lilly deal is expected to be Profluent’s first major pharma collaboration (dealroom notes it is “Profluent’s first big pharma partnership” ([34])), which will be a critical test of their platform.
The Lilly–Profluent Collaboration
On April 28, 2026, Profluent announced a strategic research collaboration with Eli Lilly to develop custom site-specific recombinases for therapeutic genome editing ([35]). The core terms (from the BusinessWire press release) are: Profluent will apply its AI models to design novel recombinases for multiple genomic targets, and Lilly will have an exclusive license to the selected candidates for in vivo research, preclinical, and beyond ([36]). In exchange, Lilly provided an undisclosed upfront payment and committed research funding; Profluent is eligible to receive up to $2.25 billion in development and commercial milestone payments, plus tiered royalties on sales ([8]). (Exact upfront and R&D funding amounts were not disclosed.) In other words, Lilly did not buy an equity stake, but secured exclusive rights to an entire multi-program pipeline of engineered recombinases, contingent on Profluent’s AI discovering viable enzymes.
This deal is explicitly framed as targeting the “holy grail” of gene therapy: inserting or fixing large DNA segments in the genome. According to the joint release, the collaboration “focuses on enabling large-scale, precise DNA editing capabilities that remain out of reach using conventional gene editing systems” ([37]). Ali Madani, Profluent CEO, put it bluntly: “Kilobase-scale DNA editing remains a holy grail in genetic medicine. Our work with Lilly is aimed at unlocking these therapeutics previously thought impossible. We believe only AI can create the designer recombinases needed to precisely target any location in the genome” ([13]). His comments were echoed in media quotes: “Kilobase-scale DNA editing remains a holy grail… Our work with Lilly is aimed at unlocking these therapeutics previously thought impossible. We believe only AI can create the designer recombinases needed” ([38]). Hilary Eaton (Profluent CBO) added that standard knock-out/base editors “leave entire categories of disease out of reach. Kilobase-scale DNA editing is how we reach them” – and that Profluent’s models “were built for exactly this problem” ([17]).
In practical terms, the partnership means Profluent’s AI platform will be used to generate new recombinase proteins, each designed to recognize a specific DNA sequence (the “target”) and insert or cut at that site. Lilly will then choose from these candidates and advance them through lab testing and (hopefully) clinical development as gene therapies. The use of the plural “genomic targets” suggests multiple disease programs are envisaged, but the companies have not publicly disclosed how many target sites or which diseases are in scope. The press release is similarly vague: neither party named any specific disease indications. (As STAT News noted, the pact is “sparse on details, including the number of programs the two companies would work on, the types of diseases they’ll pursue, or how much Lilly was paying upfront” ([16]).) All that is public at launch is the scope (recombinase editors), the financial structure (up to $2.25B) and the strategic rationale (unlock kilobase editing).
It is helpful to place the $2.25 billion figure in context. This is not an upfront cash payment; it is the maximum Lilly might pay if every research, development and sales milestone were achieved for all programs. Profluent will only collect money as each stage succeeds. Deal sources indicate the actual upfronts and milestones are undisclosed (as is common in biotech collaborations) ([8]) ([39]). As Fierce Biotech explains, Lilly will pay an undisclosed up-front and also provide R&D funding; the $2.25B consists of potential downstream payments plus royalties on sales ([40]). In industry terms, Lilly’s promise of up to $2.25B is on the high end for multi-target gene editing partnerships (for comparison, Lilly’s January 2026 recombinase pact with Seamless Therapeutics was up to $1.12B ([41])). It signals Lilly’s confidence and willingness to commit substantial resources, albeit contingent on success.
Deal Structure and Financial Terms
According to the Business Wire announcement ([36]) ([8]) and media reports, the key terms are summarized in Table 1 below. Lilly gains an exclusive license to any selected recombinases that Profluent designs (the number of targets is not specified), and Lilly will handle all preclinical, clinical and commercial development. Profluent will design and optimize the enzymes using its AI. Financially, Lilly paid an upfront payment + ongoing R&D funding to Profluent (amounts undisclosed) and committed to milestone payments totaling up to $2.25 billion plus sales royalties ([8]) ([40]). This mirrors typical biotech-pharma R&D collaborations, but at unprecedented scale for gene editing.
The deal language highlights that this is a “strategic research collaboration” focusing on a multi-program effort ([35]). In other words, Lilly likely has multiple disease programs in mind. The companies will presumably agree on several target genes (e.g. one for cystic fibrosis, one for muscular dystrophy, etc.) and Profluent will attempt to design dedicated recombinases for each. Mine suggests Profluent’s AI could screen through trillions of sequence variations to output candidate proteins for each target. Lilly then decides which to progress. The lockup is that Lilly alone can develop and sell the winners. If any reach market, Profluent gets milestone payments and royalties; if not, Lilly retains nothing beyond the knowledge itself.
The immediate significance of the collaboration is that it pushes forward the concept of generative protein design as a drug discovery tool. Instead of screening existing enzymes or libraries, Lilly is effectively outsourcing the molecular invention to an AI. As Dealroom notes, “the Lilly–Profluent deal, the Lilly–InSilico deal, and similar pacts signal that the industry now treats AI protein design as a viable drug discovery modality, not a research curiosity” ([15]). In short, Lilly is betting that foundational models trained on massive protein data can yield truly novel therapeutics.
Table 1: Major Eli Lilly Genetic Medicine Collaborations (2023–2026)
| Partners (Year) | Technology Focus | Deal Type & Skillset | Potential Value | References |
|---|---|---|---|---|
| Lilly–Profluent Bio (2026) | AI-designed site-specific recombinases (genetic editors) | Multi-program R&D collaboration | Up to $2.25B (milestones + royalties) ([8]) ([40]) | Business Wire ([8]); Fierce ([40]) |
| Lilly–Seamless Therapeutics (2026) | Programmable recombinase platform (hearing loss) | Research collaboration | $1.12B (milestones) ([41]) | Fierce Biotech ([41]) |
| Lilly–Insilico Medicine (2026) | AI-driven small-molecule discovery | Collaboration | $2.75B (milestones) ([12]) | Dealroom ([12]) |
| Lilly–Verve Therapeutics (2023) | In vivo CRISPR base editing (lipid disorders) | Acquisition (gene therapy) | $1.0B upfront ([42]) | Fierce Biotech ([42]) |
| Lilly–Kelonia Therapeutics (2024) | In vivo CAR-T cell therapy (oncology) | Acquisition | $3.2B upfront ([42]) | Fierce Biotech ([42]) |
| Lilly–Ajax Therapeutics (2026) | Myelofibrosis clinical-stage therapy | Acquisition | $2.3B upfront ([1]) | Fierce Biotech ([1]) |
Table 1 provides context: in the past three years Lilly has repeatedly invested billions to build a genetic medicines pipeline. The Profluent deal is part of this trend. For example, two days before announcing Profluent, Lilly acquired Ajax Therapeutics for $2.3 B ([43]) (to gain a clinical-stage anti-fibrosis drug). Earlier, Lilly purchased Verve (CRISPR in vivo lipids) and Ventyx (allergies) for around $1–1.2B each ([42]), and partnered with Seagen/Adaptive in immuno-oncology, etc. Notably, Lilly also invested heavily in AI-driven R&D: besides Insilico’s $2.75B deal, Lilly has the most AI-related partnerships among pharma, including an Nvidia supercomputing lab for drug discovery ([12]). In summary, the Profluent pact fits a coherent strategy: Lilly is aggressively betting on next-generation medicines (gene editing, cell therapy, AI) to sustain growth beyond its blockbuster diabetes portfolio ([1]) ([2]).
2. Generative Protein Design Meets Genetic Medicine
The Lilly–Profluent alliance epitomizes a new paradigm: generative protein design for genomic therapies. We examine how this AI-driven approach differs from existing methods and what it could enable.
2.1 Why Generative Recombinases?
Traditional gene editing (as discussed) is best for small edits. Profluent’s pitch is that only an AI can feasibly create custom recombinases for any desired DNA sequence ([13]). In principle, a recombinase that cuts and pastes at a specific 6–8 base sequence is as programmable as guiding a Cas9 – but engineering this enzyme is extremely challenging. As one analyst put it, without AI reprogramming “this would be utterly intractable” ([23]). In vitro evolution or rational redesign of recombinases has been attempted. For example, hybrid recombinases have been created by swapping subunits or evolving binding interfaces ([21]). But these efforts have yielded only a handful of new specificities after intensive lab work. By contrast, generative AI can in principle accelerate exploration of the entire recombinase sequence space, including sequences far outside natural variation.
The theoretical payoff is huge: a successful AI-designed recombinase could insert an entire functional gene at a patient’s genomic location in one shot, correcting all causal mutations simultaneously. As Profluent’s PR emphasizes, large-scale editing “transcend [s]” what nature provides ([29]). Already Profluent’s technology has produced highly novel CRISPR-like proteins (the “OpenCRISPR-1” publication) that are ~400 mutations away from any known Cas9 and yet still functional ([29]). If similar feats yield a recombinase with a desired site specificity and robust activity, that could enable productively inserting kilobase transgenes. In short, Lilly and Profluent are betting that combining AI models with human ingenuity will unlock genome edits “previously thought impossible” ([13]).
From Lilly’s perspective, this “kilobase-scale editing” covers a major unmet need. Ali Madani summed it up: “Many genetic diseases are caused by multiple different mutations across patient populations… [meaning] targeted therapies that work for all patients [are difficult]… Large-scale DNA editing (i.e. inserting long stretches of DNA, sometimes entire genes) could address this challenge” ([10]). The collaboration explicitly aims to build the “toolkit needed to scale genetic medicines” ([13]). If successful, it could deliver one class of versatile tools that Lilly could apply across many disease programs, rather than developing each from scratch.
2.2 Comparison with Other Editing Tools
To put this in context, consider the range of genome editing modalities (Table 2). CRISPR/Cas is superb at small edits, and base/prime editors extend that work to single-nucleotide substitutions or short deletions/insertions ([5]) ([18]). Transposases (like PiggyBac) and some integrases can move larger sequences (up to ~10 kb) but lack precise targeting in human cells. By contrast, a truly programmable recombinase could combine the best of both: multi-kilobase insertions delivered to a defined site.
The table below contrasts existing and emerging genome editors. While CRISPR systems pioneered genetic medicine (e.g. Casgevy for sickle cell, Exa-cel for beta-thalassemia), none of the approved or late-stage therapies use kilobase insertions. As Lilly’s Profluent deal emphasizes, inserting full genes at will is still out of reach for all current technologies ([10]) ([18]). If generative recombinases can do it, they would occupy a unique niche in the toolbox.
| Editing Technology | Mechanism | Typical Edit Size | Applications/Status | Key Limitations |
|---|---|---|---|---|
| CRISPR/Cas9 | RNA-guided endonuclease → DSB repair (NHEJ/HDR) | Single base (indel) up to ~100 bp with donor template | Knockouts, small gene corrections (some ex vivo therapies approved) | Requires PAM, limited donor size, off-target DSBs |
| Base Editors | Deaminase-Cas fusion → single base substitution | 1–3 bases (per application) | Correct SNVs (no DSB) | Cannot insert new sequence, limited to specific base conversions |
| Prime Editors | Cas9-nickase + RT + guide RNA template | Up to ~50 bp ([18]) (small insertions) | Precise small insertions/deletions (proof-of-concept) | Only small edits; efficiency and multiplexing are limited |
| Transposases (e.g. PiggyBac) | “Cut-and-paste” DNA transposon | ~1–10 kb (using transposon cargo) | Gene tagging, integration in research (e.g. CAR-T vector generation) | Integration is random or semi-random; off-target integrations; requires transposon sequences |
| Integrases (e.g. φC31) | Serine recombinase (phage) integrates at pseudo sites | ~~3 kb (relies on AAV carrying donor) | Gene therapy (e.g. hemophilia preclinical), viral-based integrations | Limited to pseudo-att sites in human DNA; not programmable for new sites |
| Natural Recombinases (Cre, Flp) | Tyrosine recombinase (recognition sites e.g. loxP, FRT) | Essentially entire transgene (kilobases) | Research (conditional knockouts), transgenics ([20]) | Only works at present recognition sites; cannot directly target arbitrary human loci |
| AI-Designed Recombinases | Generative AI-engineered site-specific recombinases | Entire genes/kb (goal: >1 kb) | Proposed – custom therapy for monogenic diseases | Experimental (no clinical use yet); needs validation of AI predictions |
Table 2 contrasts current genome editing approaches. Importantly, the upper approaches (CRISPR, base/prime editors) excel at small edits but cannot readily insert whole genes. Natural transposases/integrases can carry large payloads but lack specific targeting. Only recombinases are intrinsically suited for multi-kilobase insertions, but “any site” specificity has been unattainable – until now potentially via AI. This underscores why Lilly’s deal is framed as pursuing the unique ability to programmatically insert kilobase-scale DNA.
2.3 Scientific and Technical Challenges
Although generative protein design is promising, significant challenges remain. To date, no AI-designed biologic has yet won regulatory approval. Even CRISPR therapeutics took over a decade from discovery to the first FDA approval (Casgevy for sickle cell in Dec 2023). AI-designed enzymes, including recombinases, are an earlier stage. The dealroom analysis warns that this Lilly–Profluent partnership comes with many unknowns: “no AI-designed drug has won US approval. Recombinase-based gene editing is even earlier-stage than CRISPR therapeutics” ([44]). Critically, the press release discloses neither the targeted diseases nor how many programs (targets) the companies will pursue – only that the $2.25B is contingent on future milestones. As Dealroom notes, “the $2.25B headline is entirely contingent on milestones that may never be reached” ([14]).
Scientifically, the hurdles include proving that an AI-designed protein actually works in cells and animals. In silico predictions will need to be validated by biochemical and functional assays. Modern AI can design plausible sequences ([28]) ([29]), but wet-lab work is still required to optimize activity, reduce off-targets and ensure safety. Delivering a new recombinase protein or gene into patient tissues (often via viral vectors or mRNA) also poses challenges, especially if the enzyme is large. Regulatory pathways are untested for such agents. Also, any gene editor faces rigorous long-term safety scrutiny (genotoxicity, immune reactions, unintended insertions).
From a business perspective, the deal terms reflect these risks. Lilly pays milestones only after success, and Profluent’s total funding is modest (~$149M to date plus maybe this upfront) relative to the $2.25B ceiling ([45]). Achieving even one marketed therapy from scratch typically costs hundreds of millions. Nevertheless, Lilly’s management evidently judged the gamble worth it – treating AI-designed biology as a strategic frontier. As Dealroom succinctly put it: “Big pharma is pricing AI-designed biology at billions before a single clinical proof point” ([15]). This Lilly–Profluent pact (together with Lilly’s Insilico collaboration and others) signals a new era where computational protein design is taken seriously as a therapeutic modality ([15]).
Commercially, if even one Profluent-derived recombinase reached market, the revenues could justify Lilly’s investment. Genetic diseases like Duchenne muscular dystrophy, cystic fibrosis, or others affecting millions represent extensive markets. The up-front deal numbers reflect Lilly’s calculation of potential blockbuster returns, albeit tempered by the milestone structure. It is also a talent-and-technology acquisition: by partnering with Profluent, Lilly effectively gains in-house generative design expertise, which it can leverage more broadly. As Dealroom notes, this deal “marks Profluent’s first big pharma partnership” ([34]), but it also adds AI gene editing capability to Lilly’s toolbox.
3. Industry Perspectives and Analysis
Multiple observers have weighed in on this deal. Media and analysts have broadly characterized it as ambitious but risky. Reuters summarized that Lilly has “struck a multi-program research collaboration” with Profluent worth up to $2.25B ([34]). Fierce Biotech called it Lilly’s “latest attempt to strengthen its genetic medicine offering” ([46]), emphasizing that Lilly will license and advance the recombinases into drug development. BioSpace headlined the deal as Lilly and AI biotech “ink $2.25B pact in search of genetic medicine ‘holy grail’” ([9]), reflecting the narrative of seeking a breakthrough. The STAT news outlet (The Readout) noted that the deal was “sparse on details” and akin to an “AI gene editing gamble” ([16]) ([2]), pointing out Lilly’s broader strategy (new Boston genetics center, recent acquisitions).
From expert opinion cited in press, the tone is cautiously realistic. The Dealroom analyst points out that the competitive landscape is heating up and Lilly itself is busy acquiring startups in adjacent areas (Kelonia, Ajax, etc) ([45]). They highlight that Lilly’s obesity/diabetes windfall provides the firepower: after “Mounjaro cash,” Lilly has been on a spree in inflammation, gene therapy, AI (Verve, Ventyx, Kelonia, Ajax, Insilico) ([1]) ([12]). The STAT piece echoes this, calling Lilly “flush with obesity profits” and very active in genetic medicine ([2]).
Industry analysts also note that such deal structures are becoming more common. It’s similar in form to Lilly’s Insilico deal: big upfront R&D funding with pay-for-performance milestones. Venture-backed AI biotech companies have secured multi-hundred-million and even billion-dollar deals with pharma recently, reflecting high investor interest. The question is whether any will pay off clinically. Profluent’s early work (Protein2PAM and CRISPR redesign) lends credibility to their platform, but until a therapeutic candidate is proven, skeptics remain. In one analysis, the main criticisms were that the pipeline details are unknown, and the sums are largely theoretical ([14]). On the positive side, Profluent’s approach bypasses tedious lab evolution, which many see as a major innovation if it works. The “holy grail” line itself underscores the upside potential.
In sum, industry reaction sees this as strategically significant. It is part of a wave of collaborations where AI startups provide generative technology to big pharma. Just weeks earlier, in March 2026, Lilly spent $2.75B on Insilico Medicine (an AI-small-molecule developer) ([12]). Lilly’s CEO David Ricks and R&D head stated publicly that they expect AI to “redefine drug discovery” and have built “an AI supercomputer” partnership with Nvidia ([12]). This gene-editing deal with Profluent fits the same logic. As R&D pipelines get harder, pharma is willing to bet heavily on AI tools. The Lilly–Profluent partnership, better than an equity buyout, gives Lilly exclusive first access to a platform that could potentially yield multiple new therapies across disease categories.
Nonetheless, analysts caution the contingency: “If none of the programs succeed, Lilly will pay very little beyond the initial funding,” notes one observer. The Lynn–Profluent announcement itself stated, “Further details of the agreement have not been disclosed” ([8]), leaving open the question of per-target commitments. In short, this is viewed as part of Lilly’s broad R&D gambit: high-risk, high-reward bets now, drawn from strength in other areas. Whether it will pay off in new approved therapies remains to be seen.
Data Analysis and Case Studies
In addition to narrative context, it is useful to examine specific data and cases that illuminate multi-dimensional aspects of this deal. We gather available data on funding and outcomes in protein design, gene editing trials, and relevant market sizes to ground our discussion.
4.1 Underlying Data on Generative Protein Design
Data on large-scale AI models in biology is sparse but growing. Profluent claims its “largest protein data resource” exceeds 115 billion unique sequences ([27]). This dwarfs common public databases (UniProt has ~300 million sequences). The scaling hypothesis (that larger models trained on more data get better) is supported by Profluent’s own research announcements (April 2025 introduced their ProGen3 model and reported experimental evidence of scaling law benefits) ([47]). Such evidence suggests that “foundation models in writing biology” indeed deliver qualitatively new function as they scale.
Quantitatively, a 2024 review reported that state-of-the-art de novo protein design (using AI-driven methods) now achieves experimental success rates of roughly 10–20% for functional proteins ([5]). This means only a fraction of AI-generated designs test out in assays, but those are often novel folds not seen in nature. In the ProteinGAN study ([4]), 13 of 55 random designs (24%) from the model were enzymatically active – already notable for a true de novo library. Chroma (diffusion) produced many foldable designs, e.g. over 60% of unconditional samples in its test set were soluble ([25]). These rates are far higher than trying random sequences, indicating that generative models do concentrate probability on viable parts of sequence space. However, design still involves screening dozens to hundreds of candidates to find that ~20%. Thus, one should expect that the Lilly–Profluent teams may need to generate and test dozens of recombinase variants per target to find hits; the AI makes this tractable, but it is no guarantee that any design works in cells.
On the clinical front, one can note that no recombinase-based therapy has yet reached late-stage trials (unlike CRISPR or AAV gene therapies which have many candidates). For example, one company (OptimBet) is exploring engineered Adeno-associated virus integrases but is still preclinical. By contrast, dozens of CRISPR gene therapies are in human trials (e.g. CRISPR Therapeutics/Vertex’s ex vivo therapies for hemoglobinopathies, Intellia’s in vivo TTR program, beam’s cardio base editor trials, etc.). The lack of mature data on recombinase therapies underscores how novel Profluent’s angle is. It means Lilly is backing a pre-proof-of-concept approach.
4.2 Case Studies of Potential Targets
Although Lilly and Profluent have not named specific diseases, we can speculate on the classes of disorders that might drive interest and illustrate the opportunity:
-
Cystic Fibrosis (CF): CF is caused by mutations in the CFTR gene (4.4 kb coding region). Over 2,000 variants exist; the common ΔF508 deletion accounts for 60%, but many rare variants cause disease not addressed by small-molecule drugs. A generic solution would be to insert a correct full-length CFTR gene at its proper locus or otherwise restore function. A site-specific recombinase could in theory integrate a healthy CFTR sequence in bronchial cells. Current CF gene therapies (mostly via AAV) struggle with the large gene size; CRISPR editing of CFTR is in earlier research stages. CF exemplifies the heterogeneity problem noted by Lilly: “ [CF] involves hundreds of different mutations across patients…No single edit can address them all” ([48]).
-
Hearing Loss (GJB2-related): Mutations in GJB2 (connexin 26) cause genetic deafness. Over 100 pathogenic variants have been identified. Lilly is already exploring in vivo recombinase therapy for some forms of hearing loss via its Seamless deal ([41]). Profluent’s generative recombinases could theoretically combine with viral delivery to replace or correct GJB2 allele in cochlear cells. Indeed, Dealroom explicitly cites GJB2-related hearing loss as an example of a multi-mutation condition beyond CRISPR ([48]).
-
Inherited Blindness: Many retinal diseases involve large genes or deletion variants (e.g. RPGR, which has an ORF of 2.65 kb and complex splicing). An example is aniridia (PAX6 haploinsufficiency). A programmable integrase could insert functional gene copies into photoreceptors. Recombinase or integrase trials in retina have been attempted (e.g. φC31 for RPE65). AI-designed recombinases might target safe harbors like AAVS1 or retina-specific loci.
-
Hemophilia A: The Factor VIII gene (~9 kb) is too large for AAV gene therapy; current CRISPR strategies focus on Bcl11A knockout to boost fetal hemoglobin (benefitting hemophilia B but not A). A recombinase that could insert a miniFVIII gene (or activate the locus) might solve hemophilia A. Although rare diseases have smaller markets, the unmet need is high.
-
Metabolic Disorders: Some lysosomal storage disorders (e.g. Pompe disease, with a 2.8 kb gene plus regulatory elements) could be addressed by adding a functional gene to muscle/liver cells. Existing therapies are systemic (enzyme replacement) and expensive. A one-time gene edit could be transformative.
For each of these, CRISPR would normally require either correcting each variant separately (impractical) or relying on gene addition via viral vectors (size-limited). Base editors or prime editors offer only minor edits. A recombinant enzyme for kilobase insertion or deletion could, in principle, provide a more fundamental cure. However, each of these targets also poses delivery challenges (tissue-specific vectors for brain/eye/muscle, immunogenicity, etc.). Those are outside the scope of Profluent’s AI but are real-world hurdles for any therapy.
4.3 Deal Outcome Metrics and Milestones
While exact milestones are undisclosed, one can infer typical biotech collaboration structures. Usually, there would be research milestones (e.g., “engineer recombinase active in vitro against target A, B, C”), preclinical milestones (“demonstrate safety/efficacy in animal models”), clinical milestones (start/completion of Phase I/II/III trials) and regulatory/approval milestones. Given the $2.25B cap, each major stage likely spans from late six to nine figures. For perspective, Profluent’s own $106M funding round puts heavy weight on such milestones actually being achieved for them to collect significant funds.
Generic timelines: early R&D (1–2 years), animal studies (1–2 years), IND filing (1 year), Phase I/II (2–5 years), Phase III (3–5 years). Even if AI shaving early work, bringing 3 or 5 programs to market could easily take a decade. Thus, most of the milestone money would be spread over 8–10 years. Any large media headlines (like $2.25B) should be read as long-term potential, not short-term cash transfers.
The upfront and R&D funding, while undisclosed, likely numbers in the tens of millions. Profluent’s previous deals (with Revvity in late 2025 and Ensoma in Dec 2025) may offer clues. For example, a Dec 2025 Ensoma collaboration announced Profluent would receive “an upfront and research funding up to $100 million” plus milestones ([49]). The Lilly deal might be more, given its size, but we don’t know. The importance of R&D funding is that it supports Profluent’s work on the design algorithms and initial validation – it is effectively paid-for research. After that, the big $2.25B in milestones acts as contingent incentive.
Future Directions and Implications
The Lilly–Profluent pact is a bellwether for the future of genetic medicine and AI in biotech. We discuss likely implications, future developments, and considerations:
-
Accelerating R&D Pipelines: If Lilly and Profluent succeed even modestly, they will have proved that generative AI can become a primary engine of drug discovery. This could shift industry models: pharma might increasingly partner with or acquire AI-biotech startups to expand their pipelines cheaply at early stages, rather than build wet labs internally. One signal of this shift is Lilly’s statement via Dealroom: “Big pharma is pricing AI-designed biology at billions before a single clinical proof point” ([15]), implying that companies now value the AI expertise itself. Other firms (Novartis, Roche, Pfizer, etc.) will likely follow or already are partnering with AI platforms for various targets.
-
Broader Portfolio: While Lilly has up to five years to play with obesity oranges, it now thinks like a tech company stacking chips. The multiple programs mentioned could include very different diseases. The milestone-driven nature means Lilly can terminate unpromising programs cheaply while continuing with others. This portfolio approach spreads risk. It will be instructive to see if Lilly later discloses any of the target programs or successes, which would validate Profluent’s platform. If even one AI-designed recombinase reaches clinical proof-of-concept (e.g., restoration of gene function in a disease model), it would be a landmark achievement, likely sparking similar deals from other pharma and biotech.
-
Regulatory and Technical Hurdles: Regulatory agencies (FDA, EMA, etc.) will have to consider how to evaluate an AI-designed editing enzyme. It is still a biologic/genetic therapy and will face the usual safety/tox guidelines. But there may be new considerations, such as demonstrating that the design process is reproducible and that AI models do not inadvertently produce off-target activities beyond human expectation. Lilly will also have to coordinate delivering these enzymes. Possible routes include AAV vectors carrying the recombinase gene or mRNA therapy in cells. Delivery to tissues like muscle, liver, or CNS with large cargos remains challenging. These are not solved by AI; they are engineering disciplines. Thus, even if Profluent provides a perfect enzyme in vitro, Lilly must solve pharmacology. The collaboration likely anticipates some joint work on delivery strategies, but details are unknown.
-
Scientific Research Implications: Independently of this deal, the concept of training large language models on DNA sequences (or peptide libraries) will spread. Other research groups (academic and corporate) will attempt to design proteins for various functions: enzymes, antibodies, metabolic pathways, etc. Indeed, a recent trend is “AI Foundation models for biology” extending to cell behavior modeling. The Lilly–Profluent case could encourage more open science in this area (data sharing) as well as cautionary research on AI hallucinations in protein design (ensuring designs are truly viable). For example, the Chroma study ([3]) illustrates that generative models can push into new structural space, but each new protein still needs careful characterization.
-
Ethical and Social Considerations: While not as headline-grabbing as germline editing, editing patient genomes with designer enzymes raises ethical questions (consent, equity, long-term monitoring). Kilobase-edit drugs skirting CRISPR limitations might reignite debate on editing power. However, because Lilly’s approach is somatic (targeted to affected tissues) and for serious diseases, it is likely to be ethically acceptable if proven safe. A more tractable ethical issue is that of access and pricing: such advanced therapies will be costly. With Rx’s known cost (some approved gene therapies run >$2M per patient), we can expect any kilobase-edit therapy to be extremely expensive. Lilly will need to plan for reimbursement strategies if it moves to commercialize these in-vivo gene editors.
-
Academic and Industry Competition: If Profluent’s techniques prove viable, we may see competitors emerge. Already, companies like Insilico Medicine (small molecules) and Exscientia (AI drug design) are well-known; others like Regenesis or companies spun out of academic labs may pop up for generative protein design. In academia, groups led by leaders such as Fei-Fei Li (Stanford) or Yoshua Bengio (Mila/University of Montreal) have shown interest in biology applications of deep learning. Also, historically, programmable nucleases have included projects like the Joint Center for Cancer Nanotechnology Excellence (with MIT/Lift each?), but Brinkman-Rhine by Redwood? The generator vehicles are coming fast.
-
Future Deals in Gene Editing and AI: We expect the trend to accelerate. Other major firms may soon announce similar collaborations, perhaps with Profluent or other AI biotechnology companies (e.g. Akouos/SynthBio, Intellia maybe partnering with protein AI, etc.). Given Lilly’s proactive stance, it could even acquire Profluent outright if results are promising. Likewise, Profluent might be sought by other pharma.
-
Downstream Business Impacts: If these therapies work, they could shift the standard of care for many genetic diseases and open huge market segments. Lilly’s valuation and pipeline would then jump. Conversely, failure could lead to big write-offs (though the milestone model mitigates sunk cost). In either case, the deal itself reflects a shift in capital flow: big pharma is banking on AI innovation to feed its pipelines. In analogy, Montauk's acquisition of NAPA or Novartis with AI startup NanAI, we see that peace of mind for a future pipeline is valued highly.
Conclusion
The Eli Lilly–Profluent $2.25B collaboration announced in April 2026 is a landmark in both gene therapy and AI-driven drug discovery. It brings together Lilly’s deep pockets and pipeline needs with Profluent’s cutting-edge generative protein design platform to tackle one of medicine’s greatest challenges: inserting large DNA sequences into patients’ genomes. This report has shown that the deal sits at a confluence of historical trends (the quest for ever-more-powerful gene editors) and emerging technologies (AI foundation models for biology). We have detailed the background of genome editing and its limitations, described how generative models can create novel enzymes (citing recent Nature and Machine Intelligence studies ([4]) ([25])), and examined Profluent’s prior accomplishments (Protein2PAM, OpenCRISPR ([28]) ([29])) that led to this partnership.
Critically, we note that this is still a very early-stage gamble. No AI-designed gene therapy exists yet. All the major payouts are conditional on future success. Industry analysts caution that this is Lilly dramatically pricing in an unproven technology ([15]). But from Lilly’s standpoint, the potential is transformational. If Profluent’s AI truly can deliver functional recombinases for any desired sequence, the company will unlock a new “universal” toolkit for genetic diseases – far surpassing the reach of current CRISPR-based methods. The deal also signals a cultural change: pharma is acknowledging that protein engineering can be outsourced to AI. As Madani (Profluent’s CEO) asserts, we are shifting biology “from one constrained by evolution to one of abundant possibilities” ([29]).
In sum, the Lilly–Profluent collaboration is a bet on the future of medicine. It will likely shape how companies approach biomolecular innovation, gene therapy development, and the integration of AI into R&D. Should the partnership bear fruit, it could open treatments for numerous genetic conditions that are currently intractable. If it fails, it will still have advanced the field’s understanding of AI-based design. In either case, the partnership and its outcome will be closely watched as a case study of generative biology in action. The scientific community and industry will learn from its technical successes and setbacks. For now, it stands as a proof-of-concept of ambition: kilobase-scale genome editing by AI is no longer just theoretical, but an actively pursued goal, funded at the billion-dollar level ([9]) ([15]). The coming years will reveal whether this bet ushers in a new era of “designer” genetic medicine.
References: All factual claims and quotations above are supported by cited sources, including official press releases ([35]) ([8]), media reports ([50]) ([9]) ([16]), peer-reviewed literature ([4]) ([25]) ([5]), and expert analyses ([14]) ([15]). Each claim is followed by [source†Lx-Ly] referencing the relevant lines. The information reflects the state of knowledge as of May 2026.
External Sources (50)

Need Expert Guidance on This Topic?
Let's discuss how IntuitionLabs can help you navigate the challenges covered in this article.
I'm Adrien Laurent, Founder & CEO of IntuitionLabs. With 25+ years of experience in enterprise software development, I specialize in creating custom AI solutions for the pharmaceutical and life science industries.
DISCLAIMER
The information contained in this document is provided for educational and informational purposes only. We make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability, or availability of the information contained herein. Any reliance you place on such information is strictly at your own risk. In no event will IntuitionLabs.ai or its representatives be liable for any loss or damage including without limitation, indirect or consequential loss or damage, or any loss or damage whatsoever arising from the use of information presented in this document. This document may contain content generated with the assistance of artificial intelligence technologies. AI-generated content may contain errors, omissions, or inaccuracies. Readers are advised to independently verify any critical information before acting upon it. All product names, logos, brands, trademarks, and registered trademarks mentioned in this document are the property of their respective owners. All company, product, and service names used in this document are for identification purposes only. Use of these names, logos, trademarks, and brands does not imply endorsement by the respective trademark holders. IntuitionLabs.ai is an AI software development company specializing in helping life-science companies implement and leverage artificial intelligence solutions. Founded in 2023 by Adrien Laurent and based in San Jose, California. This document does not constitute professional or legal advice. For specific guidance related to your business needs, please consult with appropriate qualified professionals.
Related Articles

Isomorphic Labs & AlphaFold: AI Drug Discovery in Trials
Examine how Isomorphic Labs uses AlphaFold AI to advance its first computationally designed oncology and immunology therapeutics into clinical trials.

AI Antibody Discovery: Merck-Infinimmune Deal Analysis
Analyze the $838M Merck-Infinimmune AI antibody discovery deal. Understand how the Anthrobody screening platform and GLIMPSE language model engineer biologics.

NVIDIA-Eli Lilly AI Lab: Drug Discovery Compute Strategy
An overview of the $1B NVIDIA-Eli Lilly AI lab, detailing its 5-year compute strategy, DGX SuperPOD infrastructure, and foundation models for drug discovery.