top of page
Search

Federated AI Can’t Be Ignored - How Qubigen is Doing it Differently

At Qubigen, we do something fundamentally different: we build AI Engines for drug design using our secure Federated AI Drug Design platform (FedAIDD). The platform allows our clients to generate truly novel compounds, without their data ever leaving its server. Qubigen’s FedAIDD can also predict reaction pathways and optimize for desired ADME-T properties, ultimately saving on med-chem costs. Our mission is to unlock the full potential of pharmaceutical data without compromising on confidentiality, intellectual property, or regulatory compliance.


Organizations are increasingly exploring one or two off-the-shelf AI tools to help with their workflows. In contrast, Qubigen builds custom AI Engines, end-to-end intelligent systems that learn from your own in-house data and global federated knowledge safely, without intermixing them.


1. Customized for You

Qubigen securely tailors and deploys AI Engines that draw on your internal datasets. Our patented AI methods can identify the right data, curate, and cleanse it, and train stable, high-performance AI models [1-2], all without moving, copying, or even seeing the data.


2. Draw on Global Data Without Mixing It

Our patented FedAIDD platform can virtually access our curated, federated database without combining it with your data. The platform orchestrates AI training to draw out underlying aspects from diverse and distributed sources, pulling only the relevant insights to supplement your drug development capabilities. This enables a richer, more complete model than any single organization could build in isolation.


3. State-of-the-Art AI, Beyond CADD

While many companies still rely on outdated computer-aided drug design (CADD) techniques, our AI and software teams integrate the latest advances in deep learning, quantum chemistry, and generative modeling. These capabilities allow our AI Engines to predict reaction pathways, optimize for ADME-T properties, and generate truly novel compounds that move beyond iterative scaffold tweaking, toward genuine ‘cognitive leaps’ that interlace molecular aspects from diverse assay types.


ree

Why Federated AI is an Inevitable Future, and the Moat for New Players


Federated AI isn’t optional, it’s the inevitable future of secure, cutting-edge drug design. With increasing pressure to protect proprietary R&D data and the enormous inefficiency of early-stage discovery, pharma now faces a clear choice: adopt IP-protecting acceleration or be left behind.

Recent strategic alliances between Eli Lilly and NVIDIA [6], and BMS and Takeda [7], underscore where the field is heading, towards distributed AI ecosystems where models learn securely across institutional boundaries.


Why Centralized Approaches Can’t Compete


Using traditional approaches, it might be possible to circumvent the need for Federated AI by buying datasets to train large language models (LLMs) centrally. But this approach falls short in five critical ways:


  1. Insecurity: Purchased or shared datasets expose proprietary molecular and assay data to third parties. Federated AI trains models across decentralized data without data transfer, keeping IP secure behind firewalls.

  2. Incompleteness: Centralized datasets are fragmented, or standardized in ways that don’t adequately reflect specific research needs. Qubigen’s Federated AI system can access diverse, siloed data across organizations, delivering far greater specificity and coverage.

  3. Inflexibility: Drug design is iterative. Purchased datasets are static snapshots, while Qubigen’s federated models can be updated dynamically, supporting adaptive optimization and collaboration between partners.

  4. Non-Compliance: Centralized aggregation could potentially violate confidentiality clauses or other contractual restrictions. Federated learning, such as that proven in a landmark, non-commercial study [3], enables cross-institutional learning while preserving each organization’s proprietary data, a model fully aligned with standard industry IP requirements.

  5. Expense: Buying multiple proprietary datasets and training LLMs can cost millions, with diminishing returns as data acquisition escalates. Qubigen’s Federated AI is up to 187x more cost-effective than traditional (distributed systems) federated learning, scaling efficiently without the overhead of centralization [4], leading to efficient commercialization of new technologies beyond simply drug design [5].


Alternatively, it might be possible to use existing public data to create a pharmaceutical ‘chatGPT’. Public data is very messy, but it can be valuable when carefully curated, standardized, and validated. However, even with significant effort, public data represents only part of the information needed for modern drug design. The most recent in vitro studies, in vivo assays, and high-quality experimental results remain sequestered within the vault of competing commercial entities, with no traditional mechanism for safe access. Crucial negative data (the failed experiments, abandoned compounds, and unreported optimization attempts) rarely appear in publications, patents, or public databases.


A Smarter, Safer Path Forward


Pharma’s future will not be built behind walls of proprietary data; it will be built on federated intelligence. Qubigen’s platform enables global learning without compromising confidentiality, accelerating hit-to-lead optimization to generate novel, optimized drug candidates that traditional AI cannot reach.


Yet Federated AI opens the door to a greater impact on Pharma than just novel drug designs. It potentially upends the entire industry, bringing with it a whole new paradigm for data utilization. With secure ecosystems for cross-institutional learning, data previously locked behind institutional firewalls can now be transcended, securely.


The industry’s next breakthroughs won’t come from training solely on public data sources, or from buying more and more data available to anyone with a large enough pocketbook. It’s also unlikely to come from training on data confined to any single company. The next breakthrough will most likely come from supremely strong AI methods applied in a secure, Federated AI framework, such as Qubigen’s FedAIDD.


With our validated market and case studies, Qubigen is creating a new competitive moat and opening the door to Federated AI solutions for therapeutic challenges once thought impossible.


  1. Dakka, M.A., Nguyen, T.V., Hall, J.M.M., Diakiw, S.M., VerMilyea, M.D., Linke, R., Perugini, M., Perugini, D. Automated detection of poor-quality data: case studies in healthcare, Sci. Rep. 11, 18005 (2021).

  2. Nguyen T.V., Diakiw S.M., VerMilyea M.D., Dinsmore A.W., Perugini M., Perugini D., Hall J.M.M. Efficient automated error detection in medical data using deep-learning and label-clustering Sci. Rep. 13, 19587 (2023).

  3. Innovative Health Initiative. Can computers learn to think like chemists? MELLODDY have shown that ‘federated learning’ can be used to pool datasets from multiple pharma companies without revealing any valuable secrets. Innovative Health Initiative https://wayback.archive-it.org/12090/20240426112748/https://www.imi.europa.eu/news-events/newsroom/can-computers-learn-think-chemists (19 November 2020).

  4. Nguyen T.V., Dakka M.A., Diakiw S.M., VerMilyea M.D., Perugini M., Hall J.M.M., Perugini D. A novel decentralized federated learning approach to train on globally distributed, poor quality, and protected private medical data. Sci. Rep. 12, 8888 (2022).

  5. Hall, J.M.M., Nguyen, T.V., Dinsmore, A.W., Perugini, D., Perugini, M., Fukunaga, N., Asada, Y., Schiewe, M., Lim, A.Y.X., Lee, C., Patel, N., Bhadarka, H., Chiang, J., Bose, D.P., Mankee-Sookram, S., Minto-Bain, C., Bilen, E., Diakiw, S. M. Use of Federated Learning on distributed data to develop an artificial intelligence for predicting usable blastocyst formation from pre-ICSI oocyte images. Reprod. BioMed. Online 49, 6, 104403 (2024).

  6. Constantino, A. K. Eli Lilly, Nvidia partner to build supercomputer, AI factory for drug discovery and development. CNBC https://www.cnbc.com/2025/10/28/eli-lilly-nvidia-supercomputer-ai-factory-drug-discovery.html (28 October 2025).

  7. Sneha, S. K. Bristol Myers, Takeda to pool data for AI-based drug discovery. Reuters https://www.reuters.com/business/healthcare-pharmaceuticals/bristol-myers-takeda-pool-data-ai-based-drug-discovery-2025-10-01/ (1 October 2025).


Qubigen: accelerate drug design without exposing secrets


Whether you're advancing active programs, reviving dormant data, or starting from scratch, Qubigen’s secure Federated AI platform and virtual screening capabilities can help you identify, optimize, and accelerate the path to promising lead drug candidates. Get in touch to explore how we can support your next development.


 
 

enquiries@qubigen.com

Bio21 Institute

30 Flemington Road

Parkville VIC 3052

Australia

Accelerate drug design
without exposing secrets

Connect with Us

  • X
  • LinkedIn

© 2025 Qubigen. All rights reserved.

bottom of page