Boltz: A Revolutionary Model Family for Biomolecular Interaction Prediction
Introduction
In the field of biomolecular research, accurately predicting the interactions between biomolecules has always been a goal pursued by scientists. This is of crucial significance for drug development, understanding biological processes, and more. The emergence of the Boltz model family has brought new breakthroughs and hopes to this field. This article will provide a detailed introduction to the Boltz model family, including its features, installation methods, usage, and future development directions, allowing you to gain a deeper understanding of this cutting – edge model.
What is the Boltz Model Family?
Boltz is a series of models for biomolecular interaction prediction. Boltz – 1 was the first fully open – source model to approach the accuracy of AlphaFold3. Our latest work, Boltz – 2, is a new biomolecular foundation model that surpasses both AlphaFold3 and Boltz – 1. It jointly models complex structures and binding affinities, which is a critical component for accurate molecular design.
The Unique Advantages of Boltz – 2
Boltz – 2 is the first deep – learning model to approach the accuracy of physics – based free – energy perturbation (FEP) methods. Moreover, it runs 1000 times faster than FEP methods. This advantage makes accurate in – silico screening practical in the early stages of drug discovery.
Obtaining Model Resources and Licensing
Getting the Code and Weights
All the code and weights are provided under the MIT license, which means they can be freely used for both academic and commercial purposes. You can obtain the Boltz code in the following two ways:
-
Installation via PyPI (Recommended): This method is simple and convenient. You only need to enter the following command in the command line:
pip install boltz -U
-
Direct Download from GitHub: If you want to get the daily – updated version, you can clone the code repository from GitHub. The specific steps are as follows:
git clone https://github.com/jwohlwend/boltz.git
cd boltz; pip install -e .
Citation Instructions
If you use the Boltz code or models in your research, you need to cite the following two papers:
@article{passaro2025boltz2,
author = {Passaro, Saro and Corso, Gabriele and Wohlwend, Jeremy and Reveiz, Mateo and Thaler, Stephan and Somnath, Vignesh Ram and Getz, Noah and Portnoi, Tally and Roy, Julien and Stark, Hannes and Kwabi - Addo, David and Beaini, Dominique and Jaakkola, Tommi and Barzilay, Regina},
title = {Boltz - 2: Towards Accurate and Efficient Binding Affinity Prediction},
year = {2025},
doi = {},
journal = {}
}
@article{wohlwend2024boltz1,
author = {Wohlwend, Jeremy and Corso, Gabriele and Passaro, Saro and Getz, Noah and Reveiz, Mateo and Leidal, Ken and Swiderski, Wojtek and Atkinson, Liam and Portnoi, Tally and Chinn, Itamar and Silterra, Jacob and Jaakkola, Tommi and Barzilay, Regina},
title = {Boltz - 1: Democratizing Biomolecular Interaction Modeling},
year = {2024},
doi = {10.1101/2024.11.19.624167},
journal = {bioRxiv}
}
In addition, if you use the automatic multiple sequence alignment (MSA) generation function, you also need to cite the following paper:
@article{mirdita2022colabfold,
title={ColabFold: making protein folding accessible to all},
author={Mirdita, Milot and Sch{\"u}tze, Konstantin and Moriwaki, Yoshitaka and Heo, Lim and Ovchinnikov, Sergey and Steinegger, Martin},
journal={Nature methods},
year={2022},
}
Installing Boltz
Environment Recommendations
Before installing Boltz, it is recommended that you operate in a brand – new Python environment to avoid conflicts with other installed libraries.
Installation Steps
Method 1: Installation Using PyPI
This is the most recommended installation method. Just open the command – line terminal and enter the following command:
pip install boltz -U
The -U
option in this command means that if Boltz is already installed, it will be updated to the latest version.
Method 2: Installation from GitHub
If you want to get the latest development version or participate in code development and contribution, you can clone the code repository from GitHub and install it. The specific steps are as follows:
-
Clone the code repository: Enter the following command in the command line to clone the Boltz code repository to your local machine:
git clone https://github.com/jwohlwend/boltz.git
-
Enter the code directory: Use the cd
command to enter the cloned code directory:
cd boltz
-
Install the code: In the code directory, use the following command to install:
pip install -e .
The -e
option here means installing in editable mode, so that after you modify the code, you don’t need to reinstall it to make the changes take effect.
Using Boltz for Inference
Basic Command
After installation, you can use the following command to run Boltz for inference:
boltz predict input_path --use_msa_server
Parameter Explanation
-
input_path
: This parameter should point to a YAML file or a directory containing YAML files. If it points to a directory, Boltz will process all the YAML files in the directory in batches. These YAML files describe the biomolecules you want to model and the properties you want to predict (such as affinity). -
--use_msa_server
: This option indicates using the multiple sequence alignment (MSA) server.
Viewing Available Options
If you want to view all available options, you can use the following command:
boltz predict --help
Input Format Explanation
For more information about these input formats, you can refer to the Prediction Instructions. By default, the boltz
command will run the latest version of the model.
Evaluation and Training of Boltz
Evaluation
Currently, the updated evaluation code for Boltz – 2 is coming soon. To encourage reproducibility and facilitate comparison with other models, in addition to the existing Boltz – 1 evaluation pipeline, we will soon provide evaluation scripts and structural predictions for Boltz – 2, Boltz – 1, Chai – 1, and AlphaFold3 on our test benchmark dataset, as well as our affinity predictions on the FEP+ benchmark, CASP16, and our MF – PCBA test set.
Training
Similarly, the updated training code for Boltz – 2 is also coming soon. If you are interested in retraining the model, currently for Boltz – 1 but soon for Boltz – 2, you can refer to the Training Instructions.
Community Participation and Hardware Support
Community Participation
The Boltz team welcomes external contributions and is eager to interact with the community. You can join their [Slack channel](https://join.slack.com/t/boltz – community/shared_invite/zt – 34qg8uink – V1LGdRRUf3avAUVaRvv93w), where you can discuss the model’s progress, share insights, and promote cooperation around Boltz – 2 with team members and other researchers.
Hardware Support
It is worth mentioning that Boltz can also run on Tenstorrent hardware, thanks to a [fork](https://github.com/moritztng/tt – boltz) by Moritz Thüning.
FAQ
1. What fields is the Boltz model family suitable for?
The Boltz model family is mainly suitable for the field of biomolecular interaction prediction and has important applications in drug development and understanding biological processes. For example, in the early stages of drug development, by predicting the binding affinity of biomolecules, more potential drug molecules can be screened out.
2. What should I pay attention to when installing Boltz?
It is recommended to install Boltz in a brand – new Python environment to avoid conflicts with other installed libraries. You can install it using PyPI or download it directly from GitHub. The specific installation steps can be found in the installation section of this article.
3. How do I use Boltz for inference?
Use the boltz predict input_path --use_msa_server
command for inference, where input_path
points to a YAML file or a directory containing YAML files. If you want to view all available options, you can use the boltz predict --help
command. For more information on input formats, refer to the Prediction Instructions.
4. Can I use the Boltz code and models in my research?
Yes. The Boltz code and models are provided under the MIT license and can be used for academic and commercial purposes. However, when using them in your research, you need to cite the relevant papers as required.
5. When will the evaluation and training code for Boltz – 2 be available?
The article mentions that the evaluation and training code are coming soon, but the specific time is not clearly stated. You can follow the official channels or the Slack community of Boltz to get the latest news.
6. How can I participate in the Boltz community?
You can join the [Slack channel](https://join.slack.com/t/boltz – community/shared_invite/zt – 34qg8uink – V1LGdRRUf3avAUVaRvv93w) to communicate with team members and other researchers, share insights, and promote cooperation.
7. On what hardware can Boltz run?
Boltz can run on common computing hardware. In addition, thanks to the [fork](https://github.com/moritztng/tt – boltz) by Moritz Thüning, it can also run on Tenstorrent hardware.
Conclusion
The Boltz model family has brought new vitality and breakthroughs to the field of biomolecular interaction prediction. As the first fully open – source model to approach the accuracy of AlphaFold3, Boltz – 1 laid the foundation for subsequent research. Boltz – 2 has made significant improvements in both accuracy and speed, making accurate in – silico screening practical in the early stages of drug discovery. With free code and weight resources and active community participation, Boltz is expected to play a greater role in the field of biomolecular research. Although the evaluation and training code for Boltz – 2 are not fully available yet, we have reasons to expect its future performance. Whether you are a researcher or a practitioner in the relevant industry, you can pay attention to the development of the Boltz model family and apply it in appropriate scenarios to contribute to biomolecular research and drug development.
As we look into the future, the potential of the Boltz model family is vast. In the realm of drug discovery, it could lead to the identification of novel drug candidates more efficiently. Traditional drug discovery methods are often time – consuming and costly. With the high – speed and accurate prediction capabilities of Boltz – 2, researchers can quickly screen a large number of potential molecules, reducing the time and resources required for pre – clinical trials.
In the study of biological processes, understanding how biomolecules interact is key to uncovering the mysteries of life. Boltz can provide detailed insights into these interactions, helping scientists to better understand the mechanisms behind diseases and normal physiological functions. For example, by predicting the binding affinity between a protein and a ligand, we can gain a better understanding of how a drug might work at the molecular level.
Moreover, the open – source nature of Boltz encourages collaboration among researchers worldwide. Scientists from different institutions can contribute to the improvement of the model, share their data, and jointly develop new applications. This collaborative environment can accelerate the progress of biomolecular research and lead to more significant breakthroughs.
In terms of technological development, the fact that Boltz can run on Tenstorrent hardware shows its adaptability and potential for further optimization. As hardware technology continues to evolve, we can expect Boltz to become even more powerful and efficient.
In conclusion, the Boltz model family is a remarkable achievement in the field of biomolecular interaction prediction. It has the potential to transform drug discovery, biological research, and related fields. By staying updated on its development and actively participating in the community, researchers and practitioners can make the most of this powerful tool and contribute to the advancement of science.
To further explore the applications of Boltz, let’s consider some specific scenarios. In the field of personalized medicine, where treatments are tailored to an individual’s genetic makeup, Boltz can play a crucial role. By predicting the interactions between a patient’s unique set of biomolecules and potential drugs, doctors can select the most effective treatment options with fewer side effects.
In the development of biopharmaceuticals, such as monoclonal antibodies, accurately predicting the binding affinity between the antibody and its target antigen is essential. Boltz can provide valuable insights during the design and optimization process, leading to the development of more potent and specific biopharmaceuticals.
Another area where Boltz can have a significant impact is in the study of protein – protein interactions. These interactions are involved in many biological processes, including signal transduction, gene regulation, and immune response. By accurately predicting these interactions, we can gain a deeper understanding of the complex networks within cells and potentially develop new therapies to target these networks.
As we continue to expand our knowledge of biomolecules and their interactions, the need for accurate and efficient prediction models like Boltz will only increase. The research community should continue to support the development of Boltz, contribute to its improvement, and explore its full potential. With the combination of cutting – edge technology and collaborative efforts, we can look forward to a future where biomolecular research and drug development are more efficient and effective.
In addition, the data generated by Boltz can also be used for further analysis and validation. For example, the predicted binding affinities can be compared with experimental results to improve the accuracy of the model. This iterative process of prediction, experimentation, and improvement is essential for the continuous development of the field.
Furthermore, the availability of the Boltz code under the MIT license allows for customization and adaptation to specific research needs. Researchers can modify the code to incorporate new algorithms or data sources, further enhancing the capabilities of the model.
The future of biomolecular interaction prediction looks bright with the Boltz model family. It offers a powerful and accessible tool for researchers and practitioners in the field. By leveraging its capabilities, we can make significant progress in understanding the complex world of biomolecules and developing new therapies to improve human health.
In summary, the Boltz model family represents a significant step forward in biomolecular research. Its unique features, such as high accuracy, speed, and open – source nature, make it a valuable asset for the scientific community. As we continue to explore its applications and potential, we can expect to see more breakthroughs in drug discovery, biological understanding, and personalized medicine.
Let’s now take a closer look at the technical aspects of Boltz. The underlying algorithms of Boltz are designed to capture the complex relationships between biomolecules. These algorithms take into account various factors, such as the three – dimensional structure of the molecules, the chemical properties of their components, and the electrostatic interactions between them.
The use of deep learning in Boltz allows the model to learn from large amounts of data. By training on a diverse set of biomolecular structures and interaction data, Boltz can improve its prediction accuracy over time. The data used for training can come from various sources, including experimental databases, simulation results, and literature.
The development of Boltz also involves continuous optimization and improvement. The research team is constantly working on refining the algorithms, adding new features, and improving the computational efficiency of the model. This iterative process ensures that Boltz remains at the forefront of biomolecular interaction prediction.
In terms of the evaluation of Boltz, the upcoming evaluation code for Boltz – 2 will provide a more comprehensive and accurate assessment of its performance. This will allow researchers to compare Boltz with other existing models and make informed decisions about its application in their research.
The training of Boltz is also an important aspect. The future availability of the training code for Boltz – 2 will enable researchers to retrain the model on their own datasets or with specific parameters. This flexibility can lead to the development of customized models for different research needs.
In conclusion, the Boltz model family is a multi – faceted and powerful tool in biomolecular research. Its combination of advanced technology, open – source availability, and potential for customization makes it a valuable asset for the scientific community. As we continue to explore its capabilities and applications, we can look forward to significant advancements in the field of biomolecular interaction prediction.
Now, let’s consider the impact of Boltz on the scientific community from a broader perspective. The open – source nature of Boltz promotes knowledge sharing and collaboration. It allows researchers from different backgrounds and institutions to work together, sharing their expertise and resources. This collaborative environment can lead to the development of new research methods and the discovery of novel biological insights.
In addition, the availability of Boltz can also lower the barriers to entry for researchers in developing countries or smaller research institutions. These researchers may not have access to expensive proprietary software or high – end computing resources. With Boltz, they can conduct their own biomolecular interaction studies and contribute to the global scientific community.
The development of Boltz also has implications for the education of future scientists. By providing an open – source and accessible model, students can learn about biomolecular interaction prediction in a practical way. They can experiment with the code, modify the algorithms, and gain a deeper understanding of the underlying principles. This hands – on learning experience can better prepare them for future research careers.
Furthermore, the use of Boltz in research can lead to the development of new teaching materials and courses. Universities and research institutions can incorporate Boltz into their curricula, teaching students about the latest advancements in biomolecular research and the role of computational models in this field.
In summary, the Boltz model family not only has a significant impact on biomolecular research but also on the scientific community as a whole. Its open – source nature, potential for collaboration, and educational value make it a game – changer in the field. As we continue to support its development and application, we can expect to see a more inclusive and innovative scientific community.
Let’s also discuss the challenges and limitations of the Boltz model family. Although Boltz – 2 shows great promise in terms of accuracy and speed, there are still some areas that need improvement. For example, the model’s performance may be affected by the quality and quantity of the input data. If the input data is incomplete or inaccurate, the prediction results may also be unreliable.
Another challenge is the interpretation of the prediction results. While Boltz can provide numerical values for binding affinities and other properties, understanding the biological significance of these values can be complex. Researchers need to combine the model’s predictions with experimental results and biological knowledge to draw meaningful conclusions.
In addition, the computational resources required for running Boltz can still be a limitation for some researchers. Although Boltz – 2 is faster than FEP methods, it still needs a certain amount of computing power, especially when dealing with large – scale datasets or complex biomolecular systems.
To address these challenges, the research team behind Boltz is likely to continue to work on improving the model. They may develop better data pre – processing techniques to ensure the quality of the input data. They may also provide more tools and guidelines for interpreting the prediction results.
In terms of computational resources, the team may explore ways to optimize the model for different hardware platforms, making it more accessible to a wider range of researchers. They may also collaborate with cloud computing providers to offer more cost – effective and scalable computing solutions.
Despite these challenges, the potential benefits of the Boltz model family far outweigh the limitations. With continuous improvement and innovation, Boltz has the potential to become an indispensable tool in biomolecular research.
In conclusion, the Boltz model family is a revolutionary development in the field of biomolecular interaction prediction. It offers high – accuracy and high – speed predictions, open – source availability, and great potential for collaboration and application. Although there are some challenges and limitations, the future looks bright for Boltz. As the scientific community continues to support and develop this model family, we can expect to see significant advancements in drug discovery, biological understanding, and personalized medicine.
Let’s now explore the potential applications of Boltz in more specific industries. In the agricultural industry, understanding the interactions between plant biomolecules and pathogens or pesticides can be crucial for developing more effective pest control strategies and improving crop yields. Boltz can be used to predict these interactions, helping farmers and researchers to select the most appropriate pesticides and develop resistant crop varieties.
In the food industry, the study of biomolecular interactions can improve food quality and safety. For example, predicting the interactions between food components and additives can help to ensure the stability and nutritional value of food products. Boltz can provide valuable insights in this area, leading to the development of better – formulated food products.
In the environmental industry, understanding the interactions between pollutants and biomolecules in the environment can help to develop more effective pollution control strategies. Boltz can be used to predict the binding affinity between pollutants and environmental biomolecules, such as proteins in soil or water. This information can be used to design more efficient remediation methods.
In the cosmetic industry, predicting the interactions between cosmetic ingredients and skin biomolecules can help to develop more effective and safe cosmetic products. Boltz can provide insights into how different ingredients interact with the skin, allowing cosmetic companies to formulate products that are more suitable for different skin types.
As we can see, the potential applications of Boltz are not limited to the traditional fields of biomolecular research and drug development. Its ability to accurately predict biomolecular interactions makes it a valuable tool in a wide range of industries. By exploring these diverse applications, we can further expand the impact of the Boltz model family and contribute to the development of various sectors.
In addition, the development of Boltz also has implications for the ethical and legal aspects of biomolecular research. As the model becomes more widely used, issues such as data privacy, intellectual property rights, and the responsible use of predictive models need to be carefully considered.
For example, when using Boltz to analyze patient – specific biomolecular data in personalized medicine, ensuring the privacy and security of this data is of utmost importance. Researchers and healthcare providers need to comply with relevant data protection regulations to safeguard patient information.
In terms of intellectual property rights, the open – source nature of Boltz needs to be balanced with the protection of the original developers’ contributions. Clear guidelines and agreements should be established to ensure that the code and models are used and shared in a legal and ethical manner.
The responsible use of predictive models like Boltz also requires careful consideration. Researchers should be aware of the limitations of the model and not over – rely on its predictions. They should always validate the results with experimental data and use the model as a tool to support, rather than replace, scientific judgment.
In conclusion, while the Boltz model family offers great potential in many fields, we also need to address the associated ethical and legal issues. By establishing appropriate guidelines and regulations, we can ensure the responsible and beneficial use of this powerful tool.
As we continue to look ahead, the future of the Boltz model family is full of possibilities. With the continuous development of artificial intelligence and deep learning technologies, we can expect further improvements in the accuracy and efficiency of Boltz.
The integration of Boltz with other emerging technologies, such as blockchain for data security and quantum computing for enhanced computational power, may also open up new frontiers in biomolecular research.
In addition, the expansion of the Boltz community and the increasing number of users and contributors will drive the continuous innovation and improvement of the model. The exchange of ideas and the sharing of data among researchers will lead to the development of more advanced applications and the discovery of new biological phenomena.
In summary, the Boltz model family is a dynamic and evolving tool in the field of biomolecular interaction prediction. Its potential for future development is vast, and it will continue to play an important role in advancing our understanding of biomolecules and improving human health and well – being.