Most machine learning models fail in formulation for one simple reason: The data they need doesn’t exist.You can’t generate shelf life data across a ten-year horizon within a development timeline or measure what happens inside a droplet during spray drying in real time.
At amofor, we approach the problem differently. Instead of training models on patterns we can’t observe, we simulate behavior based on physical laws. This blog explains why we’ve made physical modeling the anchor of our digital formulation workflow and what this means for the future.
Two Modeling Worlds: Machine Learning vs. Physical Models
Today, pharmaceutical development relies on two different modeling approaches:
- Machine learning models,built on pattern recognition, trained on large datasets
- Physics-based models,built on equations that describe molecular behavior
Machine learning can detect correlations across large variable spaces. It recognizes patterns, even those that humans may never detect. But these models don’t explain why things happen, and fail when little data exists.
Physical models operate differently. They determine behavior based on physical laws – which describe how materials respond to changes in pressure, temperature, and volume. These models remain valid when the data is sparse and can explain underlying mechanisms in the defined boundaries.
This distinction matters in formulation, where key behaviors like crystallization and solvent evaporation occur at time or length scales that can’t be directly measured. A very simple and well-known model is the ideal gas law, a principle every STEM student learns in early semesters. It provides a way to predict how gases behave under different conditions, using first principles.
At amofor, we apply the same principle to more complex systems, such as ASDs. We use PC-SAFT, a thermodynamic model that extends the logic of the ideal gas law to describe molecular interactions in multi-component mixtures. PC-SAFT represents molecules as chains of spherical segments, a simplified but effective abstraction. Think of it as a pile of spaghetti, where each strand represents a segment of a molecule. The model calculates how these strands interact with each other in a formulation.
PC-SAFT doesn’t require a full chemical structure or massive datasets. Instead, we use five carefully selected solubility measurements, chosen to represent key solvent classes. This gives us a reliable fit for the API’s intermolecular interaction parameters. From there, it predicts:
- Phase behavior
- Polymer miscibility
- Drug loading limits
- Crystallization onset and shelf life
- Solvent dynamics during processes like spray drying
And it does this even in formulation spaces where no prior data exists, where machine learning models typically reach their limits.
Shelf Life and Spray Drying: Where Physics Matters Most
Certain formulation decisions must be made long before large datasets are available. Two common examples are shelf life and spray drying, both of which illustrate the potential of statistical modeling.
Shelf life
Predicting whether a formulation will crystallize in 5 days or 5 years is not an AI task; it’s a physical one. Using SOLCALC, we combine PC-SAFT with models for molecular mobility and nucleation kinetics to estimate crystallization onset under specific humidity and temperature conditions. “Our shelf-life predictions have a range of +/-20%, which is extremely accurate. And these predictions have been validated across more than 150 experimental data points, with shelf life ranging from one to three years,” says Christian Lübbert, co-founder of amofor. This gives formulators a clear risk profile, early enough to adapt strategy while it still matters.
Spray drying
Inside a drying droplet, the ratio of solvents changes every millisecond. Concentrations shift, solvents evaporate and accumulate, and phase separation risks emerge. These transitions happen too quickly and at too small a scale to be measured in real time. With PC-SAFT, we simulate how evaporation unfolds, when separation might occur, and under which conditions sticking becomes likely. We’ve also used PC-SAFT to design scale-up strategies that maintain quality across production phases. Clients using this approach have reported fewer crystallization issues and significantly reduced clean-up times, translating into faster turnaround and lower cost of failure.
What This New Workflow Means for Formulators
Physical modeling changes how formulation scientists, pharmacists, and chemical engineers work. Instead of navigating by trial-and-error, they operate through hypothesis-driven design, supported by predictive, mechanistic modeling.
The scientists we train aren’t just software users; they are formulation thinkers. They learn to work with digital twins of their experiments, understanding the connections between molecular mobility, solvent evaporation, crystallization risk, and shelf life.
They use these connections to plan better, test smarter, and decide faster.
In our view, this dual-track approach, modeling and lab work running in parallel, defines what’s next for pharmaceutical development: digital biology, grounded in powerful computations and on-spot experiments.
Why Physics and AI Belong Together
We’re often asked how physical modeling compares to AI. Our view is that they’re complementary.
AI models can help when enough data is available, for example, identifying patterns in excipient selection across many formulations. But when data is sparse, or when long-term predictions are needed, only physical models provide the necessary reliability.
That’s why we’ve built our platform and services around both capabilities. SOLCALC draws from structured data and thermodynamic models to create a foundation of physical insight. In select cases, we layer in physics-powered AI to optimize formulations at the molecular level and extend those insights further.
That’s why we start small — so our clients can innovate bigger, faster, and with more confidence.
Contact us to schedule a session and turn modeling into a decision-making tool.