back to Case Studies/Blog Posts
Fast and powerful optimization of elastomer properties using DOE
Benjamin Cassidy, Ph.D.
May 24th, 2022
Recently H&T helped a client optimize the properties of an elastomeric product. This product consists of a base polymer and three additives. Our aim was to develop a model that outputs the required amounts of each additive needed in order to achieve the desired compression modulus. Using advanced statistical methods, we achieved this using a minimum of experiments, saving both time and resources. As a direct result of this, we were able to reduce the manufacturing cost of the product by up to 14%. This general method can solve many market-relevant manufacturing or materials problems.
Design of Experiments
Of course, performing experiments costs money and time, so we want to design a useful model in a minimum amount of experimental runs. To do this, we used design of experiments (DOE), and in particular, partial factorial experimental design. This is a method used to determine the response of an experiment (here, compression modulus) based on several inputs (additives A, B, and C). In simple terms, the advantage of using a partial factorial experimental design is that it allows us to determine if any of the additives interact with each other, or have compounding effects when added together. DOE is routinely used in many industries such as semiconductor manufacturing, clinical trials, process chemistry, and more.
When exploring an experimental space, common practice is to change variables one at a time, while all others are held constant. This is a useful method and easy to understand, but it is inefficient and wastes valuable resources. Worst of all, it can hide important interactions between variables, leading to incorrect conclusions. A compelling example of the importance of determining how variables impact each other, in this case in drug discovery, can be found in Lendrem et al. in Drug Discovery Today.
In contrast, DOE allows for extracting the maximum amount of information from a minimum of experimental runs. We can observe interactions between variables by including runs in which we change multiple variables simultaneously and use robust statistical approaches to determine exactly which interactions are important. Fortunately, it turns out mathematically that we don’t need to run all possible combinations in order to observe these interactions – we only need to run a subset of these (hence the “partial” in partial factorial DOE). This allows us to make more observations and extract stronger conclusions using fewer experiments.
Our Design
For this example, I worked together with the client to select the above partial factorial design. In this graphic, the X axis indicates the percent by mass of additive A, the Y axis indicates the percent by mass of additive B, and the Z axis indicates the percent by mass of additive C. The maximum values of each chosen for this study were 20%, 2%, and 2%, respectively, and were chosen based on prior knowledge of the elastomer. We tested the points marked in red, and each point was tested twice. In short, we tested the elastomer without filler (0, 0, 0), each vertex containing only one additive, and the vertex containing all three additives (20, 2, 2). In addition, we tested some points not shown on the diagram – the center point (10, 1, 1) and the point (15, 1, 1). Adding the center point to the design allows us to detect curvature in the data and increases the power of our model. The properties of (15, 1, 1) were already known by the client, so we in effect acquired that data “for free”. We’re not limited to testing at the vertices for DOE, but testing at the vertices allows us to get more insight over the experimental range for a given number of runs.
Results
After running all tests, we performed multiple linear regression to generate the following model:
In this model, Y is the modulus of compression (in MPa). What this equation says at-a-glance is that additive A makes the elastomer softer, whereas B and C make the elastomer firmer. Without any additives, the elastomer’s modulus of compression is 0.0535 MPa.
However, this model leaves something to be desired – we haven’t considered interactions between additives! We can do a quick evaluation of whether including interactions is useful for our model by asking three questions:
1. Does including a particular interaction meaningfully decrease error/increase R2?
2. Does the interaction have a reasonable p-value?
3. Does the interaction make chemical sense?
R2 is a common measure of how well the model matches the observed data. An R2 closer to 1 means that more of the variation in the data is captured in the model, and that there will be less error. Next, a lower p-value generally means the interaction is more meaningful. I don’t typically enforce a hard threshold on p-values when I’m first exploring a model, but interactions with values higher than p = 0.2 require a strong chemical justification to keep in the model. Finally, interactions must be chemically plausible. Any useful model should aim to reflect reality.
These questions are not exhaustive, and only scratch the surface of ways to analyze data using DOE and regression analysis. That being said, they are powerful, easy to use, and importantly, easy to understand.
Above we can see the difference in standard error between our original model (“No interactions”) and a model containing all three two-variable interactions (AxB, AxC, and BxC). We can see that including all interactions improves the R2 value and lowers the standard error of the model, satisfying question 1.
However, two of the interactions are disqualified by question 2. The interactions highlighted in red have p-values that are quite large, and are likely irrelevant to the real-world phenomena underlying this model. Including them in the final model would be a mistake.
Removing these two interactions leaves us with the final model below:
When comparing to our first model without interactions, this equation shows that B and C individually impact the modulus more than we initially thought, but that this effect is lessened by an interaction between the two.
Taking another look at the regression statistics, we see that this model has less error than the first model, but does not have any obviously suspicious interactions included.
Including this interaction this makes sense, according to question 3 – B and C are chemically similar, so they may compete for the same spaces within the elastomer matrix. This would lead to an apparent interaction between them in the model. This logic leads to the idea that B and C are “self-competing”, in that they may (in reality) display a non-linear trend with respect to modulus of compression. Further exploration of this is outside the scope of this project, but is important to note for future work.
It is always important to have a hypothesis about what chemical phenomena affect your model when performing any DOE for materials development. Without an understanding of the underlying chemistry, it’s easy to succumb to overfitting, and make decisions based on experimental noise. This can lead to nasty and potentially expensive surprises when new formulations don’t fit within your model!
Takeaways
This example shows how powerful DOE can be when used correctly. We were able to obtain a useful model for the client with only 12 additional experimental runs. Additionally, since this model was designed with real-life chemistry in mind, the customer is now able to respond quickly to changes in required compression properties – all they need to do is use the model to design a material with the desired modulus, then verify it with a quick pilot study. This dramatically shortens the time-intensive materials design phase of any new project.
In addition, this analysis allows for immediate reduction in manufacturing cost. Each additive has a different material cost, and all are less expensive than the base polymer. With this model in hand, it is possible to design different formulations with equivalent performance, but with dramatic differences in price. For example, moving from (0, 0, 0) to (20, 2, 2) retains a similar modulus of compression, but drops in price by 14%.
Despite how quickly we were able to generate actionable data with this experimental design, there are certainly improvements to the model that can be made depending on future requirements. Since this is a mixture, it would have been more accurate to use a mixture design DOE model, such as a simplex centroid design. Over the range studied, though, the standard multiple linear regression works well enough for our purposes. Also, in this example, we did not optimize for variables other than modulus of compression, which may end up being important to the final product. Finally, this example did not account for how material properties change with temperature of use. All measurements were taken near to room temperature, but if the product is used in very hot or very cold conditions, it’s conceivable that different formulations may change properties at different rates.
I personally enjoy doing this type of work, and I enjoy it even more when it results in improved properties or lower costs for H&T’s clients! If these services can help your product, don’t hesitate to contact us.
Further Reading
A thorough explanation of DOE from the National Institute of Standards and Technology can be found in the NIST Engineering Statistics Handbook, Design of Experiments section.
Regression and Other Stories, Gelman, Hill and Vehtari. An excellent introductory textbook for regression with a focus on applying it to real-world problems. Available as a free pdf.
Data to Decisions, Chris Mack. Dr. Mack is a giant in the field of lithography, and taught me statistics, including basic DOE, at UT-Austin. His online class can be found on YouTube, and class materials can be found on his personal website.
Lost in space: design of experiments and scientific exploration in a Hogarth Universe, Lendrem et al., Drug Discov. Today. 2015, 20(11):1365-71. doi: 10.1016/j.drudis.2015.09.015. This is the paper linked in the “Design of Experiments” section above. It illustrates how common multivariate problem spaces can possess hidden patterns that can lead researchers astray, colorfully termed “Hogarth” or “wicked” universes.
Results shown here were generated using the Real Statistics Resource Pack for Microsoft Excel. This is a useful tool for making data analysis in Microsoft Excel easy and clear.