Landscapes with moderate degrees of ruggedness share a striking feature: it is the highest peaks that can be scaled from the greatest number of initial positions!
(Stuart Kauffman, At Home in the Universe)
In the previous step, we quickly found that even with a relatively modest selection of factors and levels, the combinatoric explosion can be daunting. Hence the need to find a right-size experimental space. This space should have
- Not too many factors
- Not too many levels for each factor
- No one factor having a very large number of levels
Looking for Interactions
This thinking is based on the idea that the discovery, or “hit”, will come as an interaction of no more than three factors, and that indications of the existence of that hit will come as two-factor interactions. Furthermore, the “combinatorial explosion” that occurs when four or five-way interactions are considered makes them unattainable in practice anyway.
I find that it is useful to think of an HT experimental program as a search on a rugged landscape, with some interesting high peaks and many boring flat areas. Fortunately, experience has shown that there are far more peaks in many systems than we expected! Further, mathematical study (see quote above) has shown that it is frequently easier to approach these peaks than we expect.
We do not have to land an experimental run squarely on a peak! A foothill, ridge, cwm, talus pile, or even a cairn will do. So a systematic but relatively sparse search is quite likely to provide more than a few good leads.
Let’s look at a large (but real) example:
|Formulation Factors||Type||No. of Levels|
|Amount of Cocatalyst||Quantitative||3|
|Amount of Ligand||Quantitative||3|
|Amount of Anion||Quantitative||3|
Total Number of Combinations 2,916,000
Latin Square Designs
However, we can reformat this problem into something much more reasonable. First, there are only three multilevel qualitative factors with 20, 20, and 10 levels. A reasonable approach to this part of the problem is a “Latin Square” type design. The figure shows how a 4x4x4 system can be sampled for all 2-ways with only 4×4=16 runs. In the example, the smallest design that will sample all 2-ways = 20 x 20 = 400 runs.
Second, there are six quantitative factors (assuming we are not treating the “amount” factors as a mixture design). These can be combined in a modern “definitive screening design” (JMP) in as few as 13 runs, or a highly fractionated factorial with a center point of 9 runs. Thus a very good screen of this system can be performed with from (9 x 400) = 3600 runs to (13 x 400) = 5200 runs. These are still large numbers, but easily in the range of modern high throughput equipment.
If this is still too large a problem, there are two major approaches:
Reduce the number of levels of the qualitative factors. In the example, reducing the qualitative design to 10x10x10 requires only 100 runs.
- Set all the quantitative factors to their centerpoints or “optima” and run just the Latin square design.
Either approach will reduce the total effort to less than 1000 runs.
Techniques like these are used in an ongoing dialog between the “enthusiasts” (mostly the chemists) and the “realists” (those who actually do the work, or pay for it). Generally a satisfactory equilibrium can be reached.
You now have a DOE strategy that
- Samples the experimental space in a systematic fashion
- Has a workable number of runs
In the next section, we’ll look at some of the issues found when the design meets the robot.
If you want to jump right to the whole strategy, contact me at +1 413 822 5006 or firstname.lastname@example.org!