Experimental Design in Chemistry: A Review of Pitfalls (Guest Post)

his blog post is from James Cawse, Consultant and Principal at Cawse and Effect, LLC. Jim uses his unique blend of chemical knowledge, statistical skills, industrial process experience, and quality commitment to find solutions for his client’s difficult experimental and process problems. He received his Ph.D. in Organic Chemistry from Stanford University. On top of all that, he’s a great guy! Visit his website (link above) to find out more about Jim, his background, and his company.

Introduction

Getting the best information from chemical experimentation using design of experiments (DOE) is a concept that has been around for decades, although it is still painfully underused in chemistry. In a recent article Leardi1 pointed this out with an excellent tutorial on basic DOE for chemistry. The classic DOE text Statistics for Experimenters2 also used many chemical illustrations of DOE methodology. In my consulting practice, however, I have encountered numerous situations where ’vanilla‘ DOE – whether from a book, software, or a Six Sigma course – struggles mightily because of the inherent complications of chemistry.

The basic rationale for using a statistically based DOE in any science are straightforward. The DOE method provides:

  • Points distributed in a rational fashion throughout “experimental space”.
  • Noise reduction by averaging and application of efficient statistical tools.
  • ‘Synergy’, typically the result of the interactions of two or more factors – easily determined in a DOE.
  • An equation (model) that can then be used to predict further results and optimize the system.

All of these are provided in a typical DOE, which generally starts simply with a factorial design.

DOE works so well in most scientific disciplines because Mother Nature is kind. In general:

  • Most experiments can be performed with small numbers of ’well behaved‘ factors, typically simple numeric or qualitative at 2-3 levels
  • Interactions typically involve only 2 factors. Three level and higher interactions are ignored.
  • The experimental space is relatively smooth; there are no cliffs (e.g. phase changes).

As a result, additive models are a good fit to the space and can be determined by straightforward regression.

Y = B0 + B1×1 + B2×2 + B12x1x2 + B11×12 +…

In contrast, chemistry offers unique challenges to the team of experimenter and statistician. Chemistry is a science replete with nonlinearities, complex interactions, and nonquantitative factors and responses. Chemical experiments require more forethought and better planning than most DOE’s. Chemistry-specific elements must be considered.

Mixtures

Above all, chemists make mixtures of ‘stuff’. These may be catalysts, drugs, personal care items, petrochemicals, or others. A beginner trying to apply DOE to a mixture system may think to start with a conventional cubic factorial design. It soon becomes clear, however, that there is an impossible situation when the (+1, +1, +1) corner requires 100% of A and B and C! The actual experimental space of a mixture is a triangular simplex. This can be rotated into the plane to show a simplex design, and it can easily be extended to high dimensions such as a tetrahedron.

It is rare that a real mixture experiment will actually use 100% of the components as points. A real experiment with be constrained by upper and lower bounds, or by proportionality requirements. The active ingredients may also be tiny amounts in a solvent. The response to a mixture may be a function of the amount used (fertilizers or insecticides, for example). And the conditions of the process which the mixture is used in may also be important, as in baking a cake – or optimizing a pharmaceutical reaction. All of these will require special designs.

Fortunately, all of these simple and complex mixture designs have been extensively studied and are covered by Cornell3, Anderson et al4, and Design-Expert® software.

Kinetics

The goal of a kinetics study is an equation which describes the progress of the reaction. The fundamental reality of chemical kinetics is

Rate = f(concentrations, temperature).

However, the form of the equation is highly dependent on the details of the reaction mechanism! The very simplest reaction has the first-order form

Rate = k*C1

which is easily treated by regression. The next most complex reaction has the form

Rate = k*C1*C2

in which the critical factors are multiplied – no longer the additive form of a typical linear model. The complexity continues to increase with multistep reactions.

Catalysis studies are chemical kinetics taken to the highest degree of complication! In industry, catalysts are often improved over years or decades. This process frequently results in increasingly complex catalyst formulations with components which interact in increasingly complex ways. A basic catalyst may have as many as five active co-catalysts. We now find multiple 2-factor interactions pointing to 3-factor interactions. As the catalyst is further refined, the Law of Diminishing Returns sets in. As you get closer to the theoretical limit – any improvement disappears in the noise!

Chemicals are not Numbers

As we look at the actual chemicals which may appear as factors in our experiments, we often find numbers appearing as part of their names. Often the only difference among these molecules is the length of the chain (C-12, 14, 16, 18) and it is tempting to incorporate this as numeric levels of the factor. Actually, this is a qualitative factor; calling it numeric invites serious error! The correct description, now available in Design-Expert, is ’Discrete Numeric’.

The real message, however, is that the experimenters must never take off their ’chemist hat‘ when putting on a ’statistics hat’!


Reference Materials:

  1. Leardi, R., “Experimental design in chemistry: A tutorial.” Anal Chim Acta 2009, 652 (1-2), 161-72.
  2. Box, G. E. P.; Hunter, J. S.; Hunter, W. G., Statistics for Experimenters. 2nd ed.; Wiley-Interscience: Hoboken, NJ, 2005.
  3. Cornell, J. A., Experiments with Mixtures. 3rd ed.; John Wiley and Sons: New York, 2002.
  4. Anderson, M.J.; Whitcomb, P.J.; Bezener, M.A.; Formulation Simplified; Routledge: New York, 2018.

Experimental Design in Chemistry

There is a continuous and growing demand for new and existing organic and inorganic chemical products. Chemical products encompass things like pharmaceuticals, agrochemicals, polymers, and other functional materials, flavors/fragrances, food supplements, cosmetics, fuels, cleaning products, personal care products, and many more products.

Chemistry Experiment

Chemistry Experiment.

The identification, extraction, and synthesis of new products require chemists and biochemists to continually re-design experiments for specified outcomes. Experimental procedures are often required to carry out quality control and to identify why unwanted bi-products are produced in existing manufacturing processes. Experiments take up valuable time and resources and seldom give instantaneous results.

Research and industrial chemists, biochemists, and chemical engineers are often called upon to devise experiments to optimize chemical processes to give greater yields and greater product purity. This requires scientists and engineers to regularly design and redesign experiments. As the chemistry involved becomes more complex so does the design of the experiment and Experimental Design in Chemistry is almost a science on its own.

Traditional Experiment Design

Traditionally the process of experimental design may have started with a hypothesis or a defined research target. The chemist would then make a prediction and work out a way to test the prediction. To design the experiment the chemist would:-

  • Identify the variables
  • Arrange the conditions
  • Decide on the variables and what would variable could be manipulated
  • Carry out a Risk assessment
  • Experiment and make observations
  • Change one or more variable and repeat

The team would then repeat the process of making observations and adjustments until the required solution has been achieved or eliminated. This process was often run on an OVAT (One Variable A Time) basis which is a simple but laborious and effective way of determining the results.

Design Of Experiments

Design of Experiments is a phrase introduced by Statistician and Geneticist Ronald Fisher in 1935. Using Design of Experiments (DOE) the relationship between the factors can be varied and the relationship between the various factors can be systematically investigated using statistical analysis. This process increases the manpower required as you now need at least one mathematician and probably a statistician when designing an experiment in addition to a chemist.

Using the enhanced team and specialist software and computing power the parameters required can be designed within an hour or two rather than many days or weeks offsetting the increased personnel costs.

Artificial Intelligence  (AI) or machine learning can also help as it can be used to predict molecule properties, molecular structure, and reaction outcomes as well as optimum experimental conditions. The use of AI will potentially add an “it” specialist to the team but will reduce the time required to design the experiment. Using computers allows many iterations of any design to be carried out very quickly.

AI can also be used for “retrosynthesis”. This is the process of working backward from a target molecule to a commercially available material that can be used in the manufacture of the target molecule. Performing this manually would be time-consuming but ai makes the process quick so that the design of the experiment can begin in a short time.

Control and Monitoring

The ability to identify and isolate products is an area that is continually improving and developing. The use of chromatography, mass spectrometry, spectrography, chromatography, microscopy, and many other sophisticated techniques mean that more and complicated products can be identified.

The process of identifying them can be more and more complicated but the development of instrument and control technology opens up the opportunity to design more sophisticated experiments. Digital data capture allows for more data to be captured more accurately which allows the scientist designing the experiment to acquire and analyze more data more quickly and reach more precise conclusions.

New equipment and techniques also allow for a smaller and smaller sample to be analyzed down to picoliters or one trillionth of a liter of sample. Improved manufacturing techniques allow for very precisely built equipment operating under harsh conditions to be manufactured increasing the scope of experimental designs. Enhanced control and monitoring capability allows for a greater range of experimental conditions to be considered but also allows closer control of conditions within the experiment.

Chemometrics

In 1971 Svante Wold invented the term Chemometrics which is the science of extracting data from chemical systems using applied mathematics and computer science data. There are now at least three peer-reviewed journals published about chemometrics.

Chemometrics is a multidisciplinary approach to experimental design which allows the chemist to design experiments that get closer to the solution before even starting the experimental process.

Software

There are dozens of open source, free, and proprietory software packages listed on the internet for statistical analysis and chemical reaction prediction. Many of the software suppliers will have existing templates as well as the facility to custom design experiments to fit any particular set of experimental conditions or a specified outcome.

Conclusions

Experimental design in chemistry now opens new horizons for chemists to imagine and produce all kinds of new products. The use of modern techniques and technologies not only allows an experimental chemist to design more sophisticated experiments but also allows the chemist to start from a position where many of the unproductive options have been eliminated and a likely outcome has been predicted. DOE can also be used to optimize and improve existing chemical processes for the benefit of all concerned. The science of designing experiments will continue to develop and offer opportunities in the future.