My Process Data is a Mess! What do I do?

By | August 13, 2014

I was recently contacted by a potential client who was swamped with old process data. His process had been running more or less profitably for decades without real scientific attention. His management finally started pushing him to make it more profitable. His first item of concern was the state of the data.

Why is his data a mess- and so might be yours?

  • First: this is a NORMAL situation. An experienced statistician once told me that he figured 70% of his time was spent on data cleanup!
  • A process that’s running “normally” in a plant will have little attention and casual record keeping.
  • Years-old data could be in notebooks rather than in digital form.
  • Almost every real plant has a “master chef” (often on each shift) with individual opinions on the ideal control settings – often unrecorded tweaks.
  • Over the years, “other” things will happen that are not recorded: – “minor” changes in raw materials; “incidental” equipment upgrades, “inconsequential’ changes in measurement procedures or specs….

Your Goal is to have data that will allow you to run the process at its optimal efficiency. Less waste, happier customers, more profit.

So how are you going to do this?

You can struggle with the existing process data. If you do, remember that the data is probably contaminated with “nuisance” variables; the factors that actually are important are probably correlated with each other; and the measurement equipment may not have been calibrated (or even working). Nevertheless, your management will probably be saying “Come on, you’ve got plenty of data! You should be able to figure this out!” If you must:

  • Don’t try to eat the whole elephant. Take it in bites.
  • For each bite of data you take, look it over for obvious blunders, data in the wrong format, and so on, and clean those up.
  • Find a recent period where the process is running “smoothly”. Get an idea of the distribution of the various process variables (“factors”) and measurements (“responses”). Look for simple correlations.
  • Do the same for a period of non-smooth operation. Compare.


  •  You can do it right and use the “six sigma” approach.
    •  Clean up your measurement systems first! Check the calibration of your process variable monitors and your response measurements. Find out whether you can actually measure your response within the specification limits
    • Evaluate which of your (many) process factors MAY be having an effect on the result.
    • Plan and carry out an experimental strategy that will reduce that long list of candidates to the “critical few”, and determine the effects of those factors.
    • Develop a control strategy to optimize the process.

The choice is yours… but the second choice is far more likely to produce the results you want!

I’d be delighted to help you carry it out. I have over 30 years of working with teams at GE, Dow Corning, Valspar, and others, leading them through the process of planning effective experiments and understanding their data. In several cases I was able to work with a project that had gotten nowhere for several years and lead the team to a commercial product in six months or less. Their customers got a better product and their bottom line was markedly improved.

Download my “Design of Experiments Master Guide” and call me at 413-822-5006 for a free hour of consultation.

Leave a Reply

Your email address will not be published. Required fields are marked *