The data collection plan makes sure that you collect the data in a way that you can later use these data to analyze cause-effect relationships (measuring Ys and Xs at the same time). By deciding which “data we want to have”, the data collection plan prevents us from simply using whatever data are already available. The data collection must be representative of the process as it is.

Typical questions to ask when creating a data collection plan:

  • What data should be collected?
  • How are they measured and what is the operational definition of the measurements?
  • What are the conditions to record at the same time?
  • What is the “unit of analysis”? Why do the data on X and Y belong together?
  • When I have these data, am I able to prove or disprove the cause-effect-relationships between Xs and Ys from Define and Measure?
  • How will the data be displayed and what are the patterns expected?
  • What is the right time frame, scope, etc. to ensure data are representative?

Template of a Data Collection Plan