18 Data collection plan

Once you have specified your outcomes of interest and selected the indicators you will use to measure outcomes, you need to specify how the data will be collected.

18.1 Finding existing data sources or collecting new data

You must decide between using existing administrative data sources and collecting your own data through surveys or non-survey instruments. Administrative data are records collected by governments or civil society organisations, usually in the context of program administration. Collecting your own data allows creating a custom data set on exactly the outcomes you are interested in. The most common method is to administer surveys. In some cases, you can rely on non-survey instruments such as random spot checks (for example to check the attendance and participation of program participants) or mystery clients.

18.1.1 Specifying the time and frequency of data collection

You must decide when and how often to collect data. Should you conduct a baseline survey before the program is rolled out? Should you collect data throughout program implementation? How long do you wait before doing an endline survey?

A baseline survey might be worthwhile when we want to show that our treatment and comparison group look similar before the program. This is called a balance check and if we want to analyse the data by subgroup or conduct regression analysis in the final analysis.

The more frequent our data collection, the more likely we are to capture intermediate outcomes. Of course, a downside is the cost. Also, sometimes, too-frequent surveying can itself become an intervention and change outcomes. To find out more, look at the link below and read the influential article published in 2011 in the Proceedings of the National Academy of Sciences of the United States of America.

The decision of when to collect endline data is usually a trade-off between getting long term outcomes and getting results on time for the findings to inform decision-making.

18.1.2 Ensuring identical data collection between the groups

The data must be collected in an identical manner between the treatment and comparison groups. Everything about the data collection (who does it, how it is done, how frequently, and with what tools) must be identical between the treatment and comparison groups. This ensures that you are not creating any differences between the treatment and comparison groups.

The effects of being surveyed

Read the following abstract (Zwane et al, 2011) to learn more about their findings on the effects of surveys on later behaviours of the household members. If you want to learn more in detail about the study, read the full article.

Abstract

Does completing a household survey change the later behaviour of those surveyed? In three field studies of health and two of micro-lending, we randomly assigned subjects to be surveyed about health and/or household finances and then measured subsequent use of a related product with data that does not rely on subjects’ self-reports. In the three health experiments, we find that being surveyed increases use of water treatment products and take-up of medical insurance. Frequent surveys on reported diarrhoea also led to biased estimates of the impact of improved source water quality. In two micro-lending studies, we do not find an effect of being surveyed on borrowing behaviour. The results suggest that limited attention could play an important but context-dependent role in consumer choice, with the implication that researchers should reconsider whether, how, and how much to survey their subjects.

Zwane, A. P., Zinman, J., Van Dusen, E., Pariente, W., Null, C., Miguel, E., Kremer, M., Karlan, D. S., Hornbeck, R., Giné, X., Duflo, E., Devoto, F., Crepon, B., & Banerjee, A. (2011). Being surveyed can change later behavior and related parameter estimates. Proceedings of the National Academy of Sciences, 108(5), 1821–1826.

Activity

What are the advantages and disadvantages of using survey data over administrative records?