Big Data Framework

With input from the Signal Program on Human Security and Technology at the Harvard Humanitarian Initiative.

Accountability & M&E

  • Who should be in charge of monitoring the use of the data? Whether the data is being is being used in the way for which it was collected?
  • Should your organisation implement a principled policy or guideline in relation to data use? Should there be an institutional review board? An ethics committee?
  • Will there be feedback to communities on how that data is being represented and used?

What next:

  • Outline a response in regards to the above outlined questions in relation to your work/project


  • What are your organization’s ethics, principles, and values, which will be important to this project?
  • How do you align your organisational policies to this project? Do you need to update them to ensure they helpfully apply to your data driven project? For example, how will you address privacy, consent?
  • Who could potentially be adversely affected by data use in your project, and how?
  • What would happen if the project would need to be stopped? How would you deal with the data and with communication with the data subjects. Who is responsible for ‘pressing stop’ on these projects.
  • Who in the project/organisation will have oversight and accountability for use of this data use?

What next:

  • Develop an ethics/responsibility plan for the project and a monitoring plan for assuring compliance.
  • Develop a data security plan and "Red Buttons" checklist for dealing with breaches, including circumstances for suspending/ending program.

Risk-Benefit Analysis

  • Risks versus benefits of engaging with data sets - What is your theory of change?
  • Who are the people that should be consulted in the first, exploratory phase? Local NGO’s? Civil Society Organisations?
  • Who are the best people to collect, manage & work with the data?
  • Who is responsible for monitoring responsible data issues?
  • Have you envisioned the life cycle of the data?
  • When and under what conditions should the program be stopped?

What next:

  • Develop a map of the benefits and risks associated with your project, taking into consideration the questions outlined above


  • Have you examined the data quality and viability, including:
    • accuracy?
    • potential bias?
    • specific interests on the side of the party that collected the data?
    • specific purposes for which the data was collected?
  • Have you examined the data sensitivity, including:
    • personally identifiable information? community identifiable information? Demographic identifiable information?
  • Can you carry out the project if you received already de-identified data?

What next:

  • Make a data use plan for your project including what kind of data, for what purpose

What Data

  • What data do you actually need to reach the objectives in your Theory of Change? How will that data you use that data to reach your objectives?
  • What is the level of granularity, what is the time span of data that you need?
  • Are you using secure methods of storing & transferring data?
  • What data protection regimes or national laws should you be considering? Who will you consult with to better understand the application of laws and systems?
  • What is the chain of custody from collection, to you and that it is certifiable. But when you can’t prove chain of custody you need to show provenance.

What next:

  • Outline a response in regards to the above outlined questions in relation to your work/project

Identifying skill sets needed

  • Do you trust / understand the methodology of the data collection? (particularly if you are working with data collected by others)
  • Do you know who collected the data?
  • Do you know how they collected it and what assumptions they were making?
  • Are you an expert in the subject matter or do you need help with verification?
  • Are the subjects of the data research inputting into verification?
  • What process will you use to make decisions around deidentification/anonymization of data sets? Should you work in resources to engage with eternal experts?

What next:

  • Develop a plan of action for verifying and cleaning