Open and the Better Deal for Data

Jim Fruchterman | July 1, 2025

Note: In the following essay we reference the eight Better Deal for Data™ commitments originally proposed in our April 2024 white paper. In December 2025, we released the BD4D™ Commitments as a set of seven refined commitments.

Introduction

An important objective of the Better Deal for Data is to better steward data which should not be made open. Many of the people and communities served by nonprofits are concerned about how their data gets used: they don’t want their data to be made openly available, or sold to the highest bidder. The Better Deal for Data (BD4D) is intended as a positive alternative.

Using BD4D does not cause the data of individuals (and other confidential data) to become open; instead, it commits organizations adopting the Better Deal to keep that data confidential and respect applicable data privacy laws. Any data sharing is also subject to these requirements, as well as requiring that such sharing be intended to benefit the people or communities whose data is being shared.

We at Tech Matters are not against open data as a concept; we think it’s great for many types of data which are not about people, or represent aggregated data from large numbers of people. Examples of open data include weather station data, city or state population data, or Wikipedia. Open means open: anyone could use the data for any purpose, and in the open case, the use restrictions in the BD4D commitments would cease to apply. As a result, making personal, sensitive, or confidential data open is generally not permitted under the BD4D. This leads us to the Major Question in this paper: When is open data consistent with the Better Deal for Data?

We will first talk about agency: individuals and communities have the power to decide to make their data open, assuming this is an intentional and well-informed decision. Next, we’ll explore the concept of downstream data uses, such as when the data is aggregated and the data of individuals or users cannot be accessed. The related concept of data use in research is also covered. Finally, we remind the reader of the truism that open means open, and is not a decision to be made lightly when it comes to data.

This working paper is one of a series we are publishing on major questions we are being asked about implementing the Better Deal for Data. It is not intended to be the final word on where the Better Deal for Data standard ends up on these questions. Instead, these Major Questions papers are encouraging us to explore each specific question, and think through our initial answers. We are hoping to engage in more extensive debates over any controversial points.

Agency

The Better Deal for Data is about shifting power away from data collecting organizations and the tech industry, and towards individuals and communities. Part of this power is having decision-making authority over Your Data (in the BD4D Commitments, “Your Data” means the data of data subjects, the people, the farms, the nonprofit organizations, and the communities described by the data). One of those powers is the power to decide to allow Your Data to be made openly available, even if most of that time that power is not exercised. For example, farmers might decide to make the locations of illegal mining, dumping, or toxic discharges into waterways available as open data, even if those locations correspond to land they own or occupy.

A data collection project that would generally keep its data confidential may offer the option for a data subject to choose to make their data open. The data subject (a farmer, for example) might choose to share the GPS location where some data was collected, data which might often be correlated with the location of that data subject. Of course, this would have to be opt-in (not opt-out) and there would have to be a clear explanation about the risks of such open disclosure. Under the Better Deal for Data, if there is not a presumption that data will be kept privately, then it’s not a fit for the Better Deal. If all the data in a project was always going to be open, and the choice of open was appropriate, then the data collectors should not use the BD4D, instead being clear that participation in the project means open data, all the time.

Processed Data

We expect that much of the detailed data collected under the Better Deal for Data will not be shared in its original form, which often includes personal, sensitive or confidential data. BD4D puts requirements on the sharing of such data, which include keeping it secure, complying with applicable privacy laws, and restricting intent (the primary use must be to benefit the data subject, the community, humanity, and the planet, not for private gain).

However, when is the data processed enough to where the data cannot be identified with the data subject? When is “Your Data” no longer “Your Data?” This question is a big one, and is relevant for any decision to make processed versions of the original data open.

For example, most nonprofits need to raise money constantly to deliver their programs to the communities they serve. Nonprofits regularly report aggregate impact numbers publicly about their work as a way to justify it to the community, donors, the press, policymakers, and the rest of society.

Publishing a number openly like “our nonprofit helped 637 families with our food program last month” is clearly in line with the BD4D Commitments, since it does not contain the data of an identifiable person or family, and the intent is clearly social benefit, not private gain. We discuss questions about nonprofits and funding at greater length in our Major Questions paper: Nonprofits and Funding.

A similar argument goes for other uses of data processed to the point where it can be made open in some form, and the original, confidential data cannot be extracted in practice. For example, a nonprofit might learn that a particular approach is effective in city A, but not in city B. Publishing city-wide effectiveness numbers for programs is clearly BD4D-aligned. The same ideas are applicable when evaluating the use of data collected under the BD4D Commitments when training an open source AI model. An example of this might be creating an AI model to create a simple chatbot in an underserved language, based on chat data collected and anonymized before using it to train the chatbot model. This question was important enough to make up a considerable portion of our Major Questions paper AI and the Better Deal for Data.

The key elements of making a decision about downstream use under the Better Deal are that sensitive information (especially that of individuals) is protected from disclosure (BD4D Commitment Six), and that the intended use is for social benefit and not private gain (Commitment One).

Research

The Better Deal for Data is centered on using data for public rather than private benefit. Scientific research is a great example of this, and scientists are generally far ahead of the nonprofit sector in terms of responsible data practices. Human subjects research is generally overseen by an independent Institutional Review Board (IRB), and subject to informed consent requirements. Large repositories of data are made available for responsible research under tight controls to protect sensitive data. Because the nonprofit sector generally doesn’t have access to these resources, we believe that there are still ethical (and practical) options available for data governance, and have proposed the Better Deal for Data to help fill that gap.

We should note that the Better Deal can be thought of as a set of requirements that creates a floor: there are often additional steps needed (as noted above for human subjects research) that are even more protective of the interests of data subjects than BD4D. The Better Deal for Data is not intended as a substitute for these more extensive requirements.

Some research papers are published in proprietary journals that do not share the papers freely, but a sizable fraction of current research is published under “Open Access” policies, where the resulting papers are available for free to the general public. Research based on data collected under the BD4D Commitments needs to be made available to the people and communities whose data was collected, and the most straightforward way to do this is to publish the research as Open Access. However, this is not the same thing at all as making the original datasets open.

One final point is that in order to publish, researchers are often required to make their research data available for transparency and reproducibility, thus disclosing processed data. We believe that the BD4D requirements should not be in conflict with these data requirements in science. After all, when medical research is published, the underlying data must be handled in a way to protect the privacy of individual patients.

Open Means Open

Before we conclude, we need to underscore that while open data is generally in society’s interest, once something is made openly available it is not possible to restrict all of the possible uses. So, even if the primary intent is to do social good, there needs to be an awareness of possible misuse when considering an open release of processed data. If there is an anticipated substantial threat to the interests of data subjects from the release of downstream (processed) data, then it should not be done under the BD4D without a strong consensus in the community that the benefit to them outweighs the potential harms.

An example of this is a data collection project where even the release of processed data will likely cause harm to community members or their interests, violating the promise of BD4D Commitment One. Even if the project is conducted in the interest of science, that does not mean it can claim to be conducted under the Better Deal for Data. For example, an effort to map mineral deposits in a rural area where an Indigenous community resides (and their land tenure is under threat), and making that data available publicly (or privately to mining companies!) is probably not going to be compliant with the BD4D Commitments, even if the data about individuals is kept confidential. The risk to the interests of the community is likely to be too high for the release of this specific geological data. After all, an Indigenous community should have the power to decide what projects are in their best interests. We are currently considering a Major Questions paper covering the concept that a project which would be negatively received by a significant fraction of the community is not going to be compliant with the BD4D.

We also recognize that even when the release of research or downstream data is compliant with the BD4D, there may be individuals who benefit less from that release, or may even be harmed. For example, a statistical analysis of a number of cities on a given social issue which enables ranking of those cities is going to produce cities that are ranked much better or much worse than their peers. Such a report might be intended to create political pressure in poorly ranked cities to do more to remedy the issue in question.

Conclusion: What’s Missing?

The Major Questions papers are intended to explore the big issues we hear about from the many collaborators who have contributed their data use cases, their feedback, and their support to the Better Deal for Data. The foregoing distills our initial thinking about nonprofits and open data. At this point, we would like to get even more feedback from the community, especially:

  • What additional issues come to mind about this subject?
  • What did we get wrong?
  • What examples do you have of nonprofit data use which should be inside or outside the Better Deal for Data, or are simply puzzles to consider?

While we believe that openness is generally a good thing (especially as developers of open source software), Commitment Six of the Better Deal for Data requires projects to protect and steward the data and comply with applicable privacy laws. Our goal in this working paper was to explore the limited number of cases where disclosing data under an open license is likely to be compliant with the Better Deal Commitments.

We are looking forward to many new questions and ideas as we work together to craft a usable Better Deal for Data.

Related

Sharing Data Responsibly

How can nonprofits responsibly share data for the benefit of individuals, communities, and society while staying true to the BD4D Commitments?

Nonprofits and Funding

Which nonprofit funding models fit with the Better Deal for Data commitments and which do not?