Participatory Data Stewardship Models

In Brief Research Read

Katy McKinney-Bock | October 3, 2023

A full version of this report is available:

A Better Deal for Data is focusing on empowering people to participate in shared governance of their data, and is about making commitments to data subjects, to communities, and for the public good. When data is collected about you, we want to make a set of promises about how that data will be used ahead of time.

Part of this involves research on community-centric governance, or participatory models of engagement around data. In doing this, we have found that the term stewardship is used in various ways to describe the processes and people involved in making decisions about data.

For September, we are asking the broader question, what is data stewardship? In the back of our minds are some other, more specific questions, such as what kinds of agreements do communities enter into when sharing their data? How challenging are these models to implement, and are they resilient over time?

It turns out that that pinning down a single notion of data stewardship is challenging. The UN Working Group on Data Stewardship reviewed 34 documents from a variety of sectors, and concluded that data stewardship “was not clearly defined and was often discussed in vague and abstract terms”. Here are some quotable definitions:

    • The responsible use, collection and management of data in a participatory and rights-preserving way. (Ada Lovelace Institute, 2021, p. 12)
    • Data stewardship /ˈst(y)oo͞ərdˌSHip/ : (noun) Storage of data that also provides the data owner with control (Gorr & Zawacki, 2020)
    • Data stewardship involves collecting, maintaining and sharing data, and, in particular, determining who has access to it, for what purpose and to whose benefit. (Wilson & Thereaux, 2020, p. 4)
    • Stewardship [is] a broader paradigm that pushes the envelope of data governance beyond matters of compliance. Through stewardship, we can envisage new ways to prioritize and embed individual and collective forms of empowerment, agency, and participation in the data economy. (Soni, 2021)
    • Beyond proper collection, annotation, and archival, data stewardship includes the notion of ‘longterm care’ of valuable digital assets, with the goal that they should be discovered and re-used for downstream investigations, either alone, or in combination with newly generated data. (Wilkinson et al., 2016)

Embedded in these various definitions is a sense of long-term care about data and the people data is about. But there are also other values, such as empowering people/creating agency, and enabling participation in processes around data. Mozilla even describes preservation of ‘control’ over the data owner’s data. Several different parts of the data ‘lifecycle’ are described above: collection, storage, maintenance, management, use, access, discovery, re-use.

As varied as the definitions are, there are some guiding principles that have been referenced repeatedly for exploring participatory mechanisms around data. Elinor Ostrom’s Design Principles of the Commons, Nobel-prize-winning work that sets out eight principles for governance of common natural resources that would increase the likelihood of successful management of those resources (vs other outcomes predicted in Hardin’s tragedy of the commons like overuse and ruin), has been applied to data by the Ada Lovelace Institute and Mozilla Foundation. Their intent is to increase the successful use of data for public good. Another applied theory is Arnstein’s Ladder of Participation, which talks about empowering citizens through increasing participation in democratic processes. Data stewardship can be described in a similar way, from participation via reading a datasheet or model card (“Informing”, or the lowest rung of the ladder), to a data cooperative model where ownership and control of data is shared (“Empowerment”, or the highest rung of the ladder of participation).

For practitioners, or people who have data and want to make sure it is used for public benefit, several organizations have created questionnaires that can be used to get started or to evaluate your current stewardship practices:

The Data Economy Lab (Aapti Institute) has also set out a framework for understanding models of data stewardship, which they have applied to over 100 instances of stewardship in practical use. The initial (non-exhaustive) ten models are: Data Trust, Data Marketplace, Personal Data Store, Data Exchange, Account Aggregator, Ecosystem Enabler, Data Collaborative, Data Repository, Data Cooperative.

We also examined case studies from both the Ada Lovelace Institute and the Data Economy Lab that are recent examples of participatory data stewardship models.

Data Cooperative and Empowerment:

The Salus Coop is a health data cooperative based in Spain, which developed a health data license to enable donation of pseudonymized health data for use in health research for common good. The Salus CG license makes five commitments to the data donor: (1) Data will only be used for health/social science, (2) no commercial use, (3) shared results (free of charge and accessible), (4) maximum privacy (pseudonymized), and (5) total control (can change access to your data at any time).

Databank for Youth Collaboration:

The Global Mental Health Databank / the MindKind Study (the Wellcome Trust and Sage Bionetworks) piloted a collaborative approach to creating a data bank of information about anxiety and depression treatment for young people in different settings worldwide. Youth from the UK, India, and Africa worked through an app to participate in the 12-week study.

Data Collaborative for Australian Agriculture:

The Data Stewardship Navigator lists AgReFed, which is a platform developed for the “sharing and reuse of Australian agricultural research datasets, metadata, and data-related products.” AgReFed is a federated community, organized as a data cooperative, that enables FAIR agricultural data to be shared.

Resilience and piloting of newer data stewardship models is in development, although the notion of data stewardship is not new. The above approaches are designing a principles-based set of solutions to encourage participation and empowerment around data for public good. In October, we turn to look for examples of licenses that have been developed under participatory models of governance/stewardship, and cases of data sharing.


 

This work is licensed under CC BY 4.0. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

This work is supported by a subaward from OpenTEAM as an initiative of Wolfe’s Neck Center for Agriculture and the Environment, specifically funded by the U.S. Department of Agriculture under agreement number NR233A750004G032. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author and do not necessarily reflect the views of any funder. In addition, any reference to specific brands or types of products or services does not constitute or imply an endorsement.

Please send feedback via email to [email protected].

References

Abrams, M., Abrams, J., Cullen, P., & Goldstein, L. (2019). Artificial Intelligence, Ethics, and Enhanced Data Stewardship. IEEE Security & Privacy, 17(2), 17–30. https://doi.org/10.1109/MSEC.2018.2888778

Ada Lovelace Institute. (2021a). Participatory data stewardship. https://www.adalovelaceinstitute.org/report/participatory-data-stewardship/

Ada Lovelace Institute. (2021b, March 4). Disambiguating data stewardship. https://www.adalovelaceinstitute.org/blog/disambiguating-data-stewardship/

Arnstein, S. R. (1969). A Ladder Of Citizen Participation. Journal of the American Institute of Planners, 35(4), 216–224. https://doi.org/10.1080/01944366908977225

Baker, K. S., & Yarmey, L. (2009). Data Stewardship: Environmental Data Curation and a Web-ofRepositories. International Journal of Digital Curation, 4(2), Article 2. https://doi.org/10.2218/ijdc.v4i2.90

Bloom, G. (2020). The Principles of Governing Open Source Commons. SustainOSS: Exploring Sustainability for Open Source Communities. https://sustainoss.pubpub.org/pub/jqngsp5u/release/1

Boeckhout, M., Zielhuis, G. A., & Bredenoord, A. L. (2018). The FAIR guiding principles for data stewardship: Fair enough? European Journal of Human Genetics, 26(7), Article 7. https://doi.org/10.1038/s41431-018-0160-0

Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age (p. 12615). (2009). National Academies Press. https://doi.org/10.17226/12615

Favour Borokini & Bonaventure Saturday. (n.d.). Exploring the Future of Data Governance in Africa Data Stewardship, Collaboratives, Trusts and More [White Paper]. Pollicy.

Gorr, K., & Zawacki, K. (2020, February 17). Data Stewardship—What is it and why does it matter? Mozilla Foundation. https://foundation.mozilla.org/en/blog/data-stewardship-what-it-and-why-does-itmatter/

Hilgartner, S., & Brandt-Rauf, S. I. (1994). Data Access, Ownership, and Control: Toward Empirical Studies of Access Practices. Knowledge, 15(4), 355–372. https://doi.org/10.1177/107554709401500401

Information Commissioner’s Office of the UK. (2023, May 19). Legal definitions. ICO. https://ico.org.uk/fororganisations/data-protection-fee/legal-definitions-fees/

Joshi, D., & Singh, A. (2021). The Histories, Practices, and Policies of Community Data Governance in the ‘Global South’ (SSRN Scholarly Paper 4506644). https://doi.org/10.2139/ssrn.4506644

Kapoor, A., & Whitt, R. S. (2021). Nudging Towards Data Equity: The Role of Stewardship and Fiduciaries in the Digital Economy (SSRN Scholarly Paper 3791845). https://doi.org/10.2139/ssrn.3791845

Lutz, R., & Greene, S. (1999). Data Stewardship: The Care and Handling of Named Entities. Proceedings of the ASIST Annual Meeting, 36. https://www.learntechlib.org/p/87517/

MindKind: A mixed-methods protocol for the feasibility of global digital mental health studies in young people. (2022). Wellcome Open Research, 6, 275. https://doi.org/10.12688/wellcomeopenres.17167.2

National Research Council, Division on Earth and Life Studies, Board on Atmospheric Sciences and Climate, & Committee on Climate Data Records from NOAA Operational Satellites. (2005). Review of NOAA’s Plan for the Scientific Data Stewardship Program. National Academies Press.

O’hara, K. (2019). Data Trusts: Ethics, Architecture and Governance for Trustworthy Data Stewardship. WSI White Papers, University of Southampton, 1. https://doi.org/10.5258/SOTON/WSI-WP001

Ostrom, E. (1990). Governing the commons: The evolution of institutions for collective action. Cambridge University Press.

Peng, G. (2018). The State of Assessing Data Stewardship Maturity – An Overview. Data Science Journal, 17, 7–7. https://doi.org/10.5334/dsj-2018-007

Ramdeen, S., & Hills, D. J. (2013). ESIP’s Emerging Provenance and Context Content Standard Use Cases: Developing Examples and Models for Data Stewardship. 2013, IN53C-1578.

Rosenbaum, S. (2010). Data Governance and Stewardship: Designing Data Stewardship Entities and Advancing Data Access. Health Services Research, 45(5p2), 1442–1455. https://doi.org/10.1111/j.1475-6773.2010.01140.x

Sagli, J. R., & Egeland, O. (1991). Dynamic coordination and actuator efficiency using momentum control for macro-micro manipulators. 1201,1202,1203,1204,1205,1206-1201,1202,1203,1204,1205,1206. https://doi.org/10.1109/ROBOT.1991.131773

Saxena, S. (2023, September 25). Data Sandboxes: Managing the Open Data Spectrum. Data Stewards Network. https://medium.com/data-stewards-network/data-sandboxes-managing-the-opendata-spectrum-6ef3bf9c5133

Soni, S. (2021a, September 28). Building the Stewardship Navigator: Our Approach and Methodology. The Data Economy Lab. https://thedataeconomylab.com/2021/09/28/building-the-stewardshipnavigator-our-approach-methodology/

Soni, S. (2021b, November 22). Empowering communities with data stewardship. The Data Economy Lab. https://thedataeconomylab.com/2021/11/22/empowering-communities-with-data-stewardship/

Strasser, C. (2013). DataUp: Enabling data stewardship for researchers. https://doi.org/10.9776/13300

Toczydlowski, R. H., Liggins, L., Gaither, M. R., Anderson, T. J., Barton, R. L., Berg, J. T., Beskid, S. G., Davis, B., Delgado, A., Farrell, E., Ghoojaei, M., Himmelsbach, N., Holmes, A. E., Queeno, S. R., Trinh, T., Weyand, C. A., Bradburd, G. S., Riginos, C., Toonen, R. J., & Crandall, E. D. (2021). Poor data stewardship will hinder global genetic diversity surveillance. Proceedings of the National Academy of Sciences, 118(34), e2107934118. https://doi.org/10.1073/pnas.2107934118

United Nations Economic Commission for Africa. (2023, May 25). StatsTalk-Africa: Data Stewardship in Africa. Events, UNECA. https://www.uneca.org/eca-events/statstalk-africa-data-stewardship-africa

van den Hoven, J. (1999). Information Resource Management: Stewards of Data. Information Systems Management, 16(1), 88–90. https://doi.org/10.1201/1078/43187.16.1.19990101/31167.13

Verhulst, S. G. (2021). Reimagining data responsibility: 10 new approaches toward a culture of trust in reusing data to address critical public needs. Data & Policy, 3, e6. https://doi.org/10.1017/dap.2021.4

Verhulst, S. G. (2023, March 13). Wanted: Data Stewards — Drafting the Job Specs for A Re-imagined Data Stewardship Role. Data Stewards Network. https://medium.com/data-stewards-network/wanteddata-stewards-drafting-the-job-specs-for-a-re-imagined-data-stewardship-role-f7cd28a83379

Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., … Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1), Article 1. https://doi.org/10.1038/sdata.2016.18

Wilson, R., & Thereaux, O. (2020). Designing Trustworthy Data Institutions. The Open Data Institute

Related

Informed Consent

What is informed consent? What does it mean to give consent, for example, to participate in a service? How is consent managed, both by institutions and via technologies?

Data in Context

How can we begin to test feasibility of a general set of commitments, taking context (community, geographical, different types of data) into account?