"Developing responsible AI practices at the Smithsonian Institution" published in Research Ideas and Outcomes

October 25, 2023

A team of Smithsonian staff and post-doctoral fellows from several units across the Smithsonian Institution co-authored the paper "Developing responsible AI practices at the Smithsonian Institution" in the journal Research Ideas and Outcomes. The paper (which can be accessed at https://doi.org/10.3897/rio.9.e113334) details the process we went through to come up with a values statement for responsibly applying AI to research or existing workflows at the Smithsonian.

The Spring 2022 version of the AI Values statement can be found below:
 

AI Values Statement -- Spring 2022

Technology is not neutral.

The use of Artificial Intelligence (AI) tools1 to describe, analyze, visualize, or aid discovery of information from Smithsonian collections, libraries, archives, and research data reflects the biases and positionality of the people and systems who built each tool, as well as those that collected, cataloged, and described any data used for their training. These tools might hold extensive value in their use at the Smithsonian, but there are issues that will limit the applicability and reliability of their use due to the way they were planned and created.

We seek to only begin AI projects2 that implement tools and algorithms that are respectful to the individuals and communities that are represented by the information in our museum, library, and archival collections. We aim to be proactive in identifying and documenting biases and methodologies when building and implementing such tools and making the documentation available to audiences that will interact with the resulting products. We recognize that technology evolves over time and that our efforts must also evolve to ensure our ethical framework stays relevant and robust. We encourage any person, community, or stakeholder involved with or affected by said tools and algorithms to provide feedback and point out any concerns.

We acknowledge the opportunities that AI tools present for cultural heritage organizations:

  • As digitization of museum, library, and archival collections has become more prevalent, there is a need for tools to make digitized data available to our audiences.

  • AI tools can be used to make museum, library, and archival collections more discoverable to the public by efficiently extracting, summarizing, and visualizing vast amounts of data.

  • AI tools can help us become more representative of our audiences, through surfacing the histories of marginalized people and groups.

We urge anyone contemplating an AI project to consider:

  • Is it the appropriate technology to solve the problem?

  • The development of AI tools often requires the use of specialized computational hardware, the production of which relies on mining of rare earth metals, and the operation of which can have a large carbon footprint. What is the environmental impact of choosing this technology or tool?

  • There are no unbiased methodologies, datasets, collections, algorithms, or tools. Therefore, what are the biases in the methodologies, datasets, collections, algorithms, or tools you wish to use?

We strive to promote the following actions when implementing AI tools:

  • Documentation of the biases in any methodologies, datasets, collections, algorithms, or tools.

  • Documentation of transparent data statements that outline the intent of methodologies, datasets, collections, algorithms, or tools.

  • Creation of positionality statements of the creators of datasets or algorithms behind AI tools.

  • Documentation of potential risks and regular updating of these risks as technology changes.

  • Solicitation and inclusion of feedback from relevant members of the community.

  • Documentation of how AI content was produced.

  • Clear labeling of AI-generated content, so it is not confused with human-generated content.

We strive to recognize the following when implementing AI tools:

  • Everyone at the Smithsonian is involved in data collection, creation, dissemination, analysis, as a stakeholder.

  • If any community or individual is harmed by the use of a technology, then that is one too many.

We strive to promote the following when partnering with outside organizations on AI tools or projects:

  • We should seek projects and partnerships that adhere to our institutional values.

  • We should not enter into contracts of collaborations with industry or other partners for the use of tools with unspecified or undisclosed methods and biases.

  • We should require potential partners who create AI and machine learning tools to explicitly evaluate and state if the datasets or data descriptions used in these tools was collected without consent, or contains offensive or racist descriptions before we agree to use these tools.

 

1The term “AI tools” includes a variety of technologies that seek to create decision-making software. Some examples include facial and speech recognition, machine learning based optical character recognition, language translation, natural language processing, image recognition, object detection and segmentation, and data clustering. Common commercial examples include virtual assistants such as Siri or Alexa, website search and recommendation algorithms, and tagging and identification of people in images on social media platforms.  

2The term “AI project” refers to an intentional effort to utilize or create an AI tool in research or in an existing workflow. 

References 

Bender, E. M., Friedman, B. 2018. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. https://aclanthology.org/Q18-1041.

Denton, E., Hanna, A., Amironesei, R., Smart, A., Nicole, H., Scheuerman, M. K. 2020. Bringing the People Back In: Contesting Benchmark Machine Learning Datasets. Proceedings of ICML Workshop on Participatory Approaches to Machine Learning (https://arxiv.org/pdf/2007.07399.pdf). 

Murphy, O., Villaespesa, E. 2020. AI: A Museum Planning Toolkit (https://themuseumsainetwork.files.wordpress.com/2020/02/20190317_museums-and-ai-toolkit_rl_web.pdf). 

Schwartz, R., Dodge, J., Smith, N. A., Etzioni, O. 2019. Green AI. https://doi.org/10.48550/arxiv.1907.10597

Stanford Special Collections and University Archives Statement on Potentially Harmful Language in Cataloging and Archival Description (https://library.stanford.edu/spc/using-our-collections/stanford-special-collections-and-university-archives-statement-potentially). 

Version composed in Spring 2022 by members of the Smithsonian AI & ML community. This statement is considered a living document and is expected to change with time and technology. Comments and suggestions welcomed at SI-DataScience@si.edu.