• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!


Museum APIs

Page history last edited by Mia 5 months, 1 week ago Saved with comment

Museum, gallery, library, archive, archaeology and assorted sources for machine-readable data


If you know of an open data service that's not listed, just sign up to add it. If you have some free time, the list of APIs needs to be tidied up a bit, and open cultural data releases (such as images online) could be put on a separate page - feel free to dive in and help out! You can reach me at


If you use this to find data for a project or to build another list, it'd be lovely if you credited the wiki/Mia Ridge as its maintainer as it's been a lot of work over the years.)


Please add APIs and other data sets you're aware of, ideally with a comment on what's useful or not about how they've been implemented or documented.  If you know of or have written libraries for use with these APIs, then please add them to help other people get started.  It's fine to add your own GLAM API or open cultural dataset - in fact, it's encouraged, whether it's tiny or huge, or whether it's a simple text file or a fancy endpoint. The layout is simple - H2 heading for the title, a brief summary of what's available with which licences and formats, and as many links as you like.


Information about schemas/data structures used by other projects is also useful, even if they don't provide an API.  If you're looking for images and books rather than machine-readable data, here's a lot of 'open collections'.  If there's a public-facing collections online site related to the museum API, feel free to include it.  It's also helpful if you can give a rough estimate of the number of records and supported access methods.


Designing an API? Check out (and add to) Making good APIs. Thinking of having a hack day to road-test your API?  Check out hackdaymanifesto.com for lots of useful tips.  If you're preparing a list of data sets for a hackday, why not add them to this page and share them more widely?


Other sites listing data sources include:


British Library and Alan Turing Institute Living with Machines project

The project (2018-23) produced a large number of datasets. Highlights include:


University of British Columbia Open Collections API Documentation

UBC Library Open Collections provides direct programmatic access to all of the metadata and much of the media available on the website through our APIs.

Three different Open Collections API Endpoints are available to meet a variety of use cases:

  • The Open Collections REST API provides access to all collection and item-level metadata and full text transcripts, as well as the ElasticSearch search index for complex queries.
  • The Open Collections OAI-PMH API provides access to item-level metadata programmatically or for harvesting by compatible software.
  • The Open Collections IIIF API provides access to images and associated metadata programmatically or for use with IIIF-compatible image viewers.


National Library of Singapore Datasets

'The National Library’s digital collection comprises of various formats such as digitised books, magazines, maps, newspapers, images, sound and video recordings and websites. We have made the metadata of this collection available in the form of open datasets for non-commercial visualisation and analysis. '


Victoria and Albert Museum API


The V&A makes available over a million collections records and over half a million images for you to explore, research, share, and play with.

This information is made available for re-use though our APIs, under our terms and conditions (in particular Section 9), for exploring the collections metadata (information about the objects in our collections) and for displaying the collection images (images of objects in our collections).


To use our APIs, take a look at our API Guide and API Reference. If you would like to see some worked examples exploring the collections data, look through our Data Explorations notebooks.


You can also download JSON and IIIF manifest links from collections item pages.

National Gallery of Art (USA) Open Data Program

'The National Gallery of Art serves the United States by welcoming all people to explore and experience art, creativity, and our shared humanity. In pursuing our mission, we are making certain data about our collection available to scholars, educators, and the general public in CSV format...'

Released under the Creative Commons Zero designation.

The dataset provides data records [metadata] relating to the 130,000+ artworks in our collection and the artists who created them. ...

The dataset is published in CSV format and uses UTF-8 encoding. It is compressed with LZW compression to permit the files to be downloaded without any special tools and can be decompressed with commonly available file unzipping tools. A data dictionary fully describes the dataset.'

Whitney Museum of American Art | Open Access

'As the preeminent institution devoted to the art of the United States, the Whitney Museum of American Art presents the full range of twentieth-century and contemporary American art, with a special focus on works by living artists. ...

The Whitney’s collection includes over 25,000 works created by more than 3,600 artists in the United States during the twentieth and twenty-first centuries. The datasets in this repository reflect those works and artists that have been catalogued and publicly released on the Museum’s website.

At this time, the datasets are available in CSV format, encoded in UTF-8.

These datasets are placed in the public domain using a CC0 License.'


Further reading: Stepping into open access at the Whitney

Smithsonian Open Access

'download, share, and reuse millions of the Smithsonian’s images—right now, without asking. With new platforms and tools, you have easier access to nearly 3 million 2D and 3D digital items from our collections—with many more to come. This includes images and data from across the Smithsonian’s 19 museums, nine research centers, libraries, archives, and the National Zoo.'


Landing page: https://www.si.edu/openaccess

Metadata: https://github.com/Smithsonian/OpenAccess 

Smithsonian Institution Open Access API documentation http://edan.si.edu/openaccess/apidocs/ https://edan.si.edu/openaccess/docs/ 

Hungarian photo archive Fortepan

Almost 120,000 searchable images for free use and download. https://beta.fortepan.hu/ is currently under devt, the old site with English language options is at http://fortepan.hu/?language=en-US HT @doctor_gwen for the link.

Museums of the City of Paris / Paris Musées

100,000 open access works and an API.


Collections Portal: http://parismuseescollections.paris.fr/


Collections API: http://apicollections.parismusees.paris.fr/ 

Using OCLC APIs in OpenRefine

An article on using APIs without writing code, based on OCLC APIs that provide access to library data

Australian GLAM-related CSV-formatted datasets

Collected by Tim Sherratt @wragge, who says: after an initial sweep of Australian gov data portals I have a list of more than 400 GLAM-related CSV-formatted datasets.

Browse here: https://docs.google.com/spreadsheets/d/1gCcLZEe-pdYEn8DfLrhM9WwfJ2jgfPV39ZRiodzhm78/edit#gid=1192241706

Harvesting code in the notebook here: https://github.com/wragge/ozglam-data

Museum für Kunst und Gewerbe Hamburg (MKG) Collection Online

With approximately 500,000 objects from 4,000 years of human history, the Museum für Kunst und Gewerbe Hamburg (MKG) is one of Europe’s most important museums of art and design. The digitised and published parts of the collection are accessible online via MKG Collection Online since 2015: http://sammlungonline.mkg-hamburg.de/en

The website features more than 20,000 artworks and artifacts. The LIDO-XML dataset that is provided and updated here contains metadata of all the published records including links to the connected images, if available.

Wellcome Collection


As of April 2018, you’ll find the following:

  • Catalogue: Unified search across our museum and library collections, using the same API that we use on wellcomecollection.org.
  • IIIF: Access to 110k images from our collections, using the standard International Image Interoperability Framework (IIIF) Image API.
  • Datasets: A full daily snapshot of the Catalogue and around 5800 Medical Officer of Health (MOH) reports from the Greater London area.
  • Ontologies: A set of OWL ontologies that describe our collections, exhibitions and editorial content as a rich graph of linked data.
  • Cardigan: Our UX design system, including the visual architecture and reusable components we use to build wellcomecollection.org.


Biodiversity Heritage Library developer tools and API

BHL is a treasure trove of information about global biodiversity. In addition to the various programmatic methods - an API supporting http and SOAP queries and an OAI-PMH interface, plus R and other libraries - there's also a data export page.

British Library datasets

The British Library is making copies of some of its datasets available for research and creative purposes. Hugely varied datasets from the collections of the British Library, ranging from archived web pages to Hebrew manuscripts.

Library of Congress MARC records

'Our open-access service includes nearly 25 million MARC records, as distributed in the unabridged 2014 Retrospective file sets. These MDS record sets have been made available primarily for research and development usage. Records are available in three file formats - UTF8, MARC8 and XML.'


Library of Congress for robots

'We hope this list of APIs, bulk downloads, and tutorials will help you begin exploring the many ways the Library of Congress provides machine-readable access to its digital collections.'


'Discussing Library of Congress API documentation with Laura Wrubel and Patrick Rourke' - a June 2023 interview with two people who've worked extensively with the API


Congress.gov API

The beta Congress.gov Application Programming Interface (API) provides a method for Congress and the public to view, retrieve, and re-use machine-readable data from collections available on Congress.gov.


Documentation: https://github.com/LibraryOfCongress/api.congress.gov/

Background information and more from its beta launch in 2022.

Heritage Index for the United Kingdom

Downloadable dataset that combines data about heritage assets (from the expected buildings etc to 'shipwrecks, ancient trees, war memorials and more') and activities (volunteering etc) collected under headings for historic built environment, museums and archives, industrial heritage, parks and greenspace, natural heritage, cultures and memories, general – assets and activities for each heading:


Carnegie Hall's performance history as linked open data

Documentation and SPARQL endpoint

'The initial release encompasses performance history data from 1891 through the end of the 2015-16 concert season (July 15, 2016).'


Bluestocking Corpus: Letters by Elizabeth Montagu, 1730s-1780s

243 manuscript letters, written by the ‘Queen of the Blues’ Elizabeth Montagu between the 1730s and the 1780s. Elizabeth Montagu (née Robinson, 1718-1800) was one of the key figures of the learning-oriented Bluestocking Circle in eighteenth-century England. Available as XLSX or XML download.


Williams College Museum of Art (WCMA) Collection

The WCMA collection includes works of art in all media ranging from ancient Egyptian, Assyrian, and Greco-Roman objects, Indian painting, African sculpture, photography, art of the U.S., and international modern and contemporary art. This research dataset contains contains 15,635 records, representing all currently accessioned works of art in WCMA’s collection. It includes basic metadata for each artwork, including accession number, title, date, classification, medium, and dimensions. This dataset is placed in the public domain using a CC0 License.


At this time, the dataset is available in CSV format, encoded in UTF-8. While UTF-8 is the standard for multilingual character encodings, it is not correctly interpreted by Excel on a Mac. Users of Excel on a Mac can convert the UTF-8 to UTF-16 so the file can be imported correctly.

ACMI (Australian Centre for the Moving Image) 

'The ACMI API provides JSON-formatted data as a REST-style service that allows you to explore and integrate our museum’s public data into your projects': https://www.acmi.net.au/api/


An older github-based site: ACMI Collection Data, CC0 1.0 Universal, files in TSV and JSON. This github page describes the data structures and provides instructions for opening files.


ACMI Historic Film Screening Data, CC0, a single TSV file listing dates and times of film screenings at the Australian Centre for the Moving Image from 1st April 2004 to 24th January 2016.

Data Foundry, National Library of Scotland

The National Library of Scotland‘s Digital Scholarship Service publishes data collections on the Data Foundry. Collections on the Data Foundry includes: digitised collections (text and images); metadata collections; map data; and organisational data. The Library updates and adds to these data collections on a regular basis.


TIB AV-Portal

The German National Library of Science and Technology (TIB) aims to promote the use and distribution of its collections. In this context, TIB publishes the authoritative and time-based, automatically generated metadata of videos of the TIB AV-Portal as Linked Open Data. Only metadata and thumbnails of videos which allow usage of their respective metadata and thumbnails under the Creative Commons License CC0 1.0 Universal are made available. Please note that the data was partially generated by an automatic process and may therefore contain errors or might be incomplete.


The Federico Zeri Photo Archive

From http://data.fondazionezeri.unibo.it/: Available as linked open data through the Zeri & LODE project: 'Data mostly regard artworks of Modern Art (15th-16th centuries): about 19.000 works of art and more than 30.000 photographs depicting such works are accurately described by means of like 11 million of RDF statements'. It uses 'two Italian metadata content standards, Scheda F, for Scheda di fotografia (photograph)and Scheda OA, for Scheda Opera d’Arte (work of art), both issued by the ICCD (Istituto Centrale per il Catalogo e la Documentazione, Central Institute for the Cataloguing and Documentation) of the Italian Ministry of Cultural Heritage'.


Access data and connect your application to our SPARQL endpoint or use the web interface to directly query it.

The last version of the RDF dataset can be downloaded at the University of Bologna data repository AMSActa: DOI:10.6092/unibo/amsacta/5157

Open dataset of the Central Library of the Hungarian National Museum

The Central Library[1] of the Hungarian National Museum published its entire catalogue as open data under CC0 licence. The data can be accessed and downloaded online from the online catalogue, from the OAI_PMH and Z39.50 servers. A zip file containing all data is also available. The data can be accessed in many formats: HUNMARC, MARC21, MARCXML and in many character sets including UNICODE. The documentation can be found here: http://hnm.hu/hu/muzeum/konyvtar/nyilt-bibliografiai-adatok

[1]: http://mnm.hu/konyvtar

The Museum of Modern Art (MoMA) Exhibition and Staff Histories

Exhibitions data on github: 'The exhibition index dataset was compiled by a project team from the MoMA Archives as part of their work to preserve, describe, and open to the public over 22,000 folders of exhibition records dating from 1929 to 1989 from its registrar and curatorial departments. ... This research dataset lists 1,788 exhibitions, representing all of the known exhibitions held at the museum from 1929 through 1989. All known curators and organizers, artists and other participants are listed for each exhibition. A total of 11,550 constituents are represented in this dataset, approximately 5,900 of them not currently represented in MoMA’s permanent collection of artworks.'


Their 'data centre' provides 'public access to high quality, open-licensed data that has been used or produced by MicroPasts crowd-sourcing or crowd-funding projects'. This includes 3D Models of various objects as well as data transcribed from museum records.


A useful guide to cultural datasets in data.gov.au and the Australian National Data Service (ANDS).


'Using correspSearch you can search through indexes of different letter collections (digital or print) by sender, addressee, location written, location sent, and date. To this purpose a website and a technical interface are provided. The web service collects and evaluates TEI-XML data in the ‘Correspondence Metadata Interchange’ format.' 



Structured data about military units in the First World War

The site is work-in-progress but useful data can already be obtained via http://collaborativecollections.org/WorldWarOne/How_to_access_structured_data


The site contains 'manuscripts written in the Arabic script from all subject areas, and of various geographical origins, dating from the rise of Islam up to the 19th century'. Contributing organisations include Bodleian Libraries, Oxford and Cambridge University Library. Further information:



Europeana data: full text Latvian, Austrian, Estonian, Finnish, French, Italian, Polish newspapers

Downloadable datasets.

Biblioteca Virtual Miguel de Cervantes

'The catalogue of the Biblioteca Virtual Miguel de Cervantes contains about 200,000 records which were originally created in compliance with the MARC21 standard. The entries in the catalogue have been recently migrated to a new relational database whose data model adheres to the conceptual models promoted by the International Federation of Library Associations and Institutions (IFLA), in particular, to the FRBR and FRAD specifications.

The database content has been later mapped, by means of an automated procedure, to RDF triples which employ mainly the RDA vocabulary (Resource Description and Access) to describe the entities, as well as their properties and relationships. In contrast to a direct transformation, the intermediate relational model provides tighter control over the process —for example through referential integrity—, and therefore enhanced validation of the output. This RDF-based semantic description of the catalogue is now accessible online.'

Museet for Søfart (Maritime Museum of Denmark)

JSON queries for their museum, library, image and exhibit databases.

American Museum of Natural History's Digital Universe Data (Star API)

'We provide access to positions, luminosity, color, and other data on over 100,000 stars as well as constellations, exo-planets, clusters and others.'

Docs for JSON access.


'The Artsy API provides access to images of historic artwork and related information on artsy.net for educational and other non-commercial purposes. It's currently available for playing, testing, and learning only, and not for production'

API: https://api.artsy.net/api


The journal has released various ontologies implemented in SKOS and two datasets:

  • Articles Dataset A dataset comprising the full set of articles on nature.com (1845-2015)
  • Contributors Dataset A dataset comprising the full set of contributors on nature.com (1845-2015)

The Museum of Modern Art (MoMA) Collection

CSV files available via GitHub, under a CC0 License. Over 120,000 records, 'representing all of the works that have been accessioned into MoMA’s collection and cataloged in our database. It includes basic metadata for each work, including title, artist, date made, medium, dimensions, and date acquired by the Museum'. Not all records are complete.


Includes historical data such as Hansard volumes, and the UK Parliament Ontology.

Smithsonian American Art Museum Linked Open Data

'The American Art Museum has made use of the CIDOC Conceptual Reference Model (CIDOC-CRM) to map out the concepts and relationships that exist within our collection. With the use of the W3C open data standard, RDF and SPARQL Protocol and RDF Query Language (SPARQL), linked open data gives researchers and developers outside the museum the ability to develop alternative applications, devices, and interfaces.'


Includes artwork and artist data with a sparql endpoint. Data available under CC0.


This will also contribute to the American Art Collaborative (AAC) which aims to establish a 'critical mass of LOD on the subject of American Art'.

Auckland Museum open data

Almost 1 million records from the Auckland Museum’s natural sciences, human history, documentary heritage and Cenotaph collections are available as linked open data.

BBC Ontologies

Including creative works, curriculum, news and wildlife.

Manuscripts Online API

Manuscripts Online (http://www.manuscriptsonline.org/) enables you to search a diverse body of online primary resources relating to written and early printed culture in Britain during the period 1000 to 1500. The resources include literary manuscripts, historical documents and early printed books which are located on websites owned by libraries, archives, universities and publishers. The Manuscripts Online API enables users to connect programmatically to the search engine, using GET parameters, and retrieve search results in an XML format.

Yale Center for British Art


The Yale Center for British Art's Linked Open Data service provides machine-readable access to the Center’s collections data. The data in the Linked Open Data service is expressed in RDF to allow our data set to link to other data sets without the need for database integration. We have organized our data using the CIDOC CRM (Conceptual Reference Model). The CRM is a powerful and robust ontology that represents our data set granularly to permit semantic integration with other data sets. This resource exists to support scholarly and creative activities, and to facilitate interdisciplinary projects.

The data, and sample SPARQL queries, are available through our SPARQL end point. Browse examples of our data usingPubby.

When accessing or using the Center's data and services, please be mindful that they are subject to the Center's Open Data And Data Services Terms of Use.

A description of our internal architecture is available on our In Depth page.

See related projects: researchspace.orgcollection.britishmuseum.org, and cidoc-crm.org.


Swiss Heritage Data

A list of 'open heritage data from and/or about Switzerland' compiled for a hackathon.

Institute of Museum and Library Services (IMLS) data catalogue 

Data available includes Museum Universe Data File FY 2015 Q1  'a list of known museums in the United States maintained by the Institute of Museum and Library Services', information on grants and public libraries.

The UK National Archives API

'Discovery holds more than 32 million descriptions of records held by The National Archives and more than 2,500 archives and institutions across the United Kingdom as well as a much smaller number of archives around the world. The information in Discovery is made up of record descriptions provided by or derived from the catalogues of the different archives. Although some of The National Archives records have been digitised and can be read online, Discovery can't search the words within them - only their description and title.'


See also: The National Archives Labs Datasets (downloadable (xls format) data licensed under the Open Government Licence).

The Internet Archive Metadata API

'The Metadata API is intended for fast, flexible, and reliable reading and writing of Internet Archive items.' Uses JSON. See also 'Internet Archive's S3 like server API'


Lincoln Mullen has shared an R Client for the Internet Archive API.


North Rhine-Westphalian Library Service Center (hbz)'s Linked Open Data

OhioLINK Collection and Circulation Analysis—Circulation Data

Provides access to a database of library 'circulation data compiled by OhioLINK and corresponding bibliographic data from the WorldCat database maintained by OCLC' under the Open Data Commons Attribution License


Available in Excel or tab-delimited text. They request that users adhere to the OCLC Community Norms.

Harvard Art Museums API

The Harvard Art Museums API is a REST-style service designed for developers who wish to explore and integrate the museums’ collections in their projects. The API provides direct access to detailed JSON formatted records for over 220,000 art objects, people, exhibitions, publications, and more. Documentation can be found on Github. The museum's also provide access to over 230,000 IIIF presentation manifests for objects, exhibitions, and galleries via a separate service. IIIF specific documentation can be found on Github as well.

See also: How We Learned to Stop Worrying and Love Open Data: A Case Study in the Harvard Art Museums’ API

Harvard Library Bibliographic Dataset

'This dataset contains over 12 million bibliographic records for materials held by the Harvard Library, including books, journals, electronic resources, manuscripts, archival materials, scores, audio, video and other materials.'


MARC21 records are available for download from openmetadata.lib.harvard.edu/bibdata/data under a CC0 licence. (At the time of writing the file is approx 3.8gb). They request that users 'comply with a simple set of Harvard Library community norms.  These norms request attribution and that if others improve this data, they make those improvements equally freely available'.


It's also available via the DPLA API.

Feeding America: The Historic American Cookbook Project

'The "Feeding America: The Historic American Cookbook" dataset contains transcribed and encoded text from 76 influential American cookbooks held by MSU Libraries Special Collections. Features encoded within the text include but are not limited to recipes, types of recipes, cooking implements, and ingredients. The 76 texts were chosen among more than 7000 cookbooks that MSU Libraries holds as representative of periods and themes in American cookbook history spanning the late 18th to early 20th century.'


Data available as a simple download: 'The "Feeding America: The Historic American Cookbook" dataset contains 76 plain text files of transcribed cookbook text, 76 XML files of encoded cookbook text, 1 XML file that includes metadata records for each cookbook in the dataset, and 1 DTD file that describes the schema that was used to encode the cookbooks.'

Queensland Art Gallery | Gallery of Modern Art (QAGOMA)

Catalogue data of artworks in the QAGOMA Collection. CSV format with a CC-BY licence, available via CKAN-style API or download.

They've also put attendance and other data online in the Queensland Government data repository.

Natural History Museum’s research and collections data (UK)

'2,499,355 of the Museum's 80 million specimens are now available online.'

Includes information on how to cite the dataset. Data is available through a CKAN-style API or as a download.

European Library Open Dataset

'Two mechanisms are available for interacting or obtaining the dataset is available. It can be obtained as bulk file downloadable files and according to the specifications of the W3C Linked Data Platform 1.0.

Bulk files for all open data collections can be downloaded at the data sets page.


The linked open data set is available under the Creative Commons CC0 1.0 Universal license.

Data Model and Vocabularies

A complete description of the data model used for the dataset can be found in the document: “ Linked Data at The European Library: Data Model and Vocabularies”.'

Biblioteca Nacional de España


Balboa Park Commons

'The Balboa Park Commons site uses a simple rest style API that allows the retrival of our data in a lightwieght JSON data format. There is no authentication required and the data cannot be changed or manipulated on the host side. Developers and enthusiasts may retrive the JSON data and manipulate however they see fit on the client side. In this example the JSON data for the page of a Featured Set from url http://www.balboaparkcommons.org/objectview/listview/14149813/Animals is retrieved, itenerated through and echoed out on to the page using two simple PHP function called "file_get_contents" and "json_decode".'

Europeana API and Europeana Labs


Re-launched April 2014. 'We currently offer two APIs for use. The first is a REST-API that is suited for dynamic search and retrieval of our data. This API offers exactly the same data as the Europeana Portal for end-users and in many ways the Portal can be viewed as an advanced API-implementation.

The second API is more experimental and supports download of complete datasets and advanced semantic search and retrieval of our data via the SPARQL query language. The Linked Open Data Downloads and SPARQL-endpoint currently includes only a sub-set of all Europeana data, about 20 million of the in-total nearly 31 millions records.'


REST API Standard REST calls over HTTP. Responses returned in JSON.
Linked Open Data Query and retrieve data in SPARQL 1.1.


Europeana Search Widget is a ready-made search box suitable for organisations who want to enable search in Europeana collections with the least possible effort. The widget is easily styled, configured and quickly embedded by simply copying and pasting an HTML-snippet into your website.

Danish/Copenhagen police records

Search for more than 1.7 million people, their positions, children and spouses, and more than 4 million adresses (of which 1.7 million are geo tagged).

All data are returned as from our RESTful API, in JSON format.


The data concerns all people living in Copenhagen between 1890 and 1923. It was decided by law, that citizens of Copenhagen should report personal informations such as way of living, place of birth and family informations to the police.

The 1.4 million registrations has been digitized by Copenhagen City Archives and tagged by volunteers.


See more at http://www.politietsregisterblade.dk/api/1/info.html


The Walters Art Museum Collections API

Documentation on GitHub: 'There are 5 objects that you can get via the Walters API. Each of these has documentation available via the links below.


Bonus 'sandbox' for trying it out.

Canadiana Discovery Portal API

Canada's national aggregator, 'Canadiana.org operates the Canadiana Discovery Portal, a federated search platform that collects metadata (cataloguing information) from our partners and connects it into a single searchable portal. Some 40 memory institutions have joined, providing access to 65 million pages in total'.

'Any search query can be turned into a Web service request by appending the parameter fmt=json to retrieve a JSON object.'


And here's a short example of how to use it from Ian Milligan.

CSTMC Artifact Collection

'This data set includes all artifacts in the collection of the Canada Agriculture and Food Museum, Canada Aviation and Space Museum, and Canada Science and Technology Museum. These artifacts represent the products and processes of all areas of science and technology, including communications; non-renewable resources and industrial design; physical sciences and medicine; renewable resources, including agriculture and forestry; and transportation, including land, marine, and aviation and Space flight The data set is available in XML, and includes images for almost every artifact in the collection.' 

See also: Tips for Using the Artifact Open Data Set.


Data about museum organisations and other data appears to be available through the Canadian Government's Open Data Portal.

Sharing Ancient Wisdoms

'The focus of the SAWS project is on collections of ideas and opinions – ranging from pithy sayings to short passages from longer philosophical texts - which make up the ancient genre of Wisdom Literature.' 

The SAWs RDF data is available as an RDF/XML file dump.  'We have set up a public SPARQL endpoint for SAWS RDF data (through a SNORQL interface). Alternatively to run SPARQL queries through the browser, use http://www.ancientwisdoms.ac.uk/sesame/repositories/saws?query= and append your query to the end.'

Les collections du musée départemental Albert-Kahn

Offers image galleries, simple visualisations and the ability to export data in CSV, JSON, GeoJSON, KML. More information: https://opendata.hauts-de-seine.fr/explore/dataset/archives-de-la-planete/information/

Deutsche Digitale Bibliothek API

Press release in English; API der Deutschen Digitalen Bibliothek documentation (in German)

'Over the next months, Deutsche Digitale Bibliothek plans to announce a programming competition for API applications and will host a series of workshops for developers. 

We expect that the most convincing and creative ideas for the application of the API will come from DDB’s community of users and data providers.' Metadata released under CC0 licence.

The Tate Collection


'metadata for around 70,000 artworks that Tate owns or jointly owns with the National Galleries of Scotland as part of ARTIST ROOMS. Metadata for around 3,500 associated artists is also included.

The metadata here is released under the Creative Commons Public Domain CC0 licence. Please see the enclosed LICENCE file for more detail.

Images are not included and are not part of the dataset. Use of Tate images is covered on the Copyright and permissions page. You may also license images for commercial use.'

We offer two data formats:

  1. A richer dataset is provided in the JSON format, which is organised by the directory structure of the Git repository. JSON supports more hierarchical or nested information such as subjects.

  2. We also provide CSVs of flattened data, which is less comprehensive but perhaps easier to grok. The CSVs provide a good introduction to overall contents of the Tate metadata and create opportunities for artistic pivot tables.


Finnish National Gallery API

The largest art museum organisation in Finland, including the Ateneum Art Museum, the Museum of Contemporary Art Kiasma, the Sinebrychoff Art Museum, and the Central Art Archives. In addition to the API, The Finnish National Gallery provides a data-package that contains all artwork information in a single data-file. Released under Creative Commons CC0 1.0 Universal (CC0 1.0). You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.


Units of the Finnish National Gallery manage more than 36 000 artworks. Files can be downloaded, or are available through REST calls over HTTP as Dublin Core XML, Dublin Core JSON or Dublin Core Text. 

US National Archives for Developers Developer Hub

'Archives.gov/developer connects citizen developers with the tool they need to unlock government data' and includes information on:



US National Archives Catalog API


  • Archival description metadata
  • Authority record (persons, organizations, geographic references, topical subjects, and specific records types) metadata
  • Digital object metadata (including technical metadata)
  • Public contributions to the catalog (like tags and transcriptions)
  • Metadata for crawled web pages of archives.gov and presidential libraries
  • Some metadata for user accounts


See also: Introductory blog post and github link: Using the National Archives Catalog API, 'a basic FAQ about NARA's online catalog API'.

Penn Museum

Downloadable 'metadata about the objects in the collection. This usually includes information about what the object is, a brief physical description, dates, where it is from and what it is made of. Some records will have more information than others. As a work in progress, the Museum continuously strives to improve both the quantity and quality of the data contained within the online database.' Available under a Creative Commons Attribution 3.0 Unported License as CSV, XML or JSON.

Collections from the University of Pennsylvania Museum of Archaeology and Anthropology

PastPlace Historical Gazetteer API

The PastPlace API (Applications Programming Interface) offers a simple, free to use, web service. It allows you to search our database of historical information directly in a variety of ways returning data in a range of structured formats. It is designed to allow web servers and software components to speak to each other more easily, human beings looking for data should generally use our main website A Vision of Britain through Time.

A Vision of Britain through Time

Digital boundaries (shapefiles), Historical statistics by location or topic, and data from the GB1900 crowdsourcing project.

Digital Public Library of America (DPLA) want you to 'make something awesome'


Their REST API (free key required) returns JSON-LD.


Bulk download

'All DPLA data in the DPLA repository is available for download as zipped json files. These include the standard DPLA fields, as well as the complete record received from the partner.'


As far as I can tell, the relevant licences are discussed on the Terms and Policies pages, and their metadata (PDF) is released under CC0 and content ('any images, video and audio') is under CC BY 3.0 License.


They've also outlined their philosophy and listed their namespaces: dc: Dublin Core elements;dcterms: Dublic Core terms; dcmitype: Dublin Core Metadata Initiative types; dpla: Hey, that’s us!;edm: Europeana Data Model; ore: Open Archives Initiative Object Reuse and Exchange; rdf: Resource Description Framework; rdfs: Resource Description Framework Schema; skos: Simple Knowledge Organization System; owl: Web Ontology Language.


New York Public Library (NYPL) Digital Collections API

Announcing the NYPL Digital Collections API

The New York Public Library Digital Collections API: 'For more than a century, The New York Public Library has amassed an extraordinary trove of rare and unique material covering the full spectrum of recorded knowledge. Now, for the first time, significant portions of the Library's digitized collections are available as machine-readable data: over 1 million objects and records for you to search, crawl and compute.' Page includes sample access code.

Formats: XML, JSON in the MODS (Metadata Object Description Schema) format.


And here's a post from Dot Porter on 'How to get MODS using the NYPL Digital Collections API'.


See also: NYPL Digital Collections Public Domain Item Data and Tools


And now - Public Domain Collections: Free to Share & Reuse

Royal Albert Memorial Museum & Art Gallery (RAMM)

'RAMM cares for a wonderful and diverse collection consisting of over one million individual objects and specimens from all over the globe. They are divided into the following curatorial departments: antiquities; ethnography; natural history, decorative and fine arts.

The collections contain items of local, national and international importance, and many are of outstanding historical or cultural significance. Our ethnography collection has achieved Designated Collection status and RAMM as a whole won the title Museum of the Year in 2012.'

REST API, key required

Corpus Christi Museum of Science and History

'The Portal to Texas History provides public access to number of application programming interfaces (APIs) to the collections within the system. Below are examples of APIs avaliable for Corpus Christi Museum of Science and History that can be used openly by those interested in programatically accessing data from this system. You do not need to apply for a special key to use these APIs.


Note that all example URLs below use the same protocol and server name, http://texashistory.unt.edu/explore/partners/CCMS/. We only show the URL paths and parameters below to save space.'


They offer access via OAI-PMH, SRU, and OpenSearch.

Full list at APIs for Corpus Christi Museum of Science and History

Exploring Surrey's Past

The site contains content about 'the People, Places, and Times that have helped shape the county' of Surrey, England, and databases with 'descriptions of archive documents, archaeological sites, museum objects or photographs'.


From Developers and API:

The API is based upon OpenSearch, and by default takes a Lucene style query syntax.  The API is available via a rate limiting proxy, API search requests are limited to 900 requests per hour.  The data made available through the API is licenced as Creative Commons Attribution Non-Commercial 2.0 UK and you are asked to provide a credit and link back to Exploring Surrey’s Past website in your application or mashup.   For non-rate limited, or commercial use, of the data please contact Exploring Surrey’s Past for licensing terms.  Our open data feed must not be used for any commercial planning or development use.  You agree to your use of our API being monitored.

The National Library of the Netherlands (Koninklijke Bibliotheek)

The KB (Nationale bibliotheek van Nederland) has several SRU and OAI-PMH based APIs available on their Overview page for their Dataservices and APIs including:


British Library

Free Data Services

British National Bibliography: 'The initial offering includes published books (including monographs published overtime) and serial publications, with future releases extending coverage to include multipart works, integrating resources (e.g. loose leaf publications), kits and forthcoming publications' in Linked Open BNB,Basic RDF/XML,MARC21 via Z39.50.

Old Bailey API


'This facility allows you to work directly with the text of both the individual trials and sessions published as part of the Proceedings' of the Old Bailey (London's Central Criminal Court, 1674-1913).


'You can either use the Old Bailey API Demonstrator to build queries and export texts to Voyant Tools; or else address the underlying text directly through the API.

As a part of the Datamining with Criminal Intent project, a new statistics facility that allows more complex graphing and visualisation of trial data and text has also been created:


See also OBO-APIextraction-and-Analysis, 'Tools for downloading offence and punishment data from the Old Bailey Online web-API, then modelling and analysing this data. These tools were developed as part of a masters project in which the author also learnt Python'.


See also this impressive list of 'Projects Which Have Used Old Bailey Online Data'.


Locating London's Past API

'Locating London's Past (http://www.locatinglondon.org/) is a website allowing analysis of historical data using the Google Maps API. It allows a user to search data relating to early modern and eighteenth-century London, and to map the results on historical or modern maps, with a degree of GIS-style functionality.' JSON, XML and HTML.


Connected Histories API

'Connected Histories (http://www.connectedhistories.org/) brings together a range of digital resources related to early modern and nineteenth century Britain with a single federated search that allows sophisticated searching of names, places and dates, as well as the ability to save, connect and share resources within a personal workspace. The Connected Histories API enables users to connect programmatically to the search engine, using GET parameters, and retrieve search results in an XML format.'



Danish Cultural Agency

(Via Google translate, excuse any mistranslations)

The Danish Heritage Agency has records from four national datasets of cultural heritage in the landscape and the country's museums. The records updated daily by museums and municipalities. There is public access to search all registers.


http://www.kulturstyrelsen.dk/kulturarv/databaser/hoest-data-via-oai-pmh/ (OAI-PMH)

http://www.kulturstyrelsen.dk/kulturarv/databaser/webservices/ including map services, SOAP and possibly a REST service.


Via http://hack4dk.tumblr.com/datasets 

Danish Historical Atlas API

http://blog.historiskatlas.dk/api and some documentation/examples at http://service.historiskatlas.dk/ 


Via http://hack4dk.tumblr.com/datasets 

COMET (Cambridge Open METadata)


From the Project blog:

data.lib.cam.ac.uk - Our first run at a library-centric open data service. It includes:


Datasets (see also the FAQ)

#3 Cambridge University Library OCLC recordset

Graph: http://data.lib.cam.ac.uk/context/dataset/cambridge3/bib (sample entries)

This dataset contains over 500,000 bibliographic records originating from the OCLC Worldcat database. The data has been enriched by OCLC with links from the FAST subject and VIAF name authority services. ODC-BY licence.


#2 Cambridge University Library major UK vendors recordset


This dataset contains over 1.7 million bibliographic records from the British Library and Research Libraries UK. These records are currently only available as a bulk download only. The data has been enriched by OCLC with links the the FAST subject and VIAF name authority services. PDDL licence.


#1 Cambridge University Library Recordset

Graph: http://data.lib.cam.ac.uk/context/dataset/cambridge/bib (sample entries)


This dataset of around 1.3 million records represents work over a 20+ year period which contains a number of changes in practices and cataloguing tools. No attempt has been made to screen for quaility of records other than the Voyager export process. Both MARC and RDF versions have been enriched by OCLC with links the the FAST subject and VIAF name authoirty services. This data also includes the 180,000 'Tower Project' records published under the JISC Open Bibliography Project. PDDL licence.


Cambridge University Fitzwilliam Museum

They provide basic record data (under CC0), some extended records, and some images (under CC-ShareAlike-NonCommercial-NoDerivatives) via OAI-PMHSPARQL, Application Programming Interface (API) interfaces.

Dutch Open Cultuur Data

Lots of different datasets listed at http://www.opencultuurdata.nl/datasets/ including many with images.

Flander's Open Cultuur Data

Same as above: lots of different datasets listed at http://www.opencultuurdata.be/datasets/ including many with images.

Finnish HelMet network (city libraries of Helsinki, Espoo, Kauniainen, and Vantaa, Finland)


The open data (CC0) includes library catalogues in MARC and MARCXML with examples in JSON and MARCXML at http://data.kirjastot.fi. At the moment you can perform searches by: author, title or isbn. Depending on the format you want the search result to be in, you can change the file extension in the url to either json or marcxml.


More info about HelMet: http://www.helmet.fi/en-US/Info/What_is_HelMet 

Swiss National Library

Metadata from the Helveticat electronic catalogue has 'been made available under Creative Commons License CC0 1.0 and can be freely used by third parties without restriction and without citing the source'.


The data can be obtained in MARC21 format via the Z39.50 interface and on request via OAI-PMH.


More information: http://www.nb.admin.ch/aktuelles/03147/03950/04156/index.html?lang=en 

The catalogue: https://www.e-helvetica.nb.admin.ch/pages/main.jsf (though it's not clear how the data is downloaded)


What's on the Menu data (New York Public Library)


There's a lot of data behind What's on the Menu?: a mix of simple bibliographic description of the menus (created by The New York Public Library) and the culinary and economic content of the menus themselves (transcribed by you). Now we're opening it up.

All data generated through What's on the Menu? is available in two ways:

Spreadsheet Exports

On the 1st and 16th of every month, we'll post a complete export of all menu and dish data collected so far (menus, dishes, prices, and more).



As the first project of NYPL Labs, we're happy to announce that Menus is also the first NYPL project to have a public API. In fact, we use this exact same API to build many of the features of this site.


Royal Pavilion and Brighton Museums



Like any museum, one of our core functions is to make our collections accessible. This applies to the digital data about our collections as much as it does to the physical objects themselves. ...there are many people who would like to use our digitised collections in other ways, whether through analysis of the data we hold, or by re-using this information for other purposes. In recognition of this, we have just released several datasets relating to our collections, along with accompanying images, which are available to download via the Open Data section of our Image Store.


Although this only represents a fraction of our total records, this data can be considered ‘clean’ — that is, to the best of our knowledge, the data can be considered accurate, can be illustrated, should be relatively easy to understand, and should not infringe the intellectual property rights of any third party.


The datasets are taken from eight of our collections:

Archaeology (17 records)

Costume (193 records)

Craft (33 records)

Decorative Art (1914)

Fine Art (422)

Local Photographs (1777)

Natural Science (40)

Toys (41)

Related images are gathered in a zip file. A link to this can be found by each dataset in the Image Store.

K-samsök / SOCH (Swedish Open Cultural Heritage)

From http://www.ksamsok.se/in-english/: "a web service used to search and fetch data from any organization that holds information or pictures related to the Swedish cultural heritage. ... SOCH functions as an exchange/aggregator where data from many local databases are made searchable and visible to the public and to the research community. ... The beta version was released in February 2009 and by then there were already 1.78 million objects available through SOCH. The number is growing as more content providers hook up, and below you can see the actual number of objects available through SOCH today. The set of objects include archaeological, ethnological and religious objects, as well as ancient monuments, historical buildings and places, and natural objects."


API documentation and demos (in Swedish).


SMK (Statens Museum for Kunst), The National Gallery of Denmark (art history, open data)

Jan Gossaert: Portrait of a ManSMK site page about their Free download of art works

Download all the 159 images here. Please notice, that the file is 5 GB.


Case study: Highlights from SMK, The National Gallery of Denmark

A pilot with digital images of 159 collection highlights, and 100 educational videos on YouTube under the CC BY license.  The 159 images of highlights from SMK's collections are released for free download in the highest available resolution that the museum currently has at its disposal. The image files range from approximately 10 MB up to 440 MB.


Hat tip: @MSanderhoff.

State Records NSW API (archives, API)

Documentation: http://api.records.nsw.gov.au/usage and Making sense of the catalogue data.


Downloads (from http://data.records.nsw.gov.au/?page_id=32) 

This dataset is available for download as:

  1) a set of XML files:

  2) an SQLite database:

  • ai_sqlite.zip (contains the combined content of the XML files) [18.8Mb]


Licensed under a Creative Commons Attribution-NonCommercial 3.0 Australia License.


See also http://data.nsw.gov.au/ 'the central catalogue of NSW public sector data published online.'


Hat tip: @abigailbelfrage, @CassPF, @richardlehane.

Victorian Government Data Directory


Downloadable datasets including The Victorian Heritage Database, shipping lists (immigration records useful for family history, more info at http://data.vic.gov.au/blog/find-your-family/283).


General site Terms of use

TSO Open Up Labs (UK government)

Apparently TSO is 'the leading provider of information management and publishing solutions to the public sector'

The full list of datasets available in RDF via their SPARQL endpoint.


They provide APIs and some documentation on using them:

Meketre (Egyptology, linked data)

"reliefs and paintings of Middle Kingdom tombs of Ancient Egypt. The project targets two- dimensional art of the Middle Kingdom (11th to 13th Dynasty, ca. 2040 - 1640 B.C.) and one of its main aims is to map and elaborate the development of the scenes and their content in comparison to the Old Kingdom."


http://meketre.org/index.php?page=about: "Multimedia content within the MEKETREpository is organized by means of one main taxonomy (in English), which embraces the various themes depicted by the reliefs and paintings in a hierarchical fashion and multiple additional concept schemes (ontologies / controlled vocabularies) that further describe the content. The concept schemes that describe other aspects of the multimedia content (not the depicted persons and things but for example colours, location, age, etc.) are constantly developed in the course of the project, using collaborative ontology building methods. The applied concept schemes are technically represented using standardized web based knowledge organization systems (KOS), in particular the Simple Knowledge Organization System (SKOS) and the Web Ontology Language (OWL).
The implementation of the MEKETREpository software solution utilizes industry-standard technology to ensure both reliability and maintainability. Since the project is expected to develop beyond the three-year limit, it is important to build on a well-proven foundation of software components and a clean, well documented implementation. Therefore we decided to organize the MEKETREpository in a 3-tier style which is very common for many enterprise software solutions.
As stated above, the main focus lies in providing an easy-to-use interface to the collected data. For the user browsing the MEKETREpository, an up-to-date webbrowser (e.g. Firefox 3.0, Internet Explorer 7.0 or Safari 4.0 and, of course all versions above) is sufficient. If the user needs to edit and update information in the repository, he needs to enable Javascript in the browser."


Hat tip: @bhaslhofer.


Trove and National Library of Australia APIs

Trove offers over 289,384,834 Australian and online resources including books, images, historic newspapers, maps, music, archives and more.



The Trove API provides programmatic access to the metadata and some full text in Trove.

You can use the API to search across Trove’s books, images, maps, music, sound, video, archives, journal articles, newspaper articles and lists created by other users (archived websites, people and organisations may be added at a later date) and retrieve metadata records for the items you find. You can also download the full text for most digitised newspaper articles.


Response encoding: XML, JSON

Base URL: http://api.trove.nla.gov.au


https://wiki.nla.gov.au/display/ARDCPIP/National+Library+of+Australia+APIs has links to:

  • NLA Party Infrastructure (Trove People and Organisations zone) available as OAI-PMH, SRU and OpenSearch
  • Libraries Australia data as z39.50 (subscribers only) or OpenSearch (RSS or ATOM)
  • Picture Australia z39.50, OpenSearch (RSS), OAI (with API key)


Discuss your ideas or progress with others at their new Forum: Reusing Trove resources.


There's also an API console you can use queries without an API key: http://desolate-bastion-1864.herokuapp.com/


Yvonne Perkins has written a very user-friendly guide to using the Trove API which is also a good introduction to APIs, An Introduction to the Trove API.


Cooper Hewitt

'Cooper Hewitt, Smithsonian Design Museum Collections provides a REST-ish style application programming interface (API) for developers to use in their products and services.'




See also blog posts labelled CH 3.0.


Also available on GitHub, with tombstone data released under a CC0 licence.



Amsterdam Museum Metadata API Collection

[Google-translated from http://www.appsforamsterdam.nl/wp-content/uploads/2011/02/AmsterdamMuseum.txt]


images: http://ahm.adlibsoft.com/ahmimages/

Replace ..\..\dat\collectie\images\ in the object records with the above (and replace slashes)  


Adlib API description: http://api.adlibsoft.com/site/  


The database of the museum collection called Amcollect. An example search: 


AMlibrary library database:




Thesaurus AMterms:



The linked open data from the Amterdam Museum: http://ckan.net/package/amsterdam-museum-as-edm-lod


Rijksmuseum API



The Rijksmuseum Application Programming Interface (API) is a Rijksmuseum service for partners and application developers. Register for an API key for access to structural metadata and images of the Rijksmuseum’s collection, which can then be used to develop applications or simply to enrich your collection.


Once you register for an API key, you will be emailed a code with which you can access data sets. You will be granted access to the Rijksmuseum’s general collection, consisting of over 100,000 objects, The Night Watch included! Digital images are available for all objects. The images are 330 dpi JPEg images (approx 3 to 5 mb, produced in a colourmanaged enviroment).


Our datasets are made available as an XML web service based on the OAI/PMH protocol. Each object is included in the XML as a  record. The record field definitions are based on the Dublin Core field definitions: www.dublincore.org. An example of one of our objects: The Nightwatch (SK-C-5).


A complete description of our XML records is available in Dutch at: http://www.rijksmuseum.nl/api/uitleg.


For information or comments please contact: collectieinfo@rijksmuseum.nl


Update October 2012: 'The Rijksmuseum in Amsterdam is going to place its entire collection of works of art online on October 30'. 125,000 objects! More at http://www.rnw.nl/english/video/tinkering-rijksmuseum%E2%80%99s-collection 

Government Art Collection (crawled version of website on kasabi)


'This dataset is a re-publication of the metadata contained in the Government Art Collection website. The data was obtained by crawling the website to collection information about all of the works, artists and subjects.

The dataset contains over 10,000 art works from more than 3,000 artists. The art works depict nearly 2,000 places and over a 1,000 different people. The majority of the people mentioned in the paintings (over 700) have been linked to their description in Dbpedia. Over time the dataset will be updated to include links to other datasets.

While this dataset and the original website are available for re-use under the Open Government License, the images on the site have their own copyright terms. Refer to the GAC website for guidance on licensing of images. The images themselves are not copied into this dataset, but links have been made to the source images to facilitate legal re-use. Copyright statements have been preserved where available, so these can be queried from the dataset.'


Includes SPARQL endpoint, sample queries, search, lookup, reconciliation, augmentation and attribution links.

National Library of France/Bibliothèque nationale de France


'les informations issues de ses différents catalogues, ainsi que de sa bibliothèque numérique Gallica' or content from various catalogues and Gallica, their digital library.

English Heritage Places


This dataset contains metadata for about 400,000 nationally important places as recorded by English Heritage, the UK Government's statutory adviser on the historic environment.

This dataset covers features located in England and is divided into various types:

Listed Buildings – buildings of special architectural or historic interest
Scheduled Monuments – nationally important sites and monuments from all periods of history
Registered Parks & Gardens – landscapes and naturally occuring features of national importance
Historic Battlefields – sites of important battles in English history
Protected Wreck Sites – sites of shipwrecks of national importance


Includes SPARQL endpoint, sample queries, search, lookup, reconciliation, augmentation and attribution links.

National Maritime Museum (Royal Museums Greenwich)


API structure
The search interface is powered by SOLR (http://lucene.apache.org/solr/), an open source enterprise search platform developed by the Apache foundation.

The system is currently under development and therefore the schema and indexes described in this document are subject to change.

Base URL
The base URL for SOLR searching is http://collections.nmm.ac.uk/solr


Sample queries
Find objects made in 1914 and return the description and collection:
http://collections.nmm.ac.uk/solr/?q=type:object AND dateMade:1914&fl=description,collection

Find all records with ‘Nelson’ in the title, return the title and description and provide a facet count of each record type:
http://collections.nmm.ac.uk/solr/?q=name:Nelson&facet=on&facet.field=type&fl=name, description



Extract: "Collection images must always credit ‘National Maritime Museum, Greenwich, London’ and link to the original collection record on the NMM collections website. Some collection images have a more restricted licence than CC BY-NC-SA, which you must comply with. Please see our full API documentation for advice on how to exclude such images"

British Museum Semantic Web Collection Online



"...Linked Data and SPARQL service. It provides access to the same collection data available through the Museum’s web presented Collection Online, but in a computer readable format. The use of the W3C open data standard, RDF, allows the Museum’s collection data to join and relate to a growing body of linked data published by other organisations around the world interested in promoting accessibility and collaboration.


The data has also been organised using the CIDOC-CRM (Conceptual Reference Model) crucial for harmonising with other cultural heritage data. The current version is beta and development work continues to improve the service. We hope that the service will be used by the community to develop friendly web applications that are freely available to the community."


There are links to further resources related to the use of the collection from the British Museum Research Space case study, and British Museum Collections discussion and feedback pages on this wiki


See also https://scraperwiki.com/scrapers/british_museum_object_thesaurus/ - an attempt to automatically map the British Museum Object Thesaurus published on the Collections Link website (http://www.collectionslink.org.uk/assets/thesaurus/) to the relevant URIs from the British Museum Linked Data, and also to terms/URIs on the Portable Antiquities website (http://finds.org.uk/database/terminology/objects)


The National Gallery (UK)


National Gallery API


'A SPARQL end-point to an rdf description of the National Gallery Collection is being developed. The data will be stored in a triple store from Garlic called 4Store and will be published in the form of URIs with data in human readable and machine readable form. ... The data will potentially be mapped to more than one schema/ontology, though we will be specifically developing a mapping to the CIDOC CRM.'


RDF at the National Gallery


'These pages represent RDF descriptions of National Gallery related information, some linked to the CIDOC CRM and other external information sources.'


Raphael API

The Raphael Research Resource began to examine how complex conservation, scientific and art historical research could be combined in a flexible digital form. Exploring the presentation of interrelated high resolution images and text, along with how the data could be stored in relation to an event driven ontology in the form of RDF triples. In addition to the main user interface the data stored within the system is now also accessible in the the form of open linkable data combined with a SPARQL end-point.


Web-page: http://rdf.ng-london.org.uk/raphael.

Further detail relating to the worked examples along with the code used to build then can be seen at: http://rdf.ng-london.org.uk/workshops/interface2011/


Science Museum Group (2017)

'Explore over 250,000 objects and archives from the Science Museum, Museum of Science and Industry, National Media Museum and National Railway Museum.'

Collections Online API - documentation and examples for accessing collections data via JSONAPI


Science Museum, National Railway Museum, Media Museum (NMSI) object records as CSV (2011)



We’ve published three data sets:

  • 218,822 object records
  • 40,596 media records
  • 173 event records


See also NMSI Collections list and NMSI term lists and the NMSI museum developers blog for updates on things people have made.


Things other people have done:

Ant Beck (@AntArch) at Culture Hack Day North, November 2011: We have cleaned up the National Railway data. Please use! Data: http://dl.dropbox.com/u/393477/chn11KEEP/NMSI_Object_Cleaned_With_Grouping.txt Processing http://dl.dropbox.com/u/393477/chn11KEEP/NMSI_Processing.docx.





Code libraries, snippets: some PHP that handles JSON from the CultureGrid API at https://github.com/mialondon/mmg-import/blob/master/mmg_import_functions.php 


Open API 

Open API is a Ruby-based program that allows users to create a public API for any MySQL database, including digital collections databases. 


Visit our GitHub page (http://github.com/OpenExhibits/OpenAPI) to download the latest build or learn more about Open API on our blog (http://openexhibits.org/blog/api-generator-collection-database.html). 


Powerhouse API - no longer live


Documentation: http://api.powerhousemuseum.com/api/v1/documentation/

Code libraries, snippets: some PHP that handles XML from the Powerhouse Museum API at https://github.com/mialondon/mmg-import/blob/master/mmg_import_functions.php


Open Context API

Open Context has some museum content, as well as data from a variety of field-based (mainly archaeological) projects. RESTful web services for Open Context are described here:




Essentially, query results are available in paged Atom(+GeoRSS) feeds, KML, and JSON formats. Open Context also makes summaries of query results (bascially facets) availble in these formats.


Query parameters include

  • GeoSpatial (Lat / Lon bounding box)
  • Context (Named locations, archaeological contexts)
  • Date Ranges (Start and end dates)
  • Project / Collection name
  • Type / category of item (site, small finds, sculpture, coins, etc.)
  • Related People
  • Descriptive properties (including numeric range searches)
  • Linked Data URIs: Currently Open Context supports queries over URIs to geographic places defined by Pleiades and to biological taxa defined by the Encyclopedia of Life. A description of how to query over URIs and a few examples can be found at: http://opencontext.org/about/services#query-rel


A demonstration of the use of the Open Context API is at:

http://bade.psr.edu/content/tell-en-nasbeh-database (Bade Museum website)



'A community-built gazetteer and graph of ancient places...

Pleiades is a historical gazetteer and more. It associates names and locations in time and provides structured information about the quality and provenance of these entities. There is also a graph in Pleiades: names and locations are collected within places and these collections are associated with other geographically connected places. Pleiades also serves as a vocabulary for talking about the geography of the ancient world within Linked Data sets and is referenced by research projects such as Google Ancient Places and PELAGIOS.


CSV, KML and RDF datasets can be downloaded from http://pleiades.stoa.org/downloads


Pelagios is a Digital Classics network which aims to interconnect scholarly resources from the domain of Ancient World research through the places they refer to. In principle, any dataset that contains references to places in the ancient world can join the Pelagios data network. Some examples for this are:

  • Text collections that include mentions of ancient places
  • Image collections that depict ancient places in photographs, drawings or maps
  • Database records that describe objects that are located at - or which originate from - an ancient place (e.g. archaeological finds)
  • Any other piece of data that bears a relation to a particular ancient place

To find out more about the Pelagios project, visit our blog at http://pelagios-project.blogspot.com.


The Pelagios HTTP API enables you to search and browse the Pelagios network of place references. It features a basic HTML interface, and provides API responses in JSON or RDF (XML and Turtle). 


Get started with the Pelagios API here: http://pelagios.dme.ait.ac.at/api

Documentation is available here: https://github.com/pelagios/pelagios-cookbook/wiki/Using-the-Pelagios-API

Open Images - Newsreel collection


Open Images (www.openimages.eu) is an open media platform that offers online access to audiovisual archive material from various sources to stimulate -creative- reuse. Footage from audiovisual collections can be downloaded and remixed into new works. Users also have the opportunity to add their own material to the Open Images and thus expand the collection. Open Images provides an API (http://www.openimages.eu/api), making it easy for third parties to develop mashups.


The platform currently (April 2011) offers access to over 1.500 items from various sources, including a considerable number of newsreels from the collection of the Netherlands Institute for Sound and Vision. This amount will grow substantially over the coming years as new items will be uploaded continuously, and as new partners join the initiative.

OPenn - Images and Data from the University of Pennsylvania Special Collections


This website contains complete sets of high-resolution archival images of manuscripts from the collection of the University of Pennsylvania Libraries and other institutions, along with machine-readable TEI P5 descriptions and technical metadata. All materials on this site are in the public domain or released under Creative Commons licenses as Free Cultural Works.


Files are available through a simple HTML interface or, in bulk, through rsync. 


Introduction: http://openn.library.upenn.edu/ReadMe.html

Technical details (including how to bulk download objects and collections): http://openn.library.upenn.edu/TechnicalReadMe.html 

Science Museum 'Cosmic Collections' API

A subset of our data, produced as a beta API for our mashup competition.


Brooklyn Museum API

"A set of services that you can use to display Brooklyn Museum collection images and data in your own applications."



  • collection.search
  • collection.getItem
  • collection.getImages



The methods provided by the API take a range of parameters specific to their individual functions, but all methods rely on these three principal parameters:

method (Required)
Specifies what method to perform (e.g., "collection.search", "collection.getItem")
api_key (Required)
Your personal API Key.
format (Optional)
The response format. Valid formats are xml, json, html. Default xml.


A write up here: Hack the Brooklyn Museum:


"The API is for non-commercial use with a limit of 3000 API calls a day. Naturally the museum must be respectful of artist copyrights, and requires proper attribution for any display of results. The API also permits only session-based caching with no retention of copies, and the images returned are no greater than 500 pixels in width or height, with many that may be smaller.

The API has a simple REST format with three methods - search, getitem (given the id returned from a search), and getimages for a particular item. The search can be filtered by date, keyword, and whether or not there are images present."


[And I just saw that they'd linked here at the end - cool!]


Museum of London API


A REST interface to the following data sources:

  • publications from the Archaeology Service
  • events at the Museum of London, Museum in Docklands, and London Archaeological Archive and Resource Centre
  • a converter from OSGB grid references to latitude/longitude

Base URL is: http://www.museumoflondon.org.uk/MuseumofLondon/food/rest.aspx?source=events

Output formats: plain XML; RSS2; and RSS2 with xCal, DC and geo extensions as used by Upcoming.

Set the output format by adding:  mode={rss2 or xCal or xml}. Leaving it out returns the plain XML. (?KML output also possible? Hooks in code for plain text and JSON renderings of the output (not implemented?) )



Celtic Coin Index API


The Celtic Coin Index is the online incarnation of the Celtic Coin collection at the Institute of Archaeology at Oxford University. The collection began in 1960 and contains photographs and information about coins held in Britain's museums. Coins are still being found and added to the collection.

The API is RESTful and returns responses in JSON, XML, geoRSS, KML, RSS, and CSV.


Powerhouse Museum Collection Search (RSS/Opensearch available)


Image search option also available.

Good example of RESTful query construction and Opensearch/result URI generation.


DigitalNZ API


DigitalNZ brings together the metadata of nearly 200 organisations from around New Zealand, and international organisations with New Zealand-related content. Content partners come from the cultural and heritage, broadcasting, education, and government sectors; as well as local community sources and individuals. This metadata is made available via the DigitalNZ API.

The current version of the DigitalNZ API is version 3, versions 1 and 2 have been depcrecated. A subset of the metadata has been made available for commercial use.


The custom search records method is perhaps a little unusual. DigitalNZ has a visual search builder tool that allows users to construct complex search queries. This method provides access to that refined dataset.



Response formats: xml, json, rss. Interesting selection of metadata returns. Minimum metadata response elements include:

  • dc:title - the title of the item, or a brief description if no title exists.
  • dnz:category - the high-level DigitalNZ category(s) that the item belongs to.
  • dnz:content_partner - the organisation who provided the item
  • dnz:landing_url - the preferred URL for linking to the item.
  • dnz:thumbnail_url - the URL of a thumbnail of the item (required for Images, optional for other categories).


There's a good post on how to get started with the API, including sample PHP code, here: http://digitalnz.org/blog/posts/building-a-website-on-the-digitalnz-api 



Reciprocal Research Network API


The RRN API is currently providing programmatic access to Northwest Coast items from 8 Institutions (Brooklyn Museum, McCord Museum, MOA at UBC, NMNH, Pitt Rivers Museum, Sto:lo Research and Resource Management Centre, The Burke at UW, The MAA at Cambridge). We provide a very simple API where any item or list of items can be represented as XML or JSON by appending '.xml' or '.json' to the end of the URI. We'd be open to expanding the scope of the API should there be any interest.


Transport Archive data structures

"The Metadata elements used in the project are divided into three categories: Content, Intellectual Property and Instantiation."  These pages appear to date from about 2003 - no time at all in museum years, a few centuries in internet years.


Content Data Structure

REFNUM None None
TITLE Title Alternative DC.TITLE
THEME Technology, Environment, Community, Economy None
TYPE Detail Qualifier NONE
Cartoon for
Copy after
Copy of
Derived from
Document for
Document of
Facimile of
Larger context for
Larger entity
Model for
Part of
Plan for
Prototype for
Referenced by
Sketch for
Source for
Study for
Version of
Creation site location
Current repository location
Current site location
Discovery site location
Former repository location
Former site
Period point
CREATOR Attribution
Corporate name
Personal name


Intellectual Property Data Structure

DATE Alteration
FORMAT Dimension D
Dimension W
Dimension H
Image Format


Technical Metadata

CAPTURE_DEVICE Equipment used
CAPTURE_DETAILS Methodology used - i.e. batch scanning
CHANGE_HISTORY Record of changes made to Master and Web Delivery files
COMPRESSION Level or type of compression used, if any
RESOLUTION PPI or number of pixels on each side
COLOUR Colour depth ofthe image, i.e. 24 bit
COLOUR_MANAGEMENT Details of embedded colour profiles


Museum Victoria History and Technology Collections API

The API root uri is located at http://museumvictoria.com.au/collections/api/v1/


Currently the only type of requests supported are those made via REST, and there are two response formats currently supported, XML and JSON.



Like the V&A, we're using the API for our own purposes, but would love to see what other folks might make with our "stuff".


Api Documentation

Overview | Requests | Responses | Image URLs


Api Methods







Indianapolis Museum of Art OpenSearch

Artworks on http://www.imamuseum.org are first class content, so they come up as search results in the site the same way a page would. We have exposed the entire site, including collections, through OpenSearch using the Drupal module, http://drupal.org/project/opensearch. As of the 6.x-1.3 version, there were a few bugs that had to be fixed by hand. The OpenSearch endpoint is declared in the HTML headers throughout the site. Here is the primary endpoint: http://www.imamuseum.org/opensearch/ima.


Description Schema for IMA OpenSearch



The apachesolr:filters declaration was a custom addition, and relies on the Drupal module's filtering ability.


Example Searches:


American Numismatic Society Collection Database (MANTIS) APIs

The American Numismatic Society's collection of nearly 600,000 objects is accessible through a Solr-based API that allows for querying the collection with the Lucene search syntax and responds in Atom XML.  The Atom feed contains references to the URI for the object as an HTML representation and also alternate links to the XML source of the object as well as RDF, KML, and Atom representations.  The RDF is rather rudimentary and will be improved over time.  A similar Solr-to-KML API exists.


Atom: http://numismatics.org/search/feed/?q=*:*
Object KML: http://numismatics.org/search/query.kml?q=*:*
Mint KML: http://numismatics.org/search/mints.kml?q=*:*


The functionality and Solr fields are discussed further in the Numishare (the open-source framework for managing and publishing coin collections) blog post: http://numishare.blogspot.com/2011/04/world-of-numismatic-data-at-your.html



http://numismatics.org/search/feed/?q=department_facet:"Islamic" AND weight_sint:[4 TO 5] - All Islamic coins that weigh between 4 and 5 grams

http://numismatics.org/search/feed/?q=type_text:temple AND imagesavailable:true AND department_facet:"Greek"
- Greek coins depicting temples that have been photographed


Portable Antiquities Scheme

"The Portable Antiquities Scheme is a partnership project which records archaeological objects found by the public in order to advance our understanding of the past."

"PAS is run by the British Museum ... The data gathered by the Scheme is published on an online database (www.finds.org.uk)."


Search results from the PAS database can be accessed as xml, json, kml, atom and rss by the addition of '/format/{format}' to the end of any result set URL 

Records from the PAS database can be accessed as xml, json and csv by the addition of '/format/{format}' to the end of a record URL

Thesaurus terms from the PAS database can be accessed as xml and json by the addition if '/format/{format}' to the end of the URL


In the case of search results and thesaurus terms the results are accessed in pages of 30 results. The sets can be paged through by appending /page/{number} at the end of the URL (before or after the '/format/{format}')



Search results: http://finds.org.uk/database/search/results/objecttype/ADZE/format/xml

Record: http://finds.org.uk/database/artefacts/record/id/464434/format/xml 

Thesaurus term: http://finds.org.uk/database/terminology/objects/format/xml

Black Country History


Black Country History is a searchable website which allows users to find information about documents, maps, photographs, art works, objects and more held by archives and museums services within the Black Country.

The eight partners involved in this website are:

  • Dudley Archives and Local History Service
  • Dudley Museums Service
  • Sandwell Community History and Archives Service
  • Sandwell Museums Service
  • Walsall Local History Centre
  • Walsall Museums Service
  • Wolverhampton Archives and Local Studies
  • Wolverhampton Arts and Museums Service



The API is based upon OpenSearch, and by default takes a Lucene style query syntax.  The API is available via a rate limiting proxy, and we don’t issue usage keys: by using the API you agree to having your use of it monitored.  API search requests are limited to 100 requests per hour.  The data made available through the API is licenced as Creative Commons Attribution Non-Commercial 2.0 UK: England & Wales, and you are asked to provide a credit and link back to Black Country History in your application or mashup.


Western Australian Museum Sandbox


The API uses REST methods, and can be queried via JSON and XML.


Western Australian Shipwrecks

http://www.museum.wa.gov.au/maritime-archaeology-db/wrecks-xml - this will dynamically generate an XML file for download - this contains only Shipwrecks information (no artefact data) - roughly 1500 wrecks.


Western Australian Maritime Archaeology Artefacts - Complete API

Full API - http://www.museum.wa.gov.au/maritime-archaeology-db/rest/node/

This contains all Maritime Archaeology data, including numismatics, artefacts and wrecks (roughly 47,200 items).


If you'd like to be involved in the beta development of the process - be one of the first to get your hands on our API and do something interesting with it we'd love to hear from you.  Send us an email at: onlineservices [at] museum [dot] wa [dot] gov [dot] au

eHive - software as a service collections management system


eHive is a hosted collections management system used around the world, particularly by small museums and communities. As at May 2013 there were 600 museum accounts with 200,000 records. The site includes public access to the collection records. Developer tools to support this were expanded in early 2013.


The eHive Developers website documents the REST/JSON API, Open Archives Initiative Protocol for Metadata Harvesting support, and a suite of WordPress plugins.


The API supports searching, requesting tag clouds, reading object records, adding/removing tags and adding comments. Searching is based on SOLR/Lucene so supports the Lucene query syntax.


The OAI-PMH support provides Dublin Core object record metadata for a given museum account.


The WordPress plugins support search, object detail pages, museum profile pages, tag clouds, tagging & commenting widgets and object image grids. Shortcuts are provided to allow features like image grids to be embedded in any WordPress page or post.    


Harn Museum of Art, University of Florida


The Harn Museum of Art at the University of Florida has their digital collections powered by the SobekCM software, and their digital collections are here: http://ufdc.ufl.edu/iharn


The Open Source SobekCM software is in use by many galleries, libraries, archives, and museums around the world. The SobekCM software supports harvesting by OAI-PMH, searches and browses as XML, and a JSON API, and all documentation as with all records are freely and openly available: 



Archaeology Data Service (ADS), UK


The Archaeology Data Service supports research, learning and teaching with freely available, high quality and dependable digital resources. It does this by preserving digital data in the long term, and by promoting and disseminating a broad range of data in archaeology, using a variety of avenues, including Linked Open Data.


Linked Open Data at the ADS was initially made available through the STELLAR project, a joint project between the University of South Wales, the ADS and Historic England.  The STELLAR project developed an enhanced mapping tool for non-specialist users to map and extract archaeological datasets into RDF/XML, conforming to the CRM-EH ontology (an extension of CIDOC CRM for archaeology). For a full description of the ADS datasets used in this evaluation please see the ADS STELLAR Research Page. The results of the STELLAR project are published from the ADS SPARQL endpoint.


ADS also consumes LOD from other sources (Library of Congress, Ordnance Survey, GeoNames, DBpedia and the vocabularies developed as part of the SENESCHAL project) to populate the metadata held within our Collection Management System with URIs, and then publishes the resource discovery metadata for all our archives via our SPARQL endpoint. Details of this process can be found here and here.


The ADS SPARQL endpoint is available here:






The purpose of the Archives of the Planet is to account for “the aspects and the practices of human activity which will inevitably disappear over time”. This large iconographic collection consists of 72,000 Autochrome plates (color photographs on glass plates) and hundreds of hours worth of black and white films. This is the result of the work of photographers recruited by Albert Kahn and sent all around the world. Between 1909 and 1931, the photographers have been to more than fifty countries to record the everyday lives of the inhabitants of the planet.


The complete archive is available as Open Data with all the OpenDataSoft feature: map, image gallery, filters, API. 



Collections from the Museum des Augustins - Toulouse, France


The whole Museum des Augustins collection in low resolution in Open Data.






Digitalt Museum (DiMu) API


The Digitalt Museum (DiMu) API provides access to collections from several Norwegian and Swedish museums. The API is documented by Nasjonalgalleriet & Nordiska museet on Github. There is a front-end available at digitaltmuseum.org.



Cleveland Museum of Art Open Access API


The Cleveland Museum of Art provides datasets of information on more than 63,000 artwork records in its Collection for unrestricted commercial and noncommercial use. Additionally, the museum provides image assets for as many as 34,000 works, which are made availble under the same terms. Links to the web, print, and full-sized, uncompressed versions of these images are included in the dataset where applicable.


To the extent possible under law, The Cleveland Museum of Art has waived all copyright and related or neighboring rights to this dataset using Creative Commons Zero. This work is published from: The United States Of America. You can also find the text of the CC Zero deed in the file LICENSE in this repository. These select datasets are now available for use in any media without permission or fee; they also include identifying data for artworks under copyright. The datasets support the search, use, and interaction with the Museum’s collection.


For more information about CMA's Open Access initiative, please visit:



To explore CMA's Open Access API:



Comments (15)

Mia said

at 12:00 am on Mar 26, 2009

Tony, you're a star! I've got some others ones at work, I'll add them when I get a moment.

Mia said

at 6:08 pm on May 1, 2009

A quick dump to get it off my PC clipboard - http://id.loc.gov/authorities/ Library of Congress!

Mia said

at 2:55 pm on Feb 19, 2010

Welcome Eric, and thanks for listing your data service.

Eric Kansa said

at 9:02 pm on Feb 22, 2010

Thanks Mia, and thanks for starting this. Any feedback about the Open Context API / web services would be hugely appreciated.

Mia said

at 12:34 am on Mar 10, 2010

Nice example - thanks Eric.

Mia said

at 12:30 pm on Jul 19, 2010

Thanks for sharing the link, Jonny. Have you had many enquiries from developers? And what are the terms of use?

Mia said

at 9:50 pm on Oct 27, 2010

Hi Erin, and thanks for adding info about your app!

Mia said

at 12:15 am on Feb 1, 2011

Thanks Ryan! Some beautiful objects and nice easy access xml.

Mia said

at 1:24 pm on Aug 4, 2011

Some information about a Japanese initiative by Tetsuro Kamura, The Graduate University for Advanced Studies; Hideaki Takeda, Ikki Ohmukai and Fumihiro Kato, The National Institute of Informatics; Toru Takahashi, ATR Media Information Science Laboratories; Hiroshi Ueda, ATR-Promotions.inc, at http://conference.archimuse.com/mw2011/papers/building_linked_data_for_cultural_information_ and http://lod.ac/ (in Japanese) and the British Museum's work at http://www.researchspace.org/home

Trevor Owens said

at 2:03 pm on Aug 4, 2011

Not sure if you want more Library of Congress APIs, but here are a few others that can be added to the doc if they seem appropriate. :)

Prints and Photographs Online Catalog: More than a Million digitized historical prints and photographs http://www.loc.gov/pictures/api
Chronicling America: Historical American Newspapers 1860-1922: http://chroniclingamerica.loc.gov/about/api/
This last one isn't really (to my knowledge) documented yet. But if you append fo=json to a search in LoC's new cross library search you can get some nice looking JSON. For example, this link gives you the JSON for page 2 of results for digitized maps in American Memory. http://www.loc.gov/search/?q=&fa=digitized%3Atrue|original_format%3Acartographic|site%3Aammem&sp=2&st=grid&fo=json

Mia said

at 6:33 pm on Aug 4, 2011

Thanks Trevor - the more the merrier!

Mia said

at 1:46 pm on Mar 13, 2012

Potentially a bunch more sources of data listed at http://wiki.creativecommons.org/GLAM

Monique Szpak said

at 9:40 am on Apr 17, 2012

Hi, I do Drupal (7) and helped to build the new iwm.org.uk website last year.

I have been playing with some of the APIs listed here (great list btw!) and thought I'd share.

For Solr APIs see Search API and Search API Solr modules. We used these to pipe in the IWM collections data. You will need to write some code though, especially to handle the little gotcha's e.g. the NMM Solr interface not using the standard 'select' query as expected by the SolrPHPClient library.

I am about half way through building an Adlib module for Drupal, just a sandbox atm although there is a live demo. http://drupal.org/sandbox/zenlan/1512290 When its a bit more robust I will be looking for feedback, testing and feature requests. (Its quite nice if I say so myself, you type in the url and a bare-bones search system is built for you, after that its a matter of tweaking the views. Still a long way to go though.)

I had a play with the BM's Sparql interface, not being a Sparql expert I had a bit of a learning curve and found that queries timed-out quite a lot. I stopped after a couple of days in case they got annoyed. ;)

I am also helping to test and review a Sphinx module, in case any museums out there use Sphinx (alternative to Solr) for indexing.

Given time and if I win the lottery it would be cool to build modules to talk to all of these APIs. Once an object is piped into a Drupal 'entity class' the possibilities for rendering and usage are vast!

Some time soon I will update my demo site to feature the Amsterdam Museum's Adlib API and the National Maritime Musuem's Solr API.

Mia said

at 2:42 am on Apr 22, 2012

Thanks for sharing, Monique, it sounds like you're doing some really useful work!

Mia said

at 4:01 pm on Jun 6, 2012

I think this is a future announcement rather than an already available service: Linked Data Service of the German National Library http://www.dnb.de/EN/Service/DigitaleDienste/LinkedData/linkeddata_node.html

You don't have permission to comment on this page.