• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Finally, you can manage your Google Docs, uploads, and email attachments (plus Dropbox and Slack files) in one convenient place. Claim a free account, and in less than 2 minutes, Dokkio (from the makers of PBworks) can automatically organize your content for you.


MW2010 machine-readable data unconference session

Page history last edited by Mia 10 years, 6 months ago

This is a rough report from an unconference session on RDFa, microformats and museum data held during Museums and the Web 2010.


I'm writing it up later than I intended (blame the volcano) so please excuse any mistakes in writing up, misattributions, etc - you can sign in to edit them yourself, leave a comment or drop me a line (contact details on the register your interest page).


I'm also writing it up just before I head to the airport, so this first version won't be complete so do jump in and add your own notes if you were there (or wanted to be).


We started by introducing ourselves and briefly describing our interest in the session.


Those present were: Richard Urban, Nate Solas, Paul Hagon, Peter Goodall, Bart Grob (?), Ilya, Piotr Adamczyk, Richard Morgan, Paul Rowe, Darren Scott, Erich Schroeder, Patrick Schmitz, Gunter Waibel...


Interests included: included inference rules based on metadata, embedding metadata in webpages, breaking through the 'analysis paralysis' and choosing a standard to implement (even if it wasn't perfect),


What problems are people having? Picking a standard!


What issues arose during the unconference?

There was an interesting tension between the 'just do it, near enough is good enough' and the 'let's wait until we've got the standard right' impulses - as museum technologists I guess many of us are a mixture of both. But there was also a feeling that we should find a way to move beyond the questions to the point where we start implementing something, with an eye to having a demonstrator project available by this time next year (so April 2011).


We made a useful distinction between a lightweight shared 'standard' that aimed to increase the discoverability of content, and more heavyweight standards that might be used internally or implemented with particular uses in mind. This distinction allows us to keep working through the issues to come up with a suitable (usable, robust, sustainable, implementable, accurate) long-term solution while trying out existing or ad hoc standards in the shorter term.


The voices of reason

One of the reasons I was so happy with this unconference session is that all kinds of people contributed commonsense warnings from their various domains and experiences.  Piotr and Richard said they were still looking for the things that could be done in RDFa that couldn't be done with existing infrastructure.


The use cases

Providing use cases helps everyone understand what we each want to do with the data as well as what we have in our collections.


Peter Goodall wants to make it easy for museums to do mashup collections.


Piotr is still looking for what can be done in RDFa that can't be done with existing infrastructure...


Ilya - neighbourhood project - Open Source Software Foundary - implemented RDFa as a demonstration - FOAF is format to describe social networks and DOPE - description of a project. What kind of aggregation service could we endorse to harvest from our collections?


One of mine: Caroline Herschel (1750 1848) is an astronomer, and there's content about her in lots of museums across the world. I've encountered her in Brooklyn Museum, the National Maritime Museum, the National Portrait Gallery... I'd love to link to images and content from all those other museums from our page about her - but how would I find that content, and how could I reliably link to it?


Erich from Illinois state museum - was working on oral historyproject  on agriculture, indexed to really detailed level - wants to provide user with a proper citation for an interview clip. Found zotero but only got as far as that.


Gunter: OAI-PMH and CDWA-Lite on last project; writing tips for museums working on stuff like this.


FOAF? Richard, V&A - just done collections online with an API that wasn't really standards-based.  Is with Piotr - we should just be able to do this stuff with NLP and text mining - also interested in FOAF.  FOAF sounds like a winner as we know there are people out there lookig for people's names. 

Peter Goodall - large db of people to disambiguate names.  Paul - playing with FOAF - someone made a FOAF generator from their API.  Paul Rowe - NZ museums project - looking at terminologies and overlaps.


Or maybe not FOAF... Patrick from CollectionSpace and UC Berkeley - in past life has done lots of semantic work but has reservations about RDFa. Worries about vocabs e.g. Dublin Core that turns out to be irreconcilable but once embedded make it hard to do more serious things. Interested in reasoning and inferencing across collections.  Ontologies are a point of view, doesn't believe can have a universal point of view.  Use NLP (natural language processing) to index collections from a given community. Interesting to explore more specifically the use cases e.g. compelling cases around events. FOAF doesn't let you model different types of relationships and roles that one person may fulfil. e.g. of how it's hard to shift a community to something more refined once a model is in place.  Potential to generate multiple points of view with different vocabs, use cases will help him understand.


What next? AKA, getting on with it

Testing standards - I'm really up for implementing something on our existing pages - I was thinking that a comparison of two different standards, both marked up as RDFa on existing Science Museum/NMSI web pages (Dublin Core on Ingenious and LIDO on Making the Modern World) , would help provide some useful data on the utility of the approach and the beginning of a comparison between standards.  I've written about it a bit at http://museum-api.pbworks.com/Science-Museum-linked-data - it's a very unfinished document but if you've got suggestions how making it better I'd love to hear them.


[My notes get sketchy from here on it because I'm returning to them after a few months, and some use cases may have ended up in this section, but that's probably ok]

It was suggested that versioning could be a way of dealing with the fact that we don't have a perfect standard right now - it could allow us to iterate through various prototypes and demonstrators until we get something good, while not breaking projects that are built in the meantime.


Microformats - Paul Hagon has used them on event (and other stuff?), Nate pointed out that they're used by Google and Yahoo.


Richard - maybe work on a new 'do one thing' challenge.


Dublin Core is 'messy'.  Patrick: 'is a little better than tagging'.
Peter - interested in using really dumb taxa cos people catalogue inconsistently anyway.
Patrick - taxa even in life sciences don't agree.
Something that's good enough vs something perfect.
Map to shared system with mapping to the authorities used to back things up.
PS: instead of describing a free concept, e.g. a pig, but 'a pig' and when we say pig, we mean it as in this name authority.
GW: identifier-based systems.
How much do we aim for perfection?
PS: don't tie yourself to a syntax that doesn't allow for that.
NS: What can we solve today?
PS: don't want to say figure everything out before you start but consider later options.
NS: let's do something lightweight - add RDFa to marked up pages.
Peter G: interested in something really simple... really interesting thing is the objects - being able to refer to the identity of an object from a pictorial represntation.

LIDO as vocab that works for social history museums and not just art galleries; Dublin Core as quick win.

NS: if we provide enough good enough markup... PA: satisficing approach.
WordNet as term, authority list.
Grappling with issues around how lightweight/heavyweight to go that allows useful exchange of records/assertions.
PS: can I pivot across museums based on some RDFa tags?


[So as you can see, there were no solid conclusions and we didn't leave with an agreement "let's all try implementing x".  I still like the idea of an MW2010 challenge, ideally something you can participate in as a publisher or consumer of data... Suggestions?]

Comments (0)

You don't have permission to comment on this page.