| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

View
 

RDFa

Page history last edited by Nate Solas 11 years, 7 months ago

I'm working on adding some RDFa markup to the object details on www.artsconnected.org, and hope to use this page to document my research and decisions - and get feedback along the way.

 

My primary struggle at the moment is picking a vocabulary.  There is some good brainstorming on a work of art microformat here, but I'm not sold on any of those at the moment. I think CDWAlite is a good enough standard, but the excessive nesting makes it prohibitive when the appeal of RDFa is the relative simplicity.  Richard Urban has a draft of an OWL model of CDWAlite, so if I go this route a lot of thinking has already been done.

 

Next: why bother with RDFa at all?  I'd love to do it because it's "correct", and go full on with everything we know from CDWAlite, etc, but really I want to do this primarily to help search engines find and understand our collection.  And I really doubt search engines have bothered to understand CDWAlite...  So this has really pushed my decision in the direction of plain old Dublin Core, with maybe some qualifiers, and using a few other well-understood vocabularies for the edges (comments, tags, etc).  ("Well understood" = on this page).

 

Thoughts?

 

 

Comments (12)

Nate Solas said

at 4:58 pm on Feb 13, 2010

It occurs to me that my "helping search engines" approach is missing the bigger picture - properly linked data would be usable by new apps, or browser-based plugins that could recognize when you were looking at a work of art and offer to search other repositories for the same artist, or define styles and movements, etc. It seems the real long-term potential is in stuff like that.

I still like my DC idea, but this makes me think I should "double-tag" most things using the DC attribute but also CDWAlite where appropriate. From what I understand this is still valid RDFa..?

Mia said

at 4:21 pm on Feb 14, 2010

Have you looked at VRA? http://www.vraweb.org/projects/vracore4/

I keep meaning to get in contact with Sebastian Heath about it as he appears to have done some interesting things e.g. http://mediterraneanceramics.blogspot.com/2008/12/rdfa-at-ilion.html and http://mediterraneanceramics.blogspot.com/2010/01/rdfa-patterns-for-ancient-world.html

Other relevant work might be discoverable via http://efoundations.typepad.com/efoundations/2007/10/index.html though I assume those projects have moved on or even completed since then.

re: double-tagging - I've been categorising my collections plans as 'public-facing' and 'machine-facing' but perhaps I need to go further - 'search engine-facing' and 'peer-facing' machine-readable data.

Nate Solas said

at 3:17 pm on Feb 16, 2010

I did poke at VRA and CIDOC a bit, both seem worth spending more time on. This is quickly becoming a bigger bite than I'd planned to chew!

Our data is not (yet) normalized to good vocabularies like AAT & ULAN, etc. We have misspelled artist names in places, and we have things classified as "ceramic" one time and "pottery" or "clay" the next. To get any real benefit from a formal markup like VRA, CDWA, or CIDOC, I think we need to clean our data first.

I had found some of Sebastian Heath's posts but your links led me to some other examples where he is doing what I mentioned: basically marking up the data with every standard that might apply. It's frustrating to think that's the future of machine-readable data, though, where we the developers are doing all the crosswalks upfront..? So, to get good "coverage" in any potential uses for this markup, I'd need to use VRA, CDWAlite, and CIDOC (and DC!) to describe the works. Hmm.

This must be where everyone typically says "I'm not doing this until there's a standard and some clear benefits I can understand!"... :)

I still think I can turn around a minimally-marked up version pretty quickly, and may go ahead and push out our non-standard-vocabularies in a more fully marked up version, but it might have to be a week or two.

Mia said

at 5:33 pm on Feb 20, 2010

I've just realised that we're in the middle of migrating a site full of data that was mapped to Dublin Core (http://ingenious.org.uk/) - it might not be too late to get simple RDFa into the markup.

Raj said

at 9:14 am on Mar 1, 2010

nate said, "I still like my DC idea, but this makes me think I should "double-tag" most things using the DC attribute but also CDWAlite where appropriate. From what I understand this is still valid RDFa..?"

I think you're right. You'll have to support multiple formats if you want to please multiple audiences. But don't think about it as duplicating work. Instead, think about it as providing different output formats that are programmatically generated from a single, (internal) representation of your metadata.

Mia said

at 5:34 pm on Mar 21, 2010

LIDO is the new CIDOC, apparently.

Mia said

at 11:10 am on Mar 22, 2010

I should have known I'd get myself in trouble for being flippant! A bit more accurately - LIDO is being proposed for use by museums online, rather than CIDOC-CRM e.g. http://www.athenaeurope.org/index.php?en/149/athena-deliverables-and-documents

jeremy said

at 12:53 pm on Mar 22, 2010

Still more pedantically, LIDO is a partial implementation of CIDOC-CRM, which is a reference model and not a "format". It also incorporates CDWALite* since it's an attempt to bridge these two world-views (and the Atlantic). Whether that attempt will work out or not remains to be seen! Still, it's not as scary-looking as it might have been. Richard Light says it shouldn't be too hard to express LIDO as RDF(a), though I'll wait until he's done it before I believe him ;-)

*better give an honourable mention to SPECTRUM too

Richard said

at 3:50 pm on Mar 22, 2010

I'd second Raj's comments that this shouldn't be duplicated work (on the values side at least). In "Moving Towards Shareable Metadata" (http://is.gd/aSWfh) Shreeves, et al. discuss thinking about "views" of metadata that may be expressed in different formats (or even the same format) for different audiences. I'm not sure that we need to provide our data in every format under the sun, but make an argument for specific use cases. If those records are consistent and conform to the standard, someone who wants to use them in another context can shoulder that burden. It may be worth a quick exercise to see if your chosen format supports useful migration to other related formats (I don't know, is it easier to go from VRA to CDWA? or CDWA to VRA? Maybe LIDO will trump them all.) Perhaps Dublin Core RDFa would be best to embed in a HTML page, but these simple records could reference metadata in other formats available via another service such as an API).

The other challenge is not only "crosswalking" properties from one standard to the next, but trying to figure out how to map from one view of the world to another. In my poster I mention a few things that are unresolved in CDWA's model, that are partly addressed by MuseumDAT via CIDOC CRM (and perhaps LIDO, haven't had a chance to sink my teeth into it yet). I can't say that I resolved them either, as they are choices for the community to make together.

At Google they talk about "eating the dog food" - I'd also suggest not only developing your own RDFa and services, but go out and try to use other people's LOD. Right now the IMLS DCC project is aggregating more than a million records from 300+ institutions. To build common services across all of this metadata we have to do a lot of work and it really changes your perspective when you look at metadata at this scale. Going out and doing stuff with other museum LOD data can inform the choices you make about your own.

Nate Solas said

at 1:45 pm on Mar 25, 2010

Brian Kelly blogs a bit about RDFa: http://ukwebfocus.wordpress.com/2010/03/25/microformats-and-rdfa/

Sorry to have neglected this page, things have gotten a bit busy lately. Hope to reconnect shortly.

Mia said

at 1:37 pm on Mar 26, 2010

No worries, I know the feeling! I'm snatching moments here and there but I won't really be free to do much until June. Pesky gallery builds!

Mia said

at 5:49 pm on Apr 25, 2010

I've tweeted this but thought I should add it here too - Google have a really useful tool to test your rich data snippets (microformats, rdfa): http://www.google.com/webmasters/tools/richsnippets (courtesy Daniel Pett)

You don't have permission to comment on this page.