LOD-LAM crowdsourcing session notes


Attendees - Tim Wragge, Asa Tourneau, Mia, William Gunn, Martin Kaltatovic, Kris Carpenter, Marcus G, Jo Pugh, Layna White, Matt Z, Jon Voss, Peter Brinkley.

 

Themes from intros - discovery, sharing, re-use - crowdsourcing as a way to make things happen (and as a form of public access to historical material) and issues around building tools for crowdsourcing.

 

Specially designed crowdsourcing interfaces and projects; emergent behaviours for self-organising groups like machine tags; capturing what people are doing with material in their own research; meta-level of linking - stuff that happens outside your own digital presence.

Not only creating but using structured crowdsourced data like machine tags

 

Importance of quality interface in getting quality crowdsourced data

Also creating an ecosystem of interfaces to deal with the problems that crowdsourcing projects throws up - e.g. one interface to correct errors that might have crept in through other interfaces

 

Experience design - small satisfying interactions that add up.

 

The 'crowd' is not monolithic - can be specialists with energy and enthusiasm and specialist skills.

 

Crowdsourcing as way of empowering the specialists who get excited about the availability of resources and want more.

 

Trust and crowdsourcing - could rely on being able to identify specialists and authoritative people - but how do you identify them?

 

Discussion re information provenance and expert, authoritative crowdsourced content vs general public contributions - are ideas of expertise from institutional contributors based on false confidence on e.g. archival records being created by volunteers.

 

Also from legacy data in LAMs - big quality and trust issues.

 

So linked open data and crowdsourcing - what are the opportunities?  What's the minimum viable task that creates LOD?

Recording relationships - useful nexus between need and opportunities.

Getting structured data from unstructured data... Open tasks then build on emergent behaviours?

 

Licensing - more straightforward in specific crowdsourcing interfaces but harder when it's built into the other work that people are doing e.g. taking notes or links with an interface or tool (e.g. zotero, mendeley, etc)...

 

Demand for guidance on structured names for people and places.

 

Capturing data when people are using records... even if you're not sure yet how it can be used - but aim for a positive feedback cycle where people can see their actions make a difference, their data creating new possibilities.

 

Work with the things that people are already doing... Also start with existing communities and what they're doing...

 

Context in which someone encounters and creates data about an object or record... how to record it and what can be done with that data?LOD-LAM crowdsourcing session notes

 

Being smart about avoiding lost opportunities.

 

How much information do you need for re-usable, useful data vs structured data?  Though you can always crowdsource the application of labels to relationships once they're created.

 

Reputation and motivation... democratising access and measuring quality of what's contributed. e.g. if using a game for crowdsourcing, people could try to game it; balance between encouraging participation and avoiding disincentives.

 

Design metrics for user content trustworthiness that matches the content - e.g. how long an edit remains un-reverted on wikipedia. Proxies for accuracy...

 

Action point around lost opportunities for getting semantic data...

 

Designing for interoperability, structured or semantic data... What questions should you ask before designing a project - design backends for future uses.

 

Thinking about context of audiences, material and content and mapping audience motivation, opportunity, ability against your material.

 

Designing user experiences that hide any complexity but that elicit contextualised, structured content...

 

Don't conflate highly structured tasks (which can be easy and satisfying) with complex tasks that require judgement (that can feel like work).

 

Meta-level links between collections, across LAM domains...

 

Discoverability of other structured data projects for linking across - Freebase?

 

Potential action items:

recommendations for crowdsourcing projects (backend/front-end)

gotchas

design choices on a spectrum e.g. barriers to entry - low for higher participation vs high for specialist requirements

list of crowdsourced projects

experiences from people using crowdsourced data