HCLSIG BioRDF Subgroup/Meetings/2009-06-08 Conference Call
Conference Details
- Date of Call: Monday June 8, 2009
- Time of Call: 11:00 am Eastern Time
- Dial-In #: +1.617.761.6200 (Cambridge, MA)
- Dial-In #: +33.4.89.06.34.99 (Nice, France)
- Dial-In #: +44.117.370.6152 (Bristol, UK)
- Participant Access Code: 4257 ("HCLS")
- IRC Channel: irc.w3.org port 6665 channel #hcls (see W3C IRC page for details, or see Web IRC)
- Duration: ~1 hour
- Frequency: bi-weekly
- Convener: Kei Cheung
- Scribe: Lena Deus and Eric Prud'hommeaux
Attendees
Satya Sahoo, Olivier Bodenreider, Scott Marshall, Lena Deus, Jun Zhao, Kei Cheung, Eric Prud'hommeaux, Rob Frost
Regrets
Matthias Samwald
Agenda
- Introduction and Roll Call (Kei)
- Provenance/workflow presentation (Satya) MS powerpoint slideshow PDF version
- Image data (Rob)
- SPARQL control access (Eric, Lena)
- Shared name -- pathway use case (Eric, Scott, Lena)
- AIDA (Scott)
- TCM data (Jun)
- Atags (Matthias)
Minutes
<kei> introduction and going through the agenda
<kei> start with Satya's presentation on provenance
<kei> update on HCLS KB's (Matthias emailed an update to the group>
<kei> Rob will report on looking for image data in KB's
<kei> incorporating control access in sparql
<LenaDeus> Kei: security is a concern on the semantic web
<kei> Jun will give a brief update on tcm data
<kei> Scott will give an update on AIDA
<kei> Matthias emailed the group an update on aTags
<LenaDeus> Satya presents his slides: http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Meetings/2009-06-08_Conference_Call
<LenaDeus> Topic: Provenence research
<LenaDeus> How can provenence be queried efficientelly for different applications?
<LenaDeus> Example use case: Try to understand the tasks that mediate a gene and a cloned sample
<LenaDeus> provenance can be used to assess whether data is trustworthy
<LenaDeus> a provenance ontology is used to represent provenance
<LenaDeus> Provenir ontology - the goal was to establish the minimun set of classes to describe provenance but also to enable its extension
<LenaDeus> agent and data re not connected to data directly - they are instead connected through a "process" (see slides)
<LenaDeus> The provenir ontology has 2 main classes: data and paramenter
<LenaDeus> There is no differentiation between dataset and provenance information in the data store
<LenaDeus> As such, both provenance and data can be queried using the same mechanism
<LenaDeus> Provenance is classfied into 3 categories
<LenaDeus> (slide 9)
<LenaDeus> Provenance Metadata; Specific Dataset; Operations in the Provenance Metadata
<LenaDeus> If the data has some set of characteristic attributes, the queries can be oriented by those attributes
<LenaDeus> 4 query operators were defined (see slide 10)
<LenaDeus> A query engine has been implemented on Oracle 10g
<LenaDeus> the query engines was developed as a plugin
<LenaDeus> Query optimization was found necessary - the query was taking 5-6 days to be completed
<LenaDeus> Provenance information is, by definition, historic information - it can therefore be used for optimization of queries
<LenaDeus> Using this model, the query time was reduced to 5/6 seconds
<mscottm> I think that the answer to Kei's question is 5 or 6 days...?
<LenaDeus> Conclusion (slide 15): 1) A common model of provenance that can be re-used within collaborations;
<LenaDeus> 2) Decision making support by use of standard reasoning rules
<LenaDeus> 3) A provenance query engine
<LenaDeus> 4) Verification and validation of data via provennace
<kei> Lena: how provenance info is represented in RDF?
<kei> Satya: yes in RDF.
<kei> Kei: Is named graph used?
<Satya> no.
<LenaDeus> Kei: can provenance information be integrated into the query federation scenario?
<@ericP> q+ to talk about proof languages
Zakim sees ericP on the speaker queue
<mscottm> http://twiki.ipaw.info/bin/view/Challenge/ThirdProvenanceChallenge
<LenaDeus> (I have to leave in 5 min: can anyone take over scribbing, please :) )
<@ericP> satya: we published a workflow system in IEEE
<@ericP> ... many workflow-based systems miss provenance info
<@ericP> ack me
<Zakim> ericP, you wanted to talk about proof languages
Zakim sees no one on the speaker queue
<ssahoo2> Semantic Provenance workshop at ISWC 2009: http://wiki.knoesis.org/index.php/SWPM-2009
<kei> ericP, proof chain is needed to be shown when federating provenance data
<@ericP> mscottm: for federation, you can use provenance to inform the query choreography
<@ericP> satya: named graph can help me direct my queries
<ssahoo2> I agree, there was a paper in WWW2005 by Jeremy Carrol: http://www4.wiwiss.fu-berlin.de/bizer/pub/Carroll_etall-WWW2005.pdf
<ssahoo2> discussing provenance, named graph and trust
<@ericP> satya: your provenance may be data for me
<@ericP> ... provenance info varies by query and domain requirements
<@ericP> mscottm: example to help evaludate
<@ericP> ... we have a workflow which produces textmined protein interactions
<@ericP> ... at the VoID level, you could say "this has a list of protein pairs"
<@ericP> ... then the provenance info would tell you where that data came from
ssahoo2 (826c1c72@128.30.52.43) Quit (Quit: CGI:IRC (EOF)^o)
<@ericP> kei: need to exchange now and the next call
<@ericP> topic: image datasets
<@ericP> rob: was looking at alen brain image data
<rfrost> http://www.w3.org/TR/hcls-kb/#aba
<rfrost> http://neurocommons.org/page/Bundles/aba
<@ericP> ... there is a bundle from 2007 incorproated in the neurocommons db
<rfrost> graph <http://sw.neurocommons.org/2007/aba-20070226> { ?aba_gene_record aba:refersToSameGeneAs ?mouse_gene. ?aba_mouse_expression_record aba:measuresGeneIdentifiedWith ?aba_gene_record . ?aba_mouse_expression_record aba:hasSectionSeries ?section_series . ?section_series aba:hasSection ?section. ?section aba:hasImagePyramids
<@ericP> ABA properties:
<@ericP> aba:refersToSameGeneAs
<@ericP> aba:measuresGeneIdentifiedWith
<@ericP> aba:hasSectionSeries
<@ericP> aba:hasSection
<@ericP> aba:hasImagePyramids
<rfrost> http://neurocommons.org/page/RDF_library/All_relations
<@ericP> this corpus is now old
<rfrost> http://developingmouse.brain-map.org/docs/ReferenceAtlas.pdf
<@ericP> rfrost: with this, we can model development
<@ericP> kei: would image data serve as a good use case for query federation scenarios?
<@ericP> ... do these contain provenance info?
<rfrost> not certain
<@ericP> ... for instance, go from region to sequences and visa-versa
<@ericP> rfrost: ABA offers web apis accessing image data by region or by gene
<@ericP> ericP: i think there was some value added to what's offered in the AB web api (image processing) in the neurocommons data
<@ericP> topic: DILS 09 abstract
<@ericP> junzhao: i've been logging @@1 into our SPARQL endpoing
<@ericP> ... grabbing herbs, clinical trials,
<@ericP> s/paper/poster/
<@ericP> junzhao: need input from matthias -- some "related" data is positive, other negative
<@ericP> ... would like 1 week before 22 july
<@ericP> ... so finish implementation work by end of june and start poster production beginning of july