HCLSIG BioRDF Subgroup/Meetings/2009-07-06 Conference Call
Conference Details
- Date of Call: Monday July 6, 2009
- Time of Call: 11:00 am Eastern Time
- Dial-In #: +1.617.761.6200 (Cambridge, MA)
- Dial-In #: +33.4.89.06.34.99 (Nice, France)
- Dial-In #: +44.117.370.6152 (Bristol, UK)
- Participant Access Code: 4257 ("HCLS")
- IRC Channel: irc.w3.org port 6665 channel #hcls (see W3C IRC page for details, or see Web IRC)
- Duration: ~1 hour
- Frequency: bi-weekly
- Convener: Kei Cheung
- Scribe: Rob Frost
Attendees
Rob Frost, Kei Cheung, M. Scott Marshall, Lena Deus, Satya Sahoo, Jun Zhao
Regrets
Matthias Samwald
Agenda
- TCM (Traditional Chinese Medicine) update
- Query Federation -- Microarray data and its integration with other data sets
Minutes
<rfrost> First agenda item, TCM update
<rfrost> Jun taking lead on project
<rfrost> collaboration between BioRDF and LODD
<rfrost> Jun: poster is in good shape, decided which portions each person will fill in
<rfrost> Kei: exploring linked data approach, how it can be used to connect western drug data and chinese herbal data
<rfrost> Kei: link using string matching on common gene names
<rfrost> Kei: various issues on linking genes by gene names between various datasets
<rfrost> Kei: can also link by drug names
<rfrost> kei: interesting outcome is estabilishing best practices regarding this type of dataset integration
<rfrost> Kei: integration of herbal medicine is complementary therapeutic approach to western medicine
<rfrost> Scott: linking by gene name is difficult due to large number of synonyms
<rfrost> mscottm: when you link by drug names do you then retrieve compounds?
<rfrost> kei: yes
<rfrost> mscottm: chemspider, potentially relevant dataset
<rfrost> mscottm: may be useful for recognizing drug compounds referenced in literature
<rfrost> mscottm: drug name->compound and then identify the compound name via chemspider
<mscottm> http://aida.science.uva.nl:9999/search/
<mscottm> http://www.csw.inf.fu-berlin.de:4039/sesame
<mscottm> Here's another: http://ws.adaptivedisclosure.org/search/
<rfrost> mscottm: can use AIDA to connect to Berlin endpoint - can see SenseLab dataset, TCM is not appearing
<rfrost> Jun: will forward SPARQL endpoints and other information to Scott to help further investigate issue
<rfrost> kei: discussed possible future directions for TCM during last call
<rfrost> kei: one option was to contact Huajun the possibility of integrating TCM datasets that are written mainly in Chinese
<rfrost> kei: did contact and he is interested in participating in joining the BioRDF call and giving presentation
<rfrost> Query federation topic
<rforst> kei: is there a publication date for BMC special issue
<rfrost> mscottm: does not know if a specific publication date has been set
<rfrost> kei: mentioned possible future directions for query federation in paper
<rfrost> kei: one potential area for investigation is the integration of new data sources
<kei> http://np2.ctrl.ucla.edu/np2/viewProject.do?action=viewProject&projectId=433773
<rfrost> kei: sent out email outlining microarray data
<rfrost> kei: link for microarray experiment "Project steph-affy-human-433773"
<mscottm> got it
<rfrost> kei: one of multiple neuroscience related microarray experiments; each experiments includes links to associated publications
<rfrost> kei: how to represent this data in RDF/OWL?
<rfrost> kei: which experimental data can be considered metadata/provenance?
<rfrost> mscottm: create RDF triples from HTML markup?
<rfrost> Project data included MAGE-ML
<rfrost> should investigate mapping from MAGE-ML to RDF
<rfrost> will require multiple existing ontologies to model associated data
<rfrost> kei: references to NIF ontologies
<rfrost> Jun: large overlap between MAGE ontology and NCI Thesaurus
<rfrost> kei: task: identify appropriate ontologies for modeling experimental information
<mscottm> http://bioportal.bioontology.org/
<mscottm> http://www.bioontology.org/wiki/index.php/Using_NCBO_Technology_In_Your_Project
<rfrost> rfrost: attempt to use Annotator on experiment HTML/publication/MAGE-ML data
<rfrost> Lena: has experimented with bioportal webservices
<kei> Rob, Satya, and Lena may start exploring the use of ncbo ontology services in the context of the microarray example
<rfrost> lena: RESTful services
<rfrost> kei: raw microarray data may not be appropriate for use on query federation task
<rfrost> kei: attempt to integrate gene expression data in these publications with ABA and GeneSat
<rfrost> lena: mage has concept of levels of data
<kei> differentially expressed genes should be focused on first
<rfrost> mscottm: microarray data not as reliable as other methods
<rfrost> lena: appears that gene list in paper is not complete
<rfrost> (gene list in fig 3 & 4 in paper referenced in Kei's email)
<kei> lena will contact the author for genelists
<rfrost> kei: there are other papers/experiments covering other brain regions/conditions
<kei> possible use of atags?
<rfrost> mscottm: does MAGE-ML contain sufficient data to enable integration with other datasets?
<mscottm> <OntologyEntry category="CellType" value="layer III neurons" description="">
<rfrost> jena: MAGE-ML does not cover level 2 or level 3
<mscottm> <OntologyEntry category="OrganismPartRegion" value="Entorhinal Cortex" description="">
<rfrost> need to further investigate exact structure/content of MAGE-ML and alignment with other datasets/ontologies of interest
<LenaDeus> http://mged.sourceforge.net/ontologies/index.php
<rfrost> mscottm: federation scenario: federated query across existing HCLS datasets, SWAN. etc.
<rfrost> mscottm: talk to Paola from scientific discourse group about potentially relevant experiences
<rfrost> kei: could be good opportunity for collaboration with other task forces
<mscottm> http://www.ebi.ac.uk/gxa/
<kei> scott: atlas of gene expression
<rfrost> mscottm: EBI just launched an atlas of gene expression; may be a relevant dataset