HCLSIG/SWANSIOC/Actions/Alignment

From W3C Wiki

Discussion on SWAN / SIOC Integration

  • Paolo Ciccarese - paolo.ciccarese@gmail.com
  • Marco Ocana - marco.ocana@balboasystems.com
  • Matthias Samwald - matthias.samwald@deri.org
  • Alexandre Passant - alexandre.passant@deri.org

Ontologies

SWAN

The SWAN (Semantic Web Applications in Neuromedicine) ontology is an ontology for modeling scientific discourse and has been developed in the context of building a series of applications for biomedical researchers, as well as extensive discussions and collaborations with the larger bio-ontologies community. The SWAN project is currently tightly connected with the SCF (Science Collaboration Framework) project that will make use of the SWAN ontology for sharing part of its content.

The core of the SWAN ontology is representing the scientific discourse through:

  • Research Statements: hypotheses (statements proposing one or more ideas that concern the cause or development of for instance Alzheimer disease, which have a provisional nature) or claims (research statements that are accepted by the author as reasonable. Unlike hypotheses, their author does not consider them provisional, but proven.)
  • Research Questions: topics under investigation
  • Structured Comments: the structured representation of a comments published in a digital resource

These three kind of entities are represented by a certain amounts of rdf triples. More precisely they are collections of rdf graphs.

Another important set of entities in SWAN is represented by the citations. It is possible to connect pieces of scientific discourse with evidence. The SWAN ontology is currently representing records related to: journal articles, journal comments, journal news, web articles, web news, web comments and newspaper articles. The records are containing the basic information needed to build a citation in the SWAN applications.

The current version of the ontology can be found here: http://purl.org/swan/ and it is version 1.1 http://purl.org/swan/1.1/ . The version 1.2 is currently work in progress and will take into account the results of this integration process.

-- Remark: alex - be careful to keep a standardized prefix for classes / properties so that they will not change zith future versions of the ontology, to assure backwards compatibility of existing tools

SCF is more organized like a classic forum with scientific articles that can be commented by people. It can be considered like an online book ( http://www.stembook.org ) or an online journal where the article is the first post and users can comment on it.

SIOC

The core SIOC (Semantically-Interlinked Online Communities) ontology is the foundation for Semantically-Interlinked Online Communities. Developers can use this ontology to express information contained within community sites in a simple and extensible way. Some of the entities in the SIOC core ontology that are relevant for the integration process are:

  • Container: An area in which content Items are contained.
  • Item: A content Item that can be posted to or created within a Container.
  • Post: An article or message that can be posted to a Forum.
  • Thread: A container for a series of threaded discussion Posts or Items.

SIOC modules are used to extend the available terms and to avoid making the SIOC Core Ontology too complex and unreadable. The SIOC module that is relevant for the integration process is:

  • Types Module: contains sub-classes for different types of Forums and Posts.
    1. Container: AddressBook, AnnotationSet, AudioChannel, BookmarkFolder, Briefcase, EventCalendar, ImageGallery, ProjectDirectory, ResumeBank, ReviewArea, SubscriptionList, SurveyCollection, VideoChannel, Wiki.
    2. Item: Poll
    3. Forum: ArgumentativeDiscussion, ChatChannel, MailingList, MessageBoard, Weblog.
    4. Post: BlogPost, BoardPost, !Comment, InstantMessage, MailMessage, WikiArticle.

The current version of the core ontology can be found here: http://rdfs.org/sioc/ns# . The Types module is available at http://rdfs.org/sioc/types# .

Alignment

Regarding the current states of the ontologies, there are 2 main interaction points for ontology alignment:

  • Containers / Items
  • Discussions / Argumentative model

Questions / issues:

  • Can the 'Tag' part of SWAN be replaced by using directly the Tag Ontology - http://www.holygoat.co.uk/projects/tags/ - It will also allows to use SCOT and MOAT later.
  • Will there be some Hypothesis / Claims subclasses for ResearchStatements (as it seems that those objects have a different 'trust' level.
    • We are not defining a sublcass just because the distinction is pretty arbitrary. Thus we consider Hypothesis and Claims to be the roles of ResearchStatement.
  • There are 3 citesAsXXX properties, what about a superProperty ? The cites property has a range = LSE, while citeAsXXX have Citation, is that OK ?
    • We can define a super property cites and sub properties with also citesLses
  • Wondering if the creation / title / description can be deprecated and replaced by DC elements (as we did in SIOC) - Moreover, if we model a DiscourseElement as a subclass of sioc:Item, it will be automatically inferred from the top classes.
    • As Dublin Core elements are defined as Annotation Properties I have some problems with OWL DL as I cannot define domain-ranges and subproperties. But I think we can do this anyway.

Containers/Items

A straightforward way to integrate the content of the SCF project would be to consider scientific articles as first posts of a discussions in a sioc:Forum. A scientific article can be then defined as a new type of sioc:Post in the Types Module.

  • swan:Citation rdfs:subClassOf sioc:Item sounds good to me - Some subclasses of swan:citation can also subclass some more specific types (eg: JournalComment being subclass of sioc:Comment + swan:citation)
  • I'll also consider Discourse Element as a subClass of sioc:Item (as that's more generic, less 'physical' than a post)
  • Regarding the property, discourse2discourse or citation2discourse can be subproperties of related_to

It is also possible to define new kind of sioc:Container like OnlineBook (a website where immutable chapters are published and comments are allowed on them) or OnlineJournal (a website where immutable articles are published and comments are allowed on them). OnlineBook would probably better fit the content of SCF as one of its implementation ( http://www.stembook.org ) has an ISBN.

  • Good idea

The SWAN scientific discourse elements, as already pointed out, are already defined as collection of triples. Thus, it is necessary to define a way of connecting such collection of triples to SIOC.

Discussions / argumentative

Questions: citations ?

As soon as the SWAN 'items' will be aligned to SIOC, the current SWAN properties may be aligned with existing SIOC ones. The most broader subproperty in sioc, considering that SWAN:ScientificDiscourse is a subClassOf sioc:Item will be to use sioc:related_to

Other ontologies

Bibliographic Ontology

The Bibliographic Ontology Specification provides main concepts and properties for describing citations and bibliographic references (i.e. quotes, books, articles, etc) on the Semantic Web. There are two categories of entities that are of interest in relation to SWAN: the Documents (such as: Article, Letter, Book...) and the Collections (Journal, Newspaper...). Concepts that are used by SWAN and seem to be missing are "News" and "Published Comment". Examples of both are published for instance in http://www.alzforum.org ).

Currently, there is no intention to integrate the bibliographic ontology in SWAN ontology version 1.2 (in progress). Such integration will be taken into consideration for the version after.

Useful links