WhenBrowsableAndUnambiguousCollide

From W3C Wiki

Renamed and rewtitten as an anti-pattern, OverloadedUri.

What follows below is older stuff.


GoodURIs are both browsable and unambiguous. These qualities work well together for URIs naming web pages, but what about other things?

  • Suppose we want to represent a real world object with a URL. We put a
web page at that URL describing the object. Now, when we make statements 
in RDF (or DAML, or OWL) with that URL as the subject or object, there 
is a potential confustion about whether the statement is about the web 
page or about the object that the web page describes. For instance: if a 
web page describes a particular copy of "War and Peace" and we have an 
RDF statement that the author of 
"http://www.mybook.net/WarAndPeace.html" is "Leo Tolstoy", does that 
mean that he wrote the book, or the web page?
  - Steven Gollery (2003-06-02)

The short answer is: don't do that! Don't use one URI for both a real-world object and a web page. If you do, you are introducing a lot of ambiguity, and you no longer have GoodURIs.

But then what URIs can you use for real-world objects? How can you make a URI for, say, a particular chair and have it be browsable without introducing ambiguity between the chair itself and the web page the user sees?

The short answer is that when the users tries to web-browse the chair, you say "No, but here's a page about the chair, instead." This indirection can be done with either HashURIs or SlashRedirection.

One theory about where to draw the line between "web pages" and "other things" ("real-world objects") is that URIs for "other things" can have MultipleCorrectBehaviors. That may not be quite right, but it is important to recognize when indirection is necessary. Perhaps the simplest question is "Are users experiencing the thing itself in their browser (no indirection needed), or are they receiving information about the thing? (indirection needed)"

Older version of this page, with interesting comments

I think the real issue here is that "identifying" is ambiguous and two very different senses of that word are colliding. In one sense, like in programming languages, the identifier can be used as a way of getting to the thing identified, and it is that use which makes it identify what it does. In the other sense it is simply a name, and there is absolutely no assumption that a name can be used to actually get hold of or to access the thing named, in general. Names can be assigned completely arbitrarily, as when a mathematical proof says "call the set S3..." or when parents decide to call their son "Roger". So the kind of relationship between a name and thing named is completely different from that between an identifier and the thing it locates in some framework. The way that RDF uses URIs (well, URIrefs) is like names, but the way that HTML uses them is like addresses. But these need have nothing to do with one another! Some have argued that it makes good sense to say that whenever a URI is meaningful as a URL, that it be understood to name the thing retrieved by HTTP (whatever that is), which would provide a useful link between them; but they shouldn't be simply identified. --PatHayes

That doesn't make any sense to me (DanConnolly). Names take on meaning by use. A parent calling out a child's name to get their attention is just as much accessing the thing denoted by the name as a program expression using an identifier to refer to the variable it refers to. A child's name also takes on meaning by use in conventional settings like birth announcements, birth certificates, and such.

For example, consider as a resource the American Civil War. There are many web pages about it (perhaps 2 million!), each with its own URI. But none of those are URIs for the war itself; they are for web pages about the war. If you tried to use one of those URIs to refer instead to the war, it would be ambiguous: some people would still think it was a URI for a web page.

But "use to refer to" is a little vague. "Treat as denoting" is different from "Use in referring expressions", since the latter allows reference-by-description strategies, where the URI is used within a piece of descriptive RDF, rather than simply as a name. --DanBri

If you had two URIs for the war, you would want to be able to say they (that is: what they refer to) were owl:sameAs each other, without saying that two information sources were also identical.

This problem occurs when you name something with a browsable URI and the thing itself is not a "web page". Without getting into exactly where the line around "web page" is (see TagIssue:httpRange-14, RDF Core?, ...), it is clear some things are over the line (an actual argument here, rather than assertion, sure would be nice. This text does refute the DansCar position.). People, places, historical events, physical objects, numbers, RDF classes and properties, fictional characters, .... these are all things which are on the other side of the line. Each of these is like the American Civil War: if you give it a browsable URI, people who use that URI to get useful web content may well consider the URI to identify a useful source of web content instead of (or in addition to) the thing itself. For things on this side of the line, browsability and unambiguity are at odds.

Seems to me that this only follows if we confuse naming with being a web address of. Maybe that will be a source of confusion for a while but that is probably more because using URIs as genuine names is itself a new idea. But one can see perfectly sensible conventions emerging, eg if a URI is the name of a thing then what it browses to ought to be a document about that thing. And if Tim is right, then this would be impossible. Either way, there is no ambiguity. --PatHayes

perhaps "web page" is the wrong term to use here. There are http:-named Web services, and streams that will send you bytes forever. Sometimes "information resource", or "document" seems more appropriate than "page". It has been suggested (by TimBL?) that abstract works (eg. The Bible) are http:-namable things. The Bible isn't a page, by most stretches.

We ought to talk about HTTP's content negotiation here too (LinkMe/WriteMe)...

There are some ways of addressing this conflict:

  • HashURIs use fragment syntax to separate documents
 from terms used in the documents
 rdfs:isDefinedBy links.
  • a new URI scheme, tdb: for "thing described by" (LinkMe: Masinter to rdf-interest or uri or tag or some such)