URI Fragments - No Fluff Just Stuff

URI Fragments

Posted by: Brian Gilstrap on March 28, 2010

I learned something interesting when reading the Architecture of the World Wide Web, Volume One. It turns out that URI fragments (the part of a URI after a '#' character) are not interpreted as part of a URI:

Note that the HTML implementation in Emma's browser did not need to understand the syntax or semantics of the SVG fragment (nor does the SVG implementation have to understand HTML, WebCGM, RDF ... fragment syntax or semantics; it merely had to recognize the # delimiter from the URI syntax [URI] and remove the fragment when accessing the resource). This orthogonality (ยง5.1) is an important feature of Web architecture; it is what enabled Emma's browser to provide a useful service without requiring an upgrade.

The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource. The fragment's format and resolution are therefore dependent on the type of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced. If no such representation exists, then the semantics of the fragment are considered unknown and, effectively, unconstrained. Fragment identifier semantics are orthogonal to URI schemes and thus cannot be redefined by URI scheme specifications.

Interpretation of the fragment identifier is performed solely by the agent that dereferences a URI; the fragment identifier is not passed to other systems during the process of retrieval. This means that some intermediaries in Web architecture (such as proxies) have no interaction with fragment identifiers and that redirection (in HTTP [RFC2616], for example) does not account for fragments.

While I had an intuitive understanding of how browsers work with URI fragments in HTML documents (retrieve the document, find the fragment, display the document starting at the fragment), I hadn't considered the semantic split, nor the implications of URI fragments being applied to other kinds of representations.

I find the handling of fragments interesting in a number of ways. For one thing, it means that as new content types become part of the web, the creators of those content types are free to map URI fragments into that content type. So in HTML the format of URI fragments is generally a textual name that appears in the html. But in a 3D modeling format, it might take the form of [position,orientation,scale] to define a location from which the model is being viewed, the direction the camera is facing, and the scaling factor. That's nice because it allows URI fragments to be structured in a manner most appropriate for the kind of representation being retrieved.

One possibly surprising consequence of this split is that URI fragments are not considered in URI resolution activities such as interacting with proxies or redirection. It also means that you shouldn't try to use URI's with fragments as if they represented actual resources, since the web isn't allowed to cache individual fragments and no semantic interpretation is allowed from the '#' character onward.
Brian Gilstrap

About Brian Gilstrap

Brian Gilstrap is a Principal Software Engineer at Object Computing, Inc. where he has spent the last eleven of his 20+ years in the industry. In those years, he has worked with many languages and many technologies. He writes and blogs frequently, and has been on the steering committee of the St. Louis Java User's Group more than a decade. With OCI he provides consulting to companies in many industries and countries, and develops & delivers training courses for Washington University's Center for Applied Information Technology.

Brian has a passion for building software that is easy to use and robust while still meeting the rapid development requirements in today's industry. He has expertise in distributed systems, object oriented analysis and design, secure computing, and many languages and frameworks.

Why Attend the NFJS Tour?

  • » Cutting-Edge Technologies
  • » Agile Practices
  • » Peer Exchange

Current Topics:

  • Languages on the JVM: Scala, Groovy, Clojure
  • Enterprise Java
  • Core Java, Java 8
  • Agility
  • Testing: Geb, Spock, Easyb
  • REST
  • NoSQL: MongoDB, Cassandra
  • Hadoop
  • Spring 4
  • Cloud
  • Automation Tools: Gradle, Git, Jenkins, Sonar
  • HTML5, CSS3, AngularJS, jQuery, Usability
  • Mobile Apps - iPhone and Android
  • More...
Learn More »