Infrastructure-Agnostic Hypertext

Authors
Jakob Voß ORCID iD
  • Verbundzentrale des GBV (VZG)
Published
2019-04-17
Modified
2019-06-29 (version 1.0.0)
Identifier
https://arxiv.org/abs/1907.00259 , https://jakobib.github.io/hypertext2019/
Status
Published rejected manuscript with minor changes at arXiv.org
Repository
https://github.com/jakobib/hypertext2019/
Feedback
Annotate via hypothes.is
Open a GitHub issue
License
CC BY 4.0

Abstract

This paper presents a novel and formal interpretation of the original vision of hypertext: infrastructure-agnostic hypertext is independent from specific standards such as data formats and network protocols. Its model is illustrated with examples and references to existing technologies that allow for implementation and integration in current information infrastructures such as the Internet.

Categories and Subject Descriptors

  • CCS Information systems → Hypertext languages
  • CCS Information systems → Document representation
  • CCS Human-centered computing → Hypertext / hypermedia

Keywords

Introduction

The original vision of hypertext as proposed by Ted Nelson [14, 22] still waits to be realized. His influence is visible through people who, influenced by his works, shaped the computer world of today, last but not least the Web [17].1 Nelson’s core idea, a network of visibly connected documents called Xanadu, goes beyond the Web in several aspects. In particular it promises non-breaking links and it uses links to build documents (with versions, quotations, overlay markup…) instead of using documents to build links [16]. The concept of hypertext or more general ‘hypermedia’ has also been used differently from Nelson, both in the literary community (that focused on simple links), and in the hypertext research community (that focused on tools) [30].

This paper tries to get back to the original vision of hypertext by specification of a formal model that puts transcludeable documents at its heart. Apart from Nelson’s works [14, 15, 1820, 22] this paper draws from research on data format analysis [11, 28], content-based identifiers [12, 26], existing transclusion technologies for the Web [1, 2, 6, 25] and hypertext systems beyond link-based models [3].

Limited by the state of document processing tools and by submission guidelines, this paper is not a demo of hypertext.2 On a closer look however there are traces of transclusion links that have been processed to this paper. See figure for an overview and https://github.com/jakobib/hypertext2019 for details and sources.

Outline

The architecture of an infrastructure-agnostic hypertext system consists of four basic elements and their relations:

  1. documents include all finite, digital objects

  2. document identifiers reference individual documents

  3. content locators reference segments within documents

  4. edit list combine parts of existing document into new ones

Instances of each element can further be grouped by data formats. The elements and their relations are each described in the following sections after a formal definition.

Formal model

A hypertext system is a tuple D, I, C, E, S, R, U, T, A where:

D
is a set of documents
I
is a set of document identifiers with I ⊂ D
C
is a set of content locators with C ⊂ D
E
is a set of edit lists with E ⊂ D
S
is a set of document segments with S ⊂ C × D

Document sets can each be grouped into (possibly overlapping) data formats. The hypertext system further consists of:

R
is a retrieval function with R: I → D
U
is a segments usage function U: E → 𝒫(S)
T
is a transclusion function with T: S → D
A
is an hypertext assemble function with A: E → D

A practical hypertext system needs executable implementations of the functions R, U, T, A and a method to tell whether a given combination c, d⟩ ∈ C × D is part of S to allow its use with T.

Documents

A document is a finite sequence of bytes. This definition roughly equates with the definition of data as documents [8, 29]. The notion of hypertext used in this paper therefore subsumes all kinds of hyperdata (datasets that transclude other datasets). Documents are static by content [24] but they may be processed dynamically. Documents are grouped by non-disjoint data formats such as UTF-8, CSV, SVG, PDF and many many more.

Document identifiers

A document identifier is a relatively short document that refers to another document. Identifiers have properties depending on the particular identifier system they belong to [29]. Identifiers in infrastructure-agnostic hypertext must first be unambiguous (an identifier must reference only one document), persistent (the reference must not change over time), and actionable (hypertext systems should provide methods to retrieve documents via the retrieval function R). Properties that should be fulfilled at least to some degree include uniqueness (a document should not be referenced by too many identifiers), performance (identifiers should be easy to compute and to validate), and distributedness (identifiers should not require a central institution). The actual choice of an identifier system depends on weighting of specific requirements. A promising choice is the application of content-based identifiers as proposed at least once in the context of hypertext systems [12].

Content locators

A content locator is a document that can be used to select parts of another document via transclusion. Nelsons refers to these locators as “reference pointers” [20], exemplified with spans of bytes or characters in a document. Content locators depend on data formats and document models. For instance locator languages XPath, XPointer, and XQuery act on XML documents, which can be serialized in different forms (therefore it makes no sense to locate parts of an XML document with positions of bytes). Other locator languages apply to tabular data (SQL, RFC 7111), to graphs (SPARQL, GraphQL), or to two-dimensional images (IIIF), to name a few. Whether and how parts of a document can be selected with a content locator language depends on which data format the document is interpreted in. For instance an SVG image file can be processed at least as image, as XML document, or as Unicode string, each with its own methods of locating document segments. Content locators can be extended to all executable programs that reproducibly process some documents into other documents. This generalization can be useful to track data processing pipelines as hyperdata such as discussed for executable papers and reproducible research. Restriction of content locators to less powerful query languages might make sense from a security point of view.

Edit Lists

An edit list is a document that describes how to construct a new document from parts of other documents. Edit lists are known as Edit Decision Lists in Xanadu and the idea was borrowed from film making [18]. Simplified forms of edit lists are implemented in version control systems and in collaborative tools such as wikis and real-time editing. Hypertext edit lists go beyond this one-dimensional case by support of multiple source documents and by more flexible methods of document processing in addition to basic operations such as insert, delete, and replace. The actual processing steps tracked by an edit list depend on data formats of transcluded documents. Just like content locators, edit lists could be extended to arbitrary executable programs that implement the hypertext assemble function A for some subset of edit lists. To ensure reproducibility and reliable transclusion,3 these programs must not access unstable external information such as documents that may change [24].

Data formats

An often neglected fundamental property of digital documents is their grounding in data formats. A data format is a set of documents that share a common data model, also known as their document model, and a common serialization (see fig. ). Models define elements of a document in terms of sets, strings, tuples, graphs or similar structures. These structures are mathematically rigorous in theory [24] but more based on descriptive patterns in practice [28]. The meaning of these elements (for instance “words”, “sentences”, and “paragraphs” in a document model) is based on ideas that we at least assume to be consistent among different people.

levels of data modeling

Data modeling, the act of mapping between ideas, models, and formats is an unsolved problem because ideas can be expressed in many models and models can be interpreted in many ways [11]. Data models can further be expressed in multiple formats although these formats should fully be convertible between each other, at least in theory.4 Formats can further be serialized in multiple forms, which for their part are based on other data models. For instance the RDF model can be serialized in RDF/XML format which is based on the XML model. XML can be serialized XML syntax which is based on the Unicode model, and Unicode can be serialized in UTF-8. At the end of these chains of abstraction eventually all documents can be serialized as sequence of bytes. Serializations are seldom simple mappings as most serialization formats allow insignificant variances such as additional whitespace. To check whether a data object conforms to a serialization, formats are often described with a formal grammar that can also give insights about the format’s model.5 Infrastructure-agnostic hypertext does not impose any limits on possible data formats and their models.

Example

The following example may illustrate the formal model and its elements. Let D, I, C, E, S, R, U, T, A be a hypertext system with D the set of printable ASCII character strings and some documents:

d1
= ‘My name is Alice
d2
= ‘Alice
c1
= ‘char=11,15
d3
= ‘Hello, !
c2
= ‘char=7
d4
= ‘Hello, Alice!

If c1 and c2 are read in content locator syntax as defined by RFC 5147 so that T(⟨c1, d1⟩) = d2 then d4 can be constructed by transcluding a document segment of d1 into d3 at position c2. The corresponding edit list e1 ∈ E with A(e1) = d4 could look like this:

 take      995f37f2e066b7d8893873ca4d780da5bf017184
 insert at 48ba94c47b45390b6dd27824cfc0d8468c2cbc71
 from      fcb59267e2e6641140578235c8cb6d38eaf6abc1
 segment   c5b794c7ae5d490f52a414d9d19311b9a19f61b3

The values in e1 are SHA-1 hashes of d3, c2, d1, and c1 respectively.6 Retrieval function R maps them back to strings. Hyperlinks are given by U(e1) = {⟨c2, d3⟩, ⟨c1, d1⟩} used for editing d3 to d4 (versioning) and for referencing of segment of d1 in d4 (transclusion).

Implementations

One of the problems faced by project Xanadu was it long required new developments such as computer networks, document processing, and graphical user interfaces ahead of their time. Today we can build on a many existing technologies:

networks:
storage and communication networks are ubiquitous with several protocols (HTTP, IPFS, BitTorrent…).
identifier systems:
document identifiers should be part of the URI/IRI identifier system. More specific candidates of relevant identifier systems include URLs and content-based identifiers.
formats:
hypertext systems should not be limited to their own document formats (such as the Web’s focus on HTML/DOM) but allow for integration of all kinds of digital objects.
content locators
as shown above, several content locator and query formats exist, at least for some document models.

Access to documents via a retrieval function R can be implemented with existing network and identifier technologies. Obvious solutions build on top of HTTP and URL but these identifiers are far from unambiguous and persistent. Content-based identifiers are guaranteed to always reference the same document but they require network and storage systems to be actionable.7 The set of supported data formats is only limited by availability of applications to view and to edit documents. Full integration into a hypertext system however requires appropriate content locator formats to select, transclude, and link to segments from these documents. Existing content locator technologies include URI Fragment Identifiers [25], patch formats (JSON Patch, XML Patch, LD Patch…), and domain-specific query languages as long as they can guarantee reproducible builds. The IIIF Image API, with focus on content locators in images [2], and hypothes.is, with a combination of locator methods [6], popularized at least simple forms of transclusion on the Web.

Challenges

Despite the availability of technologies to build on, creation of a xanalogical hypertext system is challenging for several reasons. The general problems involved with transclusion have been identified [1]. Other or more specific challenges include (ordered by severity):

storage:
data storage is cheap, but someone has to pay for it.
normalization:
most documents (including identifiers) can be serialized in different forms. To support unique document identifiers, a hypertext system should support normalization of documents to canonical forms.
link services:
databases of links have been proposed as central part of Open Hypermedia Systems [3] but they are not available for the Web because of commercial interest.8 Links are ideally derived from edit lists with segments usage function U. Recent development such as Webmention and OpenCitation may help to improve collection of links.
visualization and navigation:
this most recognizable element of hypertext has mostly been reduced to simple links while Nelson’s ideas seem to have been forgotten or ignored [27]. Nevertheless the creation of tools for visualization and navigation in hypertext structures is less challenging then getting hold of the underlying documents and hyperlinks.
edit list formats
despite edit lists being the very core of the idea of hypertext [18], they have rarely been implemented in reusable data formats. Proper hypertext implementations therefore require to establish new formats with support of hypertext assemble function A and segments usage function U.
editing tools
applications to create and modify digital objects track changes don’t provide this information in form of reusable edit lists, if at all. Hypermedia authoring needs to be integrated into existing editing tools to succed [7].
copyright and control:
who should be allowed to use which documents under which conditions? The answers primarily depend on legal, social, and political requirements.

Differences to Xanadu

Project Xanadu promised a comprehensive hypertext system including elements for content (xanadocs), network (servers), rights (micropayment), and interfaces (viewers) – years before each these concept made it into the computer mainstream. Today a xanalogical hypertext system can more build on existing technologies. The infrastructure-agnostic model of hypertext tries to capture the core parts of the original vision of hypertext by concentrating on its documents and document formats. For this reason some requirements listed by Xanadu Australia [23] or mentioned by Nelson in other publications are not incorporated explicitly:

  1. identified servers as canonical sources of documents
  2. identified users and access control
  3. copyright and royalty system via micropayment
  4. user interfaces to navigate and edit hypertexts

Meeting these requirements in actual implementations is possible nevertheless. Identified servers (a) and users (b) were part of Tumbler identifers (that combined document identifiers and content locators) [19] but the current OpenXanadu implementation uses plain URLs as part of its Xanadoc edit list format [21]. Canonical sources of documents (a) could also be implemented by blockchains or alternative technology to prove that a specific document existed on a specific server at a specific time. Such knowledge of a document’s first insertion into the hypertext system would also allow for royalty systems (c).9 Identification of users and access control (b) could also be implemented in several ways but this feature much more depends on network infrastructures and socio-technical environments, including rules of privacy, intellectual property, and censorship. Last but not least, a hypertext system needs applications to visualize, navigate, and edit hypermedia (d)10 Several user interface have been invented in the history of hypertext [13] and there will unlikely be one final application because user interfaces depend on use-cases and file formats.

Differences to other hypertext models

The focus of models from the hypertext research community [3] is more on services and tools than on Nelson’s requirements [30]. This paper rather looks at the neglected “within-component layer”11 of the Dexter Hypertext Reference Model [10] than on issues of storage, presentation and interaction with a hypertext system. Extension with content locators (“locSpecs” in [9]) could more align Dexter with infrastructure agnostic hypertext but existing models rarely put traceable edit-lists and transclusion into their core.

Summary and conclusion

This paper presents a novel interpretation of the original vision of hypertext [14, 18]. Its infrastructure-agnostic model does not require or exclude specific data formats or network protocols. Abstract from these ever-changing technologies, the focus is on hypermedia content (documents) and connections (hyperlinks). Core elements of hypertext systems are identified as documents, document identifiers, content locators, and edit lists. A formal model defines their relations based on knowledge of data formats and models. It is shown which technologies can be used to implement such a hypertext system integrated into current information infrastructures (especially the Internet and the Web) and which challenges still exist (in particular support of edit lists in editing tools).

Proto-transclusion links of this paper

References

[1] Akscyn, R. 2015. The Future of Transclusion. Intertwingled. (2015), 113–122. DOI:https://doi.org/10.1007/978-3-319-16925-5_15.

[2] Appleby, M. et al. eds. 2017. IIIF Image API. IIIF Consortium. https://iiif.io/api/image/.

[3] Atzenbeck, C. et al. 2017. Revisiting Hypertext Infrastructure. Proceedings of the 28th ACM Conference on Hypertext and Social Media (2017), 35–44.

[4] Berners-Lee, T. 1989. Information Management: A Proposal. CERN. https://www.w3.org/History/1989/proposal.html.

[5] Capadisli, S. et al. 2015. This “Paper” is a Demo. The Semantic Web: ESWC 2015 Satellite Events. (2015), 26–30. http://csarven.ca/this-paper-is-a-demo. DOI:https://doi.org/10.1007/978-3-319-25639-9_5.

[6] Csillag, K. 2013. Fuzzy Anchoring. Hypothesis. https://web.hypothes.is/blog/fuzzy-anchoring/.

[7] Di Iorio, A. and Vitali, F. 2005. From the writable web to global editability. Proceedings of the sixteenth ACM conference on Hypertext and hypermedia. (2005), 35–45. DOI:https://doi.org/10.1145/1083356.1083365.

[8] Furner, J. 2016. “Data”: The data. Information Cultures in the Digital Age: a Festschrift in honor of Rafael Capurro. (2016), 287–306. http://www.jonathanfurner.info/wp-content/uploads/2016/12/Furner-Final-Proof-18.4.16.pdf.

[9] Grønbæk, K. and Trigg, R.H. 1996. Toward a Dexter-based model for open hypermedia: Unifying embedded references and link objects. Proceedings of the the seventh ACM conference on Hypertext (1996), 149–160.

[10] Halasz, F.G. and Schwartz, M.D. 1990. The Dexter Hypertext Reference Model.

[11] Kent, W. 1988. The Many Forms of a Single Fact. Proceedings of IEEE COMPCON 89 (1988), 438–443.

[12] Lukka, T. and Fallenstein, B. 2002. Freenet-like GUIDs for implementing xanalogical hypertext. Proceedings of the thirteenth ACM conference on Hypertext and hypermedia. (2002), 194–195. DOI:https://doi.org/10.1145/513338.513386.

[13] Müller-Prove, M. 2002. Vision and Reality of Hypertext and Graphical User Interfaces. https://mprove.de/visionreality/.

[14] Nelson, T. 1965. Complex information processing: a file structure for the complex, the changing and the indeterminate. Proceedings of the 1965 20th national conference (1965), 84–100.

[15] Nelson, T. 1974. Computer Lib / Dream Machines.

[16] Nelson, T. 1997. Embedded Markup Considered Harmful. World Wide Web Journal. 2, 4 (1997), 129–134.

[17] Nelson, T. 2008. Geeks Bearing Gifts. Mindful Press.

[18] Nelson, T. 1967. Getting It Out of Our System. Information Retrieval: A Critical Review (1967), 191–210.

[19] Nelson, T. 1980. Literary Machines. Mindful Press.

[20] Nelson, T. 1999. Xanalogical structure, needed now more than ever: parallel documents, deep links to content, deep versioning, and deep re-use. ACM Computing Surveys. 31, 4es (Dec. 1999). DOI:https://doi.org/10.1145/345966.346033.

[21] Nelson, T. and Levin, N. 2014. OpenXanadu. http://xanadu.com/xanademos/MoeJusteOrigins.html.

[22] Nelson, T. et al. 2007. Back to the future: hypertext the way it used to be. Proceedings of the eighteenth conference on Hypertext and hypermedia (2007), 227–228.

[23] Pam, A.D. 2002. Xanadu FAQ. http://www.aus.xanadu.com/xanadu/faq.html.

[24] Renear, A.H. et al. 2009. When digital objects change — exactly what changes? Proceedings of the American Society for Information Science and Technology. 45, 1 (Jun. 2009). DOI:https://doi.org/10.1002/meet.2008.14504503143.

[25] Tennison, J. 2012. Best Practices for Fragment Identifiers and Media Type Definitions. World Wide Web Consortium. http://www.w3.org/TR/fragid-best-practices/.

[26] Trask, B. 2016. Hash URI Specification (Initial Draft). https://github.com/hash-uri/hash-uri.

[27] Viégas, F. et al. Studying Cooperation and Conflict between Authors with history flow Visualizations. CHI ’04 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 575–582.

[28] Voss, J. 2013. Describing Data Patterns. http://aboutdata.org/.

[29] Voss, J. 2013. Was sind eigentlich Daten? LIBREAS. 23 (2013). https://libreas.eu/ausgabe23/02voss/. DOI:https://doi.org/10.18452/9038.

[30] Wardrip-Fruin, N. 2004. What hypertext is. Proceedings of the fifteenth ACM conference on Hypertext and hypermedia (2004), 126–127.


  1. Tim Berners-Lee references Nelson both in the proposal that led to the Web [4, 18] and at the early W3C homepage http://www.w3.org/Xanadu.html (1992/93) [19].

  2. See [5] for a good example of what an actual demo paper might look like.

  3. If documents models/segments change, locators may not be applicable anymore [6].

  4. In practice it’s often unknown whether two data formats actually share the same model, especially if models are only given implicitly by definition of their formats.

  5. Formats may also exist purely implicit in form of sample instances which grammar, model, and ideas must be guessed from by reverse engineering [28].

  6. A more practical edit list syntax E could also allow the direct embedding of small document instances which SHA-1 hashes can be computed from. If implemented carefully, this could also reconcile transclusion with copy-and-paste.

  7. New standards such as IPFS Mulihashes and BitTorrent Merkle-Hashes look promising but these types of identifiers are not specified as part of the URI system (yet) [26].

  8. In particular by search engines and by spammers. A criterion to judge the success of a hypertext system is whether it is popular enough to attract link spam.

  9. Copyright detection was easy to implement with mandatory registration such as partly required in the United States until 1976. Authors might also register documents with cryptographic hashes without making them public in the first place.

  10. Tim Berners-Lee’s first Web browser originally supported editing.

  11. “It would be folly to attempt a generic model covering all of these data types.” [10]

Changelog

See source code repository for detailed changes.

1.0.0 (2019-06-29)
Published rejected manuscript with minor changes at arXiv.org
0.1.0 (2019-05-02)
Submitted to HT’2019: ACM Conference on Hypertext and Social Media

Interactions

anonymous reviewer 1 replied on

The author proposes and briefly describes a post-Xanadu hypertext model (as a high-level formalism) to run on any network infrastructure. The author re-introduces Ted Nelson’s idea of transclusion as the main mode of linking (as opposed to the current html notion of embedded span-to-doc links), and brings it forward into the current networked environment (e.g. → the distributed web). The author provides a high level definition of the basic elements of the model, and identifies a handful of challenges (some of which might represent a career’s worth of research like ownership and control, hypertext visualization, user interfaces).

It’s great to see someone take Ted Nelson’s ideas and run with them, especially to examine which are practical in the current setting, and which are not. Yet I came away from the paper unsure about its contribution (in other words, the author claims novelty of vision, but I’m not quite sure where the novelty lies). It might be best for the author to focus on what they feel is the most important contribution of the work, and to go deeper on what is specifically new, and—most importantly—justify it with use (in other words, what aspects of current stakeholder behavior motive this formalism). That’s the problem with models: it’s fun to noodle around, and come up with more expressive notions of hypertext. It’s also easy to say, ‘we can slap a UI on it, implement the right services around it, and it’ll take off like a rocket.’ What will drive this change?

Before the Web dominated the hypertext field, there were a number of different models that contributed to OHS specs. Particularly, there were a number of different views on how links should work: how they should be represented and stored (multi-headed links, links as relations, links as full-fledged objects on par with documents, computed links, etc.), how links should fail gracefully, how ownership should figure in, and so on. How something should work can be a dangerous way of talking about systems in the absence of constraints derived from practice. E.g., one look at copyright, and you’ll see a place where practice and law have gone asunder; technology, by itself, probably won’t be able to span the growing chasm. That’s why so many communities insist on some form of study, evaluation, simulation, or careful analysis.

But—I’ll ease up!—this is a short paper. I don’t see intrinsic harm in proposing new models in abbreviated form; I’m just not sure where to look for the novelty here. Perhaps the author could offer a tighter focus, and justify parts of the model with observed phenomenon.

anonymous reviewer 2 replied on

The paper purports to present a novel and formal interpretation of Nelson’s view of hypertext. There are several reasonable ideas here. That said, I feel the paper tries to do too much, and consequently doesn’t treat any of its subjects to more than a cursory examination.

The paper begins with a formal model of a hypertext system. In general, I often find formal models to be unmotivated. There are oodles of existing models for hypertext systems already – how does this further the field? What was either unexpressable or only awkwardly expressable previously that now has an elegant representation? Furthermore, I think the converse is quite clear: the model presented is static. A document as byte stream is wholly incompatible with the bulk of OHS work in which most elements were (at least in theory) computable. (This implies that a document is a function f that maps some input and some world state [e.g., current time] to some output.)

The section on data formats was puzzling. Isn’t the whole point that we don’t care about data formats? I don’t see what this section contributes to the paper, other than the straightforward observation that the model “does not impose any limits on possible data formats”. This statement seems sufficient on its own.

I feel like there are some nice ideas here, and I realize this is a short paper, but the paper as currently constructed doesn’t seem to make any compelling point well. My recommendation is that you focus on one aspect (perhaps the model) and really explore this. How does this compare to Dexter, or Groenbaek & Trigg’s ’96 work? What does the model do for us? As it stands, the model feels insufficient for dynamic data and structure, and is essentially a restatement of Dexter.

anonymous reviewer 3 replied on

This paper presents an agnostic infrastructure model for hypertext and uses the Xanadu model/system and its capabilities as a reference for discussion.

It would be interesting for the paper to discuss (even if only briefly) how the proposed formal interpretation/model compares to other models that have been proposed for hypertext/hypermedia.

Jakob Voß replied on

Thanks for the reviews. I tried to squeeze the whole idea into a short paper of four pages (PDF version) which turned out to be too dense. The reviews point out that “the paper tries to do too much” and the authors should “focus on what they feel is the most important contribution of the work”. Ironically a similar lack of focus, partly caused by the extend of his vision, can also found in Ted Nelson’s works. Anyway my paper needs to be extended:

  • The model should be compared with existing hypertext models, in particular the Dexter Hypertext Reference Model (there is another irony in the timing of creation of this model and creation of the WWW, but that’s another story).

  • The novelty of the vision of infrastructure agnostic hypertext must be presented more convincingly. The model alone is of little interest indeed, it must be justified with benefits compared to existing hypertext system. Nelson tried decades to argue for transclusion and non-breaking links.

  • The role of data formats needs to be explained more clearly. Data formats are crucial to understand documents and we don’t want to limit hypertext to a specific hypertext document format. All kinds of data should be transcludeable persistently instead.

  • The examples need to be presented in more detail. The edit list introduces hashes as document identifiers too early, better separate description of edit lists and (content-based) document identifiers. The paper might get more focus by explicitly choosing content-based identifiers as best approach.