The Digital Object, as shown in figure 4-10, is itself composed of one or more bit sequences. The purpose of the Representation Information object is to convert the bit sequences into more meaningful information. It does this by describing the format, or data structure concepts, which are to be applied to the bit sequences and that in turn result in more meaningful values such as characters, numbers, pixels, arrays, tables, etc. These common computer data types, aggregations of these data types, and mapping rules which map from the underlying data types to the higher level concepts needed to understand the Digital Object are referred to as the Structure Information of the Representation Information object. These structures are commonly identified by name or by relative position within the associated bit sequences. The Structure Information is often referred to as the ‘format’ of the digital object.

The Representation Information provided by the Structure Information is seldom sufficient. Even in the case where the Digital Object is interpreted as a sequence of text characters, and described as such in the Structure Information, the additional information as to which language was being expressed should be provided. This type of additional required information is referred to as the Semantic Information. When dealing with scientific data, for example, the information in the Semantic Information can be quite varied and complex. It will include special meanings associated with all the elements of the Structural Information, operations that may be performed on each data type, and their inter- relationships. Figure 4-11 emphasizes the fact that Representation Information contains both Structure Information and Semantic Information, although in some implementations the distinction is subjective. It is useful to remember that the Semantic Information associated with parts of some digitally encoded information is independent of the format. For example, the meaning of numbers in a data file is independent of whether they are encoded as scaled integers or as IEEE Reals; the meaning of words in a document is independent of whether the document is Word or PDF.

This figure also shows that Representation Information may contain Other Representation Information. This indicates that the taxonomy of Representation Information presented here is far from complete. For example software, algorithms, encryption, written instructions and many other things may be needed to understand the Content Data Object, all of which therefore would be, by definition, Representation Information, yet would not obviously be either Structure or Semantics. Information defining how the Structure and the Semantic Information relate to each other, or software needed to process a database file would be regarded as Other Representation Information.

Structure Information, Semantic Information and Other Representation Information are both sub-types and components of Representation Information.

Representation Information is an Information Object that may have its own Data Object and its own Representation Information associated with understanding each Data Object, as shown in a compact form by the ‘interpreted using’ association. The resulting set of objects can be referred to as a Representation Network.

As an example, ISO 9660 (reference [D10]) describes text as conforming to the ASCII standard, but it does not actually describe how ASCII is to be implemented. It simply references the ASCII standard which is additional Representation Information that is needed for a full understanding. Therefore the ASCII standard is a part of the Representation Net associated with ISO 9660 and needs to be obtained by the OAIS in some form, or the OAIS needs to track the availability of this standard so that it may take appropriate steps in the future to ensure its ISO 9660 Representation Information is fully understandable.

Figure 4-11: Representation Information Object

