The entity is the central object type used in the (db serialization of) the Omega system. The space of entities is divided into concepts, lexical items, and senses. Lexical items correspond to actual words used to express a given concept; concept-to-lexical item relationships are reified as senses (which allows us to express relationships, e.g., WordNet derived-from annotations, on the senses themselves). Each entity has a name (identifier); furthermore, names are segmented into namespaces. In the current implementation, individual namespaces consist of a concanenation of two name components (so-called prefix and suffix components). This allows for managing of potential name clashes between, say concepts and lexical items, between two different lexical item spaces (e.g., the English noun "linear" is not the same as the Spanish verb "linear"!) or also between two different concept spaces (e.g., coming from two different Omega constituents in process of being merged in.
Perhaps an example will make things a little clearer. As Omega includes WordNet and Mikrokosmos, consider the state of the system when these two components are both resident and as yet unmerged. We might create namespace scheme with prefixes for WordNet and Mikrokosmos, and suffixes for concepts, senses, English and Spanish lexical items. We would then have
Given something like this, we could have entities called, say, "BIOLOGY" in the WordNet-concept namespace, the WordNet-lexical item namespace, even in the Mikrokosmos concept and lexical item namespaces, without any tricky name conventions. Of course, ontologists often want to merge everything into one big artifact, but this representation allows that proceed as gradually as is needed.
In the database schema, the entity tables, with satellite tables entityType (encoding concept/sense/lexical item) and namespace are used to serialize entities.
Given a set of entities as described above, the other main component of the ontology framework is the relationships between objects. At the current time, relationships are always binary and may have multiple values. In Omega, entity/entity relations are termed links; relationships between entities and other objects (we call non-entities literals) are termed attributes. We don't consider relationships between literals and literals, nor do we consider relationships that are expressed as connecting from literals to entities (the entity-to-literal inverse would be used instead).
Unlike in (say) OWL, relation names in Omega are not currently interpreted relative to a namespace. If two portions of the namespace have different ideas about the meaning of a relation R, we might have to encode that into the relation name itself. This might change as well should the need arise.
The database table link is a simple 3-column table linking two entities using a given relation. Typically, we don't materialize ctransitive closure of relations in the database (e.g., concept-ancestors) but instead compute them dynamically using Powerloom (maintaining only concept-parents). Satellite table linkType is used to encode the links. Note that linkType doesn't encode the range type of the link. So, for example, it would be legal to have a link relation DERIVED-FROM which maps from a lexical item to a lexical item or from a sense to a sense.
The database table attribute is the most complicated part of the serialization scheme. Recall that an attribute is a relationship from an entity to a literal. The range of literals has been constrained for purposes of representation ease to be one of six types. Strings, integers, and floating point numbers are all represented immediately using the natural database implementation type. Boolean-valued attributes are encoded with small integers (also immediately). S-expressions, i.e., Lisp lists which allow shared or even circular structure and reference to other objects, are represented as the printed representation of the object; this means that we can only use such Lisp objects that have valid printable representations, given a suitable Lisp runtime environment. This includes strings, numbers, lists, symbols, structures, and arrays, but excludes packages, hash-tables, and CLOS objects. Extending this would not be hard, but in fact s-expression data is rarely used anyway. Lisp keywords are a typical encoding of enumerated types. As such, they have object identity in a way that garden variety strings do not (they are more like immutable Java interned strings).
In Omega, attributes have a defined range type and thus are not polymorphic in the same way that the DERIVED-FROM link describe above is. Each entity attribute can thus be serialized into the attribute table using the identity of the entity, the attributeType (encoded as an id), and precisely one of the six value slots (according to the entityType of the specified attribute), the other five being NULL. For immediate attribute values of data type string, integer, float, boolean (1/0) or s-expression, the associated value field is used. For keyword attributes, the index of the keyword (from table keyword) is stored in keywordVal.
Namespace files have five columns:
EntityType files have two columns:
DataType files have two columns:
Keyword files have two columns:
LinkType files have two columns:
AttributeType files have three columns:
Entity files have four columns:
Link files have three columns:
Attribute files have eight columns; two are always used; only one of the other six is used in any one record -- much like a "union type"