On Utypes

Comments on Utypes note

(http://wiki.ivoa.net/internal/IVOA/Utypes/WD-Utypes-0.3-20090522.pdf)

@2.1 "data model elements": need to agree what these may be. That is need to know the syntax of data modelling. When in UML this might mean deciding on using Class diagrams. Or choosing a particular UML profile. Or possibly a custom language defined for "data modelling in the VO". This is what we have done in SimDB and therefore we were able to define ("BNF") and implement (XSLT) rules for deriving Utypes from the data model.

Definitions:

VOTable v1.10

In some contexts, it can be important that FIELDs or PARAMeters are explicitely designed as being the parameter performing some well-defined role in some external data model. For instance, it might be important for an application to know that a given FIELD expresses the surface brightness processed by an explicit method. None of the existing name, ID or ucd attributes can fill this role, and the utype (usage-specific or unique type) attribute has been added in VOTable 1.1 to fill this gap. By extension, most elements may refer to some external data model, and the utype attribute is legal also in RESOURCE, TABLE and GROUP elements.

In order to avoid name collisions, the data model identification should be introduced following the XML namespace conventions, as utype="datamodel_identifier:role_identifier". The mapping of "datamodel_identifier" to an xml-type attribute is recommended, but not required.

VOTable v1.20:

In many contexts, it is important to specify that FIELDs or PARAMeters do convey the values defined in an external data model. For instance, it can be fundamental for an application to be aware that a given FIELD expresses the surface brightness measured with a specific filter and within a 12x6arcsec elliptical aperture. None of the other name, ID or ucd attributes can fill this role, and the utype (usage-specific or unique type) attribute has been introduced in VOTable 1.1 to fill this gap. By extension, most elements may refer to some external data model, and the utype attribute is legal also in RESOURCE, TABLE and GROUP elements.

In order to avoid name collisions, the data model identification should be introduced following the XML namespace conventions, as utype="datamodel_identifier:role_identifier". The mapping of "datamodel_identifier" to an xml-type attribute is recommended, by means of the xmlns convention which specifies the URI of the data model quoted, as done in the example of section 3.1.

The utype attribute is especially useful to specify the spatial and temporal coordinates present in the table when it contains astronomical events: these parameters are essential to most applications which process multi-wavelength data. Within the IVOA, the spatial and temporal frames are described in the STC data model (see Rots [10]), and it is expected that this STC-referencing replaces the usage of the COOSYS defined in the version 1.0 of VOTable.

The example given above (see section 3.1) gives an illustration of the recommended way of linking a VOTable document to the STC model. Other examples and details are presented in the dedicated note ``Referencing STC in VOTable''[9].

SSA, v1.04, 2.3

UTYPE tags are used to provide a uniform means to identify the elements of a data model in any language or environment. For example, given the component data model “DataID“, the UTYPE “DataID.Title“ identifies the data model field containing the title string for the dataset; “DataID.Collection“ identifies the parent data collection, and so forth.

SSA, v1.04, 2.11 (incl UCD)

A UTYPE is a fixed string which uniquely identifies a field of a data model irregardless of representation. UTYPEs are strings such as "Target.Name", using embedded period characters to delimit the fields of the UTYPE. A simple way to think of a UTYPE is as a reference to a field of a data structure in a language such as C. The effect is to flatten a hierarchical data model so that all fields of the data model are represented by fixed strings in a flat name space, allowing a wide variety of software to be used to manipulate or use the model. Of course if a data model becomes complex enough this will no longer be possible, but the approach has significant benefits for a wide variety of data. UTYPEs are defined within a single name space identifying the data model, and are unique only within the context of the specified data model.

Note that while a UTYPE is always a fixed string which uniquely identifies a data model element, if there are multiple instances of the data model in a container (name space), then multiple data elements may have the same UTYPE. For example, in a VOTable representation, multiple table FIELDs may have the same UTYPE if there are multiple instances of a component data model (e.g., Association) in the table. In this case the GROUP construct is used to separately identify the data model instances. Within each GROUP, the UTYPE values still uniquely identify the field of the data model. Multiple instances of individual table FIELDs (e.g., Curation.Reference) are also possible.

A UCD identifies the semantic type of a data value or data model element, saying what type of quantity, in a physical sense, is stored in the value. UCDs are defined globally, independent of how they are used. UCDs may be used indendently of any data model. Multiple data models may define fields which share the same UCD, or multiple fields of a single data model may share the same UCD. Since multiple fields even within a single data model may share the same UCD value, UCDs cannot be used to uniqely identify data model fields. UCDs however provide a unique capability to identify or associate similar types of fields in independent data models or data instances.

Both UTYPEs and UCDs are case-insensitive, and case should be ignored when comparing string values for equality.

Semantics of UTYPEs

If a column in a table is declared to have a particular Utype it is possible to assume that that column stores a value that has been obtained from the corresponding attribute in an "exact instance" of the data model.

How about the table containing that column though. If it is given a utype corresponding to a Class in the model, how is it related to an exact instance to that class? Should it be an exact representation? Should it contain columns for all attributes and references?

Possible use cases:

  • Annotation for results of ADQL queries on SimDB/TAP
  • annotating element in TAP_SCHEMA representation of SimDB/DM
  • annotating elements in XML schema representation of SimDB/DM
  • Identification of elements in models derived from SimDB/DM, for example in views.
  • ...

Links


Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r6 - 2009-06-15 - GerardLemson
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback