Units associated with literals - ADQL considerations.

Comments supplied by JeffLusted. These will be incorporated as appropriated (modified where necessary) in the Units Working Draft.

For queries it is important to avoid ambiguity. For example, km/s (or km / s) would on their own be difficult because the forward slash is ambiguous wherever expressions routinely contain arithmetic. The same is true of + - * % and potential bit operators within a query language (~ ^ &). The use of a full stop is also awkward where dot-qualification is employed: database.schema.table or where columns are aliased such as a.ra .

Care must be taken to consider the possibility of a unit name (or part of a unit name) being the same as a column or table name, or of a reserved word, although the latter is more easy to control.

None of this is impossible to overcome. I think the easiest way is to have some standard syntactic marker. For example:

  • 40 [m+2]
  • 40 ?m+2
  • 40 u:m+2

In fact, a number of plausible alternatives could be supported. If then a parser complained of ambiguity, one of the alternatives could be used.

The presence or absence of white space is probably immaterial: in most languages m + 2 is the same as m+2, so attempting to overcome this is a misnomer.

Units associated with Columns or Expressions involving Columns. I assume a column has a UCD and units associated. Therefore, either a column value has a unit or is unit-less. It is only the unit-less ones which are problematic; see later.

For expressions, automatic divining of units is a non-starter, in my opinion. Take, for example:

  • X = expressionA * sqrt( expressionB) / expressionC

where each of the individual expressions could be arbitrarily complex. Even if the individual expressions were amenable to divining a unit from their composed columns, the overall effect is still problematic. I would say it was an open problem, or undecidable (at least by a machine).

Solution. There is a CAST operator in SQL. We adapt this to cast a value into some desired unit. If the value has no unit (eg: an arbitrarily complex expression), then we are simply assigning a unit to a value. If, on the other hand, the value has a defined unit then the cast implies some conversion where they disagree. A parser could vet this for correctness and invoke a library function under the covers to effect the conversion.

One recommendation. If the above sounds sensible, then it might be a good rule to impose: that the select list (in a query) of column values/expressions always specifies units. These could default if they are columns where units are known, but otherwise a parser could insist, or at least issue warnings. The units could then form part of the results VOTable.


Topic revision: r1 - 2009-03-18 - AnitaRichards
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback