Data Product Type
- Should we specify all possible values ? There is the suggestion in the current document to allow the value 'other' when no available category fits to describe a possible science data product.
- If we allow other , how do we specify more information ? In free format, in Subtype, but then no interoperability is granted.
- How could we suggest a possible set of pairs (type, subtype) for a data product and provide examples
Comments
In my opinion, in an ideal world this would be something like a set of tags
(e.g., it's easy to imagine spectrum-image, or spectrum-timeseries, or
image-timeseries). A taxonomy of such classes (the
"tree") could then be used for query expansion by clients. However, there
are no set-valued columns in
ADQL, and faking them using, e.g., strings and SQL
patterns ("=AND producttype like '%/spectrum/%'=") would defeat indexing and
simply be ugly. So, I'd say make a catalog, ideally using the
StandardKeyEnumeration from
StandardsRegExt, and "other" is simply the SQL
NULL. Tell both data providers and consumers not to fill in the type if
there's not a good match. Ideally, the StandardKeyEnumeration would in
the descriptions try to cover as many real data products as possible
("This category covers objective prism exposures").
My feeling is that that's the Pareto-correct way of doing things,
actually covering probably around 98% of the actual queries rather than just
80%. --
MarkusDemleitner - 04 Mar 2011