Space-Time Coordinate Metadata RFC
This document will act as
RFC centre for the
Space-Time Metadata V1.30 Proposed Recommendation.
In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your
WikiName so authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.
Discussion about any of the comments or responses should be conducted on the data model mailing list,
dm@ivoa.net.g
See also historic
comments on V1.21.
Comments
Here is a note from Arnold describing the status of implementations.
I believe the implementations are sufficient to warrant approval of the PR. - Jonathan
Applications using
STC
1. VOEvent
The
WhereWhen information of a VOEvent uses the
ObsDataLocation
construct from
STC to indicate the spatial and temporal metadata of
the event. That description is complete and unambiguous.
VOEvent recognizes a limited collection of standard coordinate
systems.
See:
http://ivoa.net/Documents/latest/VOEvent.html
2. JHU Footprint Service
The footprint service developed at JHU and used for SDSS and HLA
deploys
STC regions and coordinate positions for describing footprints
and search regions. It represents a complete implementation of
STC
regions and provides a complete set of operative functions.
See:
http://voservices.net/footprint
Coordinate transformation are not supported, although it would not be
difficult to add.
3. Registry
The Coverage component of the Registry's resource description is
expressed through
STC metadata. These descriptions are complete and
unambiguous. An issue arising from the concatenation of documents, as
performed by registry services, appears to be headed for imminent
resolution; otherwise there is an acceptable work-around. It should be
stressed that this is not a specific
STC issue, but the general
problem of handling unambiguous associations between elements when
concatenating documents.
See:
http://ivoa.net/Documents/latest/RM.html
4. David Berry's prototype
David Berry implemented a prototype coordinate transformation service
for
STC version 1.20.
Although this service is not operative for the present version, it
demonstrates that the coordinate system metadata in
STC pass the test
of allowing coordinate transformations to be executed unambiguously.
See:
http://www.starlink.ac.uk/~dsb/ast/ast.html#stc_demo
5. Examples
Ten validated sample documents show that the schema meets its
requirements of being able to represent the coordinate metadata for a
wide range of applications.
See also my
STC web page
http://hea-www.harvard.edu/~arots/nvometa/STC/
- Arnold Rots
- (2) How does STC fit in IVOA?
I have every confidence that
STC can represent complex and rigorous points and regions. But I want to do less rigorous things in a simple way.
STC forces the specification of sky position to include observer position, which is not usual in astronomy.
STC conflates wavelength and sky position, also not usual.
STC does not allow the use of the vernacular "J2000 coordinate frame", also not usual. I worry that the complex syntax and divergence from usual astronomical practice will make
STC a niche standard only.
There are already several ways within the IVOA framework for specifying positions and regions. The cone search uses a simple string such as POS=180,20 SR=1 to represent a disk region on the sky, and the registry has used expressions like polygon(FK5, 53.29, -27.97, 53.29, -27.7, ....) and Skynode uses Region('CIRCLE J2000 182.5 -0.89 8'). Is there an intention in the IVOA to deprecate these simple notations in favour of the more complex
STC system?
Finally, I note that my cursory search of the registry records (30 minutes) found very few records that have a spatial extent of greater complexity than "AllSky". I found no records with an
STC representation of coverage. Can the registry really be considered an interoperable implementation of
STC?
--
RoyWilliams - 30 Apr 2007
In reply to Roy's first paragraph, it is important to emphasize the difference between interfaces and metadata.
Interfaces may state explicit defaults or restrictions in: (a) how users enter information or make requests and (b) how information is presented to the users. Metadata that accompany and describe data that are being exchanged, on the other hand, should be complete and unambiguous (in the style of
STC). This means that there is a requirement that we are able to perform unambiguous mapping between the two.
--
ArnoldRots - 09 May 2007
Referencing mechanisms withing STC
There is a potential problem with
STC's use of xml ID and IDREF types for
identifying and referring to coordinate systems within the xml instance, when individual instances are concatenated, as might happen within a registry for instance.
This issue was discussed at length in the registry mailing list with the thread starting
http://www.ivoa.net/forum/registry/0612/1764.htm and section 6.2.1 of the standard where
the conclusion of discussion is effectively deferred by saying that that the naming conventions for the IDs will have to be agreed.
I believe that for a schema that is primarily intended to be used as an imported component
of other schemas it is fundamentally wrong to impose document-wide constraints that are
inherited (ID and IDREF) as it restricts how
STC can be used within instance documents of the inheriting schema whilst at the same time trying to make sure that the resulting documents XML valid.
In fact the just about the only possible result of the retaining ID and IDREF is that there
will be many more xml fragments in the world that all are trying to describe the same physical coordinate system,
but each with a unique identfier - this cannot be a good thing, and is
basically equivalent to Roy's complaint above that
STC is verbose to describe the common venacular.
Something based purely on the xlink mechanism could be more flexible (because the xml parser does not validate
these references) and ultimately simpler because the mnajority of
STC instance documents will use standard coordinate systems. There is really no need to have the two level identifier system that currently is defined in
STC where an <AstroCoords>
uses an IDREF to the ID of an <AstroCoordSystem> which can then use an xlink:href to a standard definition- it can all be done with just a direct xlink:href at the <AstroCoords> level.
This xlink only mechanism would positively encourage the reuse of the standard URIs for coordinate
systems and the publishing of new co-ordinate system definitions, and observatory locations at well known URIs, promoting reuse and reducing verbosity.
For the smaller set of cases where a novel coordinate system does need to be defined within the
STC document, then
there is already a convention to refer to an coordinate system
defined only locally within the document using the # 'fragment' notation of URIs.
--
PaulHarrison - 04 May 2007
As was pointed out in that discussion, the proposed solution does not really solve the problem and, more importantly, this is an IVOA-wide issue, not just an
STC quirk. The problem is simply that any unambiguous association mechanism in XML documents runs afoul of the no-ambiguity requirement when those documents are included in a larger document by a concatenating service (such as the Registry). We do need a mechanism that allows unambiguous association in XML documents and the ID/IDREF pairs are intended for that purpose and fit the bill precisely. The onus is on the concatenating services to ensure that
unique associations remain unique.
On the other hand, the criticism that this presents a significant burden on those services, since it requires checking all relevant schemata, is justified. What is needed is a convention that enables document readers to identify ID and IDREF attributes without having to resort to the schema.
Two candidate conventions have been discussed:
1. By signifying it in the attribute's name:
- Attributes of type ID will be identified by having their names
start with "ID_".
- Attributes of type IDREF will be identified by having their names
start with "IDREF_".
This option has the virtue of being simple.
coord_system_id="MyCoordSystem"
would become:
IDREF_coord_system="MyCoordSystem"
2. By identifying specific attributes to be of a certain type through
a specific attribute (minOccur=0, maxOccur="unbounded"):
- The attribute ID_type (string) will contain the name of an
attribute that is of type ID.
- The attribute IDREF_type (string) will contain the name of an
attribute that is of type IDREF.
This option has the advantage that it is backward compatible, can be retrofitted, and
makes it easier to fix documents; it's also easily extensible.
coord_system_id="MyCoordSystem"
would become:
coord_system_id="MyCoordSystem", IDREF_type="coord_system_id"
--
ArnoldRots - 09 May 2007
This issue was discussed today and the conclusion was that
STC can go forward as currently defined. A more comprehensive solution involving a generally applicable convention will be worked out for a future version.
--
ArnoldRots - 18 May 2007
Data model element names
The UML diagrams and chapter names in the
STC document refer to an item called
AstroCoordSys
, and the schema defines a (presumably corresponding) element called
AstroCoordSystem
. There may be other instances of this kind of minor discrepancy, I haven't checked. I don't know if this is deliberate to distinguish the data model from the schema, but it can be confusing, especially if one is attempting to decide what utype string one should use to refer to this item.
--
MarkTaylor - 14 May 2007
Thank you for pointing this out; I have corrected this in my copy of the document;
AstroCoordSystem
is the correct name.
--
ArnoldRots - 15 May 2007
URIs and URLs of standard libraries
In Appendix C, the definition of standard libraries of
CoordSys and
ObservatoryLocations are given along with suggested URIs for the examples. I
think that there is a mismatch between the content of the examples and the URIs
assigned. The URIs are given from the ivo: scheme for IVOA resource identifiers,
which have the registry as the only locator, however the XML content shown would
not be legal for the registry.
The XML content shown could however be served (possibly more naturally/easily for the Observatory)
by a http server as say http://www.noao.edu/kpno/STCObservatoryLocation.xml for instance. So I would recommend
- altering the coordinate system content to be registry legal.
- altering the observatory location example URI to be a http: URL.
-- PaulHarrison - 15 May 2007
I may be mistaken, but I believe the registry only provides direction to the authority Id in the identifier which in this case is STClib; from there on that resource takes over. It does mean, of course, that STClib must be registered as a resource. In that sense it is no different than the dataset identifiers that exist under ADS's authority. As to the observatory locations, I would much rather see them all gathered in a single library, identified by a proper IVOA dentifier.
-- ArnoldRots - 15 May 2007
The starting point is that the standard library components are imported into
instance documents with xlink:href. My reading of xlink is that this is a single
dereferencing, so if there were an XML parser that could understand how to
dereference the ivo: URI scheme, the best that it could do would be to fetch the
registry record itself. According to the URI conventions you could use the
fragment (#) notation to convey that you wanted only a portion of the referenced
record. With this in mind there is a way in the V1.0 registry schema to register
STCResourceProfiles - see http://wiki.ivoa.net/internal/IVOA/RegUpgradeToV10/stc.xml for an
example. The uri of the FK5 system in this case is
ivo://STClib/CoordSys#UTC-FK5-TOPO. The rest of the coordinate systems listed in
your Table 6 could also be placed into the same registry record.
I might be have missed something, but I do not think that there is a field in a
registry resource that you could fill to point to the actual STC instance
(presumaby at a http accessible location)
without at least abusing the spirit of the intended use of the registry metadata
items. So I do not think that the double dereferencing that you are assuming can
happen is not representable, as well as undesireable, and unnesessary with the
currently published V1.0 registry schema.
There is nothing to stop the observatory locations being published in the
registry in the same fashion as the coordinate systems above, however, the direct http: URL method seems to me that it
might be a useful option for registryless publishing of such STC coordinate
system/location fragments, which might provide a lower barrier to people actually doing so.
-- PaulHarrison - 15 May 2007
I believe I understand what Paul is getting at with regard to defining stardard systems. Start by noting:
- The use of an IVOA identifier (a URI that starts with ivo:) requires that there is an associated record in an IVOA registry (ref: IVOA Identifiers Recommendation).
- The record must be of type Resource from the VOResource schema or a legal extension (ref: Registry Interfaces WD).
Thus, since your example does not satisfy 2., it is not sufficient for the legal use of the IVOA identifier.
In addition to the example instance document that Paul points to (stc.xml), Please have a look at the VODataService schema
(http://www.ivoa.net/xml/VODataService/v1.0): find the definition
of the StandardSTC type, an extension of the Resource type. As Paul points out, the intended mechanism for resolving an ivo:-based xlink via the registry is a complete solution based on existing services (though, as a discussion with Arnold revealed, it needs to be better documented). Thus I would suggest that Appendix C.1. is not needed.
-- RayPlante - 18 May 2007
We have discussed this issue and the resolution is simply that Registry has put the standard coordinate systems library in place. The only thing that needs to be changed in the document are the examples in Appendix C.
-- ArnoldRots - 18 May 2007
Various comments on the document
- Section numbering: a six-level depth, that's really huge ! And many intermediate levels are most likely not required e.g. there is 4.2.1, followed by 4.2.1.1.1, at least one level could be dropped there. Not fundamental, but makes the overall structure hard to read.
- 4.2.1.1.2.1 Scalar reference frame: strange to find a proposal for a "logarithmic projection" there: why only for 1D data ? And why not other functions ? For 2D or higher dimension, only linear (by a transformation matrix) or spherical projections are proposed.
- 4.2.2.1.2 "vextor"
- 4.4.1.1.2 there are already many acronyms, why is it necessary to add unofficial ones (IAT)?
- 4.4.1.2.2 the special Units "deg deg m" should really be removed, this is not a unit. A physical unit can only qualify a number; for a vector, the units classically qualifies the number along one axis. The recommendation should be to assign units to the individual components.
- 4.4.1.2.2 (cont.) Why wait to add CYLINDRICAL ? And some names can have any dimensionality (cartesian), but other (SPHERICAL) can only accept a unique dimensionality (3). Listing the names, acceptable values of coord_naxes would be useful (isn't POLAR with naxes=3 identical to SPHERICAL ?)
- 4.5 there is no recursion there -- it's just a composition law (the result of an operation on regions is a region). In mathematical terms it's an endomorphism.
- 4.5 (cont.) the fill_factor is really difficult to define when the region refers to a set of point sources (a large number of points has still a measure of its area equal to zero, hence would be assigned a fill_factor=0)
- Appendix C: given the highly probable extensive usage of the STClib described in Appendix C, at least a reference to this Appendix in section 4.4 would help a reader trying to understand an STC serialization
- Appendix C: as far as I know, the Hipparcos catalog (which defines the ICRS system) uses the barycentric origin and a terrestrial time. This fundamental association (TT-ICRS-BARY) does not appear in the list?
- utypes: how to refer to the STC schema via utypes could be mentioned e.g. by a reference to a note (which has to be written...)
-- FrancoisOchsenbein - 21 May 2007
Response:
- Yes,six-level deep is unfortunate, but unavoidable. You are right about 4.2.1.1.1 and it will be corrected; thank you for pointing it out.
- This is tricky; I don't assume you are advocating a log-log spatial x-y scaling. The only application I can imagine is one where one of the spatial coordinates (e.g., radius, of latitude) gets scaled differently. The schema can always be extended, so we will leave this for later when the need arises.
- Thank you - will be corrected.
- IAT is actually in use in astronomical data; that's why it is included.
- "deg deg m" may well be removed in a future version, but let's await the development of standard units.
- CYLINDRICAL could be added without effort. SPHERICAL may be 2-D or 3-D. POLAR in 3-D counts zenith distance opposite to SPHERICAL's latitude, so it is not the same.
- This may be sloppy terminology; I will remove the sentence.
- You are not calculating the fill factor correctly here. In the context of, say, a catalog, the fill factor refers to the fraction of the region that is covered by the catalog (which, one would hope, would be 100%), not the fraction that is actually covered by the sources.
- I will add a reference; thank you.
- Are you sure they are not using TDB?
- I will add a comment on utypes.
-- ArnoldRots - 11 Jun 2007
- Final bullet point on page 8 - "a Pixel Coordinate System is needed for pixilated data". Simple typo where "pixilated" should be "pixelated". This is repeated in 6.1.4 Observational Metadata and 7.2.4 Observational Data.
- 4 Design - "The detailed implementation requirements that follow from this design are presented in Section 6.2.1.". I think that all of section 6.2 is relevant here.
- 4.2.2.1.1 CoordName (and others in this section). The element in the XML schema is called "Name", and it would be helpful to make this explicit.
- 4.2.3.1 ScalarInterval and 4.4.3.1 TimeInterval - "[ScalarInterval/TimeInterval] has Boolean attributes LoLimInclude and HiLimInclude". These are actually called lo_include and hi_include.
- 4.3.3 Pixel Coordinate Area - "The Pixel Coordinate Area has the same properties are the generic Coordinate Area". Simple typo, "are" should be "as".
- 4.4.2.2.2 CoordName. It would be good to state explicitly whether this uses same mechanism as the following subsection (list of elements), or XSD list type.
- 4.5.2.4 Difference. Would be better to be define this explicitly (ie what is calculated, and which argument is which). Perhaps something like "the portion of the first region that is not in the second region". Also, is a note needed about contained boundaries?
- 6.1.1.1 Coordinate System - "with respect to an other,". Simple typo - should be "another".
- 6.1.1.1 Coordinate System - "origin, c.q., translation". I am not familiar with c.q. (Casu Quo?), does the context make this clear enough for expected audience?
- 6.1.2.1 Astronomical Coordinate System "also space velocity of this reference position needs to be provided". Rather than "needs" could we use one of MUST or SHOULD?
- 6.1.2.2 Astronomical Coordinates - "The Time Instant (Section 4.4.2.1.1) in a Time Coordinate (Section 4.4.2.1) MUST contain: Absolute time value, in JD, MJD, or ISO-8601, as specified in Section 4.4.2.1.1.1". Perhaps add "with appropriate Unit specification".
- 6.1.2.2 Astronomical Coordinates - "A Redshift Coordinate (Section 4.4.2.4) MUST be 1-dimensional and MUST have a Spatial Position Unit and a Time Unit if it is expressed as a Doppler velocity". I think the logic here could be more clear if split into two sentences. eg. "A Redshift Coordinate (Section 4.4.2.4) MUST be 1-dimensional and MUST have a Spatial Position Unit. If it is expressed as a Doppler velocity it MUST also have a Time Unit." - if this is what is meant.
- 6.1.2.3 Astronomical Coordinate Area - "An Astronomical Coordinate Area object MAY contain: - Any number of Coordinate Intervals for each Coordinate Frame - Coordinate Intervals for the following Coordinate Frames: - Generic (Section 6.1.1.3) - Time". I don't understand what the first bullet point "Any number of Coordinate Intervals for each Coordinate Frame" is adding to the list.
- 6.1.3.3 Pixel Coordinate Area - "A lower and upper limit value of appropriate dimensionality". According to 4.3.3 Pixel Coordinate Area, it has the same properties as the generic Coordinate Area (see Section 4.2.3). As such is only one of lower and upper limit required, and the other optional?
- 6.2.1 Referencing Mechanism - "there is no obligation for the client to go and substitute an element that is referenced through Xlink (though it should always be permissible). We will provide some standard libraries (e.g., of common coordinate systems and observatory locations; see Appendix C) with a standard naming convention; a client that is familiar with such libraries may use its inside knowledge instead.". Is it permissible for a client to ignore an XLink and treat the element value as UNKNOWN? I think this is implied but perhaps should be explicit to avoid any confusion.
- 6.2.2 Versioning "The STC schema will be versioned using three numbers: i.jk". This has the consequence of limiting to at most 10 minor versions between major versions; is this intended? A second separator (eg i.j-k or i.j.k) would add in more flexibility. See also my point below about namespace and schema versioning.
- UML diagrams. Section 7.2 emphasises that we can have dual values in many situations, implying a range rather than a single value. I don't think this is reflected in the UML diagrams which label these relationships (eg CoordValue -> Coordinate) as 0..1 -> 1. Should they be changed to 0..2?
- Appendix B.9 Using a FITS File. This doesn't have a Mapping element, but 7.2.4 Observational Data, suggests that it should, saying "there is a third Coordinate metadata element in the case of pixilated data. A Mapping element will define the transformation between the Observation Location and the Pixel Space."
- XSD Schema - "targetNamespace="http://www.ivoa.net/xml/STC/stc-v1.30.xsd" ". Using the namespace URI as a pointer to the schema is generally not advised. There are a number of reasons to do with future evolution of the namespace, and multiple schemas sharing a namespace. Perhaps the simplest reason however is that it will confuse XML developers working with the standard. I would suggest using http://www.ivoa.net/xml/STC/ This does also raise the issue of versioning for the namespaces, and schemas, and instance documents. Given the comment that "All versions with identical major and minor version numbers SHALL be downward compatible. Users are encouraged to reference the STC version only by these numbers..." (6.2.2) it may make sense for the namespace to be http://www.ivoa.net/xml/STC/1.3/. However this would introduce issues with having to update any XPaths to use the new namespace any time a new version was released. An alternative is to use a version attribute on the root element of the instance documents (STCResourceProfile, ObsDataLocation etc) to identify the version. We could have each version of the schema enforce that each instance match that version by including eg. <xs:attribute name="version" use="required" fixed="1.3" /> for version 1.3, or we can provide this attribute but not mandate that it be used, which adds flexibility.
- XSD Schema - "<xs:annotation> <xs:documentation>Region is the base type for everything</xs:documentation> </xs:annotation>". This isn't within any element below schema, and seems too general an annotation to belong there - I assume it used to be within the schema elt of a region.xsd schema. Perhaps it is best moved within Region element, or removed altogether.
-- ReubenWright - 29 May 2007
Response
Thank you for your careful reading; here are the individual responses:
- Thank you - pixilated does have a rather different meaning.
- Actually, it is Section 6.1.
- Will be fixed.
- Will be fixed.
- Will be fixed.
- Will be fixed.
- Will be clarified; boundaries are a hornet's nest.
- Will be fixed.
- Yes, casu quo; I would hope and trust this is understood.
- Will be fixed, although it is somewhat debatable since the sentence is a note on the requirement.
- No, these do not need units; the errors, etc., do, though.
- No, both are only needed for Doppler velocities, neither for redshifts.
- Although the three bullets are not entirely orthogonal, they play different roles. The first bullet says that areas may be made up of multiple intervals; the second says how they are implemented for different frames; and the third provides specifics.
- No, pixel areas do require upper AND lower limits.
- It would be permissable, but definitely not recommended; since UNKNOWN requires the client to assume a default, which might easily conflict with what is in the referenced element.
- This is an IVOA-wide issue; I am following the official guidelines here.
- I know; but see the caveat on p. 42.
- This will be reworded, replacing the Mapping Element by the Transformation Element (on p. 80).
- Like item 16, this is a matter of IVOA-wide conventions.
- Will be fixed.
-- ArnoldRots - 11 Jun 2007
Here are some additional comments that were offered (well) after the close of the RFC period (by RayPlante).
- Table 4 - Projections: i think that ZPN should be added - Zenithal Polynomial projections. The ZPN projection is now a standard projection as defined in Greisel & Calabretta (2002)
-- NicholasWalton - 18 Sep 2007
Comments from the Working Group Chairs and Interest Group Chairs
Chairs should add their comments under their name.
Mark Allen and Mark Taylor (Applications WG)
Approved with the following comments:
The STC formalism allows precise description of a wide variety of coordinate
information of the kind which is needed in the VO, and we congratulate the author for attention to detail in this work.
We note however that real
use of STC in applications will require significant amount of effort to
develop libraries and functions for coordinate transformations, and to ease
it's implementation and use within tools. The standard looks sufficiently
complicated that it appears as though it will be very hard to take a general
STC description and do something useful with it. Questions like "tell me the
name of the reference frame" may be straight forward by searching for the
relevant XML attribute. But performing coordinate transformations, calculating
distances, answering questions about coverage etc, where different coordinate
systems are involved will be difficult.
It seems unlikely that applications could attain full STC compliance. The
flexibility of having a 'CustomFrame' is one aspect of this, but just
considering the list of 'Standard Reference Frames' (Table 3), it would
require significant effort to write a library which could perform general
transformations between all coordinate systems specified by STC. The document
does acknowledge that 'there are a few common coordinate systems that will
serve the bulk of our data and we will make an effort to make their use as
simple as possible'. We highlight the need for these ideas to be expanded and
clearly explained 'somewhere' to guide implementation of STC for the common
subset of coordinate systems. It is important that these common cases do not
require too much overhead imposed by the flexibility of the STC structure. It
is difficult to assess this from the document.
The reference implementations provide some example usage, but are somewhat
weak in terms of coordinate transformations. The comment concerning the
reference implementation of the JHU Footprint Service, "Coordinate
transformation are not supported, although it would not be difficult to add"
seems to be overstated, as transformations for example between systems with
reference frames of URANUS_G_III and SUPER_GALACTIC may be quite challenging.
We approve the current STC because it provides the precise description of
coordinates. From an applications point of view there is still much work to
required for general libraries which can parse a general STC description and
perform astronomically non-trivial operations on it.
Minor comments:
Section 5.
meat ?
'Such a Projection or Mapping metadata object is very much like the meat
in the FITS WCS'
Sec 6.1.2.3:
I think the bullet point nesting has lost a negative indent
near "All non-Generic Coordinate Areas MUST...". But I could be
wrong.
Appendix A says:
"The subsequent pages show the design of the Coordinates and
Coordinate Area classes, though these represent the design of
versions 1.2 and 1.3, and are therefore slightly outdated at
this time; in particular, the Pixel Space is missing."
It would be better if these were updated for an IVOA REC.
B.4A ucd="whoknows" is not a good thing to see in an IVOA document.
Response:
There is no question that libraries need to be developed (we intend
to start this at CfA once we have the resources) and that applications
will start simple, to be expanded as the need arises and resources
are available. Something like a transformation between URANUS_G_III
and SUPER_GALACTIC may come in at the tail end of this development.
Would "substance" be clearer than "meat"?
No, there should not be a negative indent.
It would be better if those UML diagrams were updated, but it will
have to wait for a next version.
I'll try to think of a less offensive alternative for "whoknows".
-- ArnoldRots - 16 Aug 2007
Christophe Arviset (TCG vice Chair)
In general, the document offer a very exhaustive description of the STC metadata for the VO. Its intent to cover all areas may make it too complex, but my understanding is that specific bits of it can be used, not necessary the complete set. The section 7 conclusion and usage notes as well as the examples given in Appendix B are very useful to understand how to use this standard.
Nonetheless, the document contains the description of the metadata which is fine, but it also covers some other areas which I don't believe belong to STC metadata but more to other IVOA standards, in particular:
- 4.5.2 operations (probably more linked to ADQL specs)
- 7.2.2 query constraints (probably more linked to ADQL specs)
- 7.2.3 catalog entry (probably more linked to other DM)
- 7.2.4 observational data(probably more linked to other DM)
I believe these sections (and their corresponding examples if any) should be removed or at least a clear reference to the other IVOA standards should be mentioned.
The document describes all possible STC metadata but does not indicate how to "convert" one coordinate / coordinate system to an other. So if 2 services implement STC, but using different coordinate / coordinate system, how do these 2 services be interoperable ?
Response:
Since a Region may be a Shape or the result of specific operations
performed on one or more Regions, 4.5.2 is legitimately part of
the definition of a Region.
Section 7.2 states the contexts in which the STC metadata function.
Therefore, providing a full metadata description for those contexts
does belong in this document.
This document identifies various astronomical coordinate systems,
it does not define them. There are several coordinate transformation
packages around, properly documented, that can be used for performing
coordinate transformations.
-- ArnoldRots - 16 Aug 2007
Matthew Graham (Grid & Web Services WG)
I approve this document with the following comments:
(!) It is not clear to me what the precise scope of STC is. Is it just for representing astronomical coordinates or is for any coordinate system - is it, in fact, the Coordinate data model? Can I legitimately report my data set in some multidimensional phase space in STC? Can STC be used for theoretical simulations to represent Eulerian or Lagrangian systems and should it be?
(2) I would have liked at one least reference library supporting region handling and coordinate transformations that I can install on my laptop and use as one of the implementations.
(3) It is no clear to me that STC can handle complex coordinate transformations: e.g. defining a coordinate system using cosmological comoving coordinates from the known Ra, Dec, redshift system. How would I specify the cosmological model I am using and the functional form of the metric?
Response:
(1) Yes (to the last part of the question); yes; and yes, with
extensions.
(2) Me too; and we intend to build that.
(3) This may need an extension. The main thing is, though, that
the standard is defined in such a way that these extensions are
possible - and they are.
-- ArnoldRots - 16 Aug 2007
Bob Hanisch (Data Curation & Preservation IG)
I approve this document. This is perhaps the most complicated of the VO standards, and only through more extensive use will we be able to ascertain properly how it needs to evolve further. We should move forward and deal with those revisions through the standard process.
Gerard Lemson (Theory IG)
I approve of the model, but I must admit this is partly because we need a standard for STC "right now" and there is no alternative.
I will not comment on the contents too much as I am not an expert. But it is clear that Arnold has taken
great care in including all possible situations and has produced as comprehensive a model as he could.
I want to congratulate him on that.
My main comments are with the structure of the presented model.
I find it worrisome that the only result one can see as a standard that allows applications to be built against it etc
is the XML schema. It seems to me that we can not consider the UML diagrams as normative as they are way less detailed
then the schema, even though the latter is supposed to be only one serialisation of the model.
It is unclear how the various UML diagrams have to be interpreted. The colored onese are surely closer to an
understandable model than the white ones (which can be removed) and do indeed allow one to understand the structure
of the model better than reading the XML schema does.
I refer further to my TCG comment in the spectrum datamodel for more comments on the use of UML in the DM working group.
One problem I have with the schema is that it is very monolithic. It would be nice if the schema could be
modularised in separate namespaces, with their interdependency organised as a DAG, so that when wanting to
use only part of the model, say region, one needs not import all the stuff one is not interested in.
This has several advantages. As a natural first step in validating an XML document one should check it against
the corresponding schema. If the schema is a smaller module it is now not necessray to import the whole schema to do so.
Similarly, code generators, which form the simplest way to create an XML parser, do not need to create
400+ different classes corresponding to all the different complex and simple types and root elements if one
needs only a (small) subset of the whole model.
Another aspect I can not ignore to mention (as I did so before) is the design choices Arnold made in the XML schema.
His use of ref="" in element definitions instead of name=""+type="", has as a consequence that way more root elements
have to be introduced than would be necessary if a straighforward mapping from his UML diagrams would imply.
As an example, only because of this design choices do we now have root elements named Region and Region2,
as well as Union and Union2 etc. For they are to be referenced at by the two components in a union or an
intersection etc. I counted that there are about 300+ root elements, to the 100+ complex types.
Another potential problem with this choice is that a strict XML validation will allow STC documents that consist
of a single element . This may be ok in certain contexts (?), but not in others. Any other schema
that imports the current, non-modular STC schema will automatically validate such a document as well.
This makes schema based validation hard and requireas extra work on the side of potential users of the STC model.
This same discussion has been held in the registry and votable working groups, and in both it was decided
to stay away form this design for the reasons as mentioned above. It might be nice if an attempt could be made
to have the STC schema conform to these same design choices.
This said I do also think that we need to start using the model and not hesitate to change it in future versions.
Response:
This has been a long-standing discussion. Breaking the schema up
(as it was originally, for development purposes), became unwieldy,
too. The large number of root elements is unfortunate, I agree.
But the needs of this schema are somewhat different from those of the
Registry. For STC it was important to take advantage of certain
aspects of the schema definition that allowed us to enforce a number
of consistency issues. Had we not done that, the burden on the
potential users would have been at least as large, albeit in a
different area. I grant you that this is a design decision that is
based on judgment concerning the trade-offs.
-- ArnoldRots - 16 Aug 2007
Anita Richards in agreement with Mireille Louys (Data Models WG)
I am very happy to approve STC as we need to carry on using the model and also refering instrument/data handling tool developers to its definitions. I have some comments which could be attended to in a future version. In some places it may be that I have overlooked or misunderstood what is already in the document, if so I apologise but maybe that indicates a need for clarification in those places. In general it's quite verbose. This can make it difficult for people who are slow reading English (although it is beautifully written!). A summary of the meat of Sections 1-3 in a couple of paras might help.
1) Specifying the handedness of axes
Although RA and Dec and some other coordinate systems have an unambiguous
'handedness', this is not always the case. For example, if I need to use the
positions of telescopes in a synthesis array, these may be given in
array-centred coordinates in xyz metres offset from a reference position, but
some arrays use left-handed and some arrays use right-handed coordinates.
Bitter experience has shown that expecting users or publishers to adjust to a
common conventions does not work, you have to be able to specify the
convention.
I think that GEO_D as in Table 3 p.20 covers this, or if not Generic 6.1.1 -
in any case, I am not sure how to specify handedness.
(The reference position should of course be in a well-defined conventional
system).
2) In FK4, you need to be able to specify the epoch as well as the equinox for
full accuracy and I don't see how to do that.
3) In the long term, generic coordinates should be allowed to be more than 3
dimensional, e.g. sometimes cosmologists want 4 interchangable space-time
axes, or even more dimensions.
4) 4.4.2.2.4 error in RA always in units of arc angle... is this essential? My
experience e.g. with people who publish data to Astroscope, is that people
won;t convert. Hence it would be more foolproof to insist that units are
given or at least allow them, so that if the errors are in time-like seconds
there is no confusion.
5) 6.1.3.1 Is there some place to specify the units for Scaling?
AnitaRichards for Mireille
Response:
Sections 1-3 were gradually expanded in response to requests for
more clarification. Maybe I could add something to the abstract.
(1) You are absolutely right. It is not hard to fix: I would
propose to add an optional "handedness" attribute to the Flavor
element that can be "RIGHT" or "LEFT".
(2) Epoch is there, actually, as an optional attribute.
(3) Yes, this is correct, and it should be expanded in a future
version.
(4) Time-like units for angle coordinates are not allowed - at
least not at this time.
(5) Yes.
-- ArnoldRots - 16 Aug 2007
Keith Noddle (Data Access Layer WG)
A formidable body of work! I recommend approval; some details need attaention (how could it be otherwise?) but no show stoppers.
Francois Ochsenbein (VOTable WG)
First the document is rich, and tries to propose a model that is potentially able to describe any possible frame -- which also results in a quite complex document, especially for the 99% users who just expect to quote accurately the space and time frames in which their data are expressed. We badly require a standard to specify this, thanks to Arnold for this document.
From the comments posted during the RFC period, I understand that many points will be fixed -- does it mean that what we are approving is a virtual final document including these fixes ? Will the updated document be posted somewhere before its insertion as an "IVOA Recommendation" ?
And also about the STC-S and/or STC-X and/or utypes associated documents -- will these documents be just IVOA notes or will they complete the STC document ?
Finally I would like to insist one more time on the importance of a coherent usage of units overall IVOA conventions -- and I'm sure I'm not the only one worried by this problem. In particular a unit can only be associated to a scalar value; introducing a unit associated to a 3D-vector (as done in section 4.4.1.2.2), introduces an unacceptable confusion with the measure of a 3D volume, which is a fundamentally different quantity.
Response:
The comments from the RFC period have been included in the current
version of the document (14 June 2007) and schema.
The STC-X and STC-S documents will be re-issued after this document
has been finalized; they will remain Notes, serving as explanatory
papers.
I agree that usage of units should be uniform; once we agree on
this, I will update STC to comply.
-- ArnoldRots - 17 Aug 2007
Pedro Osuna (VOQL WG)
The document contains a lot of information on the description, from the Astronomical point of view, of Space Time Coordinates, and Arnold has done a great job in compiling all this information.
In general what impresses me most, and what I would consider the core of the doc and the part that should be taken care of and expanded over, is the whole of Chapter 4, where the description of the different attributes of possible Space Time Coordinates is done.
We know of the need to release the STC so we can start making use of it and learn about its good and bad points, however I still have concerns on the scope and usability of the current document for interoperability between different parties within the VO. The reasons are the following:
1. What does the "STC Data Model" model? A POSITION in the Sky? A REGION in the Sky? Both? Why? Why should REGION be defined in the STC itself and not in a protocol making use of STC Coordinates to build REGIONS? Or is it defining QUERIES, as it says in the examples? (c.f. example B.5). Why? Or Catalogues (c.f. example B.8 and STC element called CatalogueEntryLocation)
In my opinion, the problem is that it is not clear what pertains to the MODEL and what to how to make use of it in different applications. I think I have already expanded about this in other docs, so I won't do it again here.
2. A lot of emphasis throughout the document is given to "How to describe" a set of data using STC in XML, but never on "How to discover" making use of STC. In particular, from the point of view of the VOQL, it is not clear at all how the STC will be used. Arnold said that the answer is in the STC-S. However, this is again a serialisation of the model, and reading the STC-S document I have seen that it even imposes restrictions in the order in which a certain "phrase" or "subphrase" must be written. This type of syntactic rules do not belong to the STC, whether S or X, but to the ADQL language itself. The STC shoud just give the user the possibility to build whichever ADQL query using the STC Model Attributes.
In this sense, I still can not figure how to make a simple query where I make use of an FK5 RA of 123.4. Trying to read the UML doc I should write something like:
GenCoordinate.2DCoord.Value=123.4 & STCBase.Frame.?.SpaceRefFrame=FK5 & (...)
Looks like quite a big and obscure (still not clear to me and many others) thing just to describe an attribute called "Right Ascension in FK5".
Moreover, despite the fact of the main STC doc being full of examples (more on this later), no proper example is given for the "string" case.
3. The UML diagrams are hardly readable. They are put in the Appendix section, while for me they should form the Core of the document, matching the great part of chapter 4 as I was saying before, while the serialisations should go in appendices, or examples, or simply dissappear. The UMLs are also overcomplicated. In diagram in page 47 (STC Astronomical Coordinate Systems) there is hardly any attribute in any of the "boxes", which normally represent classes. Nearly all of them are empty boxes which will contain probably a value. Too many aggregations as opposed to classes with attributes make the UML clumpsy and very difficult to read and implement. I question that implementors currently outside the VO and wanting to play the game will be willing to implement such a complex model as in pages 38 to 49.
4. Examples B.5 and B.8 are counterproductive and should be removed from the document. They give the impression that everything can be done with the STC: from Querying archives to Querying and handling Catalogues. This is something being defined in other parts of the VO.
5. The doc makes reference to other five documents. This is an overkill. Their content should either be included in the doc if relevant, or be relegated to the "references" section otherwise.
I also agree with some other people that the document should be open to other hands and views, as the rest of IVOA documents. Sometimes sixteen eyes see more than two, probably not on the content (knowledge about Space Time Coordinates in Astronomy, of which Arnold is probably the most knowledgeable anywhere) but on the other aspects.
Having said all this, I would recommend the document for approval to Recommendation status if references to queries and catalogues are removed from it, in particular the aforementioned examples b.5 and b.8.
Response:
(1) STC provides a model that defines the metadata required to specify
all relevant information on coordinate axes. So, it includes ways to
specify coordinate axes, coordinate systems, positions, areas, and
volumes. It further identifies four specific contexts in which
coordinate metadata needs to be provided and shows a preferred way
to do so.
(2) STC-S pretends to be nothing more than a string serialization
of STC. It was developed after discussions with Jim Gray to enable
applications that employ pure string serialization (such as SQL),
where XML constructs would be alien, to incorporate STC elements.
In ADQL this could be done by defining a boolean function STC_S_contains that takes as arguments an STC-S string and a position:
select ra, dec from mytable
where STC_S_contains("Circle ICRS 123 45 0.5", ra, dec)
This could be extended later to a suite of functions that can handle other STC functionality, as well, but I would imagine that this would do for now.
(3) I know it is complicated. But it would have looked worse if all attributes were included. The great advantage of the aggregations,
as opposed to the use of attributes, is that the schema is extensible
with custom extensions.
(4) I repectfully disagree. They show how the coordinate metadata
can be provided for these use-contexts.
(5) This may be a matter of preference or style. I don't think it
is a stumbling block, though. These documents serve as explanatory
add-ons to this document.
-- ArnoldRots - 28 Aug 2007
Ray Plante (Resource Registry WG)
I recommend the approval of this document to Recommendation status; however, I would strongly encourage the following changes:
- Clarifying statements are made about the as-yet non-existent documents listed in Section 1. In particular, the answers to the following questions should be made clear:
- Are any of these documents a constituent of the standard being approved or is this document complete in and of itself?
Preceding the list is the statement that STC-X "is an integral part of this standard." What does this mean?
- What kind of documents will those on STC-X and STC-S be? Standards that go through the process, or simply IVOA Notes?
- Which document will standardizes the STC-X schema: this one or the forthcoming STC-X document?
- What is the purpose of the STC document on Arnold's web site? This document is described as "full documentation" for STC (where as the STC-X document just "documents STC-X"). Is it authoritative? Or just helpful? Either way, if it is important enough to reference as a related document, it probably should be published as an IVOA Note, rather than on a private web site.
- A clarifying statement is added indicating whether the UML representation should be considered part of the standard, or is just provided for clarification.
- The UML Representation is regenerated, edited, annotated, or whatever to make it consistent with the text in the main sections.
The introduction to Appendix A admits the UML diagrams are "slightly outdated at this time", and I found this a hinderance to understanding the model.
This document may not be perfect. In particular, I'm concerned that
the semantics of the model have not been sufficiently spelled out. I
would have prefered to have seen for every term in the model a clear,
semantic definition (i.e. that says what it means). This document
emphasizes structure and syntax. I'm also concerned about the lack of
a reference library (that can do at least some of the things that Mark
& Mark mention above); more information about the model is probably
needed to know how to use the STC information in such a library.
Nevertheless, I think STC-X is understood well enough that people are
beginning to make use of it. I think the most important thing for
moving things forward, is to get the standard finalized so people can
start using it.
Finally, if Arnold finds himself making any last minute revisions, perhaps he could consider my additional comments that really should have been entered during the RFC period. I certainly understand, though, if its too late in the process to take some or any of them up.
Response:
- Sub-items:
- The standard is contained in this document and the XML schema. The comment that STC-X is an integral part refers to STC-X as the XML schema implementation of STC (as opposed to the string implementation STC-S), not the note that will further clarify that implementation (serialization).
- Therefore, the STC-X and STC-S documents will be Notes intended to clarify these two serializations.
- See item 1.a.
- This "full documentation" is the 500-page, or so, HTML document that XMLSpy generates for the schema. It provides full documentation of the schema, but is rather bulky and at this point we are not in the habit of providing this level of documentation with IVOA schemata. However, if it is deemed useful, I'd be happy to upload a tarfile.
- The UML diagrams are for clarification.
- I know, I know. At some point I will try to fix it, when I have more time.
- A glossary would not be a bad idea. In the current form of the document, though, Section 6.1 might be helpful in this regard.
- I wasn't aware, yet, of this extra page, but will look at it.
-- ArnoldRots - 21 Aug 2007
RayPlante responds...
- 1.d. does not quite answer my question, is it authoritative? That is, is that the document that controls what is compliant and what is not? I gather from your other comments that the answer is no. I would suggest not refering to it as "_the_ full documentation" but rather a "detailed documenting of the schema based on automated diagram and documentation generation from XMLSpy."
- Ultimately what I was hoping for was not just answers to my questions but some clarifications on these questions added to the document.
-- RayPlante - 10 Sep 2007
Andrea Preite-Martinez (Semantics WG)
I congratulate the author for his remarkable work. I approve it for Recommendation.
Roy Williams (VOEvent WG)
It is not clear to me what, precisely, an STC instance is actually describing. It is a point in the sky, it is the orbit and/or ephemeris of a comet, it is the coverage of a survey, it is a query on a catalog. In spite of the great detail of the STC document, it does not mention relationships with other IVOA standards, does not talk of translations or equivalences, does not define where STC should and should not be used in the IVOA context. It is difficult for me to accept a recommendation that covers SO MUCH ground, but has had only ONE HAND in its creation.
However I see no practical alternative, so I enthusiastically recommend STC as an IVOA standard.
Having said this, two points that I believe should be corrected are:
(1) Some of the examples in the document have Schema Location at Harvard. These should be copied to ivoa.net and the examples changed.
(2) Use of textually meaningful ID/IDREF structures in IVOA protocols is now deprecated, and the STC examples do not show this. This line for example from Appendix C:
<AstroCoordSystem id="TT-ICRS-TOPO" xlink:type="simple"
xlink:href="ivo://STClib/CoordSys#TT-ICRS-TOPO"/>
implies that a consumer of STC can detect the coordinate system by checking the coord_system_id, by splitting that into parts TT, ICRS, TOPO. But this is not true, since this token may have been changed in the registry. The correct check is on the href attribute of the xlink, and the sexample should change to imply this:
<AstroCoordSystem id="b57ccd870cf4b" xlink:type="simple"
xlink:href="ivo://STClib/CoordSys#TT-ICRS-TOPO"/>
Response:
This document defines the metadata items and their structure that
specify coordinate properties in the VO. It identifies four major
contexts in which these metadata play a crucial role and it shows
how the metadata description can be constructed in these contexts,
in a form that is preferred from the perspective of STC. However,
the precise syntax of the incorporation of the STC metadata in the
various VO interfaces and applications is left to their controlling
documents.
- I will correct that and may try to use default schema location rather than explicit version. (the problem with all of this is, of course, that examples need to be validated before posting, both the example and the schema, which in turn means that the definitive location of the schema cannot be used at that time)
- I am puzzled by this. I don't think use of meaningful IDs is deprecated (it can be very helpful!), but users need to be aware that these are in principle arbitrary strings. May one not assume that users who create XML documents based on a schema are aware of the XML grammar?
-- ArnoldRots - 21 Aug 2007