STC version 2.0
Goal
Why STC-2.0?
Version 1 of
STC was developed in 2007, prior to the development and adoption of vo-dml modeling practices. As we progress to the development of vo-dml compliant component models, it is necessary to revisit those models which define core content. Additionally, the scope of the
STC-1.0 model is very broad, making a complete implementation and development of validators, very difficult. As such it may be prudent to break the content of
STC-1.0 into component models itself, which as a group, cover the scope of the original.
This effort will start from first principles with respect to defining a specific project use-case, from which requirements will be drawn, satisfied by the model, and implemented in the use-case. We will make use of the original model to ensure that the coverage of concepts is complete and that the models will be compatible. However, the form and structure may be quite different. This model will use vo-dml modeling practices, and model elements may be structured differently.
Context and Scope
Measurement: Describes measured or determined data
- associates the coordinate value with errors
Coordinates: Describes the coordinate domain space
- the coordinate space; axes and domain ranges
- coordinate frames with metadata describing the origin and orientation of the coordinate space
- a general model for specifying coordinate values within the coordinate space
- simple specialized coordinates for the most common cases
- coordinate systems associating related coordinate frames
Transforms: Describes the mechanism to define data as a function of other data. i.e. to transform data from one 'frame' to another
- atomic transform operations
- operations which combine operations into sequences; either in series or in parallel
- operations which facilitate dimensional manipulation
- add, delete, duplicate dimensions
- shuffle axis order
Participants
domain experts: Jim Bosch (LSST), Ian Evance (SAO);
ArnoldRots (retired)
data modeler:
MarkCresitelloDittmar
editor(s):
MarkCresitelloDittmar
author(s)
- Coords: ArnoldRots, GerardLemson, OmarLaurino
- Trans: ArnoldRots, DavidBerry, StevenCrawford, NadiaDencheva, PerryGreenfield, TimJenness, OmarLaurino, StuartMumford, ErikTollerud
- Meas: ArnoldRots, GerardLemson, OmarLaurino
contributor(s):
Many thanks to those who contributed to the quality of the models through review and assessments. In particular,
- MarkusDemleitner - thorough reviews with attention to usability and impact on users/clients of various representations
- AdaNebot - review and assessment in the context of TimeSeries data
- FrancoisBonnarel - review and detailed assessments of Transform model
- LaurentMichel - using the models in a context other than its target (Mango model), organizing and participating in the DM workshop to demonstrate usability of the models, and implementation work.
Uses cases
1) The primary use case for this work is in support of the
CubesDM
- The focus of the CubesDM is to represent N-Dimensional data, both pixelated images and sparse data cubes. With a common data model, one or more Mappings may be applied, associating data elements with model concepts. This annotated dataset is the key to interoperability of the data products themselves, facilitating a wide array of model-based science threads.
- The following summarizes the concepts and criteria from the CubesDM which are relevant to thes models.
- General
- knowledge of the pixel and physical domain spaces provided at a high level
- definition of the domain space includes the following criteria
- dimensionality (typically 1,2 or 3 for physical domain), pixel domain may be of any dimension
- axis configuration (for spatial domain which has >1D). The most common configurations for astronomical data are Cartesian and Spherical, but others may be used as well.
- domain range along each axis, typically +/- Inf, but may be limited due to physical constraints (e.g. physical size of a detector, sensitivity limitiations, etc)
- association with additional metadata further describing the nature of the domain space ( Frame ). This is especially true for the Spatial and Temporal domains, but may apply to others as well.
- reference position (location of origin)
- reference frame (orientation of the domain space)
- planetary ephemeris
- equinox
- Pixelated Image Cube
- complete specification of pixel coordinate domain; number of axes, number of pixels per axis
- image axes
- in pixel domain, are a binned coordinate space with integerized values (pixel indexes)
- mapped to various 'physical' coordinate spaces via transform operations
- any combination of pixel axes may be involved in transform to any given 'physical' space
- any pixel axis may be involved in more than one mapping
- mappings often involve multiple steps executed in sequence
- mappings may define a progressive migration in coordinate space (e.g. pixel -> ccd -> detector -> sky -> wcs )
- intermediate stages may or may not be explicitly defined. Therefore, mappings must be stackable in series.
- transform operations should be flexible in covering the n-dimensional space. e.g. Application of Scale operations to 1D, 2D, nD axes.
- pixel axis mappings are typically to a continuous domain, but may also be to a discrete domain such as Polarization state.
- image cubes may have any number of dimensions, but are typically separable into co-dependent axes of 1, 2, or 3 dimensions.
- spatial domain typically 2-3 dimensions
- other domains (time, spectral, polarization), are typically 1 dimensional
- image data value is typically given in a physical domain, but may itself be mapped to other domains
- Sparse Cube
- data axes cover a wide array of physical domains including, but not limited to Spatial, Temporal, Spectral, Polarization,
- individual domains may be represented multiple times in different frames ( ccd, detector, sky; pha, energy )
- data values may have associated errors
- typical error forms include: symmetric( +/- a ), asymmetric( -a:+b ), interval ( a:b ), matrix
- for multi-dimensional: errors could also be represented by a shape (eg: elliptical, polynomial )
- ie: associated errors may be separable or correlated
- quality indicators:
- global status, typically numeric
- bit array, where each bit is associated with a particular quality state
-
- data axes may be virtual, defined as a mapping from other data axes
- here, the originating space is not pixelated, but an arbitrary space.
- axes involved in a mapping need not be associated with the same physical domain.
- X,Y = Map(x,y,temp); Transform with spatial and thermal dependence
- dimensionality may change between operations
- Physical Data (Observables)
- Our initial focus is on the following domains which are frequently included in astronomical data cubes. Domains: Spatial, Spectral, Temporal, Polarization.
- Spatial
- Cartesian space: chip, detector, sky
- Spherical space: Equatorial, Ecliptic, Galactic, LonLat
- Time
- 1Dimensional: JD, MJD, ISOTime, TimeOffset
- Polarization
- Discrete space: Polarization states (Stokes, Linear, Circular, Vector )
- Spectral
- 1Dimensional: energy, frequency, wavelength
2) Transform Interoperability
An implementation project focused on the Transform model has been undertaken by members of LSST and STSci community to evaluate the usability and applicability of the model to their missions. The focus of this project is to exercise the Transform model through a workflow consisting of:
- serialization in YAML of complete WCS metadata, including source/target frames and the various Transform operation sequences between them.
- the generation and passing thereof between two Transform library implementations
- This use case emphasizes the workflow and combination of atomic operations.
- combining operations in parallel to cover the dimension space
- combining operations in series to accomplish multi-stage mappings
- management and direction of axes through the operation sequence, for example:
- duplicate axes x and y to send pair into 2D-Polynomial transforms, generating x',y'; in reverse direction, select axes 3 and 2
- from 4D axis set, send axes 1,3 into operation A, axis 2 into operation B, axis 4 into operation C
- send 2D axis set into 3D operation, adding axis 3 with default value.
- handling of both forward and inverse operations
- for operations with no natural inverse, must be able to assign (optionally) an independent operation spec to be used in that direction.
3) Data Model Use Case workshop
In November of 2020, the Datamodel working group conducted a workshop to demonstrate and test the usability and effectiveness of the data models and annotation syntax.
The list of cases include:
- Column Grouping:
- maps data to the Mango, Meas, Coords data models.
- this case uses the models to define associated groups from columns of a flat table
- eg: Doppler Velocity with associated Quality flag and reference documentation.
- Combined Data:
- maps Catalog data from multiple providers to the Mango, Cube, Meas, Coords data models
- defines Sources, Detections, and TimeSeries instances
- the annotation associates Sources with their corresponding Detections and TimeSeries instances
- Native Frames:
- maps Catalog data from multiple providers to the Mango, Meas, Coords models
- the datasets provided positional data in different coordinate systems
- the case involves using the model annotation to identify the native frames of the positional data, and demonstrates the ability to easily reconcile these to a common frame, to make a plot.
- Proper Motions:
- maps Vizier data to the Meas and Coords models
- the case involves using the model annotation to extract the position and associated proper motion data, and generate plots
- simple cartesian plot with source positions and arrows indicating direction and magnitude of proper motion
- converting to AstroPy instances, create a graphic plot of sources moving according to their proper motion.
- Standard Properties:
- maps data from multiple providers to Mango, Meas, Coords models
- the case involves identifying the properties contained in each dataset and displaying that information.
- this case is a hook for science threads where one needs to identify and extract certain properties from different datasets to execute the thread.
- Time Series:
- maps Chandra Source Catalog, GAIA, GAVO and ZTF datasets as SparceCube Time series instances.
- the internal representation of the data, and complexity vary greatly among these datasets
- the case involves using the model annotation to extract the TimeSeries instances and plot them.
- the same simple script is used to interpret and plot each of the datasets.
Requirements
Examination and implementation of the above cases leads to the following set of requirements distributed through the various
STC component models.
- Structure
- [vodml.001] The model shall be vo-dml compliant, producing a validated vo-dml XML description.
- [vodml.002] shall re-use, or refer to, dependent models for objects and concepts already defined in other models
- [vodml.003] shall produce documentation in vo-dml HTML format
- [vodml.004] shall produce documentation in standard PDF format
- Application/Usage
- [user.001] Users should be able to identify and use basic content with minimal specialized information.
- in other words, a generic utility should be able to find and use core elements without knowing a lot about the various extensions and uses of those elements.
- [user.002] When applicable, the model should support usability by simplifying common scenarios.
- i.e. keep common things simple, and complex things possible
- Domains
- [dom.001] Shall accommodate the description of data in any observable domain
- [dom.002] Shall provide enhanced/specialized description for data pertaining to
- [dom.0002.1] Pixel domain: binned, integerized, n-dimensional domain
- [dom.0002.2] Spatial domain: continuous domain, typically in 2-3 dimensional cartesian or spherical spaces
- [dom.002.3] Time domain: continuous 1D domain, typically provided in JD, MJD, ISO, or as an Offset from a zero point
- [dom.0002.4] Polarization domain: discrete 1D domain of polarization states.
- Measurements
- [meas.001] Shall relate a coordinate value with associated errors
- [meas.002] Shall support multiple error associations per value to describe errors from different sources
- [meas.003] Any specific error source may appear only once
- [meas.004] Errors may be correlated between component values ( ie: may apply to coordinate set as a whole )
- [meas.005] Values associated with different domains may have correlated errors (ie: components of coordinate tuple may refer to different domains, and have non-separable errors)
- [meas.006] Shall support the most common error forms, including, but not limited to: Symmetrical, Asymmetrical, Interval, Elliptical, Matrix
- [meas.007] Shall provide specialized objects related to measurements in the priority domains ( Spatial, Spectral, Temporal, Polarization ); leveraging [user.0002] where possible
- [meas.008] Shall allow for the representation data outside the priority domains
- Coordinates
- Coordinate Spaces:
- [coords.001] Shall facilitate the description of the domain space
- [coords.001.1] Coordinate space shall consist of 1 to N dimensional axes
- [coords.001.2] Shall support the description of axes which are continuous, binned, and discrete in nature
- [coords.001.3] Each dimensional axis shall define the domain range of that axis as appropriate for its nature
- Coordinate frames:
- [coords.002] Shall facilitate the specification of the nature of the domain, providing additional metadata relevant to the interpretation of coordinates in that domain.
- Coordinates:
- [coords.003] Shall identify a location within the coordinate domain space
- [coords.004] Shall be associated with a corresponding coordinate frame providing metadata relevant to the interpretation of the coordinate
- [coords.005] Shall be associated with a particular axis of the coordinate space to provide context for the coordinate and facilitate the application of mapping Transforms
- [coords.006] Shall be complete quantities, including value and units as appropriate
- [coords.007] Shall support the association of atomic coordinates into a multi-dimensional compound grouping
- Coordinate systems:
- [coords.008] Shall provide for encapsulating the description of the entire domain space
- [coords.009] for Pixel domain, this must include the full coordinate space description
- [coords.010] for Physical domains, this must include the Frame specifications, as it is this metadata that is more relevant to users. The coordinate space is typically well defined or implied by the coordinate itself.
- Transforms:
- [trans.001] Shall facilitate the relation of two coordinate frames through a mathematical formula (Transforms)
- [trans.001.1] Shall facilitate the transport of same independent of any actual data
- [trans.002] Shall define a set of atomic Transform operations commonly used in astronomical applications
- [trans.002.1] at a minimum, will accomodate common operations found in FITS images and data cubes, including but not limited to:
- Linear, Matrix, FITS WCS projection, Lookup table, Polynomial (1D and 2D)
- [trans.002.2] shall accommodate and be compatible with established implementation packages AST, and gWCS
- [trans.003] Shall allow the combination of operations in sequence, to form complex, multi-stage transforms.
- [trans.004] Shall allow the combination of operations in parallel to cover the appropriate domain space
- [trans.005] Shall support bi-directional workflow (forward and inverse), including the explicit assignment of independent operations for types which have no natural inverse.
- [trans.006] Shall provide operations to facilitate a work flow that requires manipulation of the dimensional axes through the process
- [trans.006.1] duplicate axes, e.g. to send axis pair (x,y) into 2 Poly2D operations to form (x',y')
- [trans.006.2] shuffle axis order [x,y,z] => [x,z,y]
- [trans.006.3] add or drop dimensions
- [trans.006.4] allow explicit control of flow in both forward and inverse directions
- [trans.006.4.1] preferential selection of source in reverse direction for duplicated input axes
- [trans.006.4.2] one-to-one axis mappings are not, necessarily, bi-directional
Documents
Latest Document:
- IVOA Repository:
- Measurements model: Proposed Recommendation (RFC-2)
- Coordinates model: Proposed Recommendation (RFC-2)
- Transform model: Working Draft
- Development Version: Current revision of the document, including all images and source document, and Issue tracking.
- Measurements model:
- Coordinates model:
- Transform model:
Discussion Topics
Significant discussion threads from dm working group mailing list:
STC2 and VO-DML compliance:
Discussion on conflicts between stc2 model and vo-dml rules, specifically regarding the multiplicity of attributes.
Cube dependencies
Working Draft Review:
- 2017-05-15 version:
- Coords:
- 2019-03-20 version:
- 2018-11-30 version:
RFC Review:
RFC-2 Review:
- RFC-2 comments can be found at
Compatibility with existing packages
Meas/Coords - AstroPy comparison:
There have been requests for a formal comparison of the Meas/Coords models to the AstroPy implementation. The attached
PDF outlines the model editor's interpretation of the AstroPy design, and compatibility with the Meas/Coords models. A color coded element map PNG image is also available, showing which AstroPy elements are served by which Model elements.
- In short, we find the model contains all information necessary to instantiate the corresponding AstroPy instances. The organization of the information is not identical, but certainly compatible.. AstroPy migrates the values into the Coordinate space to be more efficient performing calculations on large coordinate sets, the model has more explicit control/definition of the coordinate spaces to satisfy Cube model requirements.
- The main differences appear to be:
- Epoch is a Time type (Representation) in AstroPy.
- I'll note that this has been mentioned in the past by Francois Bonnarel but was not adopted in the model, primarily on the grounds that it was not a time type in STC1
- AstroPy contains representations of Point in both Space-centric and Frame-centric modes, these are more extensive than what is currently supported by the Coords model.
- a Space-centric Point (lon, lat, dist), (x, y, z), (rho, phi, z)
- and access via frame-centric names (ra, dec), (l, b)
- Note: these conclusions are reinforced by the successful implementation of AstroPy hooks in multiple packages developed to parse and interpret annotated VOTables.
Transform - WCS libraries
The Transform model is compatible with three popular and well established Transform implementation libraries.
The table below shows the coverage of the various libraries to the model elements:
- AST: Starlink's Library for handling World Coordinate Systems in Astronomy. (Python, Perl, Java, and C )
- GWCS: Generalized WCS implementation, with basis in the astropy modeling package. (Python)
- WCSLIB: implements the "World Coordinate System" (WCS) standard in FITS (C, Fortran)
WCS Transform Model Element |
AST |
GWCS |
WCSLIB |
TransformSet |
FrameSet |
WCS.pipeline |
|
TransNode |
|
WCS.step |
|
Mapping |
Mapping |
Model |
|
CompoundMap |
CmpMap |
CompoundModel |
|
ComposeMap |
CmpMap |
Model composition |
|
|
ConcatenateMap |
CmpMap |
Model concatenation |
|
BiDirectionalMap |
TranMap |
Model |
|
BiDirectionalMap.forwardMap |
TranMap.map1 |
wcs.forward_transform |
|
BiDirectionalMap.inverseMap |
TranMap.map2 |
wcs.backward_transform |
|
Permute |
PermMap |
Mapping |
|
|
|
|
|
Unit |
UnitMap |
Identity |
|
Shift |
ShiftMap |
Shift |
lin.h |
Scale |
ZoomMap |
Scale |
lin.h |
Rotate2D |
|
Rotation2D |
|
EulerRotation |
|
EulerAngleRotation |
|
Matrix |
MatrixMap |
|
|
Projection |
|
|
|
o SkyProjection |
WCSMap |
Projection - each alg a separate class |
prj.h |
o SkyProjRotate |
|
RotateNative2Celestial, RotateCelestial2Native |
sph.h |
o SpectralProjection |
SpecMap |
|
spx.h |
Polynomial1D |
PolyMap |
Polynomial1D |
|
Polynomial2D |
PolyMap |
Polynomial2D |
|
Lookup |
LutMap |
Tabular1D, Tabular2D |
tab.h |
Implementations
Serializations:
- VOTable COOSYS
- this represents a standardized serialization of a Coordinate model SpaceFrame
- COOSYS =>!SpaceFrame
- COOSYS.system => SpaceFrame.spaceRefFrame
- COOSYS.equinox => SpaceFrame.equinox
- COOSYS.epoch => would map to epoch of a particular measurement set, outside the scope of SpaceFrame
- NOTE: COOSYS lacks the 'refPosition' present in SpaceFrame.. this is on the list as a probable enhancement to COOSYS
- VOTable 1.4: TIMESYS
- this is similarly, a standardized serialization of Coordinate model TimeFrame
- TIMESYS => TimeFrame
- TIMESYS.timescale => TimeFrame.timescale
- TIMESYS.refposition => TimeFrame.refPosition
- TIMESYS.timeorigin => TimeOffset.time0; centralizing this information high in the serialization
- Example serializations:
- Annotated VOTables:
- all model elements as VOTable files annotated to the VODML Mapping Syntax ( WD:20170323), produced by Jovial software package.
- coordinates model elements: here
- includes xml, and jovial dsl files
- measurement model elements: here
- includes xml, and jovial dsl files
- transform model elements: here
- includes xml, and jovial dsl files
- Various Formats: (To be updates to post-RFC2 model changes)
- independent python code, generated example serializations spanning all elements of the models in 4 formats:
- *.vot: VOTable-1.3 standard syntax
- *.avot: VOTable-1.3 annotated with VO-DML/Mapping syntax
- Validates using xmllint to a VOTable-1.3 schema enhanced with an imported VO-DML mapping syntax schema
- *.xml: XML format
- Validates against the model schema
- *.xxx: An internal DOC format
- XML/DOM structure representing the instances generated when interpreting the templates.
- measurement model elements: here
- coordinates model elements: here
Usage:
- Data Model Workshop - May 2021
- This Git repository contains original implementations from all participants.
- DM Case Implementations
- This Git repository contains a set of use case implementations maintained to the current model suite.
- each workshop usecase implemented using Jovial and Rama tools.
- Jupyter notebook illustrates the case thread.
- Highlights
- Transform - WCS usage
- The following python scripts (provided by David Berry 20201111), illustrate example usage of the two implementations exchanging/transferring WCS information. These will be ported to a Jupyter notebook for better presentation:
- yamlchan_demo.py:
- Reads a sample file containing an AST FrameSet representing a typical LSST image, including polynomial distortion.
- Writes out the WCS object as an equivalent ASDF file (lsst_wcs.asdf).
- It then converts a pixel position to sky coords, and then back to pixel coords, using both AST and GWCS packages
- Prints the results for comparison.
- fits_to_asdf.py:
- Reads a sample FITS image file to extract the WCS information from the headers
- Creates an AST FrameSet
- Then writes the FrameSet out as an ASDF wcs object in yaml.
- AstroPy Wrapper
- Using an AstroPy wrapper in the ModelInstanceInVOT code (see below)
- This Git repository holds case implementations
- Meas/Coords model elements are mapped in VOTable
- parser interprets annotation to generate model instances, and converts them to SkyCoord instances.
- Threads:
- Extract positions, parallax and proper motions from ESAC archive; generate 3D plot of source positions
- using direct Measurements model instances
- using converted AstroPy SkyCoord instances
- Identify annotated source positions and reconcile the coordinate frames.
- Extract observation history of a source from ESAC XMM TAP archive, track source movement over 20 year period.
- Examples
- Notebook
- ADASS 2021 BoF - TAP and the Data Models
- This BoF discussed the possibility and benefits for TAP services to apply on-the-fly annotation of the query responses to serve not only the data, but real model instances.
- Annotated TAP responses can be consumed by software such as those described blow, to interpret the content in terms of IVOA data models, greatly enhancing the interoperability of manipulating query responses from various services in science threads.
- Conclusions of the BoF include: "This session and the following discussions 4 highlighted that TAP services can already serve hierarchical data and that serving legacy data with annotations or even Provenance instances is within our reach."
- Resources
Tools:
- Jovial: A Java toolset that helps build and generate serializations for VODML compliant data models.
- Rama: Python package, parses annotation and instantiates instances of model classes. Includes adaptors to AstroPy classes.
- ModelInstanceInVOT Code: Python package for processing annotated VOTables
- TDIG: Working project of Time Series as Cube.
- An effort to enhance SPLAT to load/interpret/analyze TimeSeries data using data annotation
- the tool was enhanced to use new annotations (eg: TIMESYS, UTypes) to identify and interpret the data automatically.
- Delays in resolving on a standard annotation syntax has hindered progress on this project to fully realize the possibilities. This is a high-priority for upcoming work.
- pyVO: extract_skycoord_from_votable()
- Demonstrated in Paris this product of the hack-a-thon generates AstroPy SkyCoord instances from VOTables using various elements embedded in the VOTable.
- Interrogates a VOTable, identifies key information and uses that to automatically generate instances of SkyCoord.
- UCD: 'pos.eq.ra', 'pos.eq.dec'
- COOSYS.system: "ICRS", "FK4", "FK5"
- COOSYS.equinox
- The COOSYS maps directly to SpaceFrame, and the value of the system
- The UCD 'pos.eq' maps directly to meas:EquatorialPosition; with 'pos.eq.ra|dec' identifying the corresponding attributes (EquatorialPosition.ra|dec) as coordinates coords:Longitude and coords:Latitude.
- This illustrates that even with minimal annotation, this sort of automatic discovery/instantiation can take place. With a defined annotation syntax, this utility could be expanded to generate other AstroPy objects very easily.
- Transform model: These libraries are compatible with the Transform model (see above comparison table), and played an instrumental role in the development of that model. They share a common serialization format (ASDF) which enables them to exchange WCS information between them.
- AST - A Library for Handling World Coordinate Systems in Astronomy
- AST Project: documentation and download page.
- version 9.2.3 - 2020: contains a new class of Channel called YamlChan allows AST to read and write WCS information as ASDF. It does not as yet include any support for conversion of spectral or time axes ,and does not support all ASDF transform classes (see comparison chart above).
- GWCS - Generalized World Coordinate System: provides tools for managing WCS
Representing Properties
The primary driver behind the Measurements and Coordinates model work has been to facilitate the description of various Astronomical properties which will be used by other models to describe more complex entities such as an N-Dimensional data cube, or Catalogue of Source properties. The following list itemizes significant astronomical properties, and how they are currently supported/represented by the Meas/Coords models. Rather than defining a long list of property types, we expect most properties to be served by the GenericMeasure, with appropriate assigment of the UCD attribute to indicate the physical nature of the measure.
Coordinate in a spherical space; any spherical reference frame other than those listed above |