Source and Catalog Contribution Page
Rationale
This page gathers contributions and discussions of the Source and Catalogs focused session which took place in Paris
interop 2019
The goal of this session was to ask people invovled in data curation and client development (and any other people) what are their requirements to make source related data interoperable. This requires both modeling effort and serialization mechanisms and is a complex work:
-
There is no clear common definition of what a source is
- Can be a detection by a telescope
- Can be one row in a catalogue of compiled data
- Can be anything related to a sky location emitting a signal detectable by astronomers
-
Source data are multi-faceted
- Something that modelers do hate
-
Is a model relevant or possible or even desirable?
- The scope of a global source model is very huge
- It covers a large part of our science
- It couldn't likely not be described by a model
This work should be shared among multiple working groups. The models need both domain and modelling expertise.
- The effects on VOTable makes it clearly of interest to Apps,
- The data models themselves, with appropriate use cases, are the primary driver led by DM.
Challenges (on behalf of TD)
- Data model complexities, and lack of actual data models, has often been conflated with mapping complexities and lack of successful demonstrations. The mapping is seen as the problem, when it's only a part, sometimes a small part.
- Pierre Fernique's very fair question in Victoria 2018 was a great example of this. "But what about my proper motions?"
- This a complex problem, and we haven't been able to bring a commensurate amount of focused resources to it.
- In my opinion, we lack specific, documented, user stories, use cases, etc. Pierre's case would be a good example if written up, and it's a case that must be demonstrated as clearly working in order to move on.
Straw Man Steps Forward
So to address some of those issues, we suggest something like the following steps to get moving forward. It may be that we don't have the resources to do all that, in which case maybe we settle on per-topic work-arounds (like with proper motion),
or find a way to get more involvement from the greater community.
-
Get the TCG/Exec to agree on, then coordinate (mandate?) some steps like these.
-
Get a group of people to commit to work on this. a. This can worked in parallel with the other steps, but it is important to know who is out there with the time, interest, commitment and expertise to help. b. Keep sharing, presenting, and inviting participation throughout.
-
Define what problem(s) we are solving with a short-ish list of concrete user stories. For now, only include very important stories, and make them very specific, including what data sources will be used and what we will do with the data.
-
From those stories, decide what data we need to model.
-
Model the necessary data and map these models on real data
- Need first to define the mapping from the model(s) to Astropy objects.
- Astropy mapping is not sufficient. Explore different mapping syntaxes by implement the user stories. a. This can be done in parallel with the modelling effort, and can evolve with it.
- Knowing the pencil and paper mapping is a prerequisite for each experiment. Before trying the mapping syntax, we must know unambiguously how the model maps to an implemented data structure (hopefully including Astropy).
Mango Project
The MANGO model proposes a flexible way to expose data related to astronomical source objects in an interoperable way.
- The MANGO model attaches an identifier on an astronomical source and associates to it all data related : observed physical quantities called parameters in this context, and other information like spectra, time series, preview image, for instance,for that source.
- Parameters usually appear in the columns of a source catalogue. Additionnal dataproducts are bound to the source to contribute to the science analysis and enhance data understanding.
- Parameters are modeled by the IVOA MCT DM reusing both native and extended classes. Parameters' roles are given by UCDs and semantic tags.
- Associated data can be simple URLs, VO service endpoints or VO data model instances. Their roles are also qualified by semantic tags.
MANGO stands for Metadata ANnotation for Generic Objects (in astronomy).
The project can be followed on
GitHub
Mango comes with a VOTable annotation proposal that can be followed on
GitHub
Workshop DM2021
For 2 years, different actions have been undertaken to statisfy the DM user requirements in the VO on the following basis:
- There is a clear demand for using models
- There is a clear demand for the models to be simple and modular.
- The distinction between what is modelling and what is serialisation is unclear.
A
Workshop has been organised in Spring 2021 to attempt to definitely provide a common framework for the model usage.
Resources
Former contributions
There were last years some proposals
Wiki pages focused on this issue
Interop Contributions
Title |
Authors |
Link |
Focused Session |
Interop Paris 2019 |
Session page |
Presentation |
Interop Groningen 2019 |
pdf |
Presentation |
LockTown May 2020 |
pdf |