Schedule
There is only one Semantics session: Wednesday 21 May, 12:00–13:30, in room D052
Speaker |
Title |
Time |
Slides |
Nicolas Moreau |
Vocabulary developments in Paris |
12:00–12:20 |
pdf |
Baptiste Cecconi |
UCD extension for planetary data |
12:20–12:40 |
pdf |
Norman Gray |
Machine learning, and the identification of UCDs |
12:40–13:00 |
pdf |
Norman/Sébastien |
VOUnits discussion |
13:00–13:05 |
|
ALL |
WG next projects |
13:05–13:30 |
|
Notes from the Semantics session
Thanks to MireilleLouys for the session notes
Nicolas Moreau: Vocabulary developments in Paris
Discussion of mappings between Paris's astronomy thesaurus and other thesauri. Discussion of the length of time it takes to do the mapping. Some of the Mapping is done by NIST
Links with the Unified Astronomy Thesaurus (
UAT).
SebastienDerriere will follow up these links. Contact is
AlbertoAccomazzi.
Several of the thesauri are maintained as structured text, from which SKOS is generated straightforwardly.
Baptiste Cecconi: UCD extension for planetary data
A new doc uploaded on the IVOA page.
Various proposed UCD discussed in the session
- em.mol.line observed: could be changed in present , considered , etc..
- FG: em below 10Mhz: what does it mean? Should the UCDs avoid interval overlaps?
- NG: does not the context help to disentangle?
- Planetary science needs a category below 20MHz
- Should we adapt to the community usage, and if so suggest em.pw for plasma wave
- The limit 20Mhz is really bound to atmospheric limit, so it's physically motivated.
- Keep em.radio.below10MHz or at least this 0-1Khz or 20Khz is the typical range for space
Draft version of the UCDs for Solar System and Planets:
SolarSystemUCD-V05.pdf
Norman Gray: Machine learning, and the identification of UCDs
- MG:do you remove some words in the text description? NG: the tool is made for larger portion of text but works on simple sentences rather nicely. Units may help to veto some assignments (instead of weight for assignments)
- MG: use a validation step to prevent from overfitting the result. Might be interesting to use a validation data set which is possibly completely independent.
- Which training set? The UCD list for instance
- MG: how to use the column name information as a complementary feature? This could help, theses are chosen with some sense of logic.
- SD: can you expect the system to propose a suggestion for a UCD ( use it as a predictive assignment tool) NG: No because it takes the whole string as one, and is not tokenizing the diff parts of the UCD string.
- PS: what if we use other kind of classifiers like Self Oarganizing Maps? MG: Such methods have been used for object types SKOS list, similarity measure should be defined for this kind of approach.
- MG, NG: size of training set to be increased: several thousands of items should be a valuable test configuration.
Norman/Sébastien: VOUnits discussion
VO-Units it at the end of the RFC process.
ALL: WG next projects
What are the next topics ?
- MG: A "dmtype" attribute is going to be included in VOtable specification on GROUPs and FIELDref: Is there an impact of the Semantic WG?
- Do we have to use SKOS concepts for consistency checks of this semantic tag?
- ML: should be a data model tag, leading to consistency with a DM context?
- NG: LINKS item of VOTABLE could be used to convey data model information ( like Utypes)
Update of UCD VO list
Action on
MireilleLouys to contact the people of the committee for Maintenance UCD and circulate the list of proposed add-ons and corrections. NG and MG agreed to join the committee.
--
NormanGray - 2014-05-22