The second edition of the
SemanticsCalls took place on Wed, 2020-12-09 at 15:00 UTC.
Agenda
- Review of the last minutes
- Remaining stumbling blocks for Vocabularies 2
- Review of open VEPs: VEP-006
Minutes
Vocabularies 2: Open questions before PR
CMZ: The VOC2 document is not designed for covering all the usage of semantic artifacts within the VO, but just cover how to deal with consensus vocabularies and their definition/evolution. To clarify this aspects it may worth to
- Put "consensus" in the title of the document.
- Markus: I'm not too wild about changing the title, as this really should obsolete version 1. If there's really material in version 1 that theory or someone else would miss, we should adopt them for version 2.
- CMZ: We have an exec/TCG procedure for defining new standards. What about the procedure for "obsolating" an existing recomandation?
- MD: such a thing actually exists (see "Obsolete IVOA documents" on the doc repo), but I'd say that really doesn't apply here; we're just doing a major version change, and to keep the total number of standards down, we should normally obsolete rather than fork the prior version.
- CMZ: Insist on this aspect on the introduction and while positioning this new document with respect of the previous IVOA semantic reccomandation
- Markus: There's now new language (volute rev 5911); is that about enough?
- CMZ: Give some guidelines for dictionary/thesaurus producers: how can they decide if their dictionary need to become a consensus one or not?
- Markus: I'd say implementation will tell you; anyway, it's hard to come up with hard guidelines. VOEvent example: we're currently introducing a controlled vocabulary for access protocols that's in the StandardsRegExt record -- in that case, that seems the reasonable thing to do, because standardIDs traditionally use ivoids. And, of course, enumerating everything a given standard is not useful for is an endless task.
- CMZ: Say something about how to deal with "non-consenus" semantic artifacts within the IVOA (extedning the example given in the VO-DML paragraph?)
- MD: What needs to be said there? It's not easy to find something that will apply to all of, say, people who what to do full ontologies, people just wanting to re-use some external practice and people just wanting some more or less random list of words...
- CMZ: What would be the optimal balance for semantic artifacts to be used in "Subject" field in VOResource? Restraining to consensus vocabularies will simplify client-side implementations but may be too restrictive for service-providers. On the other hand, having a generic semantic pointer (including to non-consensus vocabularies, or vocabularies produced outside the IVOA) may be practice for service-providers but hard to handle for client-implementors. What is the good balance? * MD: Yeah, that's always the difficult part. In VOResource, UAT terms are a "should" only anyway, and a "should" that right now is almost universally ignored (for good reasons, because so far it's been unclear what it actually meant). For this particular use case, I'd say the right place to discuss this is in the review of https://ivoa.net/documents/uat-as-upstream/.
BC: section 2.2.12, second paragraph: The Datacite-like way of solving this woudl be to add a `value_uri` attribute to the `subject` element. You would use that attribute to point to the UAT URI, and use the plain human-readable string as the content of `subject`.
MD: There are many ways to split up base URI and term: CURIEs, default vocabulary (as in datalink), saying "use terms from this vocabulary" (as in VOTable timescales and friends), explicit vocabulary URI in an attribute. For subject, having an explicit vocabulary URI (perhaps defaulting to IVOA UAT) sounds like a viable option. On the other hand: I'd like to see plausible actual usage for such a facility before I feel comfortable going for it; it
will make things a good deal more complicated for clients.
CMZ: About the location of IVOA consenus dictionaries: The location proposed for dictionaries is
http://www.ivoa.net/rdf. However the output format is not necessary RDF (it may be a human readable html, desise json, other?) and the format depend on the parameter of the query. I (Carlo) would suggest to have somthing like
http://www.ivoa.net/vocs or
http://www.ivoa.net/vocabularies.
MD : RDF is a framework, and all
VocInVO2 vocabularies are following it. It's even the foundation for desise, even though desise tries to hide that was well as possible. Also, of course, changing the location would at least in theory break existing usage (though in reality probably few clients actually look at the URIs). Incidentally, using, say, Json-LD wouldn't work as the sort of declarative API that desise is. Still, I've not used a json media type (but rather application/x-desise+json) for desise exactly because we might want to offer our vocabularies in some JSON serialisation of RDF triples in addition to what we have now one day.
CMZ suggestion: Should we adopt the terms coming from [ISO-19135] as the DCAT
W3C standard has adopted (accepted / not accepted, deprecated, experimental, reserved, retired, stable, submitted, superseded, valid / invalid) in place of what defined in section 4.4 of VOC2.0?
ML: experimental=preliminary , accepted = in the list and not preliminary, deprecated is already used in desize
MD: or perhaps preliminary=submitted (or experimental) would work; but anyway, I'm not particularly wild on adopting an ISO upstream, because ISO standards are essentially inaccessible for almost everyone. And: for all I know, there aren't any RDF resources actually defining these terms. Let's check where this ISO std is used , and how is it used / action Carlo?
BC Comments on version "draft 2020-06-12":
BC: section 2.1.8: offline operations: To me "Offline operation" is not a requirement for the VO. It is a requirement on the libraries or tools, if the developers want to support offline operations.
MD: yeah, but the standard needs to be designed such that clients and libraries don't have an unnecessarily hard time doing this; also, if we, for instance, allow just random URIs in our fields, clients who'd like to figure out, say, hierarchies, wouldn't have a choice to go out and retrieve the extra vocabularies – with all the security, robustness, and privacy problems inherent in that.
BC: section 2.2.8: Simple cases: Not convinced that it is better to keep "simple" out of RDF. How do we define what is "simple". What if the "simple" case becomes "large" or more "complex" as it develops. Having a single way of dealing with any type vocabularies is more interoperable, than having to guess which vocabulary is "simple" (not using RDF) or not (then using RDF).
MD: The idea is exactly to make simple things simple and complex things possible: All our vocabularies are available as both desise (for naive clients) and in RDF/XML and turtle (for RDF-enabled clients).
CMZ : content negotiation : shall we put examples into the document ? +1 for me (ML) [Added after conf: there's now an example in the document for how to fetch a vocabulary using python requests -- MD].
BC: section 2.2.11: Follow up of my comment on section 2.1.8: I don't think caching is required for the entire vocabularies, or at least there should be a clear case for this. And again, not sure that should be a requirement for the VO.
MD: Well, it's easy to do, you just pull one single file (UAT is a mere 500k at this time)
BC: section 2.2.12, third paragraph: I don't see the "human readable" requirement in 2.2.1
MD: oops, that's a bug. [Note added after conf: Turns out it's not; I've added some explantory language to the requirement]
VEPs: VEP-006"> Review of open VEPs: VEP-006
Nobody had issues with the VEP.