The GAVO STC Library¶
A library to process VO STC specifications¶
- Author:
Markus Demleitner
- Email:
- Date:
2023-11-30
- Copyright:
Waived under CC-0
This library aims to ease processing specifications of space-time coordindates (STC) according to the IVOA STC data model with the XML and string serializations. Note that it is at this point an early beta at best. To change this, I welcome feedback, even if it’s just “I’d need X and Y”. Honestly.
More specifically, the library is intended to help in:
supporting ADQL region specifications and conforming them
generating registry coverage specifications from simple STC-S
generating utypes for VOTable embedding of STC information and parsing from them
The implementation should conform to STC-S 1.33; what STC-X is supported conforms to STC-X 1.00 (but see Limitations).
Installation¶
If you are running a Debian-derived distribution, see Adding the GAVO repository. When you follow that recipe,
aptitude install python-gavostc
is enough.
Otherwise, you will have to install the source distribution. Unpack the .tar.gz and run:
python setup.py install
You will normally need to do this as root for a system-wide installation. There are, however, alternatives, first and foremost a virtual python that will keep your managed directories clean.
This library’s setup is based on setuptools. Thus, it will generally obtain all necessary dependencies from the net. For this to be successful, you will have to have net access.
If all this bothers you, contact the authors.
Usage¶
Command Line¶
For experiments, we provide a simple command line tool. Try:
gavostc help
to see what operations it exposes. Here are some examples:
$ gavostc help
Usage: gavostc [options] <command> {<command-args}
Use command 'help' to see commands available.
Options:
-h, --help show this help message and exit
-e, --dump-exception Dump exceptions.
Commands include:
conform <srcSTCS>. <dstSTCS> -- prints srcSTCS in the system of dstSTCS.
help -- outputs help to stdout.
parseUtypes --- reads the output of utypes and prints quoted STC for it.
parseX <srcFile> -- read STC-X from srcFile and output it as STC-S, - for stdin
resprof <srcSTCS> -- make a resource profile for srcSTCS.
utypes <QSTCS> -- prints the utypes for the quoted STC string <QSTCS>.
$ gavostc resprof "Polygon ICRS 20 20 21 19 18 17" | xmlstarlet fo
<?xml version="1.0"?>
<STCResourceProfile
xmlns="http://www.ivoa.net/xml/STC/stc-v1.30.xsd"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ivoa.net/xml/STC/stc-v1.30.xsd http://vo.ari.uni-heidelberg.de/docs/schemata/stc-v1.30.xsd http://www.w3.org/1999/xlink http://vo.ari.uni-heidelberg.de/docs/schemata/xlink.xsd">
<AstroCoordSystem id="thgloml">
<SpaceFrame id="thdbgwl">
<ICRS/>
<UNKNOWNRefPos/>
<SPHERICAL coord_naxes="2"/>
</SpaceFrame>
</AstroCoordSystem>
<AstroCoordArea coord_system_id="thgloml">
<Polygon frame_id="thdbgwl" unit="deg">
<Vertex>
<Position>
<C1>20.0</C1>
<C2>20.0</C2>
</Position>
</Vertex>
<Vertex>
<Position>
<C1>21.0</C1>
<C2>19.0</C2>
</Position>
</Vertex>
<Vertex>
<Position>
<C1>18.0</C1>
<C2>17.0</C2>
</Position>
</Vertex>
</Polygon>
</AstroCoordArea>
</STCResourceProfile>
$ gavostc resprof "Circle FK5 -10 340 3" | gavostc parseX -
Circle FK5 -10.0 340.0 3.0
$ gavostc conform "Position GALACTIC 3 4 VelocityInterval Velocity 0.01 -0.002 unit deg/cy" "Position FK5"
Position FK5 264.371974024 -24.2795040403 VelocityInterval Velocity 0.00768930497899 0.00737459624525 unit deg/cy
$ gavostc utypes 'Redshift TOPOCENTER VELOCITY "z" Error "e_z" PixSize "p_z"'
AstroCoordSystem.RedshiftFrame.value_type = VELOCITY
AstroCoordSystem.RedshiftFrame.DopplerDefinition = OPTICAL
AstroCoordSystem.RedshiftFrame.ReferencePosition = TOPOCENTER
AstroCoords.Redshift.Error -> e_z
AstroCoords.Redshift.Value -> z
AstroCoords.Redshift.PixSize -> p_z
$ gavostc utypes 'Redshift TOPOCENTER VELOCITY "z" Error "e_z" PixSize "p_z"'\
| gavostc parseUtypes
Redshift TOPOCENTER VELOCITY "z" Error "e_z" PixSize "p_z"
Programmatic¶
The library is written in python, and thus currently can only be operated from python programs. It should not be too hard to embed it into C or even Java programs. If you have such needs, contact the author.
See API for details.
Limitations¶
Internally, all dates and times are represented as datetimes, and all information whether they were JDs or MJDs before is discarded. Thus, you cannot generate STC with M?JDTime.
All stc:DataModel utypes are ignored. On output and request, only stc:DataModel.URI is generated, fixed to uri stc.STCNamespace.
“Library” coordinate systems for ECLIPTIC coordinates are not supported since it is unclear to me how the equinox of those is expressed.
On system transformations, ellipses are not rotated, just moved. No “wiggles” (errors, etc) are touched at all.
There currently is not real API for “bulk” transforms, i.e., computing a transformation once and then apply it to many coordinates. The code is organized to make it easy to add such a thing, though.
Serialization of floats and friends is with a fixed format that may lose precision for very accurate values. The solution will probably be a floatFormat attribute on the frame/metadata object, but I’m open to other suggestions.
Reference positions are not supported in any meaningful way. In particular, when transforming STCs, transformations between all reference positions are identities. This won’t hurt much for galactic or extragalactic objects but of course makes the whole thing useless for solar-system work. If someone points me to a concise collection of pertinent formulae, adding real reference positions transformations should not be hard.
The behaviour of some transforms (in particular FK5<->FK4) close to the poles need some attention.
Empty coordinate values (e.g., 2D data with just one coordinate) are not really supported. Processing them will, in general, work, but will, in general, not yield the expected result. This is fixable, but may require changes in the data model.
No generic coordinates. Those can probably be added relatively easily, but it would definitely help if someone had a clear use case for them
Spectral errors and their “wiggles” (error, size, etc) must be in the same “flavor”, i.e., either frequency, wavelength, or energy. If they are not, the library will silently fail. This is easily fixable, but there’s too much special casing in the code as is, and I consider this a crazy corner case no one will encounter.
No reference on posAngles, always assumed to be ‘X’.
Spatial intervals are system-conformed analogous to geometries, so any distance information is disregaded. This will be fixed on request.
No support for Area.
Frame handling currently is a big mess; in particular, the system changing functions assume that the frames on positions, velocities and geometries are identical. I’ll probably more towards requiring astroCoords being in astroSystem.
Extensions to STC-S¶
After ECLIPTIC, FK4, or FK5, an equinox specification is allowed. This is either J<number> or B<number>.
For velocities, arbitrary combinations of spaceUnit/timeUnit are allowed.
To allow the notation of STC Library coordinate systems, you can give a System keyword with an STC Library tag at the end of a phrase (e.g.,
System TT-ICRS-TOPO
). This overwrites all previous system information (e.g.,Time ET Position FK4 System TT-ICRS-TOPO
will result in TT time scale and ICRS spatial frame). We admit it’s not nice, and are open to suggestions for better solutions.
Other Deviations from the Standard¶
Units on geometries default to deg, deg when parsing from STC-X.
The equinox to be used for ECLIPTIC isn’t quite clear from the specs. The library will use a specified equinox if given, else the time value if given, else the equinox will be None (which probably is not terribly useful).
Bugs¶
Conversions between TT and TCB are performed using the rough approximation of the explanatory supplement rather than the more exact expression.
TT should be extended to ET prior to 1973, but this is not done yet.
STC-S parse errors are frequently not very helpful.
Invalid STC-X documents may be accepted and yield nonsensical ASTs (this will probably not be fixed since it would require running a validating parser, which with XSD is not funny, but I’m open to suggestions).
API¶
The public API to the STC library is obtained by:
from gavo import stc
This is assumed for all examples below.
The Data Model¶
The STC library turns all input into a tree called AST (“Abstract Syntax Tree”, since it abstracts away the details for parsing from whatever serialisation you employ).
The ASTs are following the STC data model quite closely. However, it turned out that – even with the changes already in place – this is quite inconvenient to work with, so we will probably change it after we’ve gathered some experience. It is quite likely that we will enforce a much stricter separation between data and metadata, i.e., unit, error and such will go from the positions to what is now the frame object.
Thus, we don’t document the data model fully yet. The gory details are in dm.py. Meanwhile, we will try to maintain the following properties:
All objects in ASTs are considered immutable, i.e., nobody is supposed to change them once they are constructed.
An AST object has attributes time, place, freq, redshift, velocity refererring to an objects describing quantities or None if not given. These are called “positions” in the following.
An AST object has attributes timeAs, areas, freqAs, redshiftAs, velocityAs containing sequences of intervals or geometries of the respective quantities. These sequences are empty if nothing is specified. They are called areas in the following.
Both positions and areas have a frame attribute giving the frame (for spatial coordinates, these have flavor, nDim, refFrame, equinox, and refPos attributes, quite like in STC).
Positions have a values attribute containing either a python float or a tuple of floats (for spatial and velocity coordinates). For time coordinates, a datetime.datetime object is used instead of a float
Positions have a unit attribute. We will keep this even if all other metadata move to the frame object. The unit attribute follows the coordinate values, i.e., they are tuple-valued when the values are tuples. For velocities and redshifts, there is a velTimeUnit as well.
ASTs have a cooSystem attribute with, in turn, spaceFrame, timeFrame, spectralFrame, and redshiftFrame attributes.
NULL is consistently represented as None, except when the values would be sequences, in which case NULL is an empty tuple.
Parsing STC-X¶
To parse an STC-X document, use stc.parseSTCX(literal) -> AST
.
Thus, you pass in a string containing STC-X and receive a AST structure.
Since STC documents should in general be rather small, there should be no necessity for a streaming API. If you want to read directly from a file, you could use something like:
def parseFromFile(fName):
f = open(fName)
stcxLiteral = f.read()
f.close()
return stc.parseSTCX(stcxLiteral)
The return value is a sequence of pairs of (tagName, ast)
, where
tagName is the namespace qualified name of the root element of the STC
element. The tagName is present since multiple STC trees may be present
in one STC-X document. The qualification is in standard W3C form, i.e.,
{<namespace URI>}<element name>
. If you do not care about
versioning (and you should not need to with this library), you could
find a specific element using a construct like:
def getSTCElement(literal, elementName):
for rootName, ast in stc.parseSTCX(literal):
if rootName.endswith('}'+elementName):
return ast
getSTCElement(open("M81.xml").read(), "ObservationLocation")
Note that the STC library does not contain a validating parser. Invalid STC-X documents will at best give you rather incomprehensible error messages, at worst an AST that has little to do with what was in the document. If you are not sure whether the STC-X you receive is valid, run a schema validator before parsing.
We currently understand a subset of STC-X that matches the expressiveness of STC-S. Most STC-X features that cannot be mapped in STC-X are silently ignored.
Generating STC-X¶
To generate STC-X, use the stc.getSTCX(ast, rootElmement) -> str
function. Since there are quite a few root elements possible, you have
to explicitely pass one. You can find root elements in stc.STC
. It
is probably a good idea to only use ObservatoryLocation
,
ObservationLocation
, and STCResourceProfile
right now. Ask the
authors if you need something else.
There is the shortcut stc.getSTCXProfile(ast) -> str
that is
equivalent to stc.getSTCX(ast, stc.STC.STCResourceProfile)
.
Parsing STC-S¶
To parse an STC-S string into an AST, use stc.parseSTCS(str) -> ast
.
The most common exception this may raise is stc.STCSParseError, though
others are conceivable.
Generating STC-S¶
To turn an AST into STC-S, use stc.getSTCS(ast) -> str
. If you pass
in ASTs that use features not supported by STC-S, you should get an
STCNotImplementedError or an STCValueError.
Generating Utypes¶
For embedding STC into VOTables, utypes are used. To turn an AST object
into utypes, use stc.getUtypes(ast) -> dict, dict
. The function
returns a pair of dictionaries:
the first dictionary, the “system dict”, maps utypes to values. All utypes belong to AstroCoordSystem and into this group.
the second dictionary, the “columns dict”, maps values to utypes.
Of course, the columns dict doesn’t make much sense with ASTs
actually containing values. To sensibly use it it a way useful for
VOTables, you can define your columns’ STC using “quoted STC-S”. In
this format, you have identifiers in double quotes instead of normal
STC-S values. Despite the double quotes, only python-compatible
identifiers are allowed, i.e., these are not quoted identifiers in the
SQL sense. The stc.parseQSTCS(str) -> ast
function parses such
strings.
Consider:
In [5]:from gavo import stc
In [6]:stc.getUtypes(stc.parseQSTCS(
...:'Position ICRS "ra" "dec" Error "e_p" "e_p"'))
Out[6]:
({'AstroCoordSystem.SpaceFrame.CoordFlavor': 'SPHERICAL',
'AstroCoordSystem.SpaceFrame.CoordRefFrame': 'ICRS',
'AstroCoordSystem.SpaceFrame.ReferencePosition': 'UNKNOWNRefPos'},
{'dec': 'AstroCoords.Position2D.Value2.C2',
'e_p': 'AstroCoords.Position2D.Error2Radius',
'ra': 'AstroCoords.Position2D.Value2.C1'})
Note that there is no silly “namespace prefix” here. Nobody really knows what those prefixes really mean with utypes. When sticking these things into VOTables, you will currently need to stick an “stc:” in front of those.
Parsing Utypes¶
When parsing a VOTable, you can gather the utypes encountered to
dictionaries as returned by getUtypes
. You can then pass these to
parseFromUtypes(sysDict, colDict) -> ast
. The function does not
expect any namespace prefixes on the utypes.
Conforming¶
You can force two ASTs to be expressed in the same frames, which we call “conforming”. As mentioned above, currently only reference frames and equinoxes are conformed right now, i.e., the conversion from Galactic to FK5 1980.0 coordinates should work correctly. Reference positions are ignored, i.e. conforming ICRS TOPOCENTER to ICRS BARYCENTER will not change values.
To convert coordinates in ast1 to the frame defined by ast2, use the
stc.conformTo(ast1, ast2) -> ast
function. This could look like
this:
>>> p = stc.parseSTCS("Circle ICRS 12 12 1")
>>> stc.conformTo(p, stc.parseSTCS("Position GALACTIC"))
>>> stc.conformTo(p, stc.parseSTCS("Position GALACTIC")).areas[0].center
(121.59990883115164, -50.862855782323962)
Conforming also works for units:
>>> stc.conformTo(p, stc.parseSTCS("Position GALACTIC unit rad")).areas[0].center
(2.1223187792285256, -0.8877243003685894)
Transformation¶
For simple transformations, you can ask DaCHS to give you a function just turning simple positions into positions. For instance,
from gavo import stc
toICRS = stc.getSimple2Converter(
stc.parseSTCS("Position FK4 B1900.0"),
stc.parseSTCS("Position ICRS"))
print(toICRS(30, 40))
shows how to build turn positions given in the B1900 equinox (don’t sweat the reference system for data that old) to ICRS.
Equivalence¶
For some applications it is necessary to decide if two STC specifications are equivalent. Python’s built-in equivalence operator requires all values in two ASTs to be identical except of the values of id attributes.
Frequently, you want to be more lenient:
you might decide that unspecified values match anything
you may ignore certain keys entirely (e.g., the reference position when you’re doing extragalactic work or when a parallax error doesn’t matter)
you may want to view certain combinatinons as equivalent (e.g., ICRS and J2000 are quite close)
To support this, the STC library lets you define EquivalencePolicy
objects. There is a default equivalence policy ignoring the reference
position, defining ICRS and FK5 J2000 as equivalent, and matching Nones
to anything. This default policy is available as stc.defaultPolicy
.
It has a single method, match(sys1, sys2) -> boolean
with the
obvious semantics. Note, however, that you pass in systems, i.e.,
ast.cooSystem
rather than ASTs themselves.
You can define your own equivalence policies. Tell us if you want that
and we’ll document it. In the mean time, check stc/eq.py
.
Hacking¶
For those considering to contribute code, here is a short map of the source code:
cli – the command line interface
common – exceptions, some constants, definition of the AST node base class
conform – high-level code for transformations between reference systems, units, etc.
spherc.py, sphermath.py – low-level transformations for spherical coordinate systems used by conform
times – helpers for converting time formats, plus transformations between time scales used by conform.
dm – the core data model, i.e. definitions of the classes of the objects making up the ASTs
stcsast.py, stcxast.py – tree transformers from STC-S and STC-X concrete syntax trees to ASTs.
scsgen.py, stcxast.py – serializers from ASTs to STC-S and STC-X
utypegen.py, utypeast.py – code generating and parsing utype dictionaries. These are thin wrappers around the STC-X code.
stcs.py, stcsdefaults.py – a grammar for STC-S and a definition of the defaults used during parsing and generation of STC-S.
units.py – units defined by STC, and transformations between them
Since the STC serializations and the sheer size of STC are not really amenable to a straightforward implementation, the stc*[gen|ast] code is not exactly easy to read. There’s quite a bit of half-assed metaprogramming going on, and thus these probably are not modules you’d want to touch if you don’t want to invest substantial amounts of time.
The conform, spherc, sphermath, units and time combo though shouldn’t be too opaque. Start in conform.py contains “master” code for the transformations (which may need some reorganization when we transform spectral and redshift coordinates as well).
Then, things get fanned out; in the probably most interesting case of
spherical coordinates, this this to spherc.py. That module defines lots
of transformations and getTrafoFunction
. All the spherical
coordinate stuff uses an internal representation of STC, six vectors and
frame triples; see conform.conformSystems on how to obtain these.
To introduce a new transformation, write a function or a matrix implementing it and enter it into the list in the construction of _findTransformsPath.
Either way: If you’re planning to hack on the library, please let us know at gavo@ari.uni-heidelberg.de. We’ll be delighted to help out with further hints.
Extending STC-S¶
Here’s an example for an extension to STC-S: Let’s handle the planetary ephemeris element.
Checking the schema, you’ll see only two literals are allowed for the
ephemeris: JPL-DE200
and JPL-DE405
. So, in
stcs._getSTCSGrammar
, near the definition of refpos, add:
plEphemeris = Keyword("JPL-DE200") | Keyword("JPL-DE405")
The plan is to allow the optional specification of the ephemeris used after refpos. Now grep for the occurrences of refpos and notice that there are quite a number of them. So, rather than fixing all those rules, we change the refpos rule from:
refpos = (Regex(_reFromKeys(stcRefPositions)))("refpos")
to:
refpos = ((Regex(_reFromKeys(stcRefPositions)))("refpos")
+ Optional( plEphemeris("plEphemeris") ))
We can test this. In stcstest.STCSSpaceParsesTest, let’s add the sample:
("position", "Position ICRS TOPOCENTER JPL-DE200"),
Now, the refpos nodes are handled in the _makeRefpos function, looking like this:
def _makeRefpos(node):
refposName = node.get("refpos")
if refposName=="UNKNOWNRefPos":
refposName = None
return dm.RefPos(standardOrigin=refposName)
The node passed in here is a pyparsing node. Since in our data model, None is always null/ignored, we can just take the planetary ephemeris if it’s present, and the system will do the right thing if it’s not there:
def _makeRefpos(node):
refposName = node.get("refpos")
if refposName=="UNKNOWNRefPos":
refposName = None
return dm.RefPos(standardOrigin=refposName,
planetaryEphemeris=node.get("plEphemeris"))
Let’s test this; testing STC-S to AST parsing takes place in stctest.py,
so let’s add a method to CoordSysTest
:
def testPlanetaryEphemeris(self):
ast = stcsast.parseSTCS("Time TT TOPOCENTER JPL-DE200")
self.assertEqual(ast.astroSystem.timeFrame.refPos.planetaryEphemeris,
"JPL-DE200")
Thus, we can parse the ephemeris spec from STC-S. To generate it, two things need to be done: The DM item must be transformed into the CST the STC-S is built from, and the part of the CST must be flattened out. Both things happen in stcsgen.py. The CST is just nested dictionaries. Refpos handline happens in refPosToCST, so replace:
def refPosToCST(node):
return {"refpos": node.standardOrigin}
with:
def refPosToCST(node):
return {
"refpos": node.standardOrigin,
"planetaryEphemeris": node.planetaryEphemeris,}
To flatten that out to the finished string, the flatteners need to be told that you want that key noticed. Grepping for repos shows that it’s used in several places. So, let’s define a “common flattener”, which is a function taking a value and the CST node (i.e., a dictionary) the value was taken from and returns a string ready for inclusion into the STC-S. The flattener here would look like this:
def _flattenRefPos(val, node):
return _joinWithNull([node["refpos"], node["planetaryEphemeris"]])
The _joinWithNull
call makes sure that empty specifications do not
show up the in result.
This “global” flattener is now entered into _commonFlatteners
, a
dictionary mapping specific CST keys to flatten functions:
_commonFlatteners = {
...
"refpos": _flattenRefPos,
}
The most convenient way to test this is to define a round-trip test.
These again reside stcstest. Use BaseGenerationTest
and add a
sample pair like this:
("Redshift BARYCENTER JPL-DE405 3.5",
"Redshift BARYCENTER JPL-DE405 3.5")
With this, you should be done.