The Collins Economics Result Object (CERO)

A core concept in the operation of ConCERO is that of a ‘Collins Economic Results Object’ - a CERO - which serves as a standard format for data-interchange between economic modelling programs. Conceptually, the CERO is a set of instances of a ‘fundamental data type’, a discussion of which can be found in the ConCERO’s Design Philosophy documentation.

Software-wise, the CERO is a pandas.DataFrame with some additional constraints. Those constraints are:

  • cero.index must be an instance of the pandas.Index class, and
  • cero.columns must be an instance of the pandas.DatetimeIndex class, and
  • both cero.index and cero.columns values must be unique and
  • all index values must be valid identifiers (see below) and
  • cero data/array values must all be of 32-bit floating-point type (specifically, be instances of a subclass of the numpy.float32 class),

where cero is a CERO. The values of cero.index are referred as identifiers.

CERO Identifiers

As mentioned previously, values of the index of a CERO are referred to as identifiers. Identifiers are subject to a couple of restrictions. They are:

  • The identifier must be unique - that is, no other value of cero.index can be exactly the same.
  • The identifier must be either:
    • a string (str) with no commas, or
    • a tuple of strings, where each string does not have any commas.

The comma constraint is a result of how ConCERO interprets commas when reading YAML files - ConCERO interprets commas as a string-splitting character. Thus, if a configuration file contains the string:

"hello,world"

in the context of CERO identifiers, then this will be interpreted as the python tuple:

('hello','world')

Note also that any white spaced is stripped when the string is split, so the string:

"hello, world"

also becomes:

('hello','world')

and this:

" L_OUTPUT, Electricity, AUS"

becomes:

("L_OUTPUT","Electricity","AUS")

The advantage of the tuple form of identifier is that it preserves ordered relationships, even though that ordered relationship has no meaning within the CERO itself. This is necessary to store data that is more than 2-dimensional in nature in 2-dimensions. It also allows for the implementation of sets (see Sets),which provide the user with significant flexibility and power with respect to selecting identifiers of interest. In summary, sets allow the user to select large amounts of identifiers by just listing sets, as opposed to all the identifiers.

Created on Wed Dec 20 10:20:32 2017

@author: Lyle Collins @email: Lyle.Collins@csiro.au