Data

class listmode.data.BitProcessor(info)[source]

Bases: listmode.data.ColProcessor

process(in_list, out, t_front, ev_count)[source]
Parameters
  • in_list – list of data_dicts, one per channel

  • out – list of initialized output data arrays

  • t_front – current position in data

  • ev_count – number of hits per channel in the event

  • ev_num – number of hits per channel in the event

Returns

class listmode.data.ColProcessor(info)[source]

Bases: object

Simple class for aggregating data in event building. It is initialized with the data info (like in extras definition) including what happens when multiple events are found within the same time window.

The process method is given input events, channel mask and output data structure. Output data is modified in-place. Each instance of a class is only updating its own part of the data (energy, timing, coord, etc.) and is supposed to be run in a pipeline for every event.

process(in_list, out, t_front, ev_count)[source]
Parameters
  • in_list – list of data_dicts, one per channel

  • out – list of initialized output data arrays

  • t_front – current position in data

  • ev_count – number of hits per channel in the event

  • ev_num – number of hits per channel in the event

Returns

class listmode.data.Data(config)[source]

Bases: object

Sort of generic data class, with plug-in dataloader for vendor specific raw data and extra data items configurable via configuration file.

Timestamp is always present in all kinds of data. It stores the event time in nanoseconds since the start of the data. It is always 64-bit unsigned integer and is handled in a special way by listmode. All other data is defined by info-dictionaries that are of the form: info_dict = {“name”: “some_data”,

“type”: “u1”, “num_col”: 2, “aggregate”: “col”, “ch_mask”: [1, 1, 0, 0], “multi”: “mean”}

Data is held in a data dictionary, with data name as key and memmap of the data as the value. (Currently the data is also held as members with same name as the data for backward compatibility using _update method).

All data is stored in numpy arrays in the data dict. The data dict always contains time_vec and data_mat: time and energy information of events. data_mat is defined by ‘events’ info dict in the configuration file.

Data dict can also contain extra data, defined by ‘extras’ list of info dicts in the configuration file. Few types of extras are hardcoded into Listmode and are handled in a special way if they are present: coord: coordinate information. Correspondence of channels to coordinate columns is given by

config.det[‘coordinates’]. This is used for data selection and plots.

latency:timing information. Each column is the time difference between ‘main’ channel and other channels in the

event. Used to tune the latency and coincidence window.

multihit: A flag that is raised if a channel has several hits per event. A type of nondestuctive pileup. The

energy value of a multihit event is calculated using a function defined by the ‘multi’ keyword.

All other extras are just carried along with the data and can be plotted (not quite yet) or used for event selection (not there either).

get_data_block(t_slice=None)[source]

Get data and time vectors, but processed in chunks of 1M events to save memory. Optionally, define a slice in time. The method should be called in a loop to read everything. All data including extras is returned.

Last return value isdata indicates if there is more data to come. On a False the loop should be stopped, but the last data is still valid.

Parameters

t_slice – A tuple of start and stp times in nanoseconds. Full data is set to be read if this is None. The time slice should never be changed while reading the data in a loop.

Returns

A tuple of (data_dict, isdata) for current chunk.

get_dead_time(t_slice=None)[source]

Get dead time for the data or a time_slice of data.

Parameters

t_slice – a tuple of start and stop times in nanoseconds. Full dead time is retrieved if this is set to None.

Returns

The dead times in [s] for all channels as a vector of floats.

get_end_time()[source]
load_data(data_path_str, name=None, reset=False)[source]

Loads data preferably from event mode .dat files. If this fails, then channel data is searched for. (Channel data may be saved as intermediary step when doing slow conversion from other data formats.) Otherwise _read_raw_data method is called. Native format has no raw data and will fail.

Parameters
  • data_path_str – Path to data directory. It has to be either a string or a pathlib Path object.

  • name – Optional name, if data file does not share the same base_name as the directory.

  • reset (Bool) – The raw data parsing can be forced with reset=True.

Returns

class listmode.data.EventBuilder(num_ch, coinc_win, latency, extras, event_info, max_datasize=8192)[source]

Bases: object

Painful way of walking through the data and trying to build events by seeking coincidences between channel times.

Ideally works on shortish arrays of data returned by the digitizer, but should manage big savefiles in chunks.

run_batch(data_dict, timing_list)[source]

The time front is a list of the lowest unbuilt indices for each channel. (The t0 is the times, E0 the energies) The channel which has lowest time in the front is put to an event and if other channels in the front have time within the window, then they are included. The front is incremented for all the channels that were included and the iteration is started again.

Parameters
  • data_dict – list of data_dicts for each channel

  • timing_list – list holding timing information for each channel

Returns

data_dict, timing_data

class listmode.data.LatencyProcessor(info)[source]

Bases: listmode.data.ColProcessor

LatencyProcessor is a specialized processor used to visualize the timing properties of the input data. Each output column is equal to time difference between event in main channel and event in each other channel (so output of main channel is always zeros) calculated from latency corrected time data. Smallest possible value is returned if there was no coincidence between the channels. All channels should show zero-centered distributions in a properly tuned detector. Width of the distributions will show how big coincidence window is needed.

process(in_list, out, t_front, ev_count)[source]
Parameters
  • in_list – list of data_dicts, one per channel

  • out – list of initialized output data arrays

  • t_front – current position in data

  • ev_count – number of hits per channel in the event

  • ev_num – number of hits per channel in the event

Returns

class listmode.data.Metadata(parent)[source]

Bases: object

Metadata is responsible for the saving, loading and generation of metadata within data.

Under normal circumstances the metadata is present in a json file, and is loaded by the metadata class. If, however, metadata is missing or needs to be changed the metadata class provides methods for updating, validating and saving the changes.

calculate()[source]

Generates metadata from parent data and members. Calculate should not touch values that are set by the loader.

Returns

property counts
property events
get(key, channel)[source]

Get a metadata item that is not one of the properties. This is simply wrapping the dict indexing.

Parameters
  • key – keyword to get

  • channel – channel

Returns

value

property input_counts
load()[source]

Loads metadata from json files. If incomplete metadata is loaded it is updated from the data.

Returns

None

property name
property notes
property run_id
save()[source]

Save metadata back to json.

Returns

None

set(key, value, channel)[source]

Set a metadata item for one or all channels. For example some sample related information can be retrieved from database and added to metadata after the data is created. This method exists to give easy access to metadata for the loader functions of vendor specific data. This method should not be used to set the minimal metadata handled by the properties of Metadata class. ListModeMetadataSetError is raised if even tried.

Parameters
  • key – Key to _run_data dict

  • value – A value to set.

  • channel – Channel to modify. If ch is less than 0, then all channels are updated.

Returns

property start
property stop
property total_time
class listmode.data.MultiHitProcessor(info)[source]

Bases: listmode.data.BitProcessor

MultiHitProcessor calculates a bitmask where channels with multiple hits per event are set to 1.

process(in_list, out, t_front, ev_count)[source]
Parameters
  • in_list – list of data_dicts, one per channel

  • out – list of initialized output data arrays

  • t_front – current position in data

  • ev_count – number of hits per channel in the event

  • ev_num – number of hits per channel in the event

Returns

class listmode.data.StreamData(path, data_name, method='event', raw=False, channels=None, extra_name=None)[source]

Bases: object

Stream_data is a manager that pushes list mode data into disk as it comes available. Every kind of data (time + energy for channel, time + energy matrix for events, timing data and extra data) needs to have its own streamer.

Channel mode data is stored as raw binary files, with one file holding time (uint64), one the energy (uint16). Note: there is no reason to save data in channel mode after latency and coincidence window are set.

Event data is stored as raw binary with timestamps (uint64), energy matrix (uint16 x num_ch)

Timing data is a row of timing info (uint32 idx + 2xuint32 x num_ch).

Extra data can can be given via the extras dictionary (keys: ‘name’, ‘type’, ‘num_col’). Extras can include pile-up flags (Type x num_ch) or coordinates (Type x N), where N is number of coordinates.

close()[source]
new_files()[source]
class listmode.data.TimeCache(parent)[source]

Bases: object

TimeCache controls timing data and provides dead/live time of the detector plus maintains lists of index - time pairs of the time information insertions times so that quick retrieval of time periods is possible. Each interval holds variable amount of events. Because both indices and timestamps are monotonously increasing, they can both be used to find intervals from the data. The timing datafile is saved with data and should be read only. It contains index of insertion (int64) plus the dead_time delta of the interval for each channel in float32 type. First row always points to first event with zero dead times for all channels. First real dead time value is stored to the second row.

find(t_slice, ch=None)[source]

Finding indices in self.timing that contain the t_slice time.

Parameters
  • t_slice – tuple of nanosecond values defining a slice in time

  • ch – If specified will return only indices in which dead time has been given for ch. This is mainly used by get_timing to interpolate the dead time.

Returns

indices to self.timing containing the time slice.

get_dead_time(t_slice=None)[source]

Return the dead time of the time slice.

Parameters

t_slice

Returns

All dead times in a numpy array. Live and dead times are float values of seconds.

get_indices(t_slice=None)[source]

Return start and stop event indices (endpoint not inclusive) that contain the time slice fully using timing info as a hash.

Parameters

t_slice – tuple of nanosecond values defining a slice in time. If None full data is returned.

Returns

indices to event data containing the the time slice

get_live_time(t_slice=None)[source]

Return live time of the slice.

Parameters

t_slice

Returns

All live times in a numpy array. Live and dead times are float values of seconds.

get_timing(t_slice=None)[source]

Return dead time data for a slice. The first entry is zeros and the second one is interpolated to start from t_slice[0]. Last one is an extra row interpolated to t_slice[1].

If t_slice is not defined this method returns timing data as it is.

Parameters

t_slice – Time slice (in ns)

Returns

(interpolated) timing data for time slice

get_total_time(t_slice=None)[source]

Return total time of the time slice or total time of measurement, if t_slice is None. :param t_slice: :return: Total time value in nanoseconds

set(timing)[source]
Parameters

timing – an opened np.memmap instance containing the timing data (retrieved by read_binary_data)

listmode.data.data_info(info, ch_list)[source]

Fills data_info dict with defaults for parts that are missing. Hardcoded settings for energy, multihit and latency data will be overwritten if defined in config. A warning is printed if setup is overwritten.

Parameters
  • info – info dict

  • ch_list – info dict

Returns

dict with missing keys filled with defaults.

listmode.data.fill_default_data(cfg)[source]

Will generate reasonable defaults for parameters omitted for ‘events’ and ‘extras’ data_info dictionaries. It will overwrite incompatible parameters. Does not work yet.

Parameters

cfg – Configuration of the detector.

Returns

data_info dictionary

listmode.data.generate_timing(chfile, pulse_dead_time, t_vec)[source]

Utility function to generate timing vector if it does not exist. Takes pathlib type filename, pulse dead time for the channel and t_vec.

Returns nothing, just writes the data.

listmode.data.ipoly2(y, *p)[source]

Estimates the inverse of 2nd degree polynomial above by dropping the 2nd degree term: returns ~x given y. Here the larger root is always returned.

Parameters
  • y – An energy value or a numpy list of energy values.

  • p – Calibration coefficients, starting from 0th degree coefficient.

Returns

Channel values

listmode.data.kill_combinator(in_dict, idx, ev_count, name)[source]

Event is set to zero.

Parameters
  • in_list – A dictionary including all datas of the channel.

  • idx – Index of the first hit in the event

  • ev_count – Number of hits in the event

  • name – Name of the data

Returns

A single value for the hit

listmode.data.load_calibration(config)[source]

Loads calibration for the detector. Calibration gives the 2nd degree function coefficients for calibration for each channel and for each data type. The data is organized as a dictionary with data types as keys and each data as numpy arrays with channel in first axis and three coefficients (a, b and c) in second axis.

Missing data is fixed with dummy calibration ([0,1,0] coefficients), but incompatible data (e.g. wrong number of channels) will raise an exception.

Old calibration data had keys for peaks used for calibration, but they have been dropped.

Parameters

config – The detector config object (obviously missing the calibration info)

Returns

The calibration dictionary. read from disk. Missing data is fixed with dummy calibration, but incompatible

listmode.data.load_config(det_name, local_cfg, data_name=None)[source]

Detector configuration object is a namespace with: paths into configuration directories and optionally to data. Contents of the detector configuration file. Calibration for the detector. Calibration gives the 2nd degree function coefficients for calibration for each channel and for each data type. The data is organized as a dictionary with data types as keys and each data as numpy arrays with channel in first axis and three coefficients (a, b and c) in second axis. Omitted calibration data is replaced with [0,1,0] coefficients

Parameters
  • det_name – Name of the detector configuration file without the _cfg.json

  • local_cfg – Paths needed to find configurations and data

  • data_name – Optional path to data, that will be added as “home” into config.path

Returns

detector configuration object

listmode.data.max_combinator(in_dict, idx, ev_count, name)[source]

Returns the hit that has highest value.

Parameters
  • in_dict – A dictionary including all datas of the channel.

  • idx – Index of the first hit in the event

  • ev_count – Number of hits in the event

  • name – Name of the data

Returns

A single value for the hit

listmode.data.max_e_combinator(in_dict, idx, ev_count, name)[source]

Returns the hit that has highest energy value.

Parameters
  • in_list – A dictionary including all datas of the channel.

  • idx – Index of the first hit in the event

  • ev_count – Number of hits in the event

  • name – Name of the data

Returns

A single value for the hit

listmode.data.mean_combinator(in_dict, idx, ev_count, name)[source]

Returns the mean of all hits in the event.

Parameters
  • in_list – A dictionary including all datas of the channel.

  • idx – Index of the first hit in the event

  • ev_count – Number of hits in the event

  • name – Name of the data

Returns

A single value for the hit

listmode.data.min_combinator(in_dict, idx, ev_count, name)[source]

Returns the hit that has smallest value.

Parameters
  • in_list – A dictionary including all datas of the channel.

  • idx – Index of the first hit in the event

  • ev_count – Number of hits in the event

  • name – Name of the data

Returns

A single value for the hit

listmode.data.poly2(x, *p)[source]

Model function for 2nd degree polynomial fit for energy calibration.

Parameters
  • x – A channel value or a numpy list of channel values.

  • p – Calibration coefficients, starting from 0th degree coefficient.

Returns

Calibrated x.

listmode.data.read_binary_data(data_path, base_name, cfg, mode='event')[source]
Parameters
  • data_path – Path to the data directory

  • base_name – Base name of the data

  • cfg – The detector config dictionary

  • mode – What mode of data to read: ‘event’ or ‘channel’.

Returns

The detector configuration is needed for defining the extras:

List of dicts defining extra data files, type and number of columns. extras = {“name”:’x’, “type”:’t’, “num_col”:’n’}, where type is a numpy type string of the data. Several extras can be defined in det_cfg (coord, ch_flags). These are handled automatically if they are present.

Some extras, such as coord, need to have additional definitions in the config. For coord, it is the ‘coordinates’ list which defines the number of coordinates, the channels the data is found and the order of the coordinates in i, j notation.

listmode.data.strip_cal(data_mat, coord, strip_cal, coord_ch)[source]

Calculates strip calibration for coordinate data.

Parameters
  • data_mat – data

  • coord – coordinates

  • strip_cal – calibration matrix

  • coord_ch – order of coordinate channels

Returns

listmode.data.sum_combinator(in_dict, idx, ev_count, name)[source]

Returns the sum of all hits in the event.

Parameters
  • in_list – A dictionary including all datas of the channel.

  • idx – Index of the first hit in the event

  • ev_count – Number of hits in the event

  • name – Name of the data

Returns

A single value for the hit