primrose.configuration package¶
Submodules¶
primrose.configuration.configuration module¶
Module to implement a Configuration parser which enhances parsing functionality of configparser
- Author(s):
Michael Skarlinski (michael.skarlinski@weightwatchers.com)
Carl Anderson (carl.anderson@weightwatchers.com)
-
class
primrose.configuration.configuration.
Configuration
(config_location, is_dict_config=False, dict_config=None)¶ Bases:
object
Stores user defined configuration for primrose job
-
check_config
()¶ check the configuration as much as we can as early as we can
- Raises
various exceptions if any checks fail –
-
check_metadata
()¶ checks some dependencies among metadata keys
- Raises
ConfigurationError is issues found –
-
check_sections
()¶ Check that all the sections in implementation are supported ones. Either the user supplied metata.section_registry, or they are using default sections
- Raises
ConfigurationError if declaring metadata.section_registry and sections from implementation were not found in metadata –
or vice versa, or if using default operations but sections found that were not supported –
-
config_for_instance
(instance_name)¶ get the configuration for a given node / instance_name
- Returns
JSON chunk for this instance
-
dict_raise_on_duplicates
(ordered_pairs)¶ Reject duplicate keys in JSON string, ie. sections and node names.
- Parameters
ordered_pairs (list) – list of key:values from the config Example: ordered_pairs [(‘class’, ‘CsvReader’), (‘filename’, ‘data/tennis.csv’), (‘destinations’, [‘write_output’])] ordered_pairs [(‘read_data’, {‘class’: ‘CsvReader’, ‘filename’: ‘data/tennis.csv’, ‘destinations’: [‘write_output’]})]
- Returns
dictionary of key (node type) and value (node name)
- Return type
dictionary (dict)
-
static
perform_any_config_fragment_substitution
(config_str)¶ Given some configuration file content string, look for subtitutions given by $$FILE=path/to/config/file/fragment.json$$ and make the replacements using the filenames provided For example: { $$FILE=/tmp/metadata.json$$ “implementation_config”: { $$FILE= config/read_write_fragment.json $$ } } will inject /tmp/metadata.json into the 2nd line of that config.
- Parameters
config_str (str) – content of some configuration file that may or may not contain substition variables
- Returns
the post-substituted configuration string
- Return type
config_str (str)
-
sections_in_order
()¶ Return list of section names in order, either explicitly from metadata or from default Enum order
Note
If there is a non-empty section_run list in metadata return that elif there is a non-empty section_registry in metadata return that otherwise return sections present from default OperationType enum.
We need this method because the config sections are a dictionary not a list so we can’t guarantee order of keys. This method imposes an expected order.
- Returns
tuple containing:
section names (list): list of sections
source (str): where did the list come from? section_run, section_registry, or default?
- Return type
(tuple)
-
primrose.configuration.configuration_dag module¶
A class that creates a directed acyclic graph (DAG) and perhaps a number of checks, such as detecting cycles, orphans, and unrecognized nodes
- Author(s):
Carl Anderson (carl.anderson@weightwatchers.com)
-
class
primrose.configuration.configuration_dag.
ConfigurationDag
(config)¶ Bases:
object
-
static
add_edge
(G, G2, node_names, key, destination)¶ add an edge to the DAG
- Parameters
G (networkx bidirectional graph) – bidirectional graph instance
G2 (networkx directional graph) – bidrectional graph instance
key (str) – starting node name
destination (str) – destination node name
- Returns
nothing. Side effect is to add the edge
-
check_connected_components
()¶ now we can count the number of connected components. >1 is problem
- Raises
ConfigurationError if multiuple connected components –
-
check_dag
()¶ check that it is a DAG
Note
check for cycles check only 1 connected component, no orphans check that all edges point to known nodes
- Raises
Excetions if cycles found or multiple connected components –
-
check_for_cycles
()¶ check for cycles
- Raise:
ConfigurationError if cycles found
-
static
check_node_exists
(node_names, key)¶ check that some specified destination is node on graph
- Parameters
nodes_names (list) – list of node names
key (str) – name of node to check
- Raises
-
create_dag
()¶ Create the DAG
- Returns
nothing. Side effect is to set up graphs and node map
-
descendents
(source)¶ Get the list of descendents from source, i.e. subgraph below source
- Parameters
source (str) – name of source
- Returns
list of descendents of source node
-
nodes_of_type
(operation_type)¶ get set of nodes of a given operation type (OperationType.reader, OperationType.writer etc)
- Parameters
operation_type (OperationType) – OperationType
- Returns
set of keys, if any, of the given operation type
-
paths
(source, target)¶ return the paths, if any, from a given source node to a given target node
- Parameters
source (str) – name of node which is starting point of path
target (str) – name of node which is end point of path
- Returns
list of list of nodes (in order) forming the paths, or None if no path
-
plot_dag
(filename, traverser, node_size=500, label_font_size=12, text_angle=0, image_width=16, image_height=12)¶ plot the DAG to image file
- Parameters
filename (str) – path to write image to
title (str) – title to add to chart
node_size (int) – node size
label_font_size (int) – font size
text_angle (int) – angle to rotate. This is angle in degrees counter clockwise from east
image_width (int) – width of image in inches
image_height (int) – heightof image in inches
- Returns
nothing. Saves image to file
-
starting_nodes
()¶ Where does the DAG start? Compute list of starting (level 0) nodes
- Returns
list of node name
-
upstream_keys
(instance_name)¶ get list of keys (names of nodes in the DAG) that feed into instance_name node
- Parameters
instance_name (str) – name of instance
- Returns
list of nodes
-
upstream_nodes_of_type
(target_node_name, operation_type)¶ - get set of nodes of a given operation type (OperationType.reader, OperationType.writer etc)
upstream of some given target node
- Parameters
operation_type (OperationType) – OperationType
- Returns
set of keys, if any, of the given operation type
-
upstream_typed_keys
(instance_name)¶ get dictionary of the upstream keys with Operation types as values
- Parameters
instance_name (str) – name of instance
- Returns
node type}
- Return type
dictionary of {name
-
static
primrose.configuration.util module¶
set of utility methods and enum for configurations
- Author(s):
Carl Anderson (carl.anderson@weightwatchers.com)
-
exception
primrose.configuration.util.
ConfigurationError
¶ Bases:
Exception
named error specifically for configuration errors
-
class
primrose.configuration.util.
ConfigurationSectionType
¶ Bases:
enum.Enum
set of top-level sections in config
-
IMPLEMENTATION_CONFIG
= 'implementation_config'¶
-
METADATA
= 'metadata'¶
-
values
= <function ConfigurationSectionType.values>¶
-
-
class
primrose.configuration.util.
OperationType
¶ Bases:
enum.Enum
set of operation type identifiers
-
cleanup
= 'cleanup_config'¶
-
dataviz
= 'dataviz_config'¶
-
model
= 'model_config'¶
-
names
= <function OperationType.names>¶
-
pipeline
= 'pipeline_config'¶
-
postprocess
= 'postprocess_config'¶
-
reader
= 'reader_config'¶
-
values
= <function OperationType.values>¶
-
values_to_names
= <function OperationType.values_to_names>¶
-
writer
= 'writer_config'¶
-