primrose.models package¶
Submodules¶
primrose.models.minimal_search_engine module¶
concrete class that performs TFIDF on lemmatized tokens with optional ngrams
- Author(s):
Carl Anderson (carl.anderson@weightwatchers.com)
-
class
primrose.models.minimal_search_engine.
MinimalSearchEngine
(configuration, instance_name)¶ Bases:
primrose.base.search_engine.AbstractSearchEngine
simple TFIDF search engine
-
tokenize
(s, stopwords=[], add_ngrams=True)¶ - tokenize a string document, optimized for recipe names given default stopwords and other
string cleanup operations
- Parameters
s (str) – some document string
stopwords (list) – list of stopwords
add_ngrams (bool) – whehter to include ngrams on tokens
- Returns
tokens (list): list of cleaned, standardized tokens from document
-
primrose.models.sklearn_classifier_model module¶
Module to run a basic decision tree model
- Author(s):
Carl Anderson (carl.anderson@weightwatchers.com) Mike Skarlinski (michael.skarlinski@weightwatchers.com)
-
class
primrose.models.sklearn_classifier_model.
SklearnClassifierModel
(configuration, instance_name)¶ Bases:
primrose.models.sklearn_model.SklearnModel
-
eval_model
(data_object)¶ Evaluate model perfomance on a labeled testing dataset
- Returns
instance of DataObject
- Return type
data_object (DataObject)
-
static
necessary_config
(node_config)¶ Return a list of necessary configuration keys
- Parameters
node_config (dict) – set of parameters / attributes for the node
Notes
model_parameters (dict): parameters that mirror the sklearn kwargs for the user’s model mode: train, eval or predict (see AbstractModel) sklearn_classifier_name: sklearn submodule and model name (submodule.model_name) of the user’s model grid_search_scoring: scoring function name from sklearn CV docs cv_folds: number of CV folds
- Returns
set of required keys
-
predict
(data_object)¶ Make distance-based predictions using the prebuilt matrix
- Parameters
data_object – DataObject instance
load_model – load model object from gcs or not
- Returns
data_object with prediction data added
-
train_model
(data_object)¶ train the model using CV, according to user specified options
- Parameters
data_object (DataObject) – instance of DataObject
- Returns
Nothing
-
primrose.models.sklearn_cluster_model module¶
Module to run a basic clustering model
- Author(s):
Carl Anderson (carl.anderson@weightwatchers.com)
-
class
primrose.models.sklearn_cluster_model.
SklearnClusterModel
(configuration, instance_name)¶ Bases:
primrose.models.sklearn_model.SklearnModel
-
fit_training_data
()¶ fit training data to model
-
get_scores
()¶ get the scores for X_test
- Returns
returns a dictionary of scors
-
static
necessary_config
(node_config)¶ Return a list of necessary configuration keys
Note
X (list): list of columns to use
While you might expect model here, we do not need it when in predict or eval mode as the model is cached, only in train
- Returns
set of required keys
-
primrose.models.sklearn_model module¶
A primrose model based around a sklearn model
- Author(s):
Carl Anderson (carl.anderson@weightwatchers.com)
-
class
primrose.models.sklearn_model.
SklearnModel
(configuration, instance_name)¶ Bases:
primrose.base.model.AbstractModel
-
eval_model
(data_object, load_model=False)¶ evalute a model by getting the scores
- Returns
instance of DataObject
- Return type
data_object (DataObject)
-
static
evaluate_no_ground_truth_classifier_metrics
(X, labels)¶ Compute a set of metric for a classifier where there is no ground truth
- Parameters
X (datafframe) – the data
labels – the predicted classes
- Returns
value
- Return type
dictionary of score name
-
static
evaluate_regression_metrics
(actual, predictions)¶ compute set of metrics for a regression
- Parameters
actual – vector of actual values
predictions – vector of predictions
- Returns
value
- Return type
dictionary of score name
-
load_model
(data_object)¶ finds an upstream sklearn mode
- Parameters
data_object (DataObject) – instance of DataObject
- Returns
Sklearn model
-
predict
(data_object, load_model=False, use_serial=False)¶ Predict y_test from X_test
- Parameters
data_object (DataObject) – instance of DataObject
load_model – load model object from gcs or not
- Returns
instance of DataObject
- Return type
data_object (DataObject)
-
train_model
(data_object)¶ train the model
- Parameters
data_object (DataObject) – instance of DataObject
- Returns
instance of DataObject
- Return type
data_object (DataObject)
-
primrose.models.sklearn_regression_model module¶
Module to run a basic regression model
- Author(s):
Carl Anderson (carl.anderson@weightwatchers.com)
-
class
primrose.models.sklearn_regression_model.
SklearnRegressionModel
(configuration, instance_name)¶ Bases:
primrose.models.sklearn_model.SklearnModel
-
fit_training_data
()¶ fit training data to model
-
get_scores
()¶ get the scores for y_test
- Returns
dictionary of scores
-
static
necessary_config
(node_config)¶ Return a list of necessary configuration keys
Note
While you might expect model here, we do not need it when in predict or eval mode as the model is cached, only in train
- Returns
set of required keys
-