Models¶
Machine learning models
-
jamie.models.
get_model
(n)¶ Return model object corresponding to the named parameter.
- Parameters
n (str) – The name of the model
- Returns
Model object
-
jamie.models.
nested_cross_validation
(models, X, y, scoring_value, snapshot, oversampling=False, nbr_folds=5, random_state=100)¶ Perform nested cross validation and return best model. The set of models is defined in
jamie.models
. This function is generally not invoked directly, and is called throughtrain()
.- Parameters
models (List[str]) – List of models to use, specify None for all models
X (numpy.ndarray) – Feature matrix. This can be obtained by calling fit_transform() on a Features object
y (numpy.ndarray) – Binary labels, should have the same number of rows as X
scoring_value (score) – Which score type to use, same as in GridSearchCV
oversampling (bool) – Whether to perform oversampling to balance the dataset (default: True)
nbr_folds (int) – Number of folds for cross validation (default: 5)
random_state (int) – Seed to initialise the random state (default: 100)
snapshot (str) – Snapshot within which this is being run, used only for logging.
- Returns
best_params (dict) – Best parameters for the final model
final_model (model) – Final model
score_for_outer_cv (pd.DataFrame) – Scores for outer cross validation for the various models
-
jamie.models.
parse_model_description
(model_description, models=None, random_state=100)¶ Parse models description. This function expands configuration values such as hyperparameter ranges from a string description to Python objects. The following interpositions are supported for parameter types:
=<start>:<stop>[:<step>]
becomes range (start,stop,step)=e<start>:<stop>[:<step>]
becomes np.logspace (start,stop,num)
- Parameters
- Returns
dict – Model description with parameters interposed using the above substitutions
-
jamie.models.
train
(config, snapshot, featureset, models, prediction_field, oversampling, scoring, random_state=100)¶ Train models, called when using
jamie train
and save model snapshots.- Parameters
config (
jamie.config.Config
) – Configuration objectsnapshot (
jamie.snapshots.TrainingSnapshot
) – Training snapshot to usefeatureset (str) – Featureset to use
models (Optional[List[str]]) – List of models to train on, by default all models are selected
prediction_field (str) – Which column of the training set data to use for prediction.
oversampling (bool) – Whether to oversample to form a balanced set, passed to
nested_cross_validation()
.scoring (str) – Scoring method to use for grid search, passed to
nested_cross_validation()
.random_state (int) – Seed to initialise the random state (default: 100)