Command Line Interface¶
API Description of the CLI interface to Jamie. This module allows operation of the Jamie CLI from a Python environment such as a Jupyter notebook. For information about parameters, see Workflow. You can pass a custom configuration path to jamie.Jamie
as well.
Example¶
Running a standard pipeline in a Python environment:
>>> from jamie import Jamie
>>> jobs = Jamie()
>>> print(jobs.cf) # show current configuration
>>> jobs.scrape()
>>> jobs.load()
>>> jobs.train("2020-01-01T12-00-00") # train using specified snapshot
>>> jobs.predict()
>>> jobs.report()
API¶
-
class
jamie.
Jamie
(config=None)¶ Job Analysis by Machine Information Extraction
-
config
(field=None, value=None)¶ Reads and sets jamie configuration
-
distribution
(kind)¶ Distribution of jobs in database: monthly or yearly
-
features
()¶ List possible features (job types)
-
information_gain
(training_snapshot=None, text_column='description', output_column='aggregate_tags')¶ Calculates information gain for text ngrams in training snapshot
-
list_jobids
()¶ List job ids from jobs database
-
load
(dry_run=False)¶ Read scraped jobs into MongoDB
-
predict
(snapshot=None)¶ Predict using specified snapshot
-
random_sample_prediction
(snapshot=None, n_each_class=100, random_state=100)¶ Generates a random sample of positive and negative classes
-
readjob
(fn, save=False)¶ Reads a job HTML and prints in JSON format, with option to save
-
report
(snapshot=None)¶ Generate report using specified snapshot
-
scrape
()¶ Scrapes jobs from jobs.ac.uk
-
setup
()¶ Initial setup for Jamie
-
snapshots
(kind, instance=None)¶ Show saved snapshots (models/training)
-
train
(snapshot=None, featureset='rse', models=None, prediction_field='aggregate_tags', oversampling=False, scoring='precision', random_state=100)¶ Train using specified snapshot (default: last)
-
version
()¶ Version information for jamie
-
view_report
(snapshot=None, port=8000)¶ Starts a local webserver to display reports
-