Command Line Interface

API Description of the CLI interface to Jamie. This module allows operation of the Jamie CLI from a Python environment such as a Jupyter notebook. For information about parameters, see Workflow. You can pass a custom configuration path to jamie.Jamie as well.

Example

Running a standard pipeline in a Python environment:

>>> from jamie import Jamie
>>> jobs = Jamie()
>>> print(jobs.cf)  # show current configuration
>>> jobs.scrape()
>>> jobs.load()
>>> jobs.train("2020-01-01T12-00-00")  # train using specified snapshot
>>> jobs.predict()
>>> jobs.report()

API

class jamie.Jamie(config=None)

Job Analysis by Machine Information Extraction

config(field=None, value=None)

Reads and sets jamie configuration

distribution(kind)

Distribution of jobs in database: monthly or yearly

features()

List possible features (job types)

information_gain(training_snapshot=None, text_column='description', output_column='aggregate_tags')

Calculates information gain for text ngrams in training snapshot

list_jobids()

List job ids from jobs database

load(dry_run=False)

Read scraped jobs into MongoDB

predict(snapshot=None)

Predict using specified snapshot

random_sample_prediction(snapshot=None, n_each_class=100, random_state=100)

Generates a random sample of positive and negative classes

readjob(fn, save=False)

Reads a job HTML and prints in JSON format, with option to save

report(snapshot=None)

Generate report using specified snapshot

scrape()

Scrapes jobs from jobs.ac.uk

setup()

Initial setup for Jamie

snapshots(kind, instance=None)

Show saved snapshots (models/training)

train(snapshot=None, featureset='rse', models=None, prediction_field='aggregate_tags', oversampling=False, scoring='precision', random_state=100)

Train using specified snapshot (default: last)

version()

Version information for jamie

view_report(snapshot=None, port=8000)

Starts a local webserver to display reports