Types¶

Custom types used in Jamie and data schemas.

class jamie.types.Alert(value)¶: Alert levels for reporting

class jamie.types.Contract(value)¶: Contract type: Fixed Term or Permanent

class jamie.types.JobPrediction(prediction)¶

Represents prediction for a single job

jobid¶

JobID from jobs.ac.uk

Type: str

job_title¶

Job title

Type: str

snapshot¶

Model snapshot used for prediction

Type: str

closes¶

Close date for job

Type: datetime.date

contract¶

Contract type

Type: Contract

department¶

Department of the academic institution that the job is associated with

Type: str

employer¶

Job employer

Type: str

date¶

Date of the job. This is usually the same as the posted date, but if that is not available, defaults to the date of job applications closing, or the earliest date found in the job description. This attribute should be used for computing timeseries.

Type: datetime.date

posted¶

Date job was posted

Type: datetime.date

extra_location¶

Broad geographical location of job position

Type: str

salary_min¶

Minimum salary associated with the job. Sometimes jobs have a range of salaries depending on the experience of the applicant.

Type: Optional[int]

salary_max¶

Maximum salary associated with the job. Sometimes jobs have a range of salaries depending on the experience of the applicant.

Type: Optional[int]

salary_median¶

Median salary associated with the job.

Type: Optional[int]

probability¶

Probability that the job is classified in the positive class

Type: float

probability_lower¶

Lower confidence interval of the probability

Type: float

probability_upper¶

Upper confidence interval of the probability

Type: float

Parameters: prediction (dict) – Dictionary representing a single prediction from the JSONL file generated by Predict

class jamie.types.JobType(value)¶: An enumeration.

class jamie.types.PrecisionRecall(value)¶: An enumeration.

class jamie.types.TrainingData(description: str, job_title: str, aggregate_tags: int, placed_on: datetime.date, jobid: str, job_ref: str, contract: str, department: str, duration_ad_days: int, employer: str, enhanced: str, extra_location: str, final_bool: int, funding_amount: Optional[str], funding_for: Optional[str], hours: str, in_uk: bool, invalid_code: Optional[List[str]], json: Optional[str], location: str, not_student: bool, original: int, original_proba: float, qualification_type: str, reference: str, region: str, run_tag: str, salary: str, salary_max: Optional[float], salary_min: Optional[float], salary_median: Optional[float], subject_area: List[str], tags: List[str], tags_1: str, tags_2: str, tags_3: Optional[str], tag_count: int, agg_tags: float, multi_agg_tags: str, consensus_tags: str, diff_consensus_tags: str)¶

Schema for the training dataset.

Required columns for model training are ‘description’, ‘job_title’, ‘aggregate_tags’. The attribute ‘placed_on’ is required for timeseries graphs of the training data.

Parameters

description (str) – Job description
job_title (str) – Job title
aggregate_tags (int) – Integer equals 0 or 1
placed_on (datetime.date) – Date job was placed on
jobid (str) – Unique jobid given by jobs.ac.uk
job_ref (str) – Job reference, possibly used internally by the employer
contract (str) – Contract type, fixed term or permanent, full-time or part-time
department (str) – Department of the employer
duration_ad_days (int) – Duration of job advertisment in days from placed_on to closes.
employer (str) – Employer name
enhanced (str) – HTML content can be “enhanced” or “normal”, which alters the parsing
extra_location (str) – Region of UK where job is from
final_bool (int) – Unknown boolean type
funding_amount (Optional[str]) – Funding amount text if for a PhD position
funding_for (Optional[str]) – Specifies whether funding is for UK, EU, international or self-funded students
hours (str) – Specifies whether job is full time or part time
in_uk (bool) – Specifies whether job is actually in the UK. Some jobs are by UK institutions but located overseas
invalid_code (Optional[List[str]]) – List of job attributes that could not be parsed
json (Optional[str]) – JSON representation of job
location (str) – City where job is located
not_student (bool) – Whether job is a PhD level position
original (int) – Unknown boolean type
original_proba (float) – Unknown probability
qualification_type (str) – Type of qualification required for the job in term of education level
reference (str) – Unknown field
region (str) – Unknown field, possibly country of the UK where job is
run_tag (str) – Whether job was classified in first or second run
salary (str) – Text fragment which has information on salary
salary_max (Optional[float]) – If a salary range is specified, higher end of the salary range, otherwise same as median salary
salary_min (Optional[float]) – If a salary range is specified, lower end of the salary range, otherwise same as median salary
salary_median (Optional[float]) – Median of salary_min, salary_max if both are present, otherwise equals the salary value
subject_area (List[str]) – List of academic fields for the job
tags (List[str]) – List of tags (labels) given by coders to the job
tags_1 (str) – Label given to job given by coder 1 and 2 respectively, one of {‘No’, ‘Some’, ‘Insufficient Evidence’, ‘Most’} when answering the question “How much time would be spent in this job developing software?”
tags_2 (str) – Label given to job given by coder 1 and 2 respectively, one of {‘No’, ‘Some’, ‘Insufficient Evidence’, ‘Most’} when answering the question “How much time would be spent in this job developing software?”
tags_3 (Optional[str]) – Label given to job by coder 3 when coder 1 and coder 2 disagreed
tag_count (int) – Number of coders who classified the job
agg_tags (float) – Aggregate score from coders
aggregate_tags – Classification of whether the job is in the target class or not (1 indicating it is, 0 otherwise)
multi_agg_tags (str) – Unknown field
consensus_tags (str) – [tentative] Consensus of tags_1 and tags_2
diff_consensus_tags (str) – Unknown field

static reliability(data, coders=3)¶: Returns DataFrame which can be used to compute reliability

validate()¶: Validates a single row of training set data

Types¶

Navigation

Related Topics