Types¶
Custom types used in Jamie and data schemas.
-
class
jamie.types.
Alert
(value)¶ Alert levels for reporting
-
class
jamie.types.
Contract
(value)¶ Contract type: Fixed Term or Permanent
-
class
jamie.types.
JobPrediction
(prediction)¶ Represents prediction for a single job
-
closes
¶ Close date for job
- Type
-
date
¶ Date of the job. This is usually the same as the posted date, but if that is not available, defaults to the date of job applications closing, or the earliest date found in the job description. This attribute should be used for computing timeseries.
- Type
-
posted
¶ Date job was posted
- Type
-
salary_min
¶ Minimum salary associated with the job. Sometimes jobs have a range of salaries depending on the experience of the applicant.
- Type
Optional[int]
-
salary_max
¶ Maximum salary associated with the job. Sometimes jobs have a range of salaries depending on the experience of the applicant.
- Type
Optional[int]
- Parameters
prediction (dict) – Dictionary representing a single prediction from the JSONL file generated by
Predict
-
-
class
jamie.types.
JobType
(value)¶ An enumeration.
-
class
jamie.types.
PrecisionRecall
(value)¶ An enumeration.
-
class
jamie.types.
TrainingData
(description: str, job_title: str, aggregate_tags: int, placed_on: datetime.date, jobid: str, job_ref: str, contract: str, department: str, duration_ad_days: int, employer: str, enhanced: str, extra_location: str, final_bool: int, funding_amount: Optional[str], funding_for: Optional[str], hours: str, in_uk: bool, invalid_code: Optional[List[str]], json: Optional[str], location: str, not_student: bool, original: int, original_proba: float, qualification_type: str, reference: str, region: str, run_tag: str, salary: str, salary_max: Optional[float], salary_min: Optional[float], salary_median: Optional[float], subject_area: List[str], tags: List[str], tags_1: str, tags_2: str, tags_3: Optional[str], tag_count: int, agg_tags: float, multi_agg_tags: str, consensus_tags: str, diff_consensus_tags: str)¶ Schema for the training dataset.
Required columns for model training are ‘description’, ‘job_title’, ‘aggregate_tags’. The attribute ‘placed_on’ is required for timeseries graphs of the training data.
- Parameters
description (str) – Job description
job_title (str) – Job title
aggregate_tags (int) – Integer equals 0 or 1
placed_on (datetime.date) – Date job was placed on
jobid (str) – Unique jobid given by jobs.ac.uk
job_ref (str) – Job reference, possibly used internally by the employer
contract (str) – Contract type, fixed term or permanent, full-time or part-time
department (str) – Department of the employer
duration_ad_days (int) – Duration of job advertisment in days from placed_on to closes.
employer (str) – Employer name
enhanced (str) – HTML content can be “enhanced” or “normal”, which alters the parsing
extra_location (str) – Region of UK where job is from
final_bool (int) – Unknown boolean type
funding_amount (Optional[str]) – Funding amount text if for a PhD position
funding_for (Optional[str]) – Specifies whether funding is for UK, EU, international or self-funded students
hours (str) – Specifies whether job is full time or part time
in_uk (bool) – Specifies whether job is actually in the UK. Some jobs are by UK institutions but located overseas
invalid_code (Optional[List[str]]) – List of job attributes that could not be parsed
json (Optional[str]) – JSON representation of job
location (str) – City where job is located
not_student (bool) – Whether job is a PhD level position
original (int) – Unknown boolean type
original_proba (float) – Unknown probability
qualification_type (str) – Type of qualification required for the job in term of education level
reference (str) – Unknown field
region (str) – Unknown field, possibly country of the UK where job is
run_tag (str) – Whether job was classified in first or second run
salary (str) – Text fragment which has information on salary
salary_max (Optional[float]) – If a salary range is specified, higher end of the salary range, otherwise same as median salary
salary_min (Optional[float]) – If a salary range is specified, lower end of the salary range, otherwise same as median salary
salary_median (Optional[float]) – Median of salary_min, salary_max if both are present, otherwise equals the salary value
subject_area (List[str]) – List of academic fields for the job
tags (List[str]) – List of tags (labels) given by coders to the job
tags_1 (str) – Label given to job given by coder 1 and 2 respectively, one of {‘No’, ‘Some’, ‘Insufficient Evidence’, ‘Most’} when answering the question “How much time would be spent in this job developing software?”
tags_2 (str) – Label given to job given by coder 1 and 2 respectively, one of {‘No’, ‘Some’, ‘Insufficient Evidence’, ‘Most’} when answering the question “How much time would be spent in this job developing software?”
tags_3 (Optional[str]) – Label given to job by coder 3 when coder 1 and coder 2 disagreed
tag_count (int) – Number of coders who classified the job
agg_tags (float) – Aggregate score from coders
aggregate_tags – Classification of whether the job is in the target class or not (1 indicating it is, 0 otherwise)
multi_agg_tags (str) – Unknown field
consensus_tags (str) – [tentative] Consensus of tags_1 and tags_2
diff_consensus_tags (str) – Unknown field
-
static
reliability
(data, coders=3)¶ Returns DataFrame which can be used to compute reliability
-
validate
()¶ Validates a single row of training set data