Skip to main content

Graduate Outcomes - a summary of the year five SOC coding assurance

This year’s summary report refers to results of the assessments that have taken place on the year five (C21072) dataset only. This is to ensure providers can quickly navigate to the detail relevant to the current year.

You can find further detail about the overall approach we take to quality assurance under Considerations - quality assurance, which has been reviewed and improved.

Quick links: Results of the random sampling process  |  Results of HESA's additional quality checks  |  Results of provider feedback  |  Feedback to HE providers and common misunderstandings

Results of the random sampling process

A sample of records was selected at random from each cohort and checked for accuracy and consistency by HESA. Further details are available under Considerations - quality assurance

As a result of the random sampling assessment, 23 broad occupation groups were raised with the coding supplier for checking this year. A broad occupation group could include multiple job titles that were impacted by an issue and may lead to changes for multiple records, as is the case with provider feedback. Due to the nature of SOC coding and the updates that occur to the ONS indexes during a collection, there is a chance that these areas may have already been picked up in existing consistency checking or quality exercises. We still highlighted them to Oblong for completeness. 

Results of HESA’s additional quality checks

This section outlines some of the additional quality checks that HESA carries out as part of the process and the outcomes from these checks. 

Following the completion of Oblong’s in-depth consistency checking exercise across the entire dataset, a further set of assessments were carried out on the updated SOC data. These attempted to identify non-random anomalies in the dataset.

Job title and major group discrepancies

Job titles with records coded into multiple major groups were pulled out, and those with levels that could be above the threshold level of error were checked to ensure the assigned major groups were as expected. In instances where the major groups that were utilised for coding appeared to be unexpected, the dataset was checked to identify if issues were present. 

No occupation areas were found to have issues as a result of this investigation this year. 

0001 (uncodable) records

An assessment of all 0001 (uncodable) records was carried out by HESA to determine if there were any additional records that could have been assigned a code. The vast majority of uncodable records could not be assigned a code due to a lack of sufficient information available to allow accurate coding. 

However, 56 records were returned to Oblong to be checked again and five were successfully assigned a code. As these are checked against similar records in the wider dataset, there were also some records identified that had previously been coded but did not have sufficient information. Therefore, these were reassigned to 0001.   

Same activity consistency checking

In cases where graduates selected two instances of employment, but marked that these were the same activity, a code will be assigned separately to both instances of employment. Although a difference in employment type can alter a code, instances where the assigned codes did not match were checked for inconsistencies, along with instances with one coded record and one 0001 (uncodable) record. 

These were returned to Oblong for checking and led to 25 coding changes. 

SOC major group by subject

The distribution of SOC major groups was assessed by subject, using the Common Aggregation Hierarchy level 1 (CAH level 1). This was used to assess large subject-occupation groupings with a view to identify unexpected trends. For example, these checks would flag an error if a large number of medicine and dentistry graduates were coded as major group 7 under sales and customer service occupations.

There were no issues identified as a cause for any unexpected trends, as the occupations within the groups checked were coded correctly.

Distribution of SOC major groups

The distribution of SOC major groups across survey years was checked to ensure that there were no major differences that may be cause for concern. Overall distributions were similar and offered reassurance that there were no major discrepancies. 

Results of HE provider feedback

This is the fifth year of the optional feedback process which received a total of 1,055 queries from 29 higher education (HE) providers. To allow providers time to review their data after the closure of the final cohort, the cut-off date for feedback was 8 January 2024. We have outlined the assessment process in more detail within the Considerations – provider feedback (optional)

A summary of the feedback and outcomes is shown in the table below. As with previous years, many providers raised similar queries or returned multiple rows of feedback for an occupation group, therefore the issues raised with Oblong were grouped by occupation. Although some of the individual queries raised by providers may not be examples of miscoding in themselves, the entire occupation group was checked and where a systemic issue or inconsistency was identified, it was raised with Oblong. 

  Number of occupation groups reviewed by Oblong
Systemic 1
Inconsistent 9
Non-systemic / Not actionable 3

The results from this year’s assessment highlight a continued reduction in the number of issues identified as a result of HE provider feedback. In year one, 66 issues were identified as either inconsistent or systemic, reducing to 42 in year two, 40 in year three, 17 in year four and 10 in year five. The number of systemic issues identified in recent years is low, with only one systemic issue resulting from the process in both this year and last year, compared to six in year three and 12 the year before. 

The coding supplier was carrying out consistency checks on the entire collection as the HE provider feedback process was ongoing, and some of the systemic or inconsistent issues had already been rectified in their assurance work. If these were still systemic or inconsistent in the iteration of the dataset HESA was assessing, they have been included in the inconsistent group. 

This simultaneous quality checking is required to ensure that provider feedback is reviewed and incorporated into the dataset so that the additional quality checks and final data delivery can occur in a timely way. This means that as providers were basing their assessment on earlier versions of the dataset, several queries had already been resolved in later iterations of the ‘provisional’ or ‘raw’ dataset. These were marked as not actionable queries. 

Further detail - examples by coding group 

We’re committed to ensuring this process is as transparent as possible. To facilitate a better understanding of the outcome groups highlighted under Considerations - quality assurance, we have provided an example for each group. These are taken directly from the queries raised by the sector during the HE provider feedback process. 

1 Systemic: Care coordinator

There was only a single systemic issue identified in this year’s checks. This is an area that has been reviewed each year and that was being coded according to the ONS indexes. This systemic issue is related to changes in the job duties being seen in the dataset. 

Care coordinator is a job title which encompasses a variety of roles. Within the indexes, there are some clear codes that relate to the job title and also have industry qualifiers related to them. The code 3560 includes the direct job title of care coordinator, but only applies with an industry qualifier of ‘hospital service’. The code 6136 is the other index option with industry qualifiers including ‘social, welfare services’, ‘care/ residential home’ and ‘nursing home’. In the past, the group has been reviewed and coding has allowed for variation where evidence suggests these codes are not related to the role (e.g. duties, course title and qualification requirements indicating that they were clearly a mental health nurse would have led to code 2237). However, where appropriate, the index options had to be used according to the ONS guidance. 

HESA has seen an increase in responses where the existing index options did not seem sufficient for the job duties presented by graduates. Therefore, we have broadened the coding to allow 3560 for care coordinators within the NHS if duties indicate that this is most appropriate, even if they are not in a hospital service. We have also introduced a code of 4131 for those in more administrative roles, as this was best suited to the duties described by graduates. 

To clarify, this does not impact coding where evidence suggests that the role is not related to the above coding options. In these cases, alternative codes will be used. 

2 Inconsistent: Research roles (various)

Occupations can be marked as inconsistent for a number of reasons. Commonly with inconsistent issues, an occupation group is split across two or more valid codes depending on job duties, and some records have not been coded into the most appropriate code and therefore need to be moved from one valid code to another. 

There were a number of research roles that were checked as a result of HE provider feedback. Each of the research occupation groups included various job titles and two of them that resulted in slight inconsistencies can be broadly described as research scientists and researchers. These are reviewed each year by the coding supplier in their consistency checks, but as there were a few inconsistencies present in the version of the dataset checked by HESA, they have been included as inconsistent in the feedback review. 

Whilst levels of miscoding were not high, the complexity of the rules for various researchers and the variety of codes applied to the different roles had led to some respondents being applied a code that may not be the most appropriate option. Further contextual information is used to code these, as job title and duties do not provide sufficient context. In cases where this information is not definitive, but evidence is strong that the graduate is within a professional research role, default codes are applied according to the ONS indexes. 

Many of the queries raised were not changed, as information was not sufficient to apply a different code and the relevant major group 2 default code was already applied. For example, some researchers at universities were coded into Other researchers, unspecified discipline (2162), as the job duties only mentioned that the research was health related. Where no other contextual information could be applied, the default code was used, as various research codes could have applied including Biological scientists (2112), Biochemists and biomedical scientists (2113) or Social and humanities scientists (2115). 

However, some instances across the dataset were identified where either a more specific code had been applied in some more ambiguous instances, or where details could have been utilised to apply a more specific code. As a result, the group was raised with Oblong who ensured that the records had been amended as appropriate. 

3 Non-systemic: Principal occupation therapist

Many of the non-systemic issues found in HE provider feedback were fixed in Oblong’s consistency checks. Where they weren’t fixed, the fact that they were non-systemic meant that no further action was taken.

An instance determined to be non-systemic was a graduate with the job title of principal occupation therapist. This graduate should have been coded into Occupational therapists (2222) but was erroneously coded into Nursing auxiliaries and assistants (6131), which includes roles such as occupational therapy technical instructor, occupational therapy technician and others. This role was checked against various occupational/occupation therapists with a variety of job titles in order to ascertain if there were any other instances of miscoding within similar occupations. This was the only instance and was therefore deemed to be a non-systemic instance of miscoding. 

4 Not an Issue: Learning support assistant

Many instances raised were determined to have no issues. It is worth noting that some of these are due to mismatches resulting from two sets of codes when graduates have two instances of employment. 

Learning support assistants were raised in HE provider feedback and were determined to have no issues with coding. Records were correctly assigned to the group Educational support assistants (6113) which includes the job title Learning support assistant in the indexes. It was requested that these be moved to codes including Special and additional needs education teaching professionals (2316) and Higher level teaching assistant (3231), as the respondents support those with additional needs. 

However, the code 6113 specifically includes the job title in the indexes and in previous assessments in conjunction with the ONS, guidance to HESA has been to code learning support assistants to 6113. This code encompasses roles that work with children with particular learning needs, and the duties and requirements of the roles presented by the graduates also indicate that this is the most appropriate code according to the indexes. As a result, it was determined that this was not an issue. 

Feedback to HE providers and common misunderstandings

Each year, we include a section to feed back to HE providers about the process of receiving their coding queries. We hope that this continues to be useful in communicating why certain decisions are made and promotes continuous improvement.

  • The feedback provided to HESA via the template should be for the relevant survey year only. All other queries should be raised separately and will be used to inform the coding of the next survey year.
  • Activity Type referenced in the SIC/SOC feedback template is present in the SIC/SOC download as 0 (self-employment/own business/portfolio) or 1 (paid work for employer/ voluntary or unpaid) and does not refer to the graduate’s ALLACT or MIMPACT result.
  • We ask that providers do not provide uncodable (0001) records, as these will be checked automatically at the end of the collection by HESA and Oblong.
  • Graduates can have two distinct SOC codes, one for employment and the other for self-employment. Please provide the relevant job title and corresponding SOC code when submitting queries.
  • We need sufficient evidence for a code to be changed, including reference to the SOC 2020 coding framework, as mentioned above. Please consider these additional points when explaining the reason why the change is requested:
    • Salary should not be the only justification for a change in code as this variable cannot be used reliably and consistently to determine SOC codes.
    • A degree being required for a role should not be the only justification for a change in code as it is not a reliable indicator of SOC according to the framework.
    • Employer name should not be the only justification for a change in code, job title and/or duties are the primary determinants.
    • Please refer to coding rules published by ONS to explain why some records have been coded in a certain way. For example, Assistant auditor will be coded as Auditor, but Auditor's assistant is an assistant role.
  • Information collected from sources outside the survey, such as a graduate's social media platforms, cannot be used as justification for a change. Issues with the SOC indexes cannot be rectified by HESA. These can be raised with the ONS through the appropriate channels

Back to Coding assurance summary reports