Skip to main content
header.image.blues.centre

University of Leeds

Our internship initiative presents a unique opportunity for students to participate in cutting-edge computational research projects during their undergraduate study period or following graduation.


Application deadline: Monday 20 April, 2026


Prospective projects

Below is a list of prospective projects you can apply for, complete with a short explanation and the lead supervisor's name and department. Please contact them before making your application; you will be asked if you have done this on the application form.

If you are interested to learn more about a specific project before applying, download the full project proposal at the bottom of the page.

The impact of low-allergy formula prescribing on children’s oral health in Bradford Peter Day, School of Dentistry.

Since 2010, the prescription of low‑allergy infant formula has increased 3.8‑fold. However, nine in ten of these children do not have a cow’s milk allergy. Low‑allergy formulas typically substitute lactose with glucose syrup, which may pose a risk to children’s oral health. Using linked datasets within the Bradford Secure Data Environment, this internship will estimate the causal effect of prescribed low‑allergy infant formula on the likelihood of children requiring a dental general anaesthetic by the age of six. In a supportive environment, the intern will learn to manage and analyse large, complex datasets using high‑performance computing.

Where did the English novel come from? Experimenting with computational approaches to investigate language in seventeenth-century prose fiction. Mel Evans, School of English.

This project seeks to identify and interpret key properties that characterise early literary prose fiction in English to enhance our understanding of how a new form of narrative communication (“the novel”) emerges and stabilises. The internship project will develop and investigate a corpus of English fiction from pre-1710, using computational tools for text digitization, data cleaning, analysis and visualisation. The intern will experiment and innovate with emergent techniques in e.g. OCR and vibe-coding to identify new and best practice in corpus research.

Interpreting Readability Evaluation with Explainable AI. Nouran Khallaf, School of Languages, Cultures and Societies

This project investigates how explainability methods and LLM-based evaluation can be used as an evaluation framework for readability and text simplification systems. It examines whether model explanations align with linguistic transformations observed in human-authored simplifications. The project will produce an explainability- and LLM-based evaluation framework, an analysis of complexity triggers in readability prediction, and a reproducible research codebase with documentation, supporting interpretable and reproducible NLP research using HPC resources.

Scaling Rhetorical Move-Step Analysis: An HPC Pipeline for the Royal Society Corpus. Charles Lam, School of Languages, Cultures and Societies

This project develops an HPC pipeline for rhetorical move-step analysis, targeting the Royal Society Corpus 6.0 with 17,520 articles from 1665-1920 (78 mil. tokens). While previous work can map rhetorical structures in contemporary journals, scaling to historical corpora requires HPC to automate context-dependent classification. This internship falls within Digital Humanities and Machine Learning themes. Intern will create a robust workflow to process XML data and feed a specialized visualization dashboard, transforming how we trace the evolution of scientific discourse over time.

Population-Scale Biomechanics: Modelling of a Spinal Unit using HPC. Gavin Day, Mechanical Engineering

This Digital Health project uses advanced simulations to study how the spine handles physical stress. Current models often rely on "average" geometry, ignoring population-level variability. We developed an automated Statistical Shape Model of the spine to address this. Using high-performance computing, the intern will advance this framework by implementing realistic disc properties across distinct spine variations. They will evaluate exactly how individual anatomy affects spinal movement and overall function.

Creating a Historical Sentiment Lexicon Using Johnson’s Dictionary and HPC. Emily Middleton, Lecturer in Digital Humanities

This project will build a historical sentiment lexicon using Johnson’s Dictionary Online with Google’s BERT model: the 50,000 headwords, definitions and examples will be used to calculate a sentiment score for each word, which can then be applied to historical texts. The aim is to try out different ways of developing scores and test the usefulness on articles from thousands of historical newspapers using HPC. While there have been some small-scale efforts to build lexica specific to datasets, this project represents a major step forward in making sentiment analysis historically rigorous.


Application Process

Please follow the link below to apply for the internships through the University of Leeds portal.

If you are having problems with your application, please contact us at enquiries@n8cir.org.uk


Application Deadlines

Application deadline: Monday 20 April 2026

Shortlisting applications completed by: Friday 24 April, 2026

Interview dates: w/c 4 May, 2026


Download a full project proposal



Return to article index