Publisher
source

Kingston University

Multi-Language Statistical Classification of Natural and Computer-Generated Texts Kingston University in United Kingdom

Degree Level

PhD

Field of study

Computer Science

Funding

Funded PhD Project (Students Worldwide)

Deadline

Expired

Country flag

Country

United Kingdom

University

Kingston University

Social connections

How do Pakistani students apply for this?

Sign in for free to reveal details, requirements, and source links.

Where to contact

Keywords

Computer Science
Mathematics
Natural Language Processing
Corpus Linguistics
Anomaly Detection
Content Analysis
Statistics
Linguistics
Machine learning

About this position

This PhD project at Kingston University investigates the statistical properties of written documents, drawing analogies to other type-token systems such as city populations and personal incomes. Large-scale studies on the Project Gutenberg corpus have revealed significant statistical variations between texts, with some documents fitting established models and others presenting anomalies that merit deeper exploration. The research aims to analyze differences between languages—such as Finnish and English—using statistical methods, with the potential to classify languages based solely on their statistical characteristics.

A key aspect of the project is the comparison between AI-generated and natural texts across multiple languages. The study will compile a large, balanced corpus representing significant language groups, identify and separate statistical outliers, and generate a corresponding set of AI-generated texts. Each corpus will be characterized using existing type-token models, and these models will be ranked by their effectiveness in distinguishing between languages and between AI and natural texts. The outcomes could lead to the development of a robust language classifier or inform the creation of improved statistical models for document analysis.

The project is supervised by Dr M Tunnicliffe and is part of the Graduate School studentships competition for October 2026 entry. Funding may be available, covering tuition and stipend, subject to the outcome of the competition. Applicants should have a strong background in mathematics, statistics, computer science, or linguistics, with excellent quantitative and analytical skills. Experience in statistical modeling, programming, or corpus linguistics is advantageous. English language proficiency must meet Kingston University standards.

To apply, candidates should consult the Graduate School Studentships information and the Faculty of Engineering, Computing and the Environment research page at Kingston University. The application deadline is March 4, 2026. This opportunity is ideal for students interested in statistical modeling, natural language processing, and the intersection of AI and linguistics.

Funding details

Funded PhD Project (Students Worldwide)

What's required

Applicants should hold a good undergraduate or master's degree in a relevant discipline such as mathematics, statistics, computer science, or linguistics. Strong quantitative and analytical skills are required. Experience with statistical modeling, programming, or corpus linguistics is desirable. English language proficiency must meet Kingston University requirements.

How to apply

Review the Graduate School Studentships information at Kingston University London. Visit the Faculty of Engineering, Computing and the Environment research page for further details. Prepare your application according to the guidance provided on these pages.

Ask ApplyKite AI

Start chatting
Can you summarize this position?
What qualifications are required for this position?
How should I prepare my application?