Postdoc in Large Language Model Inferencing
This postdoctoral position at KTH Royal Institute of Technology is part of the Wallenberg Scholar project "Scalable and adaptive inferencing for democratizing AI," with a substantial budget of 18 Million SEK. The project aims to significantly reduce the cost and power consumption for serving large language models, such as ChatGPT, by advancing scalable and adaptive inferencing techniques. The research will focus on the design, implementation, and evaluation of distributed systems and networks for machine learning inference, including the development of agentic frameworks. As a postdoc, you will join a dynamic research lab, collaborating with faculty members and doctoral students, and contribute to both scientific publications and new grant applications.
The position is supervised by Professor Dejan Kostic and Associate Professor Marco Chiesa, offering opportunities for experimental systems work and interdisciplinary collaboration. The ideal candidate will have a strong background in distributed systems, networking, programming, and operating systems, with proficiency in C++, Python, Linux, and scripting languages. Experience with machine learning, GPU programming, or large-scale inference systems is highly valued. The role requires excellent English communication skills, critical thinking, and a willingness to experiment and explore new ideas. Awareness of diversity and equal opportunity issues, particularly gender equality, is also important.
KTH is a leading international technical university located in Stockholm, Sweden, known for its commitment to education, research, and innovation. The university provides a creative and dynamic environment, attractive benefits, and good working conditions. The postdoctoral appointment is full-time, temporary (up to two years), and offers a monthly salary. Applicants must submit a complete application, including CV, diplomas, grades, translations if necessary, and a brief statement of research interests and goals. The deadline for applications is May 25, 2026.
This position is ideal for candidates seeking to advance their research career in AI, distributed systems, and scalable machine learning inference, while contributing to a high-impact project at a renowned institution.