Machine Learning Engineer – Large-Scale Systems & Real-Time Inference
An opportunity for a Machine Learning Engineer to design and deliver large-scale systems supporting the training, optimisation and deployment of advanced models in high-performance environments. You will collaborate with researchers, software engineers and hardware specialists to build solutions that push the limits of GPU acceleration, distributed computing and automated experimentation.
Key Responsibilities
-
Design and implement distributed training pipelines handling high-volume data and complex model architectures
-
Develop low-latency inference systems providing real-time, high-accuracy predictions in production
-
Optimise and extend machine learning frameworks to improve training and inference performance
-
Leverage GPU programming (CUDA, cuDNN, TensorRT) to maximise efficiency
-
Automate model experimentation, tuning and retraining in partnership with research teams
-
Work with infrastructure specialists to optimise workflows and reduce compute costs
-
Assess and integrate emerging open-source tools to enhance ML development and deployment
Skills and Experience
-
5+ years’ experience in machine learning with a focus on training and inference systems
-
Strong programming expertise in Python and C++ or CUDA
-
Proficiency with PyTorch, TensorFlow or JAX
-
Hands-on experience with GPU acceleration and distributed training (Horovod, NCCL or similar)
-
Background in real-time, low-latency ML pipelines
-
Familiarity with cloud and orchestration technologies
-
Contributions to open-source ML or distributed systems projects are advantageous
What’s on Offer
-
Work on technically demanding, high-impact ML systems at global scale
-
Collaborate with a multidisciplinary team of experts in research, engineering and HPC
-
Join a culture that prizes innovation, precision and technical excellence
-
Competitive remuneration, benefits and clear scope for progression