📍 Local Job Near You
LLM Serving Engineer (Cloud AI Engineering), Senior / Staff Engineer
Qualcomm
📍
markham, Canada
Location
markham
Posted
June 08, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
Company Qualcomm Technologies, Inc.
Job Area Engineering Group, Engineering Group > Machine Learning Engineering
LLM Serving Engineer (Cloud AI Engineering) Qualcomm is utilizing its traditional strengths in digital wireless technologies to play a central role in the evolution of Cloud AI. We are investing in several supporting technologies including Deep Learning. The Qualcomm Cloud AI team is developing hardware and software solutions for Inference Acceleration.
We are hiring LLM Serving Engineers at multiple levels to join our dynamic, collaborative team. This role spans the full product lifecycle—from cutting‑edge research and development to commercial deployment—and demands strategic thinking, strong execution, and excellent communication skills.
This Role Involves The Following Activities
Building a scalable LLM inference platform using inference techniques (e.g. disaggregated serving and KV‑Cache management, advanced parallelism, spec...
Job Area Engineering Group, Engineering Group > Machine Learning Engineering
LLM Serving Engineer (Cloud AI Engineering) Qualcomm is utilizing its traditional strengths in digital wireless technologies to play a central role in the evolution of Cloud AI. We are investing in several supporting technologies including Deep Learning. The Qualcomm Cloud AI team is developing hardware and software solutions for Inference Acceleration.
We are hiring LLM Serving Engineers at multiple levels to join our dynamic, collaborative team. This role spans the full product lifecycle—from cutting‑edge research and development to commercial deployment—and demands strategic thinking, strong execution, and excellent communication skills.
This Role Involves The Following Activities
Building a scalable LLM inference platform using inference techniques (e.g. disaggregated serving and KV‑Cache management, advanced parallelism, spec...