Qualcomm operates one of the largest High-Performance Computing (HPC) environments in the world catering to Electronic Design Automation (EDA) workloads. Given the number of SoCs Qualcomm develops across various industries, the HPC ecosystem is the core supporting fabric of Qualcomm’s ASIC design execution.

We are looking for a Platform Engineering intern who can be responsible for the design, deployment, and operations of foundational components of our internal high performance computing platform. The ideal candidate will have a deep understanding of all stacks of a Linux-based distributed computing system and possess sufficient programming skills to develop solutions that automate routine operations.

Principal duties and responsibilities

As a Platforms Engineering intern, you’ll be part of a global cross-functional Platform Engineering team that is responsible for:

  • Designing, developing and evolving the foundational HPC infrastructure to support large scale EDA and AI.
  • Evaluating and selecting the optimal hardware and software components for the foundational HPC infrastructure.
  • Continuously observing, assessing and optimizing the performance of the HPC environments.
  • Designing and implementing solutions to improve the resilience, scalability and security of the HPC environments.
  • Providing the highest tier of support to users of the HPC environments.

Minimum qualifications

  • Bachelor of science degree (or equivalent) in computer science, engineering, or relevant field.
  • Experience in operating distributed computing infrastructure.
  • Strong knowledge of computing, networking and storage system architecture.
  • Strong knowledge of the Linux operating system.
  • Proficiency in a system programming language (C/C++, or Rust, etc.) and scripting with Python/Perl/Bash.

Preferred qualifications

  • Solid understanding of Linux kernel and user spaces. Abilities to develop kernel modules and to analyze application corefiles and kernel crashdumps.
  • In-depth skills assessing and optimizing all aspects of computational performance on Linux-based HPC systems.
  • Demonstrated experience with configuration management systems (Ansible, Chef, Puppet, etc.).
  • Demonstrated experience with high-performance storage systems.
  • Demonstrated experience with industry-standard interconnects and network fabrics, such as InfiniBand and Ethernet, and their impact on HPC system performance.
  • Demonstrated experience with job schedulers (LSF, Slurm, PBS, etc.).
  • Demonstrated experience with containers (Docker, Podman, Singularity, Apptainer, etc.) and container orchestration (Kubernetes).
  • Demonstrated experience with virtualization (KVM, VMware, Hyper-V, etc.).
  • Excellent communication and collaboration skills and the ability to work effectively in cross-function teams.
  • Strong analytical and problem-solving skills, combined with a can-do attitude and a commitment to continuous learning.

*References to a particular number of years experience are for indicative purposes only. Applications from candidates with equivalent experience will be considered, provided that the candidate can demonstrate an ability to fulfill the principal duties of the role and possesses the required competencies.