Looking for a role with plenty of growth opportunities? Join a pioneering company at the forefront of AI infrastructure development, powered by the world's leading GPU manufacturer, NVIDIA. The AI revolution demands advanced infrastructure-including compute, storage, platforms, tools, and services for developers-precisely what this company is building. With an R&D core of approximately 850 top-tier AI engineers, many with extensive experience in big tech, the company specializes in creating world-class AI infrastructure. Headquartered in Amsterdam, it has a global presence with R&D hubs across Europe, North America, and Israel, operating worldwide as a Nasdaq-listed company.
The Nasdaq-listed AI infrastructure provider is searching for a skilled Senior HPC Cluster Engineer to join the team. If you would like to learn more about this opportunity, feel free to reach out and apply today!
Responsibilities:
- Improve infrastructure supporting GPU-accelerated computing.
- Analyze root causes of performance and reliability issues across various scales and suggest effective solutions.
- Add support for new hardware across the infrastructure software stack.
- Proactively detect and resolve issues to ensure platform stability and efficiency.
Skills/Must have:
- 5+ years of professional software development experience.
- 3+ years working with Linux systems.
- Strong system-level understanding of server architecture, PCIe devices, NICs, and kernel drivers.
- Proficiency in performance-oriented programming languages (e.g., C, C++, Go, Java, Python).
Nice to have:
- Experience tuning performance for HPC workloads.
- Familiarity with RDMA, RoCE, and Infiniband networking.
- Knowledge of Software Defined Networking and HPC cluster networking.
- Understanding of the QEMU/KVM virtualization stack.
- Experience with deep learning frameworks (e.g., PyTorch, TensorFlow).
- Familiarity with collective communication libraries (e.g., MPI, NCCL).
- Willingness to complete a coding interview as part of the hiring process.
Benefits:
- Competitive salary and full benefits package.
- Opportunities for professional growth and internal mobility.
- Hybrid work environment with flexibility.
- Collaborative and forward-thinking engineering culture.
- Contribute to the infrastructure that powers next-generation AI computing.
- Collaborate with experts in virtualization, hardware acceleration, and high-performance clusters.
- Gain exposure to advanced technologies like RDMA, RoCE, Infiniband, and QEMU/KVM.
Salary:
- Competitive and based on experience.
€70000.00 - €100000.00 monthly
