Infrastructure Engineer

closed
Pallon Logo

Pallon

πŸ“Remote - Germany, Worldwide

Summary

Join Pallon, a spin-off from ETH Zurich, as a seasoned infrastructure engineer to take full ownership of our infrastructure, from our high-performance GPU cluster to our cloud systems. You will lead critical decisions around architecture, performance, and scale, while also solving real-world issues. Collaborate closely with our platform and computer vision teams to ensure tools run fast, reliably, and securely. This hands-on role offers autonomy to shape the infrastructure. You will design and build a custom GPU cluster, manage and scale infrastructure, and keep systems running smoothly and securely. The ideal candidate has 5+ years of experience owning infrastructure end-to-end, strong Linux fundamentals, and excellent communication skills.

Requirements

  • Spend 5+ years owning infrastructure end-to-end, ideally in startup environments
  • Be comfortable at every layer β€” from bare-metal servers and NVMe drives to container orchestration and cloud-native tools
  • Have strong Linux fundamentals, and know your way around networking, storage, and distributed systems
  • Code well enough to automate, debug, and build tooling across a variety of languages
  • Communicate clearly and collaborate well β€” especially with engineers who aren’t infra specialists
  • Thrive with autonomy and manage your own priorities effectively
  • Be curious and fast-learning, especially when tackling new tools or challenges
  • Have a university degree in Computer Science or a related field

Responsibilities

  • Design and build a custom GPU cluster for deep learning workloads
  • Decide how we manage and scale our infrastructure β€” both on-prem and in the cloud
  • Keep systems running smoothly and securely β€” from data pipelines to distributed training jobs
  • Troubleshoot weird kernel errors, configure systemd units, or debug Kubernetes evictions
  • Make calls on when to script, when to automate, and when to just fix the thing

Preferred Qualifications

  • Have experience with machine learning infrastructure or HPC clusters
  • Have familiarity with data engineering workflows and ETL pipelines

Benefits

  • Contribute to a positive impact on society and the environment
  • Develop a novel product that changes a whole industry
  • Be part of a motivated, smart, fun, and supportive team of software engineers and AI researchers
  • Own a part of Pallon and have a part in our success with our Employee Stock Option Plan (ESOP)
  • Work from home or enjoy access to our beautiful office space located in ZΓΌrich
This job is filled or no longer available

Similar Remote Jobs