Phantom AI is hiring a
Linux System Engineer in United States
![Logo of Phantom AI](https://cdn.jobscollider.com/logo/phantom-ai-7ba9.webp)
Linux System Engineer closed
🏢 Phantom AI
💵 ~$120k-$160k
📍United States
📅 Posted on Jun 11, 2024
Summary
The job is for an AI/ML cluster infrastructure support role at Phantom AI, a company specializing in cost-effective L2/L3 solutions for the automotive industry. The position involves systems automation, configuration management, health monitoring, debugging application performance issues, and more. The role is remote or in-office, and the company provides equal employment opportunities.
Requirements
- Bachelor’s degree in computer science, electrical engineering or related field
- Strong understanding of Linux fundamentals and performance optimizations (Ubuntu)
- Advanced experience with SLURM configuration management systems, starting from scratch
- Demonstrable knowledge of TCP/IP, Linux operating system internals, filesystems, disk/storage technologies and storage protocols
- Experience in collaborating with network and data center teams for large scale cluster builds
- Experience with configuration management software systems monitoring and alerting (Prometheus, Grafana, Telegraf, Splunk, etc.) and/or administering HPC workload managers (SLURM)
- Experience with high-throughput low-latency networks, GPU-based computing systems, and/or high performance storage systems
- Experience with Slurm and storage management of distributed parallel file systems a plus
- 3+ years of additional equivalent experience or evidence of exceptional ability related to the position
Responsibilities
- Support the AI/ML cluster infrastructure on GPU focusing on systems automation, configuration management and deployment at scale
- Improve our cluster health monitoring and auto-recovery pipeline
- Work with users on debugging application performance issues
- Work with hardware and storage vendors to tune and optimize our servers, TrueNas storage and network
- Automate and Deploy GPU cluster with Ansible
- Performance tuning and OS provisioning on Linux systems
- Manage HPC clusters, workloads and applications
Benefits
- This is a contract position
- Office snacks & reimbursable meals* when in-office
This job is filled or no longer available
Similar Jobs
- 1 months ago💰$48k-$107k📍France
- 1 months ago💰$48k-$107k📍France
- 1 months ago💰$48k-$107k📍United Kingdom
- 1 months ago💰$48k-$107k📍United Kingdom
- 1 months ago💰$48k-$107k📍Israel
- 1 months ago💰~$80k-$120k📍United States
- 1 months ago💰~$57k-$82k📍Worldwide
- 1 months ago💰~$135k-$203k📍Worldwide
- 1 months ago💰~$110k-$150k📍Worldwide
- 1 months ago💰$110k-$114k📍Worldwide