Senior Systems Engineer HPC

Rackspace Technology Logo

Rackspace Technology

πŸ’΅ $116k-$198k
πŸ“Remote - United States

Summary

Join Rackspace as an HPC System Engineer and contribute to the success of our flagship clients. You will be responsible for designing, implementing, maintaining, and optimizing high-performance computing (HPC) infrastructure for a leading client. This role involves collaborating with researchers, scientists, and other engineers to ensure the smooth operation of HPC systems. The position is 100% remote, with a preference for candidates in the PST or CST time zones. Minimal travel may be required to San Antonio, TX or Seattle, WA.

Requirements

  • Bachelor's degree in computer science, engineering, or a related field
  • Experience may substitute for the degree
  • Minimum of 10 yrs experience working with systems; 5yrs specifically with HPC
  • Strong knowledge of Linux operating systems (e.g., Rocky, Ubuntu)
  • Experience with cluster management tools (e.g., Slurm, PBS)
  • Familiarity with high-speed interconnects (e.g., InfiniBand, Ethernet)
  • Knowledge of parallel file systems (e.g., Lustre, SEPH, GPFS)
  • Proficiency in scripting languages (e.g., R, Python, Bash)
  • Understanding of HPC hardware architectures and technologies (e.g., CPUs, GPUs, memory)
  • Strong demonstrated experience with a major configuration management software (e.g. Terraform, Ansible), including application packaging and installation
  • Must have strong knowledge of Linux security and Linux shell scripting
  • Strong communication and interpersonal skills
  • Knowledge of data transfer protocols and large-scale storage solutions

Responsibilities

  • Install, configure, and maintain HPC clusters, including hardware and software components
  • Monitor system performance, identify bottlenecks, and implement solutions to optimize performance
  • Manage user accounts, permissions, and resource allocation
  • Perform regular system maintenance, updates, and patching
  • Troubleshoot and resolve hardware and software issues in a timely manner
  • Participate in the design and planning of HPC infrastructure upgrades and expansions
  • Evaluate and recommend hardware and software solutions to meet evolving computational needs
  • Implement and manage storage systems, networking infrastructure, and interconnects (e.g., InfiniBand)
  • Optimize system configurations and application performance for HPC workloads
  • Profile and analyze application performance to identify areas for improvement
  • Implement and utilize performance monitoring tools and techniques
  • Provide technical support and training to HPC users
  • Collaborate with researchers and scientists to understand their computational requirements
  • Work closely with HPC architects and engineers to ensure that research needs are met
  • Document system configurations, procedures, and best practices
  • Assist HPC engineers and architects with day-to-day operations and ticket management
  • Implement and maintain security measures to protect HPC infrastructure and data
  • Ensure compliance with relevant security policies and regulations
  • Manage data backups and disaster recovery procedures

Benefits

  • The anticipated starting pay range for Colorado is: $116,100 - $170-280
  • The anticipated starting pay range for the states of Hawaii and New York (not including NYC) is: $123,600 - $181,280
  • The anticipated starting pay range for California, New York City and Washington is: $135,300 - $198,440
  • Unless already included in the posted pay range and based on eligibility, the role may include variable compensation in the form of bonus, commissions, or other discretionary payments
  • These discretionary payments are based on company and/or individual performance and may change at any time
  • Actual compensation is influenced by a wide array of factors including but not limited to skill set, level of experience, licenses and certifications, and specific work location

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs