Senior Infrastructure Engineer

KoBold Metals
Summary
Join KoBold, a leading mineral exploration company, and contribute to building reliable and scalable infrastructure for turning data and models into real-world exploration insights. Partner with exploration and engineering teams to improve observability, streamline MLOps workflows, and maintain shared tools. This role involves designing, building, and operating scalable compute infrastructure, embedding observability and security in the software development process, and creating automation for monitoring and deployments. You will lead capacity planning, performance reviews, and system tuning, participate in on-call rotations and incident response, and establish disaster recovery practices. Continuous learning about mineral exploration is essential, including field work with geologists. This position offers a unique opportunity to contribute to a fast-growing company and make a real-world impact on the mining industry.
Requirements
- 5+ years of experience as an Infrastructure Engineer, Site Reliability Engineer or in a similar role
- Strong scripting and programming skills (Python, Go, Java or JavaScript/ Node.js )
- Experience with IaC tools like Terraform and container orchestration tools like Kubernetes and Docker
- Experience with cloud platforms such as AWS
- Experience operating or administering JupyterHub in a multi-user environment
- Understanding of MLOps workflows, including model training, deployment, and related tooling
- Excellent communication & collaboration skills and a continuous improvement mindset
- Proven ability to troubleshoot complex issues and implement effective solutions
- Proven ability to thrive in dynamic and evolving environments, effectively navigating uncertainty and incomplete information
- Proven ability to grow expertise, influence & educate others
- Comfortable making informed decisions with limited data, adapting quickly to new circumstances, and maintaining focus on strategic objectives while driving clarity for the team
- Intellectual curiosity and eagerness to learn about all aspects of mineral exploration, particularly in the geology domain. Enjoys constantly learning such that you are driving insights through using our tools in exploration and willing to work directly with geologists in the field
- Ability to explain technical problems to and collaborate on solutions with domain experts who are not infrastructure engineers. A strong communicator who enjoys working with colleagues across the company
- Excitement about joining a fast-growing early-stage company, comfort with a dynamic work environment, and eagerness to take on an evolving range of responsibilities
- Keen not just to build cool technology, but to figure out what technical product to build to best achieve the business objectives of the company
Responsibilities
- Design, build, and operate compute infrastructure that is both scalable and reliable to support critical services
- Work closely with engineering teams to embed observability, reliability, and security throughout the software development process
- Create and maintain automation for monitoring, deployments, and incident response to keep operations efficient and predictable
- Lead or support capacity planning, performance reviews, and system tuning to ensure stable and efficient systems
- Join the on-call rotation and take part in incident response, troubleshooting, and resolution
- Develop and refine monitoring and alerting to catch issues early and reduce downtime
- Establish and maintain disaster recovery and business continuity practices that protect the organization against failures
- Regularly review and improve our tools and processes to strengthen system visibility and reliability
- Investigate points of fragility in distributed systems and understand how complex systems behave under stress in order to improve resilience
- Continually learn about mineral exploration through reading, discussions with exploration team members, periodic rotation on an exploration team and time in the field with geologists
Benefits
Remote, Candidates can be located anywhere in the United States or Canada
Share this job:
Similar Remote Jobs

