Summary

Join Abnormal Security's IT team as an Infrastructure/DevOps Engineer to build and maintain reliable, scalable, and secure infrastructure for AI software engineers. You will collaborate with IT, security, and AI/ML engineering teams to support experimentation, deployment, and monitoring of advanced AI tools and solutions. This fully remote role (US and Canada) is perfect for someone passionate about systems engineering, AI enablement, and solving complex operational challenges. You will architect and manage infrastructure, implement containerization and orchestration, develop CI/CD systems, and collaborate on security and compliance. The ideal candidate thrives in collaborative environments and values automation, self-service tools, and building reliable systems. This role requires strong communication and a customer-first mindset.

Requirements

4+ years of experience in DevOps, SRE, or Infrastructure Engineering roles
Proficiency with cloud providers (AWS preferred), Kubernetes, and Docker
Experience with infrastructure as code tools (Terraform, Ansible, or Pulumi)
Strong scripting skills in Python, Bash, or similar
Familiarity with CI/CD systems such as GitHub Actions, Jenkins, or CircleCI
Understanding of networking, security, and identity management in cloud environments
Experience supporting ML workloads and GPU-based infrastructure
Ability to troubleshoot complex system issues in a distributed environment
Comfortable working across functional teams and communicating with technical and non-technical stakeholders

Responsibilities

Architect and manage infrastructure that supports AI/ML pipelines, tools, and data platforms
Implement and maintain containerization (e.g., Docker) and orchestration (e.g., Kubernetes) environments
Develop CI/CD systems that integrate with ML workflows and ensure reproducible AI experiments
Collaborate with security and compliance teams to ensure infrastructure meets data protection standards
Automate provisioning and deployment using IaC tools like Terraform or Pulumi
Monitor and troubleshoot infrastructure issues with tools like Prometheus, Grafana, and ELK stack
Partner with AI and software engineers to optimize platform performance and resource utilization
Maintain clear, accessible documentation to scale platform knowledge across the org

Preferred Qualifications

Familiarity with MLOps tools like MLflow, Kubeflow, or SageMaker
Experience with AI platform infrastructure (e.g., model serving, feature stores)
Knowledge of logging and monitoring frameworks (e.g., Fluentd, Loki)
Background in supporting data platforms like Snowflake, Databricks, or Hadoop
AWS Certified
Experience working in high-growth startups or tech companies

Benefits

Bonus
Restricted stock units (RSUs)

Infrastructure/DevOps Engineer

Abnormal Security

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Mid-level

Share this job:

Similar Remote Jobs

AI ML Software Developer

Vonage

Remote

Software Development

Mid-level