DevOps Engineer

TrustFlight
Summary
Join TrustFlight, a global leader in mission-critical software systems, as a DevOps Engineer to shape the backbone of our cloud-native SaaS platform. You will architect, modernize, and optimize our Azure-based infrastructure, ensuring high availability and performance. Responsibilities include driving infrastructure automation, enhancing CI/CD pipelines, and championing DevSecOps. You will collaborate with engineering and product teams, participate in a rotating support schedule, and contribute to incident response. This remote position, based in South Africa, offers opportunities for growth and development in a dynamic and challenging environment. Your work will directly impact the safety and efficiency of aviation operations, affecting hundreds of thousands of passengers and crew monthly.
Requirements
- Proven experience building and evolving internal platform capabilities—including CI/CD pipelines, deployment tooling, and automation frameworks—that empower engineering teams to deliver software reliably and efficiently
- Hands-on experience with the Azure ecosystem, including deep familiarity with resource provisioning, monitoring, performance tuning, and cost optimization strategies
- Strong scripting and automation skills using tools like PowerShell, Bash, or other languages relevant to cloud operations
- Experience or knowledge in SQL database administration, particularly around backup/recovery strategies and high availability configurations (MySQL and PostgreSQL preferred)
- Strong understanding of cloud infrastructure architecture, distributed systems, and the operational challenges of supporting SaaS platforms at scale
- Demonstrated application of security best practices and DevSecOps principles across infrastructure and deployment lifecycles
- Experience applying modern AI tools to enhance observability, operational workflows, or support processes—paired with a solid understanding of their capabilities and limitations
- Deep understanding of containerization, orchestration, and virtualization technologies, including Kubernetes, Docker, and related tools
- Proficiency with CI/CD tools and workflows (preferably GitLab CI or Azure DevOps), and familiarity with release orchestration and environment management
- Experience with Infrastructure-as-Code practices and tooling (preferably Terraform), and strong infrastructure lifecycle management skills
- Familiarity with modern cloud service architectures such as web apps, microservices, API gateways, and service meshes
- Solid knowledge of networking fundamentals, including DNS, load balancing, VPNs, firewalls, and reverse proxies (NGINX preferred)
- Master-level organization and documentation practices that support reliable operations and knowledge sharing across teams
- Excellent communication and collaboration skills; proven ability to work effectively with cross-functional engineering and product teams
Responsibilities
- Architect, modernize, and optimize our Azure-based infrastructure to support platform growth, resilience, and scalability
- Ensure high availability and performance across cloud environments (primarily Azure, with some exposure to GCP) and shared services
- Monitor and analyze system health, usage patterns, and cost metrics to ensure reliable operations and cost-effective scaling
- Manage disaster recovery readiness, including backup procedures, failover testing, and incident preparedness
- Drive infrastructure automation and standardization to improve consistency, repeatability, and deployment efficiency across environments
- Continuously assess and enhance CI/CD pipelines, deployment tooling, and delivery workflows to enable fast and safe software releases
- Develop internal platform capabilities and self-service tools that empower engineering teams to ship code reliably
- Champion a DevSecOps mindset, embedding security best practices into infrastructure and deployment lifecycles
- Lead and contribute to incident response efforts, including real-time resolution, root cause analysis, and post-incident reviews
- Implement strategies for operational excellence, proactively identifying and resolving performance and stability bottlenecks
- Work closely with software engineers, product teams, and platform stakeholders to align infrastructure initiatives with business and technical goals
- Document architecture and operational processes clearly to promote knowledge sharing and reliable team operations
- Participate in a rotating support schedule, including providing coverage during European time zones and handling emergency responses as needed
Preferred Qualifications
- Experience with GCP or multi-cloud environments
- Exposure to GitOps workflows and tools like ArgoCD or Kustomize
- Knowledge of .NET applications in cloud settings
- Familiarity with observability stacks (e.g., Grafana, ELK, Prometheus)
- Understanding of compliance frameworks like SOC 2 or ISO 27001
- Use of AI tools for enhancing operational efficiency
- Experience with SIEM integration and incident response tooling
- Comfort with remote collaboration and working across time zones
- Familiarity with SLOs, SLIs, error budgets, and operational KPIs
Benefits
- Opportunities for growth and development
- A dynamic and challenging scale up environment
Share this job:
Similar Remote Jobs
