Senior Infrastructure Engineer, SRE

closed
Flex Logo

Flex

πŸ“Remote

Summary

Join Flex, a NYC-based FinTech company revolutionizing rent payments, as a Senior Infrastructure Engineer. You will be a key member of a small team responsible for building and maintaining a scalable and reliable infrastructure on AWS and GCP. This remote role requires at least 5 years of cloud infrastructure experience and proficiency in technologies like Terraform, Kubernetes, and CI/CD. You will collaborate with service engineering teams, automate processes, and ensure optimal system performance. Flex offers a competitive compensation package, including comprehensive health insurance, 401k, unlimited PTO, and parental leave.

Requirements

  • Proven experience in building, scaling and monitoring cloud infrastructure on AWS, especially EKS, S3, RDS, API Gateway, Load Balancers, VPC, Lambdas, DocumentDB and DynamoDB
  • Proven experience using Terraform to update and maintain cloud infrastructure
  • Proven experience with containerized applications, kubernetes and microservice deployments
  • Strong knowledge of GitHub Actions and CI/CD best practices
  • Experience with developer productivity tools: designing CI/CD workflows, building internal tools, and creating self-service solutions to streamline software development
  • Knowledge of monitoring and observability tools and frameworks, with working knowledge of Datadog being a plus
  • Familiarity with networking concepts (DNS, load balancing, firewalls, VPNs)
  • Strong collaboration skills with the ability to work effectively across teams and communicate technical ideas clearly
  • Experience coding/reading in one of the industry standard language such as Java, Python, TypeScript
  • Minimum of 5 years of cloud infrastructure experience

Responsibilities

  • Collaborate with service engineering teams to design, implement, and maintain scalable and resilient infrastructure solutions optimizing for performance, resilience, and cost
  • Ensure infrastructure aligns with business requirements and industry standards
  • Leverage Terraform to automate infrastructure provisioning and configurations
  • Implement SRE principles to improve system reliability and reduce downtime
  • Improve developer workflows by creating self-service tools, optimizing CI/CD pipelines, and enhancing deployment processes to remove friction
  • Develop and maintain robust monitoring and alerting systems to proactively identify and resolve issues
  • Lead incident responses, manage on-call rotations, and facilitate post-incident reviews to drive continuous improvement and resilience
  • Automate everythingβ€”drive adoption of Infrastructure as Code (IaC) and build automated pipelines for testing, monitoring, and deployments
  • Leverage your excellent written and verbal communication skills, to create communications on upcoming changes and how they affect teams

Benefits

  • Competitive pay
  • 100% company-paid medical, dental, and vision
  • 401(k) + company equity
  • Unlimited paid time off with a PTO minimum + 13 company paid holidays
  • Parental leave
  • Flex Cares Program: Non-profit company match + pet adoption coverage
  • Free Flex subscription
  • Competitive Pay
  • Company Equity
  • Unlimited PTO
This job is filled or no longer available