Senior Infrastructure Engineer, SRE

Logo of Flex

Flex

๐Ÿ“Remote

Job highlights

Summary

Join Flex, a NYC-based FinTech company revolutionizing rent payments, as a Senior Infrastructure Engineer. You will be a key member of a small team responsible for building and maintaining a scalable and reliable infrastructure on AWS and GCP. This remote role requires at least 5 years of cloud infrastructure experience and proficiency in technologies like Terraform, Kubernetes, and CI/CD. You will collaborate with service engineering teams, automate processes, and ensure optimal system performance. Flex offers a competitive compensation package, including comprehensive health insurance, 401k, unlimited PTO, and parental leave.

Requirements

  • Proven experience in building, scaling and monitoring cloud infrastructure on AWS, especially EKS, S3, RDS, API Gateway, Load Balancers, VPC, Lambdas, DocumentDB and DynamoDB
  • Proven experience using Terraform to update and maintain cloud infrastructure
  • Proven experience with containerized applications, kubernetes and microservice deployments
  • Strong knowledge of GitHub Actions and CI/CD best practices
  • Experience with developer productivity tools: designing CI/CD workflows, building internal tools, and creating self-service solutions to streamline software development
  • Knowledge of monitoring and observability tools and frameworks, with working knowledge of Datadog being a plus
  • Familiarity with networking concepts (DNS, load balancing, firewalls, VPNs)
  • Strong collaboration skills with the ability to work effectively across teams and communicate technical ideas clearly
  • Experience coding/reading in one of the industry standard language such as Java, Python, TypeScript
  • Minimum of 5 years of cloud infrastructure experience

Responsibilities

  • Collaborate with service engineering teams to design, implement, and maintain scalable and resilient infrastructure solutions optimizing for performance, resilience, and cost
  • Ensure infrastructure aligns with business requirements and industry standards
  • Leverage Terraform to automate infrastructure provisioning and configurations
  • Implement SRE principles to improve system reliability and reduce downtime
  • Improve developer workflows by creating self-service tools, optimizing CI/CD pipelines, and enhancing deployment processes to remove friction
  • Develop and maintain robust monitoring and alerting systems to proactively identify and resolve issues
  • Lead incident responses, manage on-call rotations, and facilitate post-incident reviews to drive continuous improvement and resilience
  • Automate everythingโ€”drive adoption of Infrastructure as Code (IaC) and build automated pipelines for testing, monitoring, and deployments
  • Leverage your excellent written and verbal communication skills, to create communications on upcoming changes and how they affect teams

Benefits

  • Competitive pay
  • 100% company-paid medical, dental, and vision
  • 401(k) + company equity
  • Unlimited paid time off with a PTO minimum + 13 company paid holidays
  • Parental leave
  • Flex Cares Program: Non-profit company match + pet adoption coverage
  • Free Flex subscription
  • Competitive Pay
  • Company Equity
  • Unlimited PTO

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.
Please let Flex know you found this job on JobsCollider. Thanks! ๐Ÿ™