Summary
Join Fearless Digital as a Site Reliability Engineer III and lead the design and implementation of reliable infrastructure solutions. You will collaborate with a talented team, mentor others, and drive all phases of the infrastructure lifecycle. This 100% remote position requires 8+ years of Software Engineering/DevOps experience, including at least 4 years as a DevOps Engineer/SRE, and strong AWS experience. You'll need expertise in Kubernetes, Terraform, and monitoring tools. Fearless offers a competitive salary, comprehensive benefits, and a flexible work environment.
Requirements
- Ability to obtain security clearance required by the project: Public Trust
- A minimum of 8 years of Software Engineering/DevOps experience total with at least 4 (most recent) years as a DevOps Engineer/SRE. Programming language experience with Java, Python, and Bash
- Prior experience leading a small team of SRE/DevOps Engineers (served as a Tech Lead or Project Lead)
- Strong demonstrated working experience managing Kubernetes Clusters using EKS within an AWS environment
- Strong experience using Terraform and CloudWatch to manage and monitor an AWS infrastructure
- Strong experience with Monitoring and Observability tools such as New Relic, Grafana, and Splunk
- Strong source code management experience with Git/Github
- Implementing best practices for maximizing uptime and debugging issues quickly
- Developing resiliency in infrastructure to handle outages
- Cloud Security best practices to prevent unauthorized access to resources
- Understanding of all layers of software engineering and system architecture
- Proficiency in securing systems on the application, network, and infrastructure layers
- Shall have experience in designing and implementing end-to-end continuous delivery pipelines
- Shall have deep AWS cloud experience in a production environment (e.g. network, security, deployment, automation, server-less technologies)
- Shall have experience and understanding in SRE principles for highly scalable and reliable systems
- Shall have strong experience with Configuration Management and Infrastructure as Code
- Experience with core infrastructure capabilities: operating systems, networking, identity, and access
- Understanding of CI/CD and related concepts
- Expert ability to execute advanced git actions like rebasing and squashing
- Ability to assist other engineers with source code management in git
- Basic understanding of software development and web application development concepts
- Ability to discuss technical tasks and team process topics with team members
- Ability to operate and manage work, strategically reason, and build relationships and influence others
Responsibilities
- Synthesize business requirements and objectives and drive the development of infrastructure solutions
- Collaborate with talented designers, product managers, and fellow engineers to plan and build new features
- Coach and mentor others to develop their professional skills
- Drive all phases of the infrastructure and operations lifecycle from task creation to production deployment of new system components
- Design and implement effective, secure infrastructures solutions that meet the business and technology requirements of the project
- Articulate business needs and translate them into technology solutions
- Develop pipelines and automate workflows and processes through code and tooling to reduce technical debt and improve the efficiency of the team
- Troubleshoot technical issues in infrastructure like software-defined networks, databases, and compute resources
- Develop and implement plans for the continuous improvement and vulnerability management of the system
- Identify opportunities and articulate suggestions to improve the technology strategies beyond the scope of a team or project
- Make decisions that are consistent with the organizationβs business strategy
- Demonstrate deep knowledge of products/workflows within the businesses they support
- Review other developers' code and provide specific, constructive feedback
Preferred Qualifications
- Current or prior local, state, or federal government project experience
- Experience with tools such as SQL, Matomo, and Kafka
- BS/MS/MEng in Computer Science, Information Systems, Information Technology, Mathematics, Electrical Engineering, Computer Engineering, or similar technology-related degree
- Experience working with government or large industry clients
- Proficient in at least one programming language and web applications framework such as Node.js/Express, Python/Django, Go, Java 8+/Spring, Ruby/Rails, etc
- Holds a current AWS Certified Developer Associate, Solutions Architecture Associate, or Solutions Architect Professional or similar certification in another cloud platform
- Holds a current Certified Scrum Master certification
- Holds a current CompTIA Security+ certification
Benefits
- We cover 100% of your premium for our medical HSA plan + the deductible portion of HSA contributions, 80% of your premium for our HMO or PPO plans, and offer competitive dependent coverage. We cover 100% of dental and vision premiums for you and your dependents and offer medical and dependent care FSA options. We also offer life insurance, short- and long-term disability coverage, and legal planning and support insurance
- Tech, education / training, and wellness allowances
- Safe Harbor 401(k) plan with employer contributions (current match = 4%) and immediate vesting
- Referral bonus: Bring your friends! If someone you refer is hired, youβll get a bonus of $6β12k!
- Total Pet Plan
- Employee Assistance Program
- Up 12 weeks of FMLA paid at 100%
- PTO is provided to team members as a lump sum allowance, not an accrual. PTO is prorated based on your start date ( see table below ) on a quarterly basis (with tenure-based increases), 8.75 days of sick leave, 11 federal holidays, their birthday (8 hours), up to 15 days for jury duty, and up to 3 days (24 hours) of bereavement leave per eligible instance
- Life-friendly schedules
- Family-friendly workplace