Associate Production Support Engineer

Updater Logo

Updater

πŸ’΅ $70k-$95k
πŸ“Remote - United States

Summary

Join Updater's Production Support team and revolutionize moving experiences through operational excellence. As an Associate Production Support Engineer, you will safeguard revenue-generating systems and evolve into a reliability engineering partner, preventing incidents. Collaborate with a dedicated team to transition from reactive incident response to proactive reliability engineering. Over the next year, your primary mission is to enhance the team's operational capabilities and master foundational elements for a reliability engineering role. You'll support the company's shift towards a proactive reliability engineering practice. This role involves a balance of operational excellence and growth responsibilities, focusing on incident management, communication, and SRE development.

Requirements

  • 2+ years of experience troubleshooting production systems, networks, or applications
  • Understanding of fundamental programming concepts and basic scripting abilities
  • Experience with SQL and relational databases for data analysis and troubleshooting
  • Familiarity with API testing tools (Postman) and web service troubleshooting
  • Basic understanding of Git concepts and collaborative development workflows
  • AWS Certified Cloud Practitioner level knowledge or equivalent cloud platform experience
  • Proven ability to work effectively under pressure during system outages or critical incidents
  • Experience with ticketing systems and structured escalation procedures
  • Strong problem-solving skills with ability to analyze complex technical issues
  • Excellent time management and ability to prioritize multiple concurrent issues
  • Professional communication style with both technical and non-technical stakeholders
  • Experience providing status updates and technical explanations during incident response
  • Ability to write clear documentation and incident reports
  • Comfort participating in bridge calls and leading technical discussions
  • Bachelor's degree in Computer Science, Information Technology, Engineering, or related technical field
  • OR equivalent work experience (additional 2+ years of hands-on technical experience in lieu of degree)
  • 2+ years of experience in technical support, system administration, DevOps, or related operational roles
  • Demonstrated ability to learn new technologies quickly and adapt to changing technical environments

Responsibilities

  • Monitor critical production systems that directly generate company revenue using DataDog dashboards and synthetic tests
  • Respond to incidents with speed and precision, following established escalation procedures to minimize business impact
  • Manage escalations from internal and external partner call center agents
  • Partner with Updater engineering, support teams, and our providers to resolve escalated issues
  • Participate in 24x7 on-call rotation, ensuring someone is always watching our systems
  • Triage and resolve production issues through JIRA workflows, maintaining clear communication with stakeholders
  • Lead incident response for production outages, coordinating across teams to restore service quickly
  • Document incidents thoroughly and participate in blameless postmortem processes that focus on system improvement
  • Communicate effectively with internal teams, external partners, and leadership during high-stress situations
  • Build relationships across the organization, assuming positive intentions and celebrating team successes
  • Learn to implement Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for critical services
  • Develop basic automation scripts to reduce manual operational tasks (Python, Bash, AWS CLI)
  • Gain familiarity with Infrastructure as Code concepts using Terraform
  • Collaborate with the DevOps team to understand CI/CD pipelines and deployment automation
  • Contribute to developer self-service initiatives that reduce operational dependencies
  • Support the Platform Engineering mission by identifying opportunities to simplify and standardize operational processes
  • Help develop documentation and "golden path" procedures that enable teams to operate more independently
  • Participate in cross-team collaboration to advance platform maturity and developer experience
  • Learn observability best practices including custom metrics, alerting, and Real User Monitoring (RUM)

Preferred Qualifications

  • Basic familiarity with Infrastructure as Code concepts (Terraform)
  • Experience with monitoring and observability tools (DataDog, Prometheus, Grafana)
  • Understanding of containerization and Kubernetes fundamentals
  • Knowledge of CI/CD pipeline concepts and deployment automation
  • Experience with configuration management or automation tools
  • Previous experience in customer-facing technical support roles
  • Background in system administration or DevOps practices
  • Experience with incident management frameworks (ITIL, SRE practices)
  • Understanding of SLA/SLO concepts and reliability engineering principles

Benefits

  • Medical, Dental, and Vision Insurance
  • Unlimited PTO
  • 13 paid company holidays annually
  • Updater Stock Options
  • 401(k)
  • Commuter Benefits
  • Personal Wellbeing Subsidy
  • New Hire Subsidy
  • One Medical Membership
  • Short Term Disability Insurance
  • Supplemental Life Insurance
  • 12 weeks of Primary Caregiver Parental Leave

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.