Senior DevOps Engineer, Monitoring & Control

NBCUniversal
Summary
Join NBCUniversal as a creative engineer and contribute to the engineering, operations, deployment, and maintenance of core Distribution Engineering Monitoring and Control systems. You will utilize scripting and automation to develop and enhance monitoring/alerting tools, interact with automated monitoring infrastructure, create system dashboards, and query data stores. Responsibilities include developing proof-of-concept deployments, utilizing modern frameworks, providing support to operations teams, and participating in on-call rotation. This fully remote position requires a Bachelor’s degree in Computer Science or a related field and 5+ years of relevant experience. The ideal candidate will possess strong skills in scripting languages, frontend technologies, cloud platforms, and containerization.
Requirements
- Bachelor’s degree in Computer Science or related degree
- 5+ yrs experience with monitoring and alerting tools i.e. Grafana, Splunk, ELK Stack, Dataminer
- Ability to develop end-to-end monitoring dashboards, alerts and reports for enterprise level environments
- Ability to collect data from various systems using COTS APIs
- Experience with scripting languages and tools i.e C#, Python, Bash
- Experience with modern frontend technologies like Vite, React, NodeJS, Typescript
- Experience with configuration management technology i.e. Ansible, Salt, and/or Chef
- Experience with public cloud platforms such as AWS, GCP or Azure
- Experience with networking and cloud-based network environments
- Experience with containerization Docker & Kubernetes
- Experience with CI/CD build (Github Actions), deployment practices, and Infrastructure as Code (Terraform)
- Experience in administrating Linux and Windows environments
- Ability to use Agile process for project management, development & tracking
Responsibilities
- Utilize scripting and automation to develop, customize and enhance monitoring/alerting tools for “on-air” environments
- Interact with automated monitoring infrastructure to ensure healthy environments
- Create system dashboards that improve system availability and reliability
- Query data stores to quantify the scope of reported issues
- Create new metrics and identify monitoring deliverables to improve site reliability
- Administer monitoring and control systems within the “on-air” environments
- Develop proof of concept deployments for evaluation of products and architectures
- Utilize modern frameworks and scripting languages to develop products and services for NBCU's IP video distribution environment
- Provide support for Operations/On-Air Engineering teams
- Participate in on-call rotation for 24/7 support coverage
Preferred Qualifications
- Experience with a variety of software and hardware operating environments
- Experience in troubleshooting complex technical issues
- Experience with IP video and broadcast technologies is a major plus
- Experience with broadcast monitoring products like TAG and MediaProxy
- Experience with SMPTE standards and implementation
- Experience with PTP implementation
- Good communicator and able to clearly articulate complex issues and technologies
- Great design and problem-solving skills
- Willing to take ownership of problems and see them through to resolution
- Comfortable working in a fast-paced, agile environment. Requirements change quickly and our team needs to constantly adapt to moving targets
- Experience with DevSecOps principles
- Ability to create user interface designs based on client workflows
- Ability to intake project requirements from Operational partners and work with vendors to meet their needs
Benefits
- Medical, dental and vision insurance
- 401(k)
- Paid leave
- Tuition reimbursement
- A variety of other discounts and perks
Share this job:
Similar Remote Jobs
