Senior DevOps Engineer at Binance

Summary

Join Binance, a leading global blockchain ecosystem, and become a key player in maintaining our ultra-low-latency infrastructure. You will own and optimize EC2 fleets, ensuring high-throughput messaging and network integrity. Responsibilities include performance tuning, building immutable infrastructure, and implementing robust observability measures. You will also participate in reliability testing, incident response, and capacity planning. Collaboration with cross-functional teams is crucial for identifying and resolving performance bottlenecks. This role requires extensive experience in Linux low-latency tuning, AWS operations, and high-throughput messaging systems.

Requirements

Linux low-latency tuning – CPU pinning, NUMA awareness, IRQ affinity, TCP/UDP stack tweaks, hugepages
AWS operations at scale – EKS, EC2, VPC, NLB/ALB, Auto Scaling, multi-AZ fail-over, cost & quota management
Infrastructure as Code / GitOps – Terraform (modular state)
CI/CD pipelines – GitLab CI or Jenkins; blue-green / canary deploys, sub-2-minute rollbacks, latency smoke-test gates
Observability – Prometheus + Grafana, Alertmanager, high-cardinality metrics, centralized log aggregation, eBPF tracing for µs-level hotspots
High-throughput messaging – Kafka cluster operations (partition strategy, ISR tuning, < 3 ms end-to-end), Nginx WebSocket termination
Trading-grade networking – ENA/SR-IOV, packet-loss analysis, security-group hardening
Performance & reliability engineering – perf, FlameGraph, chaos/load testing, p95/p99 latency SLO ownership
Automation & scripting – Python or Go for tooling, incident remediation, environment bootstrap

Responsibilities

Own ultra-low-latency EC2 fleets - Design cluster placement groups with ENA / SR-IOV networking
Kernel-level performance tuning - Apply CPU pinning, NUMA alignment, IRQ affinity, hugepages, and TCP/UDP sysctl tweaks to flatten tail latency
Immutable infrastructure & automated rollouts - Build Packer AMIs and Terraform Auto Scaling Groups; run GitLab/Jenkins pipelines with blue-green or canary deploys and sub-2-minute automatic rollbacks
High-throughput messaging & gateways - Operate Kafka clusters (partition/ISR tuning, rack awareness) and Nginx WebSocket edges serving 100 k + clients with single-digit-ms fan-out
Network integrity - Run packet-loss analysis and MTU/ECN/queue-depth tuning; enforce least-privilege security-group micro-segmentation
Observability & SLO stewardship - Instrument Prometheus/Grafana dashboards for order-ack latency, queue depth, reject rate; write Alertmanager rules driven by p95/p99 error-budget burn
Reliability testing & incident response - Schedule chaos/load drills; take part in 24 × 7 on-call, use perf/eBPF/FlameGraphs/tcpdump for µs-level RCA, and publish post-mortems with remediation actions
Capacity planning around macro events - Pre-warm spot pools and leverage Savings Plans to balance headroom and cost
Automation & tooling - Write Go/Python scripts for bootstrap, health probes, latency regression tests, and one-click remediation
Cross-team collaboration - Pair with Java/Rust engineers and quants to profile hot-path code, and eliminate bottlenecks without trading downtime

Preferred Qualifications

Rust/Go code familiarity, CNCF/AWS certifications, XDP/DPDK experience for kernel-bypass networking

Benefits

Competitive salary and company benefits
Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)

Senior DevOps Engineer

Binance

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Windranger Labs

Remote

DevOps

Senior

Trase

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior

Remote

DevOps

Senior