Senior Data Infrastructure Engineer

Cybereason Logo

Cybereason

πŸ“Remote - Japan

Summary

Join Cybereason as a Senior Data Infrastructure Engineer to design and scale the data infrastructure for our cybersecurity analytics platform. You will build distributed systems processing billions of security events daily, utilizing big data, cloud-native engineering, and cybersecurity expertise. Responsibilities include designing petabyte-scale data infrastructure, building high-throughput data pipelines, architecting distributed systems using cloud-native technologies, and implementing robust data governance frameworks. You will collaborate with data science and security teams, ensuring infrastructure compliance with security and availability requirements. This role requires a Bachelor's degree, 7+ years of relevant experience, and expertise in stream processing, analytical databases, and distributed storage. Cloud expertise and experience with Kubernetes are also essential.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or related field
  • 7+ years of experience building and maintaining large-scale data infrastructure
  • Proven experience operating petabyte-scale systems processing billions of records per day
  • Expert-level proficiency with stream processing: Apache Flink, Kafka, Pulsar, Redpanda, Kinesis
  • Deep experience with analytical and time-series databases: ClickHouse, Druid, InfluxDB, TimescaleDB
  • Familiarity with distributed storage: Hadoop (HDFS), Amazon S3, GCS, Azure Data Lake
  • Strong skills in: Rust, Go, Scala, Java, or Python for high-performance systems
  • Cloud expertise: AWS (EMR, Redshift, Kinesis), GCP (Dataflow, BigQuery, Pub/Sub), or Azure equivalents
  • Solid experience with Kubernetes, Docker, and Helm; familiar with service mesh like Istio or Linkerd
  • Strong grasp of data lake/lakehouse architectures and modern data stack tools

Responsibilities

  • Design and develop petabyte-scale data infrastructure and real-time streaming systems capable of processing billions of events daily
  • Build and optimize high-throughput, low-latency data pipelines for security telemetry
  • Architect distributed systems using cloud-native technologies and microservices patterns
  • Design and maintain data lakes, time-series databases, and analytical stores optimized for security use cases
  • Implement robust data governance, quality, and monitoring frameworks across all data flows
  • Continuously optimize for performance, scalability, and cost-efficiency in large-scale data workloads
  • Collaborate with data science and security teams to enable advanced analytics and ML capabilities
  • Ensure data infrastructure complies with strict security, availability, and compliance requirements

Preferred Qualifications

  • Experience with Apache Iceberg, Delta Lake, or Apache Hudi
  • Familiarity with Airflow, Prefect, or Dagster for orchestration
  • Knowledge of search platforms: Elasticsearch, OpenSearch, or Solr
  • Experience with NoSQL: Cassandra, ScyllaDB, or DynamoDB
  • Familiar with columnar formats: Parquet, ORC, Avro, Arrow
  • Experience with observability stacks: Prometheus, Grafana, Jaeger, OpenTelemetry
  • Familiar with Terraform, Pulumi, or CloudFormation for IaC
  • GitOps tools: ArgoCD, Flux for automated deployments
  • Exposure to data mesh, data governance, and metadata tooling (Apache Atlas, Ranger, DataHub)
  • Background in cybersecurity, SIEM, or security analytics platforms
  • Familiarity with ML infrastructure and MLOps best practices

Benefits

  • Competitive salary and benefits
  • Remote work options

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs