Databricks is hiring a
Incident Manager

Logo of Databricks

Databricks

πŸ’΅ $81k-$181k
πŸ“Remote - United States

Summary

Join the Databricks Global Support Engineering team as an Incident Manager, utilizing technical experience and resourcefulness to lead urgent customer situations to conclusion.

Requirements

  • Minimum 3 years of experience in customer support, support escalation, and incident management
  • Minimum 3 years of experience in designing, testing, or maintaining Python/Java/Scala-based applications in project delivery and consulting environments
  • Prior incident management or escalation management experience
  • Hands-on experience with two or more of the following technologies at production scale: Big Data, Hadoop, Spark, Machine Learning, Artificial Intelligence, Streaming, Kafka, Data Science, ElasticSearch
  • Hands-on experience in performance tuning/troubleshooting of Spark-based applications at production scale
  • Working knowledge in Data Lakes, preferably on SCD types use cases at production scale
  • Working and hands-on experience with SQL-based databases, Data Warehousing/ETL technologies like Informatica, DataStage, Oracle, Teradata, SQL Server, and MySQL
  • Linux/Unix administration skills and hands-on experience with AWS, Azure, or GCP
  • Proven experience in JVM and Memory Management techniques such as Garbage collections, Heap/Thread Dump Analysis
  • Strong analytical, troubleshooting, and technical problem-solving skills, demonstrating technical excellence
  • Work with integrity, accountability, attention to detail, and expertise in execution and planning
  • Excellent contextual interpretation, writing skills, and ability to communicate effectively to technical and business audiences
  • Demonstrates the ability to make timely decisions from both business and technical perspectives
  • Thrives in high-pressure, fast-paced environments, showing resilience and maintaining a constructive attitude
  • Ability to work holidays and weekends as part of an on-call rotation
  • Bachelor's degree in Computer Science or a related field

Responsibilities

  • Drive critical customer escalations or widespread outages to conclusion and resolution
  • Demonstrate cross-functional leadership while establishing ownership of escalations and outages
  • Compile and deliver frequent high-quality communication to internal and external stakeholders including executive staff
  • Commence and lead war rooms while establishing other temporary communication channels as warranted for the duration of an outage
  • Ability to multi-task on several incidents and/or projects at once
  • Be the leader who derives product and process improvements from every incident and submits necessary feedback for improvements
  • Participate in on-call rotations

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Jobs

Please let Databricks know you found this job on JobsCollider. Thanks! πŸ™