📍Worldwide
Data Engineer III/IV

Rackspace Technology
📍Remote - India
Please let Rackspace Technology know you found this job on JobsCollider. Thanks! 🙏
Summary
Join our team as a Data Engineer III/IV and lead operational coverage, resolving pipeline issues and proactively monitoring sensitive batches. You will design, build, test, and deploy fixes to non-production and production environments. Responsibilities include cost/performance optimization, security audits, troubleshooting incidents, and knowledge management. You will also manage communication with customers, handle change and request management, and conduct root cause analysis. This role requires extensive problem-solving skills and the ability to research and resolve issues across various platforms.
Requirements
- Good hands-on Databricks, Dbt, SSRS, SSIS, AWS DWS, AWS APP Flow, PowerB I/ Tableau
- Ability to read and write sql and stored procedures
- Good hands-on experience in configuring, managing and troubleshooting along with general analytical and problem-solving skills
- Excellent written and verbal communication skills
- Ability to communicate technical info and ideas so others will understand
- Ability to successfully work and promote inclusiveness in small groups
- Requires a Bachelor’s degree in computer science or other related field plus 10+ years of hands-on experience in configuring and managing AWS/tableau and databricks solutions
- This role requires extensive problem-solving skills and the ability to research an issue, determine the root cause, and implement the resolution; research of various sources such as databricks/AWS/tableau documentation that may be required to identify and resolve issues
- Must have the ability to prioritize issues and multi-task
Responsibilities
- Leads Level 4 operational coverage: Resolving pipeline issues / Proactive monitoring for sensitive batches / RCA and retrospection of issues and documenting defects
- Design, build, test and deploy fixes to non-production environment for Customer testing
- Work with Customer to deploy fixes on production upon receiving Customer acceptance of fix
- Cost / Performance optimization and Audit / Security including any associated infrastructure changes
- Troubleshooting incident/problem, includes collecting logs, cross-checking against known issues, investigate common root causes (for example failed batches, infra related items such as connectivity to source, network issues etc.)
- Knowledge Management: Create/update runbooks as needed
- Governance: Watch all the configuration changes to batches and infrastructure (cloud platform) along with mapping it with proper documentation and aligning resources
- Communication: Lead and act as a POC for customer from off-site, handling communication, escalation, isolating issues and coordinating with off-site resources while level setting expectation across stakeholders
- Change Management: Align resources for on-demand changes and coordinate with stakeholders as required
- Request Management: Handle user requests – if the request is not runbook-based create a new KB or update runbook accordingly
- Incident Management and Problem Management, Root cause Analysis, coming up with preventive measures and recommendations such as enhancing monitoring or systematic changes as needed
Preferred Qualifications
Experience with Databricks and tableau environment is desired
Benefits
Remote work
Share this job:
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.