Database Administrator-SRE

Axiom Software Solutions Limited
Summary
Join our Cloud Team as a DBA SRE and play a pivotal role in delivering cloud-related database services and products to internal customers. You will provide comprehensive database support for cloud services, encompassing monitoring, incident response, and resolution for both public and private cloud platforms. Your expertise will ensure database availability, reliability, security, and performance while adhering to SLAs. You will actively participate in incident management, change management, and root cause analysis, continuously striving to improve environmental stability. As an observability expert, you will oversee centralized database platform monitoring and alerting, driving improvements in this area. Automation of repetitive tasks through runbooks and playbooks will be a key focus, along with supporting development teams in adopting DevOps practices for efficient scaling of cloud workloads.
Responsibilities
- Provide end-to-end database support for cloud services through monitoring, incident response, and incident resolution for the Public and Private Cloud Platform environment
- Support incidents affecting hosted database workloads by providing these services to the owning teams who are customers of the database Platforms, in a timely manner
- Monitor the databases to ensure availability, reliability, security and performance, responding to incidents and escalating them to the appropriate teams as required to ensure SLA's are met
- Follow incident, change, release, and problem management processes
- Participate and drive root cause analysis on incidents to ensure issues around and acted up on in a timely manner, using those processes to continually seeking to improve the stability of the environment for all customers
- Be an expert in observability, responsible for centralised database platform monitoring and alerting tooling, setting the direction and enabling greater observability for all
- Seek to use automation to remove repetitive tasks, developing Run books and Playbooks for use across the organisation
- Be the expert in Cloud operations in cloud environments, using that expertise to support development teams in taking on more database operations responsibilities and driving DevOps practices to enable the company to efficiently scale cloud workloads
- Database Systems Administration: Responsible for database backup and recovery (verification, checks), examine logs and alerts, maintain access rights and roles, database instance version control
- Base Database Monitoring: Essential, manual / script technical items (e.g., service/process up-down, etc.)
- Based detection and notification of critical database
- Database Structural Maintenance: Responsible for database space and storage management, database object management, physical database layout and temporary space management
- Patch Management: Identification, testing, packaging and application of the necessary security updates, support packs, and other updates associated with supported database
- Engineering Support: Management and support of database environment provided by engineers with deep technical skills and problem-solving acumen given inclusion in and visibility to current and historical changes to the environment
- Advise on best practices for production operation of Database PaaS environments
- Implement database maintenance plans and ensure optimal production configuration is in place
- Database Systems Administration: Responsible for database backup and recovery, examine logs and alerts, maintain access rights and roles, security group management, provisioning new databases
- Base Database Monitoring: Service availability, Health Check, Error monitoring, setup of CloudWatch alerts
- Engineering Support: Management and support of database environment provided by engineers with deep technical skills and problem-solving acumen given inclusion in and visibility to current and historical changes to the environment
- Participate in on call rota
Benefits
Fully Remote