Summary
Join the dynamic team at Kaseya as a Staff AI Platform Engineer. Build high-performance backend systems that power intelligent AI agents, delivering low-latency, high-throughput microservices for communication, orchestration, and analytics. Youβll engineer backend infrastructure with carrier-grade reliability and scale, enabling real-time agent workflows and seamless integration across the platform. Your work will be critical to ensuring the speed, resilience, and intelligence of our next-generation AI systems. This is a 100% remote position.
Requirements
- 5-7+ years of experience in backend or platform engineering
- Mastery of Python, Java, or Go
- Experience building and scaling RESTful and gRPC APIs for high-volume traffic
- Advanced database design and optimization skills across SQL (PostgreSQL, MySQL) and NoSQL (Cassandra, MongoDB, DynamoDB, vector databases)
- Deep experience with cloud-native services (AWS, Azure, or GCP)
- Expertise in container orchestration (Docker/Kubernetes)
- Experience with real-time data streaming technologies (Kafka, Pulsar, Kinesis)
- Strong understanding of distributed systems principles (consistency, availability, partition tolerance)
- Experience implementing robust authentication/authorization (OAuth 2.0, OpenID Connect, SAML)
- Proven track record implementing domain-oriented data products and self-serve data infrastructure
- Familiarity with service mesh technologies (e.g., Istio, Linkerd)
- Bachelor's degree in Computer Science, Software Engineering, or a related field
Responsibilities
- Design, build, and maintain scalable backend microservices that support AI agent communication, orchestration, and data processing
- Develop high-throughput, low-latency APIs and services with a focus on reliability, observability, and fault tolerance
- Design and implement distributed, domain-oriented data architectures and domain-aligned data mesh framework
- Implement real-time data ingestion pipelines, streaming systems, and metadata cataloging solutions to support intelligent agent workflows
- Ensure secure, scalable, and resilient backend infrastructure using modern cloud-native technologies
- Collaborate with AI, frontend, and infrastructure teams to deliver seamless end-to-end platform capabilities
- Optimize system performance and reliability through rigorous testing, monitoring, and chaos engineering practices
Preferred Qualifications
- Expertise in building fault-tolerant and self-healing systems
- Performance optimization for ultra-large scale, low-latency systems
- Knowledge of advanced distributed consensus algorithms
- Experience with WebAssembly (WASM) for performance-critical components
- Experience building real-time collaborative features (e.g., WebSockets, CRDTs)
- Background in chaos engineering and designing for resilience
- Experience developing for multi-cloud or hybrid cloud environments
- Familiarity with high-performance database technologies (e.g., ScyllaDB, TimescaleDB)
- Master's degree or certifications in Cloud Architecture, Distributed Systems, or Backend Engineering (e.g., AWS Certified DevOps Engineer, GCP Professional Cloud Developer)
Benefits
This position is 100% remote
Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.