Senior Cloud Ai Engineer

G-P
Summary
Join G-P's AI Team and contribute to the development of GIA, our AI-powered product designed to simplify global employment challenges. This role offers the unique opportunity to be part of a small, startup-structured team within a larger organization, building a product from the ground up. You will collaborate with data scientists and engineers, leading the design and development of cloud-native applications and services using AWS. The ideal candidate possesses expert-level knowledge of Python and extensive experience in cloud-native development and deployment. This position requires strong problem-solving skills and the ability to work independently in a fast-paced environment. Competitive compensation and benefits are offered, along with the chance to expand your skills and see your innovations become reality.
Requirements
- 5+ years in software engineering, with a strong emphasis on cloud-native development and deployment practices
- Expert-level knowledge of Python, particularly for backend service development and data/ML tooling
- Hands-on experience with AWS services including ECS, Lambda, API Gateway, IAM, and CloudWatch
- Proficiency with infrastructure-as-code using Terraform, with a clear understanding of secure and cost-aware AWS architectures
- Strong experience in Docker and container orchestration patterns
- Familiarity with MLOps principles such as model versioning, inference APIs, logging, and data pipeline integration
- Competence in supporting full-stack applications and APIs, including services written in FastAPI (Python), Node.js and frontends in React
- Ability to work independently in a fast-moving environment, with strong collaboration and problem-solving skills
Responsibilities
- Collaborate with data scientists and engineers to containerize, deploy, and maintain machine learning models and APIs within our cloud infrastructure
- Lead the design and development of cloud-native applications and services, using AWS offerings such as ECS, Lambda, and API Gateway
- Implement practical MLOps workflows to support model packaging, inference pipelines, observability, and versioning β focusing on performance, auditability, and scalability
- Build and manage Terraform modules to provision secure, cost-effective, and maintainable infrastructure
- Create CI/CD pipelines (e.g., GitHub Actions) to automate the deployment and monitoring of models, services, and supporting tools
- Design and support production-grade systems with robust monitoring, alerting, and metrics using CloudWatch or third-party tools like New Relic
- Work across Python, NodeJS, and React-based applications, ensuring model services integrate smoothly with internal APIs and UIs
Preferred Qualifications
- Experience deploying and maintaining machine learning models in production, including containerized inference APIs and event-driven serving with ECS or Lambda
- Familiarity with model packaging and serving workflows using tools like MLflow, BentoML, or by building custom inference APIs with FastAPI or Flask
- Hands-on experience working with Large Language Models (LLMs), including prompt design, API integration (e.g., OpenAI, Claude, or Cohere), and optimizing inference latency and token usage
- Strong understanding of logging, monitoring, and observability practices for ML-powered services (e.g., using CloudWatch, New Relic, or Prometheus)
- Comfortable working across languages and stacks (Python, Node.js, React), particularly where backend ML services support user-facing applications
- Familiarity with vector stores (e.g., Pinecone, FAISS) and retrieval-augmented generation (RAG) patterns is a plus β bonus if youβve worked on more advanced AI capabilities like multi-agent orchestration, memory-enabled chains, or MCP-like architectures
- AWS certifications or prior experience in high-compliance, production-scale AWS environments
Benefits
Competitive compensation and benefits