Senior AI Infrastructure Engineer at TetraScience

Summary

Join TetraScience, a leader in Scientific Data and AI Cloud, as a Senior AI Infrastructure Engineer. Design, build, and scale AI and data infrastructure, focusing on cloud-based MLOps pipelines. Collaborate with AI engineers, data engineers, and platform teams. Maintain and improve the performance, reliability, and cost-efficiency of AI models. Contribute to the design and evolution of the AI platform. Integrate AI models and LLMs into production systems. This role requires extensive experience in AI/ML infrastructure and strong coding skills.

Requirements

7+ years of professional experience in software engineering and infrastructure engineering
Extensive experience building and maintaining AI/ML infrastructure in production, including model, deployment, and lifecycle management
Strong knowledge of AWS and infrastructure-as-code frameworks, ideally with CDK
Expert-level coding skills in TypeScript and Python building robust APIs and backend services
Production-level experience with Databricks MLFlow, including model registration, versioning, asset bundles, and model serving workflows
Proven ability to design reliable, secure, and scalable infrastructure for both real-time and batch ML workloads
Ability to articulate ideas clearly, present findings persuasively, and build rapport with clients and team members
Strong collaboration skills and the ability to partner effectively with cross-functional teams

Responsibilities

Design, implement, and maintain cloud-native infrastructure to support AI and data workloads, with a focus on AI and data platforms such as Databricks and AWS Bedrock
Build and manage scalable data pipelines to ingest, transform, and serve data for ML and analytics
Develop infrastructure-as-code using tools like Cloudformation, AWS CDK to ensure repeatable and secure deployments
Collaborate with AI engineers, data engineers, and platform teams to improve the performance, reliability, and cost-efficiency of AI models in production
Drive best practices for observability, including monitoring, alerting, and logging for AI platforms
Contribute to the design and evolution of our AI platform to support new ML frameworks, workflows, and data types
Stay current with new tools and technologies to recommend improvements to architecture and operations
Integrate AI models and large language models (LLMs) into production systems to enable use cases using architectures like retrieval-augmented generation (RAG)

Preferred Qualifications

Familiarity with emerging LLM frameworks such as DSPy for advanced prompt orchestration and programmatic LLM pipelines
Understanding of LLM cost monitoring, latency optimization, and usage analytics in production environments
Knowledge of vector databases / embeddings stores (e.g., OpenSearch) to support semantic search and RAG

Benefits

100% employer-paid benefits for all eligible employees and immediate family members
Unlimited paid time off (PTO)
401K
Flexible working arrangements - Remote work
Company paid Life Insurance, LTD/STD
A culture of continuous improvement where you can grow your career and get coaching

Senior AI Infrastructure Engineer

TetraScience

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

DevOps

Senior

Share this job:

Similar Remote Jobs

Remote

DevOps

Senior

LoopMe

Remote

Data

Senior

Apollo.io

Remote

Software Development

Senior

Remote

DevOps

Senior

Remote

Software Development

Senior

Thoughtworks

Remote

DevOps

Senior

Thoughtworks

Remote

DevOps

Senior

Remote

Software Development

Senior

Remote

Software Development

Senior