Performance Engineer at Writer

Summary

Join Writer as a Principal Performance Engineer and lead the performance optimization of our cutting-edge Generative AI technology stack. This critical role ensures the scalability, efficiency, and reliability of our Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) systems. You will identify and resolve performance bottlenecks, optimize resource utilization, and ensure a seamless user experience. Collaboration with AI research, software engineering, and infrastructure teams is key to delivering world-class AI solutions. The position requires extensive experience in performance engineering, particularly with large-scale distributed systems and AI/ML technologies. A Bachelor's degree in a related field is required, with a Master's preferred.

Requirements

Hold a Bachelor's degree in Computer Science, Engineering, or a related field
Have 10+ years of experience in performance engineering, with a focus on large-scale distributed systems
Have 2+ years of experience working with AI/ML technologies
Possess proven experience in performance testing, profiling, and analysis of complex software systems
Demonstrate a deep understanding of NLP architectures, training, and inference
Have experience with vector databases and search technologies
Have experience with cloud computing platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes)
Possess strong programming skills in Python
Have experience with performance analysis tools (e.g., profilers, debuggers, monitoring tools)
Possess strong analytical and problem-solving skills
Possess excellent communication and collaboration skills
Have the ability to work in a fast-paced and dynamic environment
Demonstrate a passion for AI and a desire to push the boundaries of performance engineering

Responsibilities

Define and implement performance engineering strategies for our Generative AI full stack, including services, application, LLMs, RAG pipelines, and related infrastructure
Lead performance testing, profiling, and analysis efforts to identify and resolve performance bottlenecks
Establish and maintain performance benchmarks and SLAs for critical AI services
Provide technical leadership and mentorship to performance engineering team members
Analyze and improve LLM inference performance, including latency, throughput, and resource utilization
Develop and implement strategies for LLM capacity planning and scaling
Collaborate with AI researchers to evaluate and improve LLM model architectures and training techniques for performance
Optimize LLM inference through techniques such as quantization, distillation, and optimized kernel implementation
Design and implement performance tests for RAG pipelines, including retrieval, ranking, and generation components
Identify and optimize performance bottlenecks in RAG systems, such as database queries, vector search, and document processing
Evaluate and optimize RAG system architectures for scalability and efficiency
Tune vector databases for optimal recall and latency
Collaborate with infrastructure teams to optimize hardware and software configurations for AI workloads
Evaluate and recommend new technologies and tools for performance monitoring and analysis
Develop and maintain performance dashboards and reports to track key metrics
Optimize GPU utilization and memory management for LLM inference
Work closely with AI researchers, software engineers, and product managers to ensure performance requirements are met
Communicate performance findings and recommendations to stakeholders at all levels
Stay up-to-date with the latest developments in Generative AI and performance engineering

Preferred Qualifications

Hold a Master's degree in Computer Science, Engineering, or a related field

Benefits

Generous PTO, plus company holidays
Medical, dental, and vision coverage for you and your family
Paid parental leave for all parents (12 weeks)
Fertility and family planning support
Early-detection cancer testing through Galleri
Flexible spending account and dependent FSA options
Health savings account for eligible plans with company contribution
Annual work-life stipends for: Home office setup, cell phone, internet
Wellness stipend for gym, massage/chiropractor, personal training, etc
Learning and development stipend
Company-wide off-sites and team off-sites
Competitive compensation, company stock options and 401k

Performance Engineer

Writer

Summary

Requirements

Responsibilities

Preferred Qualifications

Benefits

Remote

Software Development

Mid-level

Share this job:

Similar Remote Jobs

DC SCORES

Remote

DevOps

Manager

ServiceNow

Remote

Software Development

Mid-level

ServiceNow

Remote

Software Development

Mid-level

Halcyon

Remote

DevOps

Mid-level

Dremio

Remote

DevOps

Senior

Remote

QA

Senior

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

DevOps

Mid-level

Remote

Software Development

Mid-level