Summary

Join Tether's AI model team and drive innovation across the entire AI lifecycle. Develop and implement rigorous evaluation frameworks and benchmark methodologies for pre-training, post-training, and inference. Design metrics and assessment strategies to ensure models are highly responsive, efficient, and reliable. Work on various systems, from resource-efficient models to complex, multi-modal architectures. Collaborate with cross-functional teams to share evaluation findings and integrate stakeholder feedback. Engineer robust evaluation pipelines and performance dashboards. Set industry-leading standards for AI model quality and reliability, delivering scalable performance and tangible value.

Requirements

A degree in Computer Science or related field
Ideally PhD in NLP, Machine Learning, or a related field, complemented by a solid track record in AI R&D (with good publications in A* conferences)
Demonstrated experience in designing and evaluating AI models at multiple stages from pre-training, post-training, and inference
You should be proficient in developing evaluation frameworks that rigorously assess accuracy, convergence, loss improvements, and overall model robustness, ensuring each stage of the AI lifecycle delivers measurable real-world value
Strong programming skills and hands-on expertise in evaluation benchmarks and frameworks are essential
Familiarity with building, automating, and scaling complex evaluation and benchmarking pipelines, and experience with performance metrics: latency, throughput, and memory footprint
Proven ability to conduct iterative experiments and empirical research that drive the continuous refinement of evaluation methodologies
You should be adept at staying abreast of emerging trends and techniques, leveraging insights to enhance benchmarking practices and model reliability
Demonstrated experience collaborating with diverse teams such as product, engineering, and operations in order to align evaluation strategies with organizational goals
You must be skilled at translating technical findings into actionable insights for stakeholders and driving process improvements across the model development lifecycle

Responsibilities

Develop, test, and deploy integrated frameworks that rigorously assess models during pre-training, post-training, and inference
Define and track key performance indicators such as accuracy, loss metrics, latency, throughput, and memory footprint across diverse deployment scenarios
Curate high-quality evaluation datasets and design standardized benchmarks to reliably measure model quality and robustness
Ensure that these benchmarks accurately reflect improvements achieved through both pre-training and post-training processes, and drive consistency in evaluation practices
Engage with product management, engineering, data science, and operations teams to align evaluation metrics with business objectives
Present evaluation findings, actionable insights, and recommendations through comprehensive dashboards and reports that support decision-making across functions
Systematically analyze evaluation data to identify and resolve bottlenecks across the model lifecycle
Propose and implement optimizations that enhance model performance, scalability, and resource utilization on resource-constrained platforms, ensuring efficient pre-training, post-training, and inference
Conduct iterative experiments and empirical research to refine evaluation methodologies, staying abreast of emerging techniques and trends
Leverage insights to continuously enhance benchmarking practices and improve overall model reliability, ensuring that all stages of the model lifecycle deliver measurable value in real-world applications

AI Research Engineer

Tether.to

Summary

Requirements

Responsibilities

Remote

Software Development

Mid-level

Share this job:

Similar Remote Jobs

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level

Remote

Software Development

Mid-level