Software Engineer

Tekmetric Logo

Tekmetric

πŸ“Remote - Worldwide

Summary

Join Tekmetric, a rapidly growing cloud-based auto-repair shop management system company, as a Software Engineer specializing in web scraping, data processing, and search technologies. You will build a large-scale data ingestion and classification system, extracting data from diverse sources, cleaning and normalizing it, and building search capabilities using ElasticSearch/OpenSearch. This role requires proficiency in Python, Scrapy, Airflow, Kubernetes, AWS, and Spark. You will collaborate with ML/NLP teams and work in a dynamic, startup-like environment. Tekmetric offers a comprehensive benefits package including healthcare, generous PTO, flexible work arrangements, and various financial benefits.

Requirements

  • 3+ years of experience in Python with building crawling/scraping solutions at scale
  • Experience working with APIs (REST), PDF processing (OCR, Tesseract, PyMuPDF etc.)
  • Proficiency in data processing & search technologies (ElasticSearch/OpenSearch, NoSQL/SQL databases)
  • Hands-on experience with Airflow and Spark (EMR) or similar distributed systems
  • Strong problem-solving skills in handling anti-scraping mechanisms and data scaling challenges
  • Hands-on experience with AWS or GCP

Responsibilities

  • Build and design large scale, distributed crawling bots (perhaps AI agents) and infrastructure that operate in an adversarial environment aiming at low operational overhead
  • Develop and maintain data pipelines to extract data from large volumes of web pages, documents, PDFs (OCR), and APIs
  • Help unify heterogeneous documents into a coherent data schema across varied source formats
  • Preprocess and normalize raw data for downstream classification, ML/NLP, and search indexing
  • Build APIs to expose structured, classified data via ElasticSearch/OpenSearch
  • Collaborate with ML/NLP teams to integrate classification models into the pipeline
  • Automate workflows using Apache Airflow and deploy solutions in Kubernetes on AWS
  • Optimize and scale data pipelines using Spark (EMR) for processing large datasets

Preferred Qualifications

  • Familiarity with NLP and Machine Learning (a plus but not required)
  • Experience with LLMs, NLP models, or ML frameworks (e.g., Hugging Face, spaCy, TensorFlow, PyTorch)
  • Prior experience in automated document classification
  • Experience working in high-scale, production environments with petabytes of data
  • Hands-on experience with Kubernetes

Benefits

  • Flexible and remote work opportunities
  • Generous PTO
  • Exceptional leave programs for all of life’s moments: maternity, paternity and parental bonding, as well as medical leave to care for yourself or loved ones
  • Excellent Medical, Dental, Vision and Prescription Drug Coverage
  • 401(k) Retirement Savings Plan with a 6% Match
  • Employer covered STD, LTD, Life and AD&D Insurance Programs
  • Up to $60 monthly for wellness expenses and activities
  • Education Assistance- includes undergraduate/graduate courses and continuing education

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.