Data Scientist

XA Group
Summary
Join XA Group, a global technology company, as a Data Scientist β Gen AI to contribute to the development of next-gen GenAI-powered copilots and real-time systems. You will be responsible for building advanced GenAI applications using LLMs and agent-based architectures, designing and scaling agent workflows, integrating Hugging Face models, and working with document processing pipelines. This role requires experience with Python, NLP, machine learning, and various AI tools and technologies. You will collaborate with cross-functional teams to translate ideas into production-ready applications and stay updated on cutting-edge research. The position offers the opportunity to work remotely and build innovative AI solutions.
Requirements
- 2β6 years of experience in Python, NLP, machine learning, and transformers
- Strong hands-on experience with LangChain, LangGraph, agent orchestration, multi-agent system design, and retrieval-augmented generation
- Proven experience working with agents and multi-agent collaboration patterns in real-time applications
- Experience with OCR, NER, document extraction, and automated document workflows
- Proficiency with Hugging Face Transformers and model testing/integration
- Hands-on experience deploying scalable applications using Docker, Azure, and CI/CD pipelines
- Experience working with FastAPI, asyncio, and websockets for building real-time, responsive interfaces
- Strong problem-solving mindset with excellent debugging and optimization skills
- Comfortable working with both structured and unstructured databases
- Knowledge of LLM fine-tuning, semantic caching, and memory-enhanced agents
- Excellent communication and planning skills; comfortable working across cross-functional teams
Responsibilities
- Build advanced GenAI applications using LLMs, Advanced RAG, and especially agent-based and multi-agent architectures (e.g., LangGraph, LangChain)
- Design, orchestrate, and scale agent workflows that coordinate tasks across copilots, document processing, and real-time systems
- Plug, play, test, and integrate Hugging Face models (NLP, OCR, NER, etc.) into modular, extensible agent pipelines
- Work with OCR, NER, and document extraction & processing pipelines for intelligent document understanding
- Build intelligent copilots, chatbots, and backend logic using Python, FastAPI, async programming, websockets, and parallel processing
- Deploy and manage agent-based applications using Docker, Azure, and modern CI/CD pipelines
- Implement and manage vector databases, semantic search, and retrieval workflows for high-quality contextual responses
- Conduct prompt engineering, LLM/RAG/Agent evaluation, and continuous system improvement
- Collaborate with cross-functional teams β product, engineering, QA β to translate ideas into production-ready, agent-enabled tools
- Build POCs, internal tools, and full-fledged production applications based on multi-agent designs
- Stay updated with cutting-edge research papers and trends in LLMs, agentic workflows, and GenAI
Preferred Qualifications
Familiarity with rapid prototyping tools like Streamlit or frontend stacks like React