Staff Data Scientist

KnowBe4 Logo

KnowBe4

πŸ“Remote - India

Summary

Join KnowBe4 as a Data Scientist and lead with innovation, shaping a distinctive career supported by our global reach and cutting-edge technology. You will craft impactful, data-driven solutions, collaborating with talented teams in a dynamic environment. As a Data Scientist, you'll design data modeling processes, create algorithms and predictive models, and work alongside engineers in an agile development environment. You will also communicate complex concepts to non-technical audiences and identify opportunities to optimize business impact. This role requires expertise in machine learning, deep learning, and generative AI, along with strong data manipulation and visualization skills. KnowBe4 offers a fun and engaging work environment with various benefits.

Requirements

  • BS or equivalent plus 10 years experience
  • MS or equivalent plus 5 years experience
  • Ph.D. or equivalent plus 4 years experience
  • Expertise working experience with programming languages like Python, R, and SQL
  • Solid understanding of statistics, probability, and machine learning
  • 10+ years of relevant experience in designing ML/DL/GenAI systems
  • Expertise in rolling out Generative AI SAAS product and features
  • Expertise in AWS ecosystem
  • Proficiency in machine learning algorithms and techniques, including supervised and unsupervised learning, classification, regression, clustering, and dimensionality reduction
  • Strong understanding and practical experience with deep learning frameworks such as TensorFlow or PyTorch. Ability to design, train, and optimize deep neural networks for various tasks like image recognition, natural language processing, and recommendation systems
  • Knowledge and experience in generative models like Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). Ability to create and use generative models for tasks such as image generation, text generation, and data synthesis
  • Exposure to LLMs, Transformers, and a few technologies like Langchain, Llamaindex, Pinecone, Sagemaker Jumpstart, Chatgpt, AWS Bedrock, and VertexAI
  • Strong data manipulation skills, including data cleaning, preprocessing, and feature engineering. Experience with data manipulation libraries like Pandas
  • Ability to create compelling data visualizations using tools like Matplotlib or Seaborn to communicate insights effectively
  • Proficiency in NLP techniques for text analysis, sentiment analysis, entity recognition, and topic modeling
  • Strong understanding of data classification, sensitivity, PII, and personal data modeling techniques
  • Experience in model evaluation and validation techniques, including cross-validation, hyperparameter tuning, and performance metrics selection
  • Proficiency in version control systems like Git for tracking and managing code changes
  • Strong communication skills to convey complex findings and insights to both technical and non-technical stakeholders. Ability to work collaboratively in cross-functional teams
  • Excellent problem-solving skills to identify business challenges and devise data-driven solutions

Responsibilities

  • Research, design, and implement Machine Learning, Deep Learning algorithms to solve complex problems
  • Communicate complex concepts and statistical models to non-technical audiences through data visualizations
  • Performs statistical analysis and using results to improve models
  • Identify opportunities and formulate data science / machine learning projects to optimize business impact
  • Serve as a subject matter expert in data science and analytics research, and adopt the new tooling and methodologies in Knowbe4
  • Manage the release, maintenance, and enhancement of machine learning solutions in a production environment via multiple deployment options such as APIs, embedded software, or stand-alone applications
  • Advise various teams on Machine Learn Practices and ensure the highest quality and compliance standards for ML deployments
  • Design and develop cyber security awareness products and features using Generative AI, machine learning, deep learning, and other data ecosystem technologies
  • Collaborate with cross-functional teams to identify data-related requirements, design appropriate NLP experiments, and conduct in-depth analyses to derive actionable insights from unstructured data sources
  • Staying updated with the latest advancements in machine learning, deep learning, and generative AI through self-learning and professional development

Preferred Qualifications

  • Experience in designing data pipelines and products for real-world applications
  • Experience with modern/emerging scalable computing platforms and languages (e.g. Spark)
  • Familiarity with big data technologies like Hadoop, Spark, and distributed computing frameworks for handling large datasets

Benefits

  • Company-wide bonuses based on monthly sales targets
  • Employee referral bonuses
  • Adoption assistance
  • Tuition reimbursement
  • Certification reimbursement
  • Certification completion bonuses
  • A relaxed dress code

Share this job:

Disclaimer: Please check that the job is real before you apply. Applying might take you to another website that we don't own. Please be aware that any actions taken during the application process are solely your responsibility, and we bear no responsibility for any outcomes.

Similar Remote Jobs