Senior Staff Data Engineer, Data Security

Netskope
Summary
Join Netskope's Data Security team and contribute to building the next generation of data loss prevention solutions. You will design and deliver highly effective, scalable, and high-performance software solutions for content inspection and data protection. Responsibilities include developing software tools for processing data from various sources, collecting and analyzing data, developing linguistic data resources, creating regular expressions, researching orthographic variations, and identifying data resources for information normalization. A Bachelor's or Master's degree in Data or Computer Science with additional experience in Linguistics or a related field is required, along with fluency in English and proficiency in Python. Excellent attention to detail, strong analytical skills, and the ability to work independently are essential. The role involves working with confidential information and may include exposure to offensive content.
Requirements
- A Bachelor's or Master's Degree in Data or Computer Science with additional extensive formal study or work experience in Linguistics, Symbolic Systems, Taxonomy, or a related discipline
- Must be a native or near-native, fluent speaker of English
- Proficiency in Python or similar computer languages, with a strong background in creating advanced re, ICU, or PCRE regular expressions
- Must be able to perform deep online research from credible sources and perform document searches in a wide variety of human languages
- Must have an extremely good eye for detail β this is absolutely crucial
- Possess a strong ability to effectively work independently as part of a small remote team
- Must have excellent proofreading skills, obsessive to the point of wanting to fix restaurant menus with Wite-Out or Sharpies (but doesn't)
- Must be able to think creatively and possess strong analytical and problem-solving skills
Responsibilities
- Developing or assisting in the development of software tools for processing relevant information from a variety of structured and unstructured data sources
- Collecting, cleaning, analyzing, and organizing data of various types
- Developing or assisting in the development of language and linguistics data such as vertical dictionaries, grammars, phrase and idiom lists, and corpuses for analysis and testing
- Developing or assisting in the development of regular expressions which can match numbers, terms and other alphanumeric expressions
- Researching variations in prescriptive and organic orthographies for dozens of languages
- Identifying and leveraging data resources from various sources of knowledge to normalize context-relevant information
Preferred Qualifications
- Experience with developing rule-based NLP applications or manual POS annotation is a plus
- Have an interest in social or online language, or foreign language idioms or colloquialisms