Senior Data Scientist (NLP) at Parenthetic (Washington, DC) (allows remote)

At Parenthetic

Washington, DC

We are building human-driven NLP technologies and are seeking an experienced, proactive data scientist to join our newly formed engineering team. An ideal candidate will have a strong background in natural language processing methods and experience with a range of text classification problems. We have a particular interest in event extraction/detection, entity recognition, slot-filling tasks, document classification, and zero-/few-shot learning. Secondary experience in forecasting, time series, and/or active learning methods will also be beneficial. As this is a new team, we are looking for candidates who are willing to help grow the organization by taking on a range of responsibilities across the technical spectrum, as well as effectively collaborate to deliver findings to our customers.

The position may require some on-site work in Northern Virginia for team and client meetings.

Your day-to-day will include:

  • Research and develop machine learning methods for a wide range of text extraction and classification tasks, often with limited labels and/or in multiple languages

  • Design and implement tools for monitoring and forecasting trends in signals derived from text data

  • Work effectively, in an often self-directed environment, to estimate timelines, communicate progress, and identify avenues for future research and development

  • Perform analyses and generate detailed data products for internal stakeholders and external clients

  • Institute MLOps principles in our software development practices and platform development

  • Deliver version-controlled, documented, and reproducible analyses and experiments that can be readily transitioned into scalable inference services

Work Experience and Skills

  • Advanced degree in computer science, math/statistics, engineering, linguistics, social science, or a related field

  • 5+ years of experience in the data science field (this is flexible depending on academic work)

  • Proficiency with major Python data science libraries, including the SciPy stack and Scikit-learn

  • Experience with at least one deep learning and/or NLP framework (Tensorflow, PyTorch, Transformers, etc.)

  • Knowledge and understanding of pre-trained language models like BERT and GPT

  • Familiarity with other commonly used technologies including Linux operating systems, SQL/NoSQL databases, etc.

  • Ability to use git, as well as other version, workflow, and project management tools and technologies

  • Possess strong communications skills, with the ability to communicate complex ideas clearly and concisely to a range of audiences

  • Aptitude for learning quickly and a willingness to take on a wide range of responsibilities

Preferred Qualifications:

  • Experience with other technologies and platforms in our stack, including: Elasticsearch, Kibana, Docker, DVC, Kubernetes, GCP, GitLab

  • Prior work in the marketing/communications and/or defense sectors

  • Ability to obtain and/or maintain a US government security clearance

Apply for the Job

Recent Job Postings