Experience with data cleaning, handling, and labeling in Python.
Experience in labeling datasets both in Python and Excel.
Strong multi-tasking skills when it comes to Python and data.
Day to Day Responsibilities:
Acquiring new data and labeling or tagging entries
Designing scripts to process, clean, tag, and label data
Some manual data labeling for select datasets via Excel
Performing text-based data analysis on documents
Testing and evaluating new techniques and scripts
The Digital Integration & Predictive Technologies (DIPT) team (within Process Development) is searching for an Associate Data Analyst (CW) to supports its big data programs.
The Digital Integration and Predictive Technologies (DIPT) team leads the development and deployment of statistical and mechanistic models, data integration tools, and advanced technologies throughout Process Development and Manufacturing organizations.
We are seeking highly motivated early-career candidates with a passion for handling, processing, labeling and standardizing big data to join our team in Cambridge, MA (preferred) or Thousand Oaks, CA.
The successful candidate will be given opportunities to apply their expertise to enhance the way data is processed and managed within the organization.
The ideal candidate enjoys tackling challenges and excels at organizing information from numerous sources to provide well-constructed deliverables.
If this sounds like you, please read on…
Responsibilities will include, but are not limited to:
Working with large quantities of data when it comes to general workflows such as data processing, cleaning, and labeling.
Working with APIs to acquire and organize new data efficiently.
Using standard Python libraries to process, clean, and standardize large datasets.
General understanding of Natural Language Processing workflows and NLP datasets.
Creating striking visualizations to aid in data interpretation and usage.
Exploring and evaluating new techniques to improve the data processing capabilities.
Proficiency in Python and SQL.
Experience processing, filtering, cleaning, and standardizing large datasets.
Ability to restructure, pivot, and reform data based on current needs
Experience with Python data libraries such as Numpy and Pandas.
Working knowledge of Microsoft Excel
Experience with data visualization tools such as Matplotlib, and Spotfire.
Passion for data exploring and organizing large datasets.
Intellectual curiosity with ability to learn new concepts, scripts, and methods.
Ability to manage multiple, competing priorities simultaneously.
Ability to deliver work in an organized, and on-time fashion.
Ability to work in highly collaborative, cross-functional environments.
Experience in the biotechnology industry is NOT required
Bachelor's degree in applied mathematics, computer science, engineering, data science, computational chemistry, or other quantitative discipline.
Recent graduate and early-career would be best.
Travel- domestic and international.
"This posting is for Contingent Worker, not an FTE"