Data Science - Intern Department: Data Analytics COE Location: New York, NY Company: HRA START YOUR APPLICATION Working with cross-departmental teams, the Data Science Intern will work as a member of the Data Strategy team and be responsible for building a Proof of Concept for the automation, using machine learning and natural language processing, of a clinical coding taxonomy. Currently, the process for adding clinical codes to medical malpractice claims is done manually and subject to errors or inconsistencies. This resource will be specifically dedicated to, using unstructured data from claims files (already extracted through OCR), perform text preprocessing, fit machine learning models to automate the classification of clinical variables, and help bring the models to production if model development is successful. The purpose of this project is to bring efficiencies to TDC group and permit current resources to perform more analysis, not just data entry. 

Job Functions:
� Become familiar with the CRICO clinical coding taxonomy
� Study unstructured information contained in claims files
� Conduct meetings with claims and risk management team to understand the best source of data for specific clinical variables
� Perform text preprocessing (tokenization, normalization, noise removal, etc.)
� Using Databricks (R, Python, autoML) and, build machine learning models to automate the reading and classification of variables to the medical malpractice claims � Identify predictors and causes of business-related problems and implement novel approaches related to forecasting and prediction
� Identify, develop, manage, and execute analyses to uncover areas of opportunity and present written business recommendations
� Collaborate with multiple teams as a leader of quantitative analysis and where you develop solutions that utilize the highest standards of analytical rigor and data integrity
� Bachelor's degree in data science, statistics, or actuarial science required; pursuing a Master's or PhD degree in data science or related field preferred.

Pursuing Masters or equivalent advanced degree from a top tier Technology school.

� Fluent familiarity with Microsoft Excel, SQL, R, or Python

� Familiarity with data processing with Python, R & SQL

� Academic experience in manipulating/transforming data, model selection, model training, cross-validation and deployment at scale

� Demonstrated quantitative, analytical, and problem-solving skills

� Attention to detail with a willingness and ability to solicit and incorporate feedback into work product 

