Sr. Data Engineer - AWS
- Must be a US resident
- Must be US Citizen, or Green Card Holder
- Must have 5+ years of Data engineering experience and interested in working in a startup
- Must have experience working with Python and SQL
- Must have experience working on AWS, AWS analytics, and data storage services
- Design and build an ETL infrastructure
- Background in BioMedical/BioTech/Genome Industry preferred
You are a talented and motivated Senior Data Engineer, experienced in optimizing storage, queries and ingestion of data, and building scalable infrastructure to support data mining, visualization, automated insights, and machine learning efforts.
You thrive in a multi-disciplinary team environment and are astute at identifying areas where data engineering can make a difference, thus developing data architectures and systems that enable our scientists to solve key problems in the drug discovery and development process.
- Work with project leads, users, and SMEs to design and build cloud-based data platforms to facilitate the identification of novel targets, ligases, and therapeutic compounds
- Design and build an ETL infrastructure, project-specific data pipelines, and validation tools to load public and proprietary data from multiple sources using Python, R, Java, SQL and AWS cloud technologies
- Collaborate with the Data Science team to provide users with data, visualizations, reports and automated insights needed to do their jobs more effectively
- Collaborate with the Data Science team to deliver integrated data to machine learning pipelines
- Use best practices for code development, optimization, and unit testing
Role, Responsibilities & Requirements
- Must be a US Resident and US Citizen/Green Card holder
- BS in Computer Science, Information Architecture, Mathematics, or similar field with
- 5+ years of data engineering experience
- Advanced programming skills, including object-oriented programming; Proficient with Python; willingness to learn other languages as needed
- Demonstrated experience building and operating scalable data pipelines
- Expert level proficiency in data modeling, SQL query solution design and coding, query optimization, and performance tuning
- Proficiency in cloud computing, AWS
- Passionate about delivering high-quality, data solutions to further scientific research
- Must be effective in a dynamic environment while adapting to changing priorities
- Excellent written and verbal communication skills
- Experience building with and/or using AWS analytics and data storage services
- Proficiency in processing and/or analyzing large public or proprietary biological patient and assay datasets, including transcriptomic, proteomic, and small molecule datasets, is preferred
- Proficiency with predictive modeling approaches and/or preparing data for predictive modeling is preferred.
- Strong capacity for independent thinking and the ability to grasp underlying biological questions are a plus
- Annual Bonus
- Parental Leave
- Three weeks of Anual leave
- Medical Insurance (HMO, PPO, and others), FSA & HSA
- 100% paid Dental and Vision Insurance
We are a venture-backed biotechnology company solving critical problems in human health through the discovery and development of innovative new medicines against ‘undruggable targets.' Our team comprises industry-leading experts in protein degradation and molecular glues with a track record of ground-breaking discoveries in the field.
We are committed to leadership in advancing the science and technology of molecular glue drug discovery while prosecuting a pipeline of projects through clinical development. Our patient-first, science-driven approach is complemented by our dedication to a supportive and collaborative work environment.
We are headquartered in San Diego, California, and have a key collaboration with the Center for Protein Degradation at the Dana Farber Cancer Institute.