Bioinformatics Machine Learning Intern
RefinedScience
14d ago
0$71k - $79kDataUnited Stateshimalayas
BioinformaticsMachine-Learning-InternComputational-BiologyBioinformatics-InternshipData-Science-InternshipsEntry-level
Job Description
Bioinformatics Machine Learning InternRefinedScience | United States (hybrid or remote)At RefinedScience, our mission is to advance care by bringing together the best science, data and minds – disease by disease, patient by patient, cell by cell to discover pathways to life beyond disease.What We Are Looking ForWe are seeking a highly motivated Bioinformatics Machine Learning Intern to join our team. This internship is designed for Ph.D. candidates with experience applying machine learning, deep learning, or generative AI methods to single-cell omics data. You will contribute to active projects spanning single-cell biology, multiomics integration, and computational approaches to precision medicine and drug development.Our Bioinformatics team plays a crucial role in integrating computational biology, large-scale data analysis, and machine learning to drive discoveries in precision medicine and drug development.Key ActivitiesAnalyze single-cell and multiomics datasets to extract biological insights supporting precision medicine and drug development programsApply and evaluate machine learning and deep learning approaches to single-cell data for tasks such as cell type classification, biomarker discovery, and patient stratificationExplore and prototype generative AI and LLM-based approaches to accelerate biological data interpretation and scientific workflowsCollaborate with scientists, clinicians, and data scientists to design and execute data-driven research projectsDocument and optimize computational workflows following reproducible research best practicesPresent findings through technical reports, visualizations, and presentations to cross-functional teamsMust HavesCurrent Ph.D. candidate in Bioinformatics, Computational Biology, Computer Science, Biostatistics, or a related quantitative fieldSingle-cell omics experience: Demonstrated ability to process, analyze, and interpret single-cell data (scRNA-seq, scATAC-seq, CITE-seq, or spatial transcriptomics) using frameworks such as Scanpy/scverse, Seurat, or BioconductorMachine learning expertise: Applied experience developing and evaluating ML/deep learning models on biological data, including neural network architectures (GNNs, transformers, autoencoders), model selection and benchmarking, and integration of ML approaches into analytical workflowsProgramming proficiency: Python and/or R for data analysis, statistical modeling, and visualizationStatistical foundation: Understanding of statistical methods for biological data (hypothesis testing, differential expression, multiple testing correction, clustering)Strong problem-solving skills and ability to communicate complex insights effectivelyDesired QualificationsMachine Learning & AIExperience with deep learning frameworks (PyTorch, TensorFlow, JAX)Familiarity with graph neural networks, attention mechanisms, or transformer architectures applied to biological dataExperience with ML experiment tracking and reproducibility (MLflow, Weights & Biases)Exposure to representation learning, variational autoencoders, or contrastive learning methodsFamiliarity with scikit-learn, XGBoost, or similar ML librariesInterest in or experience with LLMs, RAG systems, or agentic AI toolingBioinformatics Experience with multimodal single-cell integration (Seurat WNN, scvi-tools/MultiVI/totalVI, Muon)Familiarity with spatial transcriptomics analysis (Squidpy, cell2location, nf-core/spatialvi)Experience with cell-cell communication inference (CellChat, NicheNet, LIANA)Knowledge of drug-gene interaction resources (CMap/LINCS, OpenTargets, ChEMBL)Engineering & InfrastructureFamiliarity with Linux/Unix CLI and version control (Git/GitHub)Experience with containerization (Docker, Singularity) and environment management (conda, venv)Exposure to cloud computing platforms (GCP preferred)Familiarity with workflow managers (Nextflow, Snakemake)Adherence to best-practices for conduct reproducible computational researchDuration8–10 weeksWhy You'll Love RefinedScienceTeam + ValuesAt RefinedScience, we seamlessly integrate top-tier clinical and biological data with expert knowledge to provide unparalleled insights. We maximize patient impact with these unique insights by optimizing clinical trial probability of success and time to actionable results. We work across biopharma and we are a trusted partner in achieving better results, faster – working together to unlock strategic advantage.Our ValuesAct with Purpose – We believe in rigor through deliberate and thoughtful actionsBe Curious – Curiosity is the spark that ignites innovation and growthTake Ownership – True ownership leads to pride and commitment in the work we doInvest in Relationships – Building strong connections is the foundation for effective collaboration and trust for long term successEmbrace Agility – We celebrate agile thinking, resilience, and adaptabilityCompensation$34-$38 per hourOriginally posted on Himalayas
