← Back to all jobs
Harvard University

Scientific Platform Engineer

Harvard University

10h ago

No Phone RequiredDevopsUnited Stateshimalayas
Platform-EngineeringDevOpsBuild-&-Release-EngineeringScientific-ComputingInfrastructure-EngineeringMid-level

Job Description

The SBGrid Consortium at Harvard Medical School supports a large international research community by curating and distributing a scientific software platform used across structural biology, cryo-EM, and related fields. The platform includes approximately 650 software titles and 6,000 versions across macOS and Linux and is deployed across laptops, workstations, HPC clusters, and cloud environments. We are hiring a Scientific Platform Engineer to help lead the modernization, security, reliability, and engineering evolution of this platform. This is a platform engineering role with substantial independent responsibility for CI pipelines, reproducible packaging, deterministic installation, release engineering, runtime hardening, observability, and software supply-chain integrity. The role is designed to be primarily engineering and platform-development work, not routine support, and it directly impacts software delivery and platform reliability across a globally distributed scientific infrastructure.What You Will Work On: This is an engineering-heavy role – expect 90%+ project/building time vs break-fix. Build & Test Automation Design and implement CI pipelines for scientific software across macOS and Linux.Develop regression and smoke test harnesses for packaged software.Catch failures before distribution rather than after client installation.Support fast-moving development branches (e.g., nightly builds) safely.Reproducible Packaging Help define and enforce a canonical build contract.Improve dependency tracking and version control.Enable deterministic rebuilds across environments.Contribute to artifact integrity and metadata tracking (e.g., SBOM readiness).Runtime Platform Hardening Add tests and versioning discipline to SBGrid’s runtime wrapper system (“capsules”).Introduce feature flags and safer rollout mechanisms.Improve logging, observability, and error classification. Internal Tooling & Observability.Develop dashboards and structured signals around build failures and common error states.Reduce reliance on tribal knowledge by encoding workflows into systems.Technologies You’ll Use (and can help shape): Core platforms Linux (expert-level): shells, process model, filesystems, toolchains, debugging, perf basicsmacOS (strong): building, testing, and release workflows across Intel + Apple SiliconBuild/release + automation CI/CD: GitLab CI (or equivalent CI systems and concepts)Scripting and automation: Bash + Python (primary)Performance-oriented implementation as needed: Go and/or Rust (selectively, for the hot paths)Packaging and reproducibility Current + future packaging direction: evaluating/adopting Nix/Spack/Homebrew-style approachesDependency management, artifact metadata, caching, provenance, reproducible builds Execution environmentsContainers and virtualization: Docker/Podman, VMs, and orchestration concepts (framework-agnostic; Kubernetes/OpenShift not required)(Nice-to-have) Apptainer/Singularity in scientific/HPC contextsVersion control + engineering hygiene Git, code review workflows, testing discipline, documentation-first habitsNice-to-have context Research/HPC exposure (Slurm, shared filesystems, scientific software stacks)AWS familiarity (useful, not required)“You Don’t Need to Know Everything” We expect this job description to match people coming from different directions including Unix/HPC/research computing admins who like building durable automation, build/release engineers who want harder problems than a typical web app pipelines, and developers with systems instincts who are comfortable close to the OS and tooling.If you’re strong in either the systems/ops side or the programming/build tooling side, and you want to grow into the other half, we want to hear from you.Basic Qualifications: Minimum of five years’ post-secondary education or relevant work experience.Additional Qualifications and Skills (Preferred):Bachelor’s degree in computer science, engineering, or a related technical field. Minimum of 5 years of relevant experience in platform engineering, systems engineering, DevOps, build/release engineering, research computing infrastructure, or a closely related area. Two or more years of professional software development experience.Experience with CI/CD systems (e.g., GitLab CI, GitHub Actions, similar).Experience with an Infrastructure-as-Code tool (e.g. Ansible, Puppet, Chef, Terraform, etc).Comfortable with Linux internals and scripting in Bash.Experience debugging cross-platform build or runtime issues.Solid programming skills in at least one interpreted language (Python preferred, Javascript, Ruby, etc).Comfort working in a remote, documentation-driven environment.Experience with HPC environments or research computing.Familiarity with containerization (Docker, Singularity/Apptainer, similar).Experience with artifact signing or supply-chain tooling.Experience working in regulated or compliance-sensitive environments.Interest in scientific research software ecosystems.Strong engi