← Back to all jobs
G2i

Machine Learning Evaluation Specialist

G2i

1d ago

No Phone Required$416k - $832kDataAlbania, Argentina, Austria +36 morehimalayas
AI-Model-Evaluation-SpecialistAI-Evaluation-EngineerMachine-Learning-SpecialistMachine-Learning-Data-SpecialistAI-Response-Evaluation-SpecialistAI-Assessment-SpecialistMid-level

Job Description

Machine Learning Evaluation Specialist (Remote)List of accepted countries and locationsImportant for US applicants: This is a 1099 independent contractor role and is not compatible with F-1 OPT, STEM OPT, or other visa statuses that require W-2 employment, guaranteed hours, or employer sponsorship. We are unable to provide offer letters or employment verification for this role.Help design the hardest ML problems state-of-the-art AI hasn't solved yet.We're hiring domain experts to build evaluation tasks that challenge the frontier of AI. This is not an ML engineering role — it's a research role. You'll use deep expertise in your field to create problems that general ML knowledge can't touch.What you'll doPropose and frame original, research-grade ML problems rooted in your domainDesign evaluation tasks that require specialized knowledge well beyond standard pipelinesAssess AI-generated solutions for correctness, creativity, and methodological rigor — and explain exactly where and why they fall shortDocument problem difficulty, required domain knowledge, and expected failure modesWhat you needGraduate-level expertise (MS or PhD preferred) in a scientific or technical domain that intersects with MLStrong working knowledge of ML methods — model selection, feature engineering, evaluation metricsDeep familiarity with active research problems in your field — you know where general ML knowledge runs outExcellent written communication — you can articulate complex problems clearly and precisely. This cannot be overstated.Self-motivated and comfortable working independently on intellectually demanding tasksWhat you don't needNo prior AI training or RLHF experience requiredNo software engineering background needed — domain expertise and research instincts are what matterDomains we're especially looking forComputational Biology / BioinformaticsGenomics / Molecular BiologyPhysics / Astrophysics / Signal ProcessingClimate / Environmental ModelingHealthcare / Medical ImagingNeuroscience / Brain-Computer InterfacesMaterials Science / ChemistryFinance / Quantitative ModelingRobotics / Control Systems / Reinforcement LearningAdvanced NLP (specialized domains)Mathematics / Statistics (applied)LogisticsFully remote — work from anywhere$200–$400/hr depending on domain and seniority10–40 hrs/week, hourly contractAssessment required — paid if approvedIndependent contractor (1099) — not compatible with F-1 OPT, STEM OPT, or visa statuses requiring W-2 employment or employer sponsorship⚠️ This is a project-based, freelance opportunity with no guaranteed hours. We recommend keeping other work options open while waiting for project assignment.Originally posted on Himalayas