Shape the Future of AI
With Your Expertise

Elevating Talents, Advancing Careers.

All roles are project-based and supervised. Browse our open positions below.

Evaluation & Alignment Active

RLHF & Preference Ranking Specialist

Compare AI-generated responses and select the best based on helpfulness, accuracy, and safety rubrics. Your rankings directly shape how frontier language models learn.

Evaluation & Alignment Active

AI Output Quality Evaluator

Score and audit AI responses across multiple quality dimensions, writing clear feedback that drives model improvement.

Evaluation & Alignment Active

Instruction Following Evaluator

Test whether AI models correctly execute complex, multi-step instructions by checking every specified constraint independently.

Evaluation & Alignment Active

Conversational AI Evaluator

Engage AI models in realistic multi-turn conversations and assess coherence, context retention, and response quality across the full dialogue.

Evaluation & Alignment Active

Factual Accuracy & Hallucination Reviewer

Verify factual claims in AI outputs against authoritative sources, identifying and categorizing hallucinations with expert precision.

Evaluation & Alignment Active

Long-form Content Evaluator

Evaluate extended AI outputs including essays, reports, and summaries for coherence, structure, accuracy, and stylistic quality across thousands of words.

Evaluation & Alignment Active

Multimodal Output Evaluator

Evaluate AI systems that generate or reason across text, images, and other modalities ; assessing cross-modal consistency and reasoning quality.

Computer Vision Active

Image Annotation Specialist

Produce precise bounding boxes, semantic labels, and classification tags across diverse image datasets for computer vision model training.

Computer Vision Active

Video & Temporal Annotation Specialist

Annotate objects, actions, and events across video sequences with consistent tracking, temporal boundaries, and activity labels.

Computer Vision Active

Semantic Segmentation Specialist

Produce pixel-precise segmentation masks that define the exact boundaries of every object in an image ; the most technically demanding vision annotation task.

Computer Vision Active

LiDAR & Point Cloud Annotator

Annotate 3D LiDAR point cloud data with cuboids, segmentation labels, and tracking IDs for autonomous vehicle and robotics AI.

Computer Vision Active

Medical Imaging Annotator

Annotate clinical images including CT, MRI, X-ray, and pathology slides under clinical supervision for medical AI training.

Computer Vision Active

Satellite & Aerial Imagery Analyst

Classify land use, detect infrastructure, and annotate change detection in satellite and aerial imagery for geospatial AI.

Computer Vision Active

Keypoint & Pose Estimation Annotator

Place precise skeleton keypoints on human bodies, faces, hands, and animals to train pose estimation and motion AI.

Audio & Speech Active

Audio Transcription Specialist

Transcribe spoken audio into accurate text across languages, accents, and challenging acoustic conditions.

Audio & Speech Active

TTS Evaluation Specialist

Evaluate synthesized speech for naturalness, prosody, emotional authenticity, and intelligibility using structured scoring rubrics.

Audio & Speech Active

Speaker Diarization Specialist

Annotate who speaks when in multi-speaker recordings, precisely marking speaker turns, overlaps, and re-entries.

Audio & Speech Active

Dialect & Accent Identification Specialist

Listen to audio and identify dialect, accent, and regional speech patterns with precision for inclusive speech AI development.

Audio & Speech Active

Audio Event Classification Specialist

Label environmental sounds, acoustic scenes, and sound events in audio recordings for audio AI model training.

NLP & Text Active

Text Classification Specialist

Apply precise category labels to text across complex, multi-level taxonomies for NLP model training at scale.

NLP & Text Active

Named Entity Recognition Annotator

Tag entities ; people, organizations, locations, dates, products, and custom types ; in text with precise span boundaries.

NLP & Text Active

Question-Answer Pair Writer

Craft natural, challenging, and diverse Q&A pairs across domains for reading comprehension and knowledge AI training datasets.

NLP & Text Active

Summarization Quality Evaluator

Evaluate AI-generated summaries for factual faithfulness, completeness, fluency, and appropriate conciseness.

NLP & Text Active

Intent & Slot Filling Annotator

Annotate user utterances with intent labels and extracted slot values for conversational AI and virtual assistant training.

Domain: Sciences Experts Needed

Biology & Life Sciences Expert

Evaluate AI reasoning in biology, ecology, genetics, microbiology, and biochemistry ; verifying accuracy and flagging scientific errors.

Domain: Sciences Experts Needed

Chemistry Expert

Evaluate AI outputs in organic, inorganic, physical, and computational chemistry for correctness and scientific rigor.

Domain: Sciences Experts Needed

Physics Expert

Review AI physics explanations and problem-solving across classical mechanics, electromagnetism, quantum mechanics, and relativity.

Domain: Sciences Experts Needed

Medical & Clinical Expert

Clinician-led evaluation of AI medical outputs for clinical accuracy, appropriate safety caveating, and alignment with current guidelines.

Domain: Sciences Experts Needed

Pharmacology & Drug Safety Expert

Evaluate AI outputs in pharmacology, toxicology, and drug safety for accuracy, contraindication awareness, and regulatory compliance.

Domain: Sciences Experts Needed

Environmental & Earth Sciences Expert

Evaluate AI content in climate science, ecology, geology, oceanography, and environmental policy for scientific accuracy.

Domain: Professional Experts Needed

Legal & Compliance Expert

Attorney review of AI legal outputs for jurisdictional accuracy, reasoning quality, and appropriate professional caveating.

Domain: Professional Experts Needed

Finance & Economics Expert

Professional evaluation of AI financial and economic outputs for numerical accuracy, reasoning quality, and regulatory compliance.

Domain: Professional Experts Needed

Psychology & Behavioral Science Expert

Evaluate AI psychology outputs for clinical accuracy, ethical appropriateness, and alignment with current research and practice.

Domain: Professional Experts Needed

Education & Pedagogy Expert

Evaluate AI educational content for pedagogical quality, age-appropriateness, accuracy, and alignment with learning science.

Domain: Professional Experts Needed

Accounting & Financial Reporting Expert

Verify AI accounting outputs for GAAP/IFRS compliance, numerical accuracy, and sound financial reporting reasoning.

Domain: Technical Experts Needed

Mathematics & Reasoning Expert

Verify AI mathematical solutions and proofs step by step ; from algebra to graduate-level mathematics.

Domain: Technical Active

Code Review & Programming Expert

Evaluate AI-generated code for correctness, security vulnerabilities, performance, and engineering best practices across multiple languages.

Domain: Technical Active

Data Science & Analytics Expert

Evaluate AI data science outputs for statistical validity, methodology quality, and correct interpretation of analytical results.

Domain: Technical Experts Needed

Cybersecurity Expert

Evaluate AI cybersecurity outputs for technical accuracy, current threat landscape alignment, and appropriate security guidance.

Domain: Technical Experts Needed

Mechanical, Civil & Electrical Engineering Expert

Evaluate AI engineering outputs for technical correctness, code compliance, and sound engineering reasoning across disciplines.

Safety & Ethics Active

Adversarial Red Team Specialist

Deliberately probe AI models to find vulnerabilities, jailbreaks, and failure modes under strict ethical boundaries ; making AI safer before it reaches the public.

Safety & Ethics Active

Content Safety & Policy Reviewer

Classify AI outputs against safety policies across harm categories, applying consistent judgment and escalation protocols.

Safety & Ethics Active

Bias Detection & Fairness Auditor

Systematically test AI models for demographic bias and unequal treatment across race, gender, religion, and other protected characteristics.

Safety & Ethics Active

Toxicity & Harmful Content Classifier

Apply fine-grained toxicity labels and severity scores across hate speech, harassment, threats, and harmful instructional content.

Safety & Ethics Active

Misinformation & Disinformation Researcher

Identify, categorize, and verify claims in AI outputs for factual accuracy and potential for harmful misinformation spread.

Linguistics Active

Linguistic Analysis Specialist

Apply phonetic, grammatical, and pragmatic annotations that teach AI the deep structure of human language.

Linguistics Active

Translation & Localization Expert

Translate and culturally adapt AI training data so that multilingual models genuinely understand non-English languages.

Linguistics Active

Dialect & Accent Identification Specialist

Listen to audio and identify regional dialects and accent features with precision for inclusive speech AI development.

Linguistics Active

Sentiment & Tone Analyst

Classify emotional tone, sentiment polarity, and pragmatic intent in text and audio ; teaching AI to understand feeling, not just words.

Linguistics Active

Computational Morphology & Syntax Annotator

Annotate morphological structure and syntactic relations in text ; fundamental data for low-resource and morphologically rich language AI.

Engineering & DevOps Experts Needed

SME Prompt Engineer

Design expert-level prompts that push AI reasoning to its limits and systematically improve model capabilities through structured experimentation.

Engineering & DevOps Active

Agent & Tool Use Evaluator

Evaluate autonomous AI agents performing multi-step tasks with external tools ; one of the most cutting-edge roles in AI capability research.

Engineering & DevOps Active

Python AI Data Developer

Build and maintain Python scripts and tools that support annotation pipelines, data processing, and quality assurance workflows for AI training.

Engineering & DevOps Active

MLOps & AI Infrastructure Engineer

Design, deploy, and maintain the infrastructure that powers AI training data pipelines ; from annotation tooling to quality monitoring dashboards.

Engineering & DevOps Active

Data Pipeline Quality Analyst

Audit AI training datasets for errors, duplicates, and quality issues before they enter model training ; a high-leverage quality gate.

Engineering & DevOps Active

Synthetic Data Quality Analyst

Evaluate LLM-generated synthetic training data for quality, diversity, and factual consistency before it enters training pipelines.

Emerging Roles Active

Multimodal Foundation Model Evaluator

Evaluate AI systems that reason simultaneously across text, images, audio, and video ; among the most advanced evaluation roles in the field today.

Emerging Roles Active

Autonomous AI Agent Tester

Test AI agents that act autonomously in complex environments ; identifying failure modes, planning errors, and safety risks before deployment.

Emerging Roles Active

Generative AI Content Evaluator

Evaluate the output quality, safety, and prompt adherence of generative AI models producing text, images, code, and other creative content.

Emerging Roles Active

Robotics & Embodied AI Data Specialist

Annotate and evaluate training data for robots and embodied AI systems ; including physical task instructions, manipulation feedback, and spatial reasoning datasets.

Emerging Roles Active

AR/VR Content Annotator

Annotate 3D environments, spatial interactions, and extended reality content for training AI systems that operate in AR and VR spaces.

Engineering & DevOps Active

Frontend Developer

Build the interfaces that annotation teams and quality leads depend on ; fast, accessible, and built for precision work.

Engineering & DevOps Active

Backend Developer

Architect the server-side systems that route, validate, and power every annotation pipeline Stalwart operates.

Engineering & DevOps Active

UI/UX Designer

Design the annotation tools and client dashboards that must be simultaneously beautiful, cognitively efficient, and enterprise-grade.

Engineering & DevOps Active

Full Stack Developer

Own features end-to-end ; from the annotator interface to the API to the database ; in a product that directly shapes frontier AI.

Engineering & DevOps Experts Needed

AI Prompt Engineer

Design, test, and refine the prompts that govern frontier AI model behavior ; where language precision and engineering rigor meet.

Precision. Reliability. Structure.

Rigor and the ability to follow detailed instructions are mandatory. We provide structured onboarding, performance-guided supervision, and a clear path to senior roles.

Not seeing your perfect role? Send in your application here:

Submit Open Application →
LinkedIn