Back

[Remote] Senior Scientist, Synthetic Data Generation

Worldwide Salaried Open

Note: The job is a remote job and is open to candidates in USA. NVIDIA is at the forefront of the AI revolution, and they are seeking a Senior Scientist to advance capabilities in synthetic data generation for training frontier models. The role involves building synthetic data generation pipelines, advancing multimodal synthetic data generation, and collaborating with various teams to contribute to open-source libraries within the NVIDIA NeMo ecosystem.

Responsibilities

  • Build synthetic data generation pipelines using LLM-based methods and automated quality evaluation, producing datasets that improve the pre- and post-training of LLMs such as Nemotron — reasoning, coding, structured output, and multimodal understanding
  • Advance multimodal synthetic data generation — image, document, video, and audio — in partnership with NVIDIA's model teams
  • Design and maintain open-source libraries and SDKs with clean APIs and strong documentation
  • Drive software excellence with modern tooling, architecture based on configuration, and professional Git/CI-CD
  • Publish original research at top machine learning and AI conferences to maintain NVIDIA's technical leadership
  • Mentor interns and junior researchers to develop technical growth within the team

Skills

  • PhD in Computer Science, Machine Learning, Statistics, or a related field, or equivalent experience
  • A research background of 3+ years in synthetic data generation, generative modeling, multimodal machine learning, or related areas. Comparable experience is also considered
  • Deep technical understanding of LLMs, how data shapes their pre- and post-training, and inference frameworks such as vLLM or TGI
  • Proven track record of developing or maintaining software libraries used by a broad developer community
  • Strong publication record at premier venues such as NeurIPS, ICML, ICLR, ACL or similar
  • Open-source contributions in ML or data tooling
  • Experience with multimodal generation or understanding (vision-language, document AI, video, or audio)
  • Building and optimizing scalable data pipelines for large-scale model training (throughput, distributed inference)
  • Experience generating data for agentic, tool-use, or reinforcement-learning post-training

Benefits

  • You will also be eligible for equity and [benefits](https://www.nvidia.com/en-us/benefits/).

Company Overview

  • NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI. It was founded in 1993, and is headquartered in Santa Clara, California, USA, with a workforce of 10001+ employees. Its website is https://www.nvidia.com.
  • Company H1B Sponsorship

  • NVIDIA has a track record of offering H1B sponsorships, with 448 in 2026, 1872 in 2025, 1354 in 2024, 976 in 2023, 835 in 2022, 601 in 2021, 529 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    More jobs

    [Remote] Senior Account Manager, Telecom AI-GRID Cloud Providers

    Worldwide Salaried

    [Remote] Senior Site Reliability Engineer, GeForce NOW

    Worldwide Salaried

    [Remote] Lead Reporting Analyst

    Worldwide Salaried

    [Remote] VP Clinical Consulting & Advocacy

    Worldwide Salaried

    [Remote] Data Center Capacity Analyst (Remote)

    Worldwide Salaried

    [Remote] Head of Legal

    Worldwide Salaried

    [Remote] Remote Customer Service Representative – Full-Time or Part-Time

    Worldwide Salaried

    [Remote] Principal ProServe Cloud Architect, Healthcare and Life Sciences (HCLS) , AWSI Sales, AWSI Sales

    Worldwide Salaried

    [Remote] Software Engineer II

    Worldwide Salaried

    [Remote] Program Manager, Professional Services - West

    Worldwide Salaried

    Experienced Data Entry Technician – Customs Brokerage Team (Remote)

    Worldwide Salaried

    Telecommunications Tower Crew Foreman

    Worldwide Salaried

    Experienced Customer Service Representative – Part-time/Full-time Opportunities in California

    Worldwide Salaried

    Remote Oncology Informatics Registered Nurse; RN – Precision Medicine – Dallas, Tx

    Worldwide Salaried

    Regional Sales Director – Northeast (Emerging Markets)

    Worldwide Salaried

    Experienced Full Stack Data Entry Specialist – Remote Work Opportunity with arenaflex

    Worldwide Salaried

    Experienced Full Stack Snowflake Developer – Data Entry, Customer Support, and Cloud Application Development at arenaflex

    Worldwide Salaried

    Experienced Customer Service Representative - Convenience Store Operations at arenaflex

    Worldwide Salaried

    Outreach Specialist - Remote in Pittsburgh, PA

    Worldwide Salaried

    Sales Coordinator Remote - Sunnyside-Tahoe City

    Worldwide Salaried