Technical Program Manager, Agent Quality and Evaluation, DeepMind
Company: Google
Location: Mountain View
Posted on: April 2, 2026
|
|
|
Job Description:
Minimum qualifications: Bachelor's degree or equivalent
practical experience. 10 years of experience in program or project
management. 10 years of experience managing cross-functional or
cross-team projects. Preferred qualifications: Experience building
and scaling evaluation infrastructure for AI/ML systems, including
benchmark design, metrics definition, and quality tracking.
Experience partnering with research and engineering teams in
fast-paced environments to guide program delivery from concept to
completion. Understanding of the unique challenges in evaluating
agentic behavior with a passion for AI agents and self-sustaining
systems. Ability to prioritize, adapt to change, and provide
flexible thought partnership in an evolving landscape. Excellent
communication skills with the ability to develop meaningful
relationships with key partners and influence action and outcomes.
About the job A problem isn’t truly solved until it’s solved for
all. That’s why Googlers build products that help create
opportunities for everyone, whether down the street or across the
globe. As a Program Manager at Google, you’ll lead
multi-disciplinary projects from start to finish — working with
stakeholders to plan requirements, manage project schedules,
identify risks, and communicate clearly with cross-functional
partners across the company. Your projects will often span offices,
time zones, and hemispheres. It's your job to coordinate the
players and keep them up to date on progress and deadlines. As the
Technical Program Manager for AI Agent Quality and Evaluation, you
will be the strategic owner of evaluation infrastructure that
ensures AI agents deliver reliable, high-quality outcomes at scale.
You will scale evaluation efforts across agent quality (e.g.,
capability-based evaluations, user feedback pipelines, quality
dashboards) and product evaluations (e.g., workflow validation,
real-world task completion metrics). You will establish the quality
bar for self-sustaining agent execution across software
development, operations, and enterprise workflows. In this role,
you will own the evaluation strategy for AI agent programs. You
will work at the intersection of research, engineering, and product
to ensure our AI agents meet the highest quality standards before
deployment.Artificial intelligence will be one humanity’s most
transformative inventions. At Google DeepMind, we are a pioneering
AI lab with exceptional interdisciplinary teams focused on
advancing AI development to solve complex global challenges and
accelerate high-quality product innovation for billions of users.
We use our technologies for widespread public benefit and
scientific discovery, ensuring safety and ethics are always our
highest priority. We are pushing the boundaries across multiple
domains. Our global teams offer various learning opportunities and
varied career pathways for those driven to achieve exceptional
results through collective effort. The US base salary range for
this full-time position is $240,000-$334,000 bonus equity benefits.
Our salary ranges are determined by role, level, and location.
Within the range, individual pay is determined by work location and
additional factors, including job-related skills, experience, and
relevant education or training. Your recruiter can share more about
the specific salary range for your preferred location during the
hiring process. Please note that the compensation details listed in
US role postings reflect the base salary only, and do not include
bonus, equity, or benefits. Learn more about benefits at Google .
Responsibilities Develop and scale capability-based evaluation
frameworks for AI agents. Establish quality dashboards and
leaderboards for tracking agent performance and latency. Guide user
feedback pipelines to collect and curate high-quality evaluation
examples. Co-ordinate benchmark evaluations comparing agent
capabilities against baselines. Collaborate with evaluation teams
to validate agent capabilities across various use cases.
Keywords: Google, Santa Cruz , Technical Program Manager, Agent Quality and Evaluation, DeepMind, IT / Software / Systems , Mountain View, California