From the Field

Research Notes: Two Emerging Strategies for Using AI in Tutoring

Generative artificial intelligence (AI) has the potential to reshape the K-12 tutoring landscape with promises of serving more students at lower cost. But until recently, evidence on whether AI-enabled tutoring can actually improve student learning has been limited. Two new randomized controlled trials find that AI embedded in live, chat-based math tutoring can improve student academic outcomes, raising questions about the tradeoffs between cost and the value of personal connections provided by human tutors.

One study, conducted by researchers from Google and Eedi Labs, evaluated LearnLM, a generative AI tutoring system that provides responses to students’ questions that are reviewed by a supervising human tutor. Tutors could approve the AI’s responses as written or edit them before sending them to students. Between May and June 2025, 165 students ages 13 to 15 participated in the study. Within each tutoring session, students were randomly assigned to one of three conditions: chatting with a human tutor, chatting with LearnLM, or receiving static, pre-written hints from the Eedi Labs platform.

The results suggest that AI can function as a reliable instructional tool on its own. Supervising tutors approved 76.4 percent of LearnLM’s responses with little or no edits, and the LearnLM was just as effective as human tutors in helping students correct their mistakes. More notably, students who interacted with LearnLM performed better on subsequent, more challenging topics: they had a 66 percent success rate, compared with students tutored by humans alone (61 percent) or those who received static hints (56 percent).

A second study conducted by researchers at Stanford University examined a different model: Tutor CoPilot, an AI-tool designed to provide guidance to tutors during chat-based tutoring sessions. Different from LearnLM, which gives the supervising tutor only one suggested response, Tutor CoPilot gives tutors three suggested responses that tutors can choose from, edit, or regenerate. In a study conducted between March and May 2024, 1,000 elementary school students were randomly assigned to chat-based sessions with either a human tutor alone or a human tutor using Tutor CoPilot.

Students in the Tutor CoPilot condition were four percentage points more likely to achieve topic mastery than students assigned to human tutors, with the largest gains (up to 9 points) among students assigned to lower-rated and less-experienced tutors. The researchers suggest that these improvements were likely driven by the use of higher-quality instructional practices—tutors using CoPilot were 10 percentage points more likely to prompt students to explain their thinking, while tutors in the control condition were more likely to rely on generic encouragement.

Taken together, the two studies highlight different approaches to using AI in tutoring. The LearnLM model lends itself to substitution: supervising tutors play a limited role, and the results suggest that the design could ultimately support fully AI-driven tutoring. Tutor CoPilot, by contrast, positions using AI as a supplementary tool to improve tutor quality, rather than replace tutors, by functioning as a form of embedded professional development.

The two approaches have different cost implications for school districts as well. Tutor CoPilot costs roughly $20 per tutor and could potentially reduce tutor training costs by strengthening the quality of their instruction. LearnLM, by contrast, is charged on a per-student basis that includes supervising tutors in the subscription, eliminating the cost of human labor for districts altogether.

LearnLM’s design also raises broader questions about the role of human tutors in AI tutoring. While AI tutoring’s  cost-saving capabilities can increase scalability, its potential to eliminate the use of human tutors could threaten important relational aspects of learning. Research continues to demonstrate the positive impact of “sustained and strong” relationships between students and their tutors on student outcomes, prompting a fundamental question: can AI, on its own, replicate a human relationship? In an effort to address this concern, some school districts are experimenting with tutoring platforms that leverage existing student-teacher relationships by cloning a teacher’s appearance and voice into an AI-generated avatar, which then interacts with the student in real time.

The emerging evidence suggests that AI will certainly have a role in the future of tutoring, but exactly what that role will be remains unclear. The two programs reflect contrasting aims: one seeks to strengthen human instruction, while the other moves to replace it. Given these distinct objectives, school districts will confront a key tradeoff between scalability and the human, relational elements of instruction proven to be central to learning.

AI tutoring can safely and effectively support students: An exploratory RCT in UK classrooms

LearnLM Team, Google & Eedi

November 2025

Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise

Rose E. Wang, Ana T. Ribeiro, Carly D. Robinson, Susanna Loeb & Dorottya Demszky

November 2025