INITIAL PROBLEMS
Supervisors need to watch lots of zoom recording sessions to track tutors' performance.
For supervisors, manual reviewing 100+ sessions weekly was exhausting and inefficient—leaving no room for deep insights or proactive coaching.
Designing Human-Centered AI for Scalable Supervision to Support Invisible Work
AI Insight interaction rate
Reduction in avg. time / week / per supervisor
AUG 2024 – DEC 2024
2 Product Designers,
1 Product Manager,
2 Developers,
1 User Researcher
Lead product designer from concept research to product final launch
End-to-End Product
OVERVIEW
PLUS - Personalized Learning Squared, is a tutoring platform that combines human & AI tutoring to boost learning gains for middle school students from historically underserved communities. The platform supports over 3,000 students and 500 tutors, completing more than 90,000 hours of tutoring each month.
Instead of replacing human judgment, the tool amplifies human work without losing the human heart.
INITIAL PROBLEMS
For supervisors, manual reviewing 100+ sessions weekly was exhausting and inefficient—leaving no room for deep insights or proactive coaching.
EARLY RESEARCH
Before jumping into design, I partnered with researchers and engineers to align on how AI could responsibly support this shift—without compromising human oversight.
AI USAGE RESEARCH
We using confusion matrix ,user flows, and study on capabilities to understand AI usage.
Feasibility Research on Behavior Metrics
UX Flow for Each AI Capability
Impact & Feasibility Evaluation Workshop
USER RESEARCH
We conducted co-design with end-users to align what they expect for AI to help.
Co-Design Sessions with various stakeholders
User Persona for Reference User Needs
KEY INSIGHTS & GOALS
Supervisors need to reduce time spent on repetitive performance reviews while maintaining fairness & human oversight.
Scale tutor supervision efficiently without risking unjust churn or eroding trust.
PROBLEM STATEMENTS
DESIGN CONSIDERATIONS
From user interviews and technical feasibility reviews, we identified clear opportunities for AI to reduce manual workload—without overstepping human judgment.
Supervisors didn’t need AI to evaluate emotional nuance or replace their decisions; they needed help surfacing what matters most.
ITERATION #1
Supervisors were spending hours watching Zoom recordings to gauge whether tutors were engaged.
They wanted help detecting signals like “warmth” or “proactiveness,” which led us to initially use AI models to score sentiment, tone, and conversational pace.
However, it quickly failed. Our early approach faced multiple challenges that AI-models were unreliable, biased, and technically infeasible.
NLP models misinterpreted accents, low-quality audio, or quiet speakers. Emotional scoring felt culturally biased and unexplainable, and high technical complexity and low confidence rates made the system unusable.
SOLUTIONS #1
We pivoted toward measurable support behaviors, focusing on how tutors actually spent their time with students. We introduced a Time Allocation by Student Needs framework, and further breaks down time by student needs.
This approach gave supervisors clear, contextual insights into how tutors prioritized their efforts—highlighting support quality, not just presence.
KEY IMPACT
Helped supervisors skip full session reviews by surfacing clear, rule-based engagement signals.
Saved review hours and standardized evaluations without relying on costly, biased sentiment AI.
ITERATION #2
We found out supervisors were manually documenting performance issues like no-shows by checking Excel sign-up sheets and matching them with Zoom attendance logs.
It was time-consuming and unreliable to track patterns at the tutor level—supervisors could only see if a single session was missed, not how often someone missed sessions overall.
We designed AI to detect common performance issues (e.g., missed sign-ups, late logins, no-shows) and automatically issue warnings once a threshold was reached. The original goal was to summarize session data into tutor-level performance insights and scale up intervention. However, early AI models were too opaque or made decisions on their own.
SOLUTIONS #2
We redesigned the AI to surface tutor-level performance patterns while keeping humans in control. Supervisors receive trend summaries and alerts in the dashboard and they can override and notes ensure decisions.
KEY IMPACT
Gave supervisors control and context over tutor issues, improving fairness and reducing micromanagement.
Saved supervisor time and improved evaluation consistency by replacing vague cues with actionable data.
FINAL DESIGNS
The AI Tutor Coach system transforms how supervisors manage tutor performance by automating repetitive monitoring tasks and surfacing what truly matters.
By focusing on measurable actions rather than vague emotional signals, the tool helps supervisors coach more effectively, reduce tutor churn, and maintain trust across the platform.
REAL IMPACT & RECOGNITION
Tutors received fairer, more transparent feedback, while supervisors saved time by focusing only on sessions that needed attention.
AI insight interaction rate
Reduction in avg. time / week / per supervisor
The tutor coaching tools are one of the key ai-enhanced features for PLUS platform.
REFLECTION
AI doesn’t need to feel human—it needs to make humans feel confident.