What Didn’t Make the Slides: Insights from Our SSH Virtual Learning Lab

Last week, Founder and CEO Balazs Moldovanyi and I had the opportunity to present a Virtual Learning Lab for SSH titled “From Conversation Partner to Communication Coach: Training AI to Offer Meaningful Feedback.”

In this session, we shared how PCS.ai has evolved from early neural network-based classification models to a fully integrated feedback engine powered by large language models (LLMs). This generative AI system is designed to deliver scalable, timely, and individualized feedback to healthcare learners after a simulation-based training session—helping them reflect, refine, and improve with each encounter.

By combining validated qualitative and quantitative assessment frameworks with the capabilities of generative AI, PCS empowers educators to scale their impact—offering learners more frequent opportunities to practice while receiving feedback that is consistent, personalized, and aligned with clinical and communication standards.

In short, AI is now better equipped than ever to help bridge the feedback gap—delivering consistent, high-quality input to support both formative learning and preliminary assessment. By doing so, it eases the strain on traditional feedback systems like Standardized Patients, trained observers, and faculty, which are often stretched thin by time and resource constraints.

I always enjoy presenting these sessions and contributing what we hope is truly cutting-edge content. I still remember when the Virtual Learning Labs first launched—an innovative format born out of the pandemic that’s continued to evolve and thrive. Since that first wave of virtual engagement, Balazs has presented on behalf of PCS at least once a year—an impressive streak in its own right. It’s not every day you see a CEO leading educational sessions.

One of the things I enjoy most about co-presenting with Balazs is the collaboration that happens behind the scenes. The back-and-forth of shaping the deck, refining our message, and deciding how best to communicate it—those internal conversations are often where the real insights emerge. I walk away having learned something new every time.

Of course, not everything makes it into the final deck. As with any strong collaboration, plenty ends up on the cutting room floor—whether due to the constraints of a 50-minute format or because an idea, while compelling, doesn’t quite fit the central theme. Still, those discussions often lay the groundwork for future sessions, features, or blog posts—like this one.

Here are five insights that didn’t make it into the presentation—but are too good not to share.

1. Scoring Disputes in the Age of AI: Fewer, Smarter, Fairer

Michelle Castleberry: As AI becomes more accurate in assessing what the learner did and didn’t ask during a simulated clinical encounter, will learners still have the ability to challenge their score?

Balazs Moldovanyi: Absolutely—they'll still have that option. We’re not removing any functionality. That said, we expect challenges to happen far less frequently. The accuracy of this latest generation is a huge leap forward—honestly, it’s remarkable.

We’ll admit it: scoring isn’t easy, especially when you’re trying to evaluate something as nuanced as the intention behind the question within a simulation.

For example, is the learner asking about shortness of breath because it’s a pertinent positive—part of clinical reasoning and hypothesis generation—or are they asking it as a routine Review of Systems question? That distinction matters when assigning credit.

Historically, even we had to accept that scoring wasn't always 100% accurate—because doing it in a way that’s truly airtight is hard. In fact, most human-scored evaluations—whether by faculty or standardized patients—aren’t always perfectly consistent either. But with this generation of AI, the precision is significantly better, and that’s a big deal.

2. Can AI Tell If You're Being Sarcastic? Not Yet—and Here's Why

Michelle Castleberry: You also mentioned something interesting to me—the idea that the technology is getting closer to evaluating things like tone of voice. Can AI eventually tell whether someone sounds genuine or sarcastic?

Balazs Moldovanyi: Yeah, it’s an exciting frontier. But it's always a balancing act. The real question becomes: how should we spend the limited CPU and GPU cycles available for each simulation session? We have to make strategic decisions—what gives us the most meaningful return in terms of improving the learner experience?

It’s one thing to write a wish list of features; it’s another to actually allocate resources in a way that optimizes learning. That forces trade-offs. For example, using more realistic voices could dramatically enhance the experience—but it comes with significant cost, at least for now. As costs come down, we absolutely see the value and plan to move in that direction to increase realism.

When it comes to assessment, though, I think we’re already very close. For 99% of learner encounters, we can now provide high-quality, human-like feedback. The remaining 1%—cases that might be unusually strange or even malicious—may fall short. But our focus has always been on supporting the vast majority of learners: those who are genuinely trying to improve their clinical interviewing skills. We prioritize optimizing their experience over engineering for the extreme edge cases.

3. An AI That Grades the AI? Yep.

Michelle Castleberry: Do we have an AI that evaluates how good our AI is?

Balazs Moldovanyi: Absolutely. That’s exactly the kind of evaluation we use larger AIs for—it's one of the reasons we pay OpenAI. We do use commercial models, no question. Sometimes they’re extremely helpful because they’re much more powerful than the mid-sized LLMs we run for day-to-day use which are optimized for cost and high efficiency.

But those big models aren’t always consistent, which is why we use them primarily for internal tasks—like monitoring AI output, validating system performance, and generating training data. That’s where their scale and intelligence really shine. We spend a lot with OpenAI, but it’s targeted: we use our own custom models for anything that goes into production and will be used broadly by learners.

4. Open Secrets: It’s the Execution That Matters

Michelle Castleberry: Is any of this top secret? Like, if a competitor were watching the webinar—are we saying too much?

Balazs Moldovanyi: Not at all. I don’t really believe in that kind of secrecy. At the end of the day, the devil’s in the details—and in the hard work. We’ve had a head start with AI in healthcare simulation, and we’re focused on maintaining that lead.

Sure, our AI components are a big part of our value, but it’s not just that. There are countless simulation-specific features in our Simulation Cloud, and what really sets PCS apart is how we integrate AI across a wide range of functions. It’s the combination—how it all works together—that creates the real value and makes what we offer so unique.

5. That’s a Future Webinar Topic—And You Know It

Michelle Castleberry: I recently heard someone say, “Gen Z is digitally native, but tech illiterate.”

Balazs Moldovanyi: Yeah—but there’s another way to look at that. We're all tech illiterate when it comes to certain things. Take electricity: most of us don’t understand how it works, how to wire a building, or what’s happening behind the wall. And yet, we use it every day.

That’s the natural progression of technology. The goal isn’t for users to deeply understand the inner workings—it’s for the tech to just work. We can’t expect Gen Z—or any generation—to be as technically hands-on as early adopters were. Back then, tech was still the domain of hobbyists. We were configuring home networks, tweaking settings, writing custom scripts. That era created a kind of hands-on literacy that isn’t as necessary today.

Gen Z expects immediacy. They don’t want to read manuals or click through layers of tabs—and yes, in that sense, they may appear “tech illiterate.” But the real question is: do we want to change how they behave, or change how our products behave?

Michelle Castleberry: Fair enough. But we’ve seen Gen Z learners skip right past simple interface elements—like the vitals tab or the patient note—

Balazs Moldovanyi: Exactly. Their mental models are different. And part of the issue is that we’re still using outdated metaphors—like the floppy disk icon for “save.” That means nothing to them. But that’s not their failure—it’s on us, as designers and developers, to evolve.

We’re all guilty of clinging to old metaphors sometimes. But it’s okay. This is how technology moves forward.

Michelle Castleberry: Let’s put a pin in that—could be a whole webinar on its own 😅

We hope the session was valuable for everyone who attended live, as well as those who’ve since watched the recording. We're grateful to The Society of Simulation in Healthcare (SSH for supporting the simdustry and for providing a platform to share updates about our technology with its members.

Until next time...

Quick Links

Non-SSH Members: [webinar recording]
SSH Members: [webinar recording]
4-Page Executive Summary
Slides