Can AI be trusted?
Photo: Jay Townsend
A new episode of the Leidos MindSET podcast confronts AI's trust issues. The episode features Ron Keesing, the head of AI at Leidos.
Why you should know: If we don’t trust a technology, we simply won’t use it. But when it works, it’s amazing how fast we trust it. One classic example is the elevator, which many trust with our lives daily.
Trusting AI is no different. Sebastian Thrun, who led Google’s self-driving car project, said his team demonstrated that as passengers we trust driverless cars after only 10 minutes, when our attention will turn from the road to our phones.
But trust runs deeper than functionality. AI has earned a healthy distrust for its ethics. Machine learning models can mimic the behavior of their human creators who are prone to biased thinking. We’ve seen algorithms reflect biases in hospitals, courtrooms and the workplace.
In this episode:
- Can we trust AI to be ethical, fair and inclusive?
- Can AI become a trusted partner in combat?
- How trustworthy will AI become in the future?
From the source: “You don’t go from zero to a trusted AI system in one easy step,” Keesing said. “If you’ve built a really great system, humans can learn to trust it quickly. But building that trusted solution takes time, and requires going through stages of maturity until it’s ready to be introduced to people as a technology they can build that trust relationship with.”
Related:
-
Ron Keesing:
You don't go from zero, not having anything, to building a really trusted AI system in one easy step. If you've built a really great system, humans can learn to trust it quickly, but building that trusted solution actually takes time and it takes going through stages of AI maturity until it's really ready to be introduced to people as a technology that they can develop that trust relationship with.
Meghan Good:
Welcome to Mindset. I'm your host, Meghan Good.
Brandon Buckner:
And I'm your host, Brandon Buckner.
Meghan Good:
Today we've asked Ron Keesing back to talk a little bit more about trusted AI. So Brandon, let's get started.
Brandon Buckner:
Welcome to the Mindset podcast. Today, we're confronting AI's trust issues with Ron Keesing, who is the head of AI at Leidos. Welcome to the show, Ron.
Ron Keesing:
Hey, it's great to be here. Nice to talk to you guys.
Brandon Buckner:
You and your team recently hosted AI Palooza, which for our listeners is a gathering of AI professionals within the company. Something I found so interesting from one of our presenters was that when technology works, it's amazing how fast we trust it. One classic example is that many of us ride the elevator every day and trust that technology with our lives. From what it seems, trusting AI is no different. In fact, it's been proven that as passengers, we begin to trust self-driving cars after only 10 minutes when we'll turn our attention from the road back to our phones.
Brandon Buckner:
So Ron, what about trusting AI in domains like combat and in matters of national security and in the many other serious domains Leidos is involved with? As our customers start to use AI more and more, just how important is it that we trust it?
Ron Keesing:
Well, first off, this is a topic about which I'm passionate. Trust in AI is absolutely critical. And you've mentioned these examples where we have tremendous trust in technologies if we have the right experiences with them and if we learn that we can use them effectively. So from my perspective, when we work on AI trust as a company, the first thing is being really clear what we mean by AI trust. Whose trust are we talking about? Because you think about say using AI in a DOD mission, it's not just the trust of end users, that people on the elevator that matter. It's also the building owners, the people who own a mission or a system. They need to trust that AI isn't going to put a mission at risk. It's the public who's afraid that if AI is misused or not effective, it could put innocent lives at risk. And so AI trust has to incorporate all those perspectives. It's not just about the end user.
Ron Keesing:
Now the other really interesting thing I think about AI trust is that you've given these great examples of systems that humans can learn to trust very easily because they worked so well. One of the key points about both those technologies is that they are the result, what's experienced today as an easily trusted system is the result of extensive development and testing and expertise and interaction that's taken place over a very long time. So one of the things we've actually looked at Leidos, we've been building trusted AI systems for a very long time, and one of the things we've seen is that you don't go from zero, not having anything, to building a really trusted AI system in one easy step. If you've built a really great system, humans can learn to trust it quickly, but building that trusted solution actually takes time and it takes a very methodical approach to going through stages of AI maturity until it's really ready to be introduced to people as a technology that they can develop that trust relationship with.
Meghan Good:
And I imagine that gets even more important when the stakes get higher, when people's lives are in danger or when it's such a mission critical kind of application. Can you give us a couple of examples about the consequences in really high stakes domains when the AI cannot be trusted?
Ron Keesing:
Sure. Well, let's give you an example of one where we had to build in a great deal of trust into an AI system that Leidos built and delivered for the US Navy, which is our sea hunter platform. So sea hunter is an autonomous vessel that navigated autonomously from California to Hawaii. And in doing so, one of the really important parts of safety that the system had to do and to be trusted was to avoid colliding with other vessels, like fishing vessels that it might run across. So we had to do an incredible amount of testing and validation of how that system would perform so that we could be really confident. And we did a lot of experimentation in different kinds of environments to make sure that the AI would actually always behave in the manner intended. And we were able to actually avoid and what could have been putting people in life and death situations because the system was so well tested.
Ron Keesing:
Another great example of trusted AI at Leidos is the work we do on security detection and automation. So we build the scanning systems that use computer vision to identify people who might be trying to smuggle a threat object onto a plane. And again, those are critical applications where both the human has to trust that the public is going to be treated fairly, but they also have to trust that the AI isn't going to let any threat through. So those are great examples of kind of work Leidos does where the trust stakes are very high and we work closely with our customers to make sure we can really build trusted AI.
Brandon Buckner:
Ron, you mentioned trust over time. And in your presentation, you stated that trusted AI is nothing new to Leidos, something we've been doing, as you said, for nearly a decade. In fact, in our company's annual report from 2004, you'll find a picture of the sand storm, which is a self-driving Humvee built by Carnegie Mellon University with the help of our technologists that competed in a DARPA Grand Challenge by crossing the Mojave Desert with no human driver. So it's sort of an early precursor to some of the autonomous platforms we see today. Looking back over that span of time, how far do you think we've come?
Ron Keesing:
Oh, it's incredible. I mean, if you just look at the advances in self-driving cars that have taken place, those original self-driving car demonstrations as part of the DARPA Grand Challenge, the goal was to drive 60 kilometers, an open desert with no people around and no moving obstacles. And even then, they really struggled to do that independently.
Ron Keesing:
Today we have autonomous vehicles that can navigate around in crowded cities with people, or like I described, the Sea Hunter platform that we've built that can navigate in very difficult, challenging open ocean and avoid collisions with vessels. So clearly just in the realm of autonomous systems, we've gone an incredible way.
Ron Keesing:
But so too, in so many other domains. Now we use AI and machine learning to help clinicians make decisions on benefit determinations for our veterans, which again, is a critical trust-based use case, right? These are incredibly honored veterans who want to make medical care decisions over, and we're using AI to help with human cooperation, with human trust, to help make benefits determination that will be really critical to their course of care. So there are lots of examples that we've been building up this expertise over time of how to actually deliver AI solutions that both the public and the owners and the users of the technology can all really trust.
Brandon Buckner:
So whether it's Sandstorm or Sea Hunter, a decade later, or even medical care like you mentioned, generally speaking, what's the Leidos approach? What has it been over the past 10 years to develop entrusted AI?
Ron Keesing:
Yeah, so I mentioned that methodology piece is really important, and this is something. We've been trying to distill down what are the common elements to our successful trusted AI programs? And one of the things we've really found is that it involves a structured methodology that hasn't been necessarily consciously used every time, but it's been a repeated pattern that has been part of every successful program. And that is that we start with essentially a really detailed analysis upfront. We call this kind of our foray methodology. And step one is analysis, actually really dive deep into the data and understand what the data tells us about the problem. Often at that stage, we're building actually something like a data science capability where humans are still using data to answer questions, but they're using the techniques of AI and machine learning.
Ron Keesing:
As we matured the technology, we then move into a stage where we're doing more of what might be called an assistive mode of AI, where the AI is operating in cooperation with a human and it's taking tasks humans can already do, and just making them faster and better with the AI as kind of a partner.
Ron Keesing:
Eventually, move into a stage where AI to augment what humans can do. So now the AI is actually helping humans do something that they couldn't do themselves completely before, but with AI's help, they can do a task better or faster or in a whole new way until finally, we can achieve kind of true, where it's appropriate, high-level automation or autonomy.
Ron Keesing:
Now I mentioned this because, Meghan, you made a great point about risk. One of the great points about this methodology is it also allows us to really control the level of risk of the AI and understand it. So if you try and build a fully autonomous system on day one, what you often find is there's a whole lot of risk associated with that. But by building up your understanding of the AI and the AI capabilities, gradually you're only introducing the higher levels of risk once you really understand how the AI works.
Ron Keesing:
The other really important points is you often need additional data that you get from how humans interact with the AI. And when they're interacting during these lower levels of assistance and augmentation, you actually get a lot of human guidance on what the AI is doing right and what it's doing wrong. So you can develop human trust in the systems organically, and you also can improve their performance by learning from what humans tell them to do and not to do until, when you reach full autonomy, you really have a robust and capable system that is effective across many different kinds of problems.
Meghan Good:
In the course of that interaction with humans, do you get pushback from the users of that system that they're still worried that this AI will take their job or will have the domain expertise that they've built over the years?
Ron Keesing:
I find that's a really big advantage of the gradual approach of introduction of the AI. Typically, if you come in and you say, "Hey, human, here's a machine that's going to replace your job. What do you think? Is it doing it right?", of course they're threatened and of course they don't really trust that the AI does it right. On the other hand, if you introduce the AI in stages and you actually show them how the AI can free up their time and allowed them to do their jobs more quickly at first and maybe do things they couldn't do that were almost frustrating to them, then you build up that trust organically.
Ron Keesing:
Now there are still users, of course, there are always going to be users who are threatened by AI and by automation. But what we do find is that users are far more accepting of it, particularly if they can provide the feedback. So if a human says, "Look, this AI always makes these kinds of mistakes," or, "I need to correct this thing about it," and then they can see that the AI actually becomes better as their feedback is incorporated then they feel like they become kind of owners in the process and they're much more receptive and much more trusting as their feedback into the system is captured. So again, one of the real benefits of this more gradual approach is actually developing organic human trust. That's based on them seeing the system become better over time with their ideas being incorporated into it.
Meghan Good:
And still achieving some speed gains, some efficiencies along the way.
Ron Keesing:
Absolutely. And often one of the really exciting points we find is, first of all, humans are really excited when they see AI, make it possible for them to do the most tedious and repetitive parts of their jobs more easily. And then the other thing they get really excited to do is to actually be able to do their jobs in new ways, where maybe that augmentation stage you go from tasks that humans maybe were struggling or always wondering why something was so hard, and now suddenly the AI is making that job easy.
Ron Keesing:
Let me give you an example from the world of intelligence analysis. So we had a program where we had users who, the only way they could find the data they were looking for was to actually kind of manually search through tons of files and a data warehouse that was essentially just impossible to navigate.
Ron Keesing:
And we turned this into an AI enriched data processing platform where AI helped them find connections between all these different kinds of data. What we found was humans who were able to search just a tiny part of the holdings before, literally we were seeing humans and only looked at maybe one or 2% of the data before, could now look at all the data at once and find all these connections that were missing. And suddenly they didn't see the AI as a threat. They actually saw the AI as a way to do their jobs better and to let them focus on the more interesting problems. Those kinds of successes really make my job exciting and meaningful.
Brandon Buckner:
Let's shift gears to the ethics of AI, which remain a hot topic today. We know that machine learning models can often mimic the behavior of their human creators, who are, of course, prone to biased thinking. We've seen examples in the news of when AI has reflected biases in places like hospitals and courtrooms and the workplace. What are some of the biggest challenges in this area and how are we helping our customers navigate them?
Ron Keesing:
That's a great point. And over the last several years, there's been a great deal, more awareness around these issues of AI, ethics and fairness. I think it's great. We're having this conversation and considering these potential implications of building solutions with AI. It's a conversation we need to have, and we need to have it on two levels.
Ron Keesing:
First, we need to make our AI better and more resistant to the kinds of bias we've seen and heard about. But also, we need to appreciate that AI learns biases by learning from humans unless AI can help surface and recognize biases that humans have been operating with for a long time. There's a great example from a large tech company where they tried to train an AI system how do identify the most promising resumes from applicants by predicting which people would be most likely to be hired. The widely reported story is that the AI learned that attending traditionally all women's colleges was a negative feature in the hiring model. In other words, women were less likely to be hired. And that was just replicating the fact that women were less likely than men to be hired in the company using their current processes.
Ron Keesing:
Now is that AI's fault or did AI actually help us recognize that something was wrong with the humans? The thing with AI is if you're not careful, it can pick up on human biases and make them even worse. AI can amplify those biases.
Ron Keesing:
So Brando, you asked what do we do about that? Well, the answer is partially in technology. We actually use algorithms that can look over and detect these biases in model performance, especially by groups. And so we apply specific fairness tools that are able to identify and detect these biases and mitigate them, either by adding additional training data or by the changing weights of models and so on to help identify and mitigate biases and make sure that protected classes are treated fairly.
Ron Keesing:
It's not just all about the technology. It's also about governance. Often we need to make sure that when we design a project in the first place, we actually consider these possible biases and we're actually making sure that the very design of the way we create and collect our data is also not going to amplify the biases. So it's a technology piece, it's a governance piece, it's an awareness piece. And again, we help all of our customers with these issues.
Ron Keesing:
The really other interesting thing as I think about this issue of fairness is the complexity of it. We all have a human sense of what it means to be fair, but actually fairness has a lot of different definitions that are mutually incompatible. Do we mean by fair that something is sort of completely blind to a particular membership in a selected class? Do we believe that being fair is to predict what the outcomes will be when humans actually interact? A lot of famous fairness problems have actually interacted because people haven't defined up front what definition of fairness they really mean. So one of the other things we work with our customers on is actually defining what specific kind of fairness is important to their mission and how do we do that?
Ron Keesing:
Well, a really interesting example for me is the fact that a lot of these fairness issues have to do with the fact that often we don't have enough data about certain classes. And this actually becomes a real problem, not just if you're trying to, let's say, be fair to people applying for mortgage applications, but if you're solving an intelligence problem and you're looking for rare objects that you've only seen a few have ever before. Trying to build a machine learning model that performance as well over the rare objects is the common ones is actually really hard. And this is also something we help our customers with as well.
Brandon Buckner:
So we've talked about reliability and about ethics, but part of trust is also in security and resilience. It was interesting to me that the company has invested a lot of resources already in these areas, but where have you seen these investments gone specifically? And how have you seen returns on these investments?
Ron Keesing:
It's a great question. It's something I'm just fascinated in personally, right? One of the things that's come up in the area of AI and machine learning, and we talk about the fact that in AI, we can use data to learn how systems should behave, but that actually creates new kinds of vulnerabilities in systems where if humans can actually access the data, they can misbehave. One thing they can do is they can actually look at how a system behaves and maybe to infer the data in the first place. And if you think about it in the context of let's say a health application where it might involve protected health information or a classified application where there might be secure information that we use to train a model, if that data could be reverse engineered. Then that becomes a vulnerability.
Ron Keesing:
Similarly, we have all of these examples where if someone could get ahold and feed a model bad data, they could try to actually cause the model to do the wrong thing under specific circumstances. Someone could add data that would cause a system to get confused between the school bus and a tank so that you accidentally bomb a school bus instead of a tank. That would be terrible, right? So how do we protect the data to make sure that the AI can be trusted as well and the AI that functions over that data is trusted as well?
Ron Keesing:
So I will tell you, this is an area we've been working for a long time. We've developed algorithms that can help us determine when data has been manipulated and spoofed. And we also developed kind of a whole chain of trusted data techniques so that we can monitor the provenance of data that goes into our machine learning models in the first place, and make sure that we understand if it's been manipulated.
Ron Keesing:
And then the final piece is we actually want to monitor our models while they're running and making sure that they're performing as they should when they're working properly and then we can stop them or actually take action if our models are misbehaving. So those are all techniques that we've used. They're being included now in some of our solutions and we're really starting to see our customers coming and being aware of the risks and we're excited because we've been working on this for quite a while and we can now start really offering these as solutions to our customers.
Meghan Good:
Now with all of that in mind and the over a decade of experience doing this, what do you see is coming on the horizon? What's next in the area of trusted AI?
Ron Keesing:
It's such an interesting time in the field of AI. Things are moving so quickly. One of the real changes is the degree to which autonomous systems are just part of our lives now. We talked about self-driving cars earlier. You can actually buy now a Tesla with some significant self-driving features and people are talking about delivery drones and military autonomous systems. So I think the degree to which people actually interact with autonomous systems in their daily lives, we're going to see a real increase in that over the next three to five years. And it's going to be very impactful and very much kind of an in your face part of AI. I think humans trusting those systems and understanding also how those systems fail, there will be probably some high profile incidents where AI will fail and people will be upset about it, but it also is going to be something that's pretty much unavoidable that AI is going to get used this way.
Ron Keesing:
The other side of this though, the AI trust that I think will be very positive is that there's so much awareness now around the kinds of ways that AI amplifies human bias. I am really excited about the conversation that this has started to actually cause some recognition of some human biases that AI just helps us surface. I love it when we talk about the fact that in a hiring practice or a mortgage loan practice, we trained an AI model and we found that the data was really biased because it can actually help us improve processes we use as humans and make them fairer even without the AI's help and understand some of the implicit biases that we're all operating with that then the machines just pick up.
Ron Keesing:
So I think one of the really exciting areas of AI bias is actually in helping humans identify and overcome some more biases two. And we're already starting to see some of that get picked up and people thinking about, in their practices, auditing human practices to see if machines would learn to be biased from them and then identifying practices that may be human run right now that we want to actually reform.
Meghan Good:
It's like a matter of measuring unconscious bias, right?
Ron Keesing:
That's right. That's right. And machines are amazing at not only picking it up, but amplifying it, which means that you can use them as powerful tools to help make us as humans better.
Brandon Buckner:
Speaking of measurement, Ron, how do you measure trust in AI?
Ron Keesing:
It's a great question. And I think one of the ways that we found to be successful in this is that there is no one measure of human trust in AI, that typically we really have to build very mission specific ways of measuring trust. So for example, for a human trusting Google, how do you measure that a human trusts Google search results, which are AI underneath? Well, actually just look at what they click. It's really easy to see whether they trust the results. If they pick the highest recommendations from Google and they keep coming back, then you can get a sense that they trust Google search results.
Ron Keesing:
On the other hand, if you're doing, let's say an analytic function where humans are actually looking at tags on objects, one of the things you want to look at is how often do humans make corrections to the AI? And we found that things like if humans trust the AI too much, they don't question the labels. If humans don't bother to correct the AI when it makes obvious mistakes, that means they just don't care and they don't trust the AI. So you actually have to look at a certain degree, an appropriate degree of interaction with a task like that to understand.
Ron Keesing:
Similarly, if you're looking at tasks where humans are interacting with augmented systems where the AI is helping them do their job better, one of the things we find very powerful is to look at how much they're actually interacting with the AI and controlling it. And again, we find is typically kind of a inverted U shaped curve where trust is really, if you look at how much they interact with the AI, if they trust it too much, they interact very little with it. If they trust it not at all, they don't interact with it. An appropriate level of trust, there's really high degree of interaction with the AI. So again, I would say we found is the best way to measure AI. Trust is very mission specific and is very application specific. We've done it in these kinds of different environments and there are some commonalities that you find. But one of the other things we find is that we really want to tailor to those metrics to very specific mission needs.
Meghan Good:
So Ron, with all of those different examples of trusted AI, I can only imagine how they get more and more complicated over time and there's more and more of them. So how do we build trusted AI at scale?
Ron Keesing:
This is something I love to think about and we're really working on hard at Leidos is how do we build a Leidos way for building trusted AI at scale and speed. And we've really, again, looked at what's been successful. We've found that it boils down to a series of steps in how we execute our AI projects as well, not just kind of the incremental introduction of technology, which I talked about, but also kind of a repeated formula for building our AI solutions.
Ron Keesing:
The first step is almost all of our great AI solutions involve some use of existing technology from outside of Leidos. Why? Because there are billions of dollars being invested in amazing companies and startups and huge, five large AI company, AI in academia. And so one of the things we start almost every AI project with is considering the technology that's already out there and how we can make use of and harvest all that powerful AI that's already available.
Ron Keesing:
And then the big part where we start to come in is really how do we domain adapt that to our customer's missions? Because often the commercial AI or the academic AI may be really powerful, but it may be built for a completely different domain than the government's problem. So we work on problems like how do we generate the data that's necessary to retrain a commercial model and so on.
Ron Keesing:
Then the final piece is actually how do we deploy that AI into real production environments? We find when we talked to a lot of customers about their AI is that they've gotten stuck. They can do data science of lab. Maybe they could do small scale little projects, but actually deploying production AI involves solving all these issues of AI trust. So again, what we've developed is this pattern of how you do your AI deployments, both in terms of tools and methodology, to actually deploy AI that's safe and secure into production.
Ron Keesing:
And then you asked a great question about scalability, right? So we've described this repeatable pattern that we follow. But the other point that we've found is that a lot of these solutions that we've developed can be reused and the patterns can be reused. So we're actually establishing what we call an AI factory construct within Leidos where those same processes used to build models can be scaled and repeated again and again with common solution patterns in areas like AI ops and cyber where we have customer after customer with very similar kinds of problems. And so we can build out a trust AI solution drawing, not just on what their specific needs are, but the way we solve this problem in the past. And by using that kind of combined formula, the common methodology pieces, the common technology pieces and this AI factory construct, we're able to build customized solutions at scale and speed.
Brandon Buckner:
Well, I think we've covered a lot of ground. Ron, thank you so much for your insight. What final thoughts do you have for our listeners?
Ron Keesing:
Just that Leidos as a company is as passionate about this trusted AI mission as I am personally. We see this as our role within the AI ecosystem. There's tremendous need for trusted AI across the government. And there's tremendous AI that's available from the commercial world and academia, but the commercial AI that's available typically is not operating at the level of trust that government applications require. So it's a very big difference between streaming cat videos and actually a mission that puts human lives at risk. And so bridging that gap between the level of robustness and testing that goes into a commercial AI application versus that's what's required for a trusted AI application that's going to be in the hands of a war fighter or a person monitoring airport security or a physician making a critical health decision, that's where we see Leidos role is taking commercial AI, academic AI, the best AI in the world, but actually turning it into trusted AI solutions on behalf of the US government. And that's why we really aspire to be the trusted AI provider to the US government.
Meghan Good:
Well, thanks for your time and insights today, Ron.
Ron Keesing:
Great. Thank you so much. It's always fun to talk to you, Meghan. And great to talk to you, Brandon.
Meghan Good:
And thanks to our audience for listening to Mindset. If you enjoyed this episode, please share with your colleagues and visit leidos.com/mindset.