Skip to main content Skip to secondary navigation
Main content start

The future of AI Chat: Foundation models and responsible innovation

An expert in AI describes how foundation models are fueling the recent advances in AI and how this is changing everything.
Abstract painting of human and AI Robot communicating.
How are foundation models being built, and how should they be evaluated? | iStock/stellalevi

Guest Percy Liang is an authority on AI who says that we are undergoing a paradigm shift in AI powered by foundation models, which are general-purpose models trained at immense scale, such as ChatGPT.

In this episode of Stanford Engineering’s The Future of Everything podcast, Liang tells host Russ Altman about how foundation models are built, how to evaluate them, and the growing concerns with lack of openness and transparency.

Embed Code

Listen on your favorite podcast platform:



[00:00:00] Percy Liang: The term, you know, large language models doesn't really underscore the importance or the role that this, uh, these models were going to have. So, we called it foundation models, uh, because we felt like there's a paradigm shift where we all of a sudden have these huge models trained on broad data, uh, which served as a foundation for developing other products and models downstream.

[00:00:33] Russ Altman: This is Stanford Engineering's The Future of Everything and I'm your host Russ Altman. If you enjoy The Future of Everything, please follow or subscribe to it wherever you listen to your podcasts, it will help us grow.

[00:00:44] Today, Percy Liang from Stanford University will tell us about how models like chat GPT, some call them language models, other call them foundation models, how they're built, what they do on the internals, and how they should be evaluated? It's the future of AI chat.

[00:01:04] Before we jump into this episode, I want to ask you to rate and review the podcast. It'll help us get better, and it'll help other users discover us.

[00:01:18] Everybody knows that in the last year, the world is being turned upside down by language models or foundation models like ChatGPT. These are amazing AI tools that can take questions and apparently generate pretty good answers. But how do they really work? What data was used to train them? What are the hidden biases? And most importantly, how do we evaluate them? And how do we evaluate the companies that produce them or organizations that produce them?

[00:01:48] Well, Percy Liang is a professor of computer science at Stanford University. He's also a senior fellow at the Stanford Institute for Human Centered AI. And he runs the Stanford Center for Research on Foundation Models. He's deeply looking at how to build and evaluate these models.

[00:02:07] So, Percy, let's start out. Uh, you kind of, uh, were instrumental in introducing this phrase, foundation model. And a lot of times you're using it the way other people use large language models like ChatGPT and others. So, let's start out, what's a foundation model to you and why was it important to have a different term for these kinds of things?

[00:02:26] Percy Liang: So, the foundation model is a term that we coined two years ago. Two years ago, is eons in the era of AI and we thought that there was something big emerging and people used the term language models. People were using language models for decades. And it meant always a model over language, natural language. So, you have a sequence of words or tokens and you put a probability distribution over, um, over, um, these, uh, words. But what was [00:03:00] happening was, uh, that GPT 3 came out and GPT 3 was a language model, of course. But what we saw was that it was not just language models. It was also vision.

[00:03:11] There was other modalities, um, and we've seen, for example, in models over protein sequences, and we felt like language was both not really capturing all the modalities, which we thought were important. But also, the term, you know, large [00:03:30] language models didn't really underscore the importance or the role that this, uh, these models were going to have. So, we called it foundation models, uh, because we felt like there was a paradigm shift where we all of a sudden have these huge models trained on broad data, uh, which served as a foundation for developing other products and models downstream. So that's why we call it a foundation model. It's something that you invest a huge amount of capital and build once.

[00:04:00] And then now this can be repurposed for, uh, many different tasks in contrast to the previous paradigm, where everybody would collect their data, train a model, and deploy it and there you have these silos. But now we have this, all of a sudden, this, uh, this centralization and that's why we figure that it deserved a new name.

[00:04:22] Russ Altman: Great. And actually, that makes a lot of sense. Your work and your group spends a lot of time building these, testing them and we're going to get to that. Because you've done even recently, you've done some really exciting things in terms of evaluating a whole bunch of language models, but let's go back to some basics. Can you explain to a nontechnical person, how are they built and what is the training data that's used?

[00:04:42] Percy Liang: Yeah. So anytime you do machine learning or AI, uh, these days, there's basically two things, there's the data and then there's the model architecture. So, let's start with the data. So, one of the insights, uh, that, um, which is not new, but I think has gotten a lot of attraction lately is idea that you can just take as much data from the internet as you possibly can get your hands on. And you can actually make use out of that, right?

[00:05:15] So in the past where you were trained a question to answer system, you would have to have data question answer pairs, for example. Or if you train a, you know, a cat versus dog detector, you would have images of cats and dogs. But the idea that we have this rich resource called the internet, and there's all this data out there. And if you could figure out how to harness that, you would have basically a much more powerful model.

[00:05:41] So, uh, long story short, the data that is, is collected is generally based on these web scrapes, usually common crawl. It is a sort of a public domain scrape of the internet, and then there's...

[00:05:58] Russ Altman: And when you say scrape, you mean they literally go to the websites, and they collect all that text that's normally displayed in our browser. They take it and they put it in a file to give it to some model architecture that you're going to tell us about in a couple of seconds.

[00:06:11] Percy Liang: Yeah, that's right. So there's a process or a common cause organization that actually produces these, does the scraping and that's one data source. Then there's Wikipedia, which I think everyone knows about. There's also some, uh, books, uh, which, you know, there's some copyright issues around this, but nonetheless, these [00:06:30] sometimes go into the training data. Sometimes you have code from GitHub, and you have, uh, papers from archive, maybe there's some others, uh, sources, PubMed articles. And so, it's basically this sort of hunt for, you go online, what's available, you grab it, and then you form a pretty large, uh, data set.

[00:06:52] So, generally the data sets right now are, uh, just to give you a scale of things, are, people talk about training on, you know, one or two trillion tokens for, and the token is maybe 2. 5 tokens per word, let's say. So, trillions of words, let's say.

[00:07:11] Russ Altman: Trillions of words.

[00:07:12] Percy Liang: Tens of trillions of words.

[00:07:14] Russ Altman: Much more than the average human, or in fact, much more than any human has ever read.

[00:07:19] Percy Liang: Yeah, much, much more that that. Okay, so you have all this data, and then there is model architecture, and that just means it's a neural network, which are called a transformer. So, the transformer was this thing that was introduced by some Google researchers back in 2017, which again builds on attention mechanisms, which are developed by Yoshua Bengio's group in 2014 or '13.

[00:07:43] So, there's a long history of developing these types of architectures, but the one that seems to be the one that people have converged on, at least for now, is a transformer architecture. I'm not going to go through the details of this, but it's basically a giant neural network, which is, uh, takes in a sequence, and then, um, does a huge number of matrix multiplications, different operations, and produces, uh, vectors, which then could be used to produce, predict future sequences.

[00:08:16] Russ Altman: These are vectors of numbers.

[00:08:17] Percy Liang: Yeah, vectors of numbers. And so, you train a model, which means this model has, um, often billions of, if not tens or hundreds of billions of parameters. So GPT 3 was 175 billion parameters and so just as a sort of a goalpost. So, we have, let's say a hundred billion parameters just to.

[00:08:39] Russ Altman: And when you say parameter, these are part of the model that when you start off, you don't know what the value should be, but after exposing it to trillions of pieces of text, it then can estimate what those parameters are so that in the future, it'll be able to make predictions. I just want to check.

[00:08:57] Percy Liang: Yeah, exactly. So, a parameter, it just means a number.

[00:09:00] Russ Altman: Ok.

[00:09:00] Percy Liang: So, you basically have 100 billion numbers that you need to set somehow. You start, all of them are random in the beginning, and then you take your training data, which is, you know, remember tens of trillions of words, and then you stick it through the algorithm and it sets these numbers and it sets the numbers based on the objective and objective function is to predict the next word.

[00:09:26] So, you're given, let's say half of a Wikipedia article and you try to predict the next word. And so, this seems very mundane, right? Okay. So, predict the next word, but it turns out that predicting the next word is actually encodes a lot. Because in order to predict the next word, you have to understand, I'll say, "understand" the, uh, what's going on in the sentence.

[00:09:51] For example, if I say, Stanford was founded in blank. You know, in order to predict that, you have to know something about Stanford. So, all of this is actually based on a very... simple and elegant principle, which is just predicting the next word. And what's sort of incredible about these foundation models is that once you do this at scale, you can actually have a model that you can now, not just predict the next word in the document, but you can give it instructions. You can say, generate a poem, in the style of Shakespeare about, you know, Russ Altman or The Future of Everything, and then it will do something. We can try that later, right? Um,

[00:10:36] Russ Altman: Well, one quick question. You said, uh, based on the previous words, it'll predict the next word. How far back does it look? Like, is it looking at the previous 100 words? 1000 words? Because it matters. It seems to me that that might be an important thing to know. But how far?

[00:10:50] Percy Liang: Yeah. So, this is called the context length, uh, and this number has gone up over time. So, I think, uh, it's gone from, let's say a thousand, to maybe ten thousand, and now people are looking at context lengths of even a hundred thousand...

[00:11:09] Russ Altman: okay.

[00:11:10] Percy Liang: ...tokens, which so, so. Quite a lot. You can fit entire books, uh, now.

[00:11:15] Russ Altman: And tell me if I'm wrong, but this seems important because if the key thing to know in order to get the next word right is 200, 000 words ago or 500, 000 words ago, and that's not in your context, then you might not get the answer right. I'm asking.

[00:11:31] Percy Liang: Yeah. So, the context length determines the sort of the upper bound on how much you can really milk out of the data set.

[00:11:39] Russ Altman: Right.

[00:11:40] Percy Liang: Of course, most of the predictions can just be made just looking very locally. If I say of, you know, in the, uh, well, okay, I'm trying to find a good example, but uh, um, you know, if I have a word where it's... Like

[00:11:58] Russ Altman: Percy and I are meeting at noon to go have blank.

[00:12:02] Percy Liang: Yeah.

[00:12:03] Russ Altman: Lunch is a pretty good guess and we've only, and I only said about 20 words.

[00:12:07] Percy Liang: Yeah. And then after that, it's like a period, which is even easier.

[00:12:11] Russ Altman: Right.

[00:12:11] Percy Liang: So, so, there are some cases where you don't need to look very far.

[00:12:14] Russ Altman: Okay.

[00:12:15] Percy Liang: But imagine like you're trying to predict what's going to happen in a murder mystery. And it's at the very end, and you're trying to predict who did it. That requires quite a bit of context, if it's a good murder mystery.

[00:12:28] Russ Altman: Gotcha. Okay, so this is really helpful. Okay, so you, you set it up. You told us about the data, you told us about what that elegant task is. I like that use of that word. Um, let me ask you a few detailed questions. Well, not that detailed, but we've heard that sometimes these things make things up. The word hallucination has been used. Some people don't like the word, I like the word. When it gets something wrong, what's happening?

[00:12:51] Percy Liang: So, when it gets something wrong, um, I guess just to frame things a little bit. So, these are models that are meant to generate text, which is like the training data. So, the key point is that it's trying to generalize from the training data, right? So, they shouldn't, um, only generate things which are exactly training data because then there would be no learning. And now, if you just give a model, just token sequences, it doesn't, there's no notion of truth, there's no grounding there. And in some cases, this is actually fine. If I wanted to generate a fictional story of someone, then actually, yeah, I mean,

[00:13:36] Russ Altman: There's no wrong in fiction...

[00:13:39] Percy Liang: There's no wrong thing. So, I don't want, I would push back on saying that the hallucinations in itself is fundamentally bad. And we want these models to only generate things which are factual because what are the notion of facts is also a little bit, uh, hazy. Now in a given application, um, there turns out to be things that you should say and things that, uh, by you, I mean, the language model should output and things that language model should not output.

[00:14:08] Right? So, if I ask, uh, a biomedical question and I say, okay, answer truthfully from this data source, then there are things that it should just not, it shouldn't make up things, right? If there's uncertainty, maybe it can suggest options, but it shouldn't just confidently make up things. Whereas in the context of generating me a play, uh, with, uh, about, you know, uh, dogs and GPT three, then maybe it's fine if it just makes things up.

[00:14:39] So, so, I think the way I see it is there is an element of creativity of creating things which are novel. And that's the underlying mechanism that, which is tied to generalization. This either produces hallucin- undesired hallucinations or, um, creative things and the what's [00:15:00] needed is on top of that, a way to control for what you want and what you don't want. And so, one thing I didn't mention, but is actually extremely important. So, I lied a little bit when I said, I hallucinated, when I said....

[00:15:15] Russ Altman: See, even we humans can do this every now and then.

[00:15:19] Percy Liang: I said that, oh, it's just predicting the next word. That's the first step of training. The second step is now you it's called alignment, and some people call alignment, uh, some people use the term reinforcement learning from human feedback. And the point of this is now we have now just raw potential. You can have the base language model is just, you know, as someone who's so, uh, just to draw analogies, so creative, now you need to sort of, uh, um, put a little bit of a lid on,

[00:15:51] Russ Altman: Ya.

[00:15:52] Percy Liang: Let's, let's put you in line. You're going to follow. Am I going to give you instructions? You're going to follow in this way. You're going to be polite. You're going to not lie. You're going to not generate a racist or toxic content. And that alignment happens generally on top of the base model training.

[00:16:10] Russ Altman: That's fascinating.

[00:16:11] Percy Liang: That's where a lot of the hallucinations get kind of driven, uh, down.

[00:16:16] Russ Altman: So, it's a separate, so there's the pure or the elegant, like predict the next word, but then, uh, as an engineer, you add other constraints basically on the output. Is it that you tell it to change its output, or do you just throw away the output and ask it to try again?

[00:16:33] Percy Liang: So the typical algorithm, I just described reinforcement from human feedback, is you train a model, and then you give it some instructions and then you ask it to generate a few options and then you go to a human and say, Russ, which do you like better, A or B?

[00:16:52] Russ Altman: Ah.

[00:16:53] Percy Liang: And then you say A, and that says, Okay, well, let's make the model generate A more and B less. And I go to you again and say A or B. And you do that over and over again. It'll be a team of people who are providing feedback. And then through that refinement process, that's where a lot of the behavior of the model gets.

[00:17:17] Russ Altman: And I'm guessing there are a whole bunch of parameters for those models as well. In addition to the parameters that were set up for predicting the next word, there are...

[00:17:25] Percy Liang: Actually, this is important.

[00:17:26] Russ Altman: Yeah.

[00:17:26] Percy Liang: To clarify, it's the same model.

[00:17:29] Russ Altman: Same model.

[00:17:29] Percy Liang: Uh, same model. You just change the weights.

[00:17:32] Russ Altman: Gotcha.

[00:17:33] Percy Liang: Based on this new objective, which is to favor A over B, as opposed to predict the next word.

[00:17:39] Russ Altman: This is The Future of Everything with Russ Altman. More with Percy Liang next.

[00:17:55] Welcome back to The Future of Everything. I'm Russ Altman and I'm speaking with Percy Liang from Stanford University.

[00:18:00] In the last segment, Percy gave us the basics of how foundation models work. In the next segment, he will tell us that it's absolutely critical that we evaluate these models. And that we evaluate the organizations that put them out.

[00:18:14] How do we evaluate these? How do we know which ones are good, which ones are bad? Or I mean, is that even the question? Um, so, and I know you've done a lot of work in this area. So how do you think about the evaluation of these foundation models?

[00:18:28] Percy Liang: Yeah, evaluation is really, really tricky now. Before, and before these foundation models, uh, often we would have data sets, uh, like question answering data set, machine translation data sets, image classification data sets, which would be, uh, well defined tasks that people, researchers would go and work on like examples include like ImageNet and SQuAD. Um, and, um, now the paradigm has changed, right? We don't work on necessarily tasks per se, but we have this general [00:19:00] model. And now it's sort of everything's flipped around. Now we have this model, it can do things, now we have to figure out what it can do as opposed to saying we want to do this, go build something that can do that.

[00:19:11] Russ Altman: Right, and you could imagine that any given model might be good at some things and less good at other things.

[00:19:15] Percy Liang: Yeah.

[00:19:15] Russ Altman: So good or bad is not really the question. So how do you wrap your head around this?

[00:19:21] Percy Liang: So, we've been grappling with this for the last two years or so. The thing that we have done at the Center for Research on Foundation Models is we put out a benchmark called HELM, which stands for Holistic Evaluation of Language Models, and it tries as much as possible to cover everything.

[00:19:42] Of course, everything is a lot, so we're not covering everything, but we try to have represented prototypes to cover all the different aspects. So, the way we think about it is there are a set of potential use cases or scenarios in which these models will be deployed, maybe for a text classification, maybe for summarization, uh, maybe for, um, question answering and then, so we have about 40 different scenarios, which we've quite a, uh, source from the existing literature, looking at how these models are actually used in industry and so on.

[00:20:22] And then for each of these scenarios, we look at what do we care about that? And typically, people are interested in accuracy. That's the go to metric, but we thought that this is not the end all be all, accuracy is important. But we also care about calibration, which is a measure of how well a model knows what it doesn't know. And it turns out models actually are confidently wrong a lot of the time. Uh, as they get more accurate, they also get more incorrectly confident. And we also...

[00:20:54] Russ Altman: I know people, I know people like that, but that's a different podcast.

[00:20:57] Percy Liang: And then there's a robustness, uh, to different typos, um, there's a bias, there's toxicity and there's efficiency. So, we look at all of these different metrics in the context of these scenarios and then we go around and find all the models we can get a hold of above a certain level of capability, and we had about 30 different models. Uh, we evaluated the giant cross product of all these, um, and we made this all available on our website for future analysis.

[00:21:32] Um, so this was the, at the time, so there's December 2022, when we released it, the most comprehensive evaluation of language models, and over time we've been trying to, you know, update this over, with, as new models come out, as we discover new use cases, and so on and so forth.

[00:21:52] Russ Altman: Are the companies happy to share their models? Like I can imagine it's somewhat risky because they don't know how they're going to perform, and they don't want to get like bad marks. So how open have the companies been about providing you with their models for testing?

[00:22:07] Percy Liang: So, most of the models we evaluated are public.

[00:22:12] Russ Altman: Ah.

[00:22:12] Percy Liang: Because the company, that's their product. They put the model out as an API. So, in some ways, you know, they don't have a choice.

[00:22:21] Russ Altman: Um, right.

[00:22:23] Percy Liang: There were a few cases where we did collaborate with companies. And overall, companies I would say have been pretty supportive of the project because I think in general, everyone wants to figure out how well the models are, are doing. But there is a sense in which we want to, um, obviously companies want their models to look good.

[00:22:43] So that's the sort of the evaluation it's really, I would want to stress, an ongoing challenge to keep up with the latest models and the capabilities and trying to, uh, evaluate everything. I just want to maybe point out one, you know, give people a sense of [00:23:00] how, um, difficult evaluation is. If you think about a single example in a data set, could be write me a, uh, well researched article about World War II and its implications for, um, the, you know, the coming century, you know, and models can now generate a whole essay, right? How do you evaluate this? It's basically what, you know, you would do in, in school, like an English teacher would evaluate.

[00:23:32] Russ Altman: Yeah, this is what you have history professors for.

[00:23:35] Percy Liang: Or even in other domains, like in law or medicine where you're, um, asking a model to do something complicated. Now you need a doctor or a lawyer to evaluate that. So, even evaluating a single data point, if you did it properly, could take like an hour for a person. And so, this is a, you know, it's, it's not machine learning classifies this image is it, it's a cat or a dog, right?

[00:24:06] Russ Altman: Ya.

[00:24:06] Percy Liang: We're far beyond those, uh, that level.

[00:24:11] Russ Altman: Are most of the HELM outputs, are they mostly numerical or is there anything that approximates a qualitative evaluation?

[00:24:19] Percy Liang: So, uh, most of the ones that we initially worked on are fairly cut and dry or try to be cut and dry in terms of multiple-choice questions. There's a few short answers. Uh, we did have summarization as well, which required, um, humans to look at the summaries, which was, you know, took a bit of

[00:24:37] Russ Altman: expensive I'm guessing.

[00:24:39] Percy Liang: there was expensive. There's been a quite a bit of interest in trying to use the model to evaluate the model itself, which this is a collaboration with, uh, you know, Tatsu and Hashimoto and Carlos Guestrin on AlpacaEval, which you have the model rate, the model predictions, and this actually gives you some signal. Of course, you might be a little bit concerned, there's some bias there. Uh, so that's something to be figured out for sure.

[00:25:09] Russ Altman: Wow. Now you've also done recently, you've just announced kind of a scorecard. Uh, tell me what motivated that and how that works?

[00:25:17] Percy Liang: Yeah. So, this is, I think you're referring to the Foundation Models Transparency Index.

[00:25:21] Russ Altman: Yes, I am.

[00:25:22] Percy Liang: The story behind this was we were thinking about model evaluation as a way to ground the conversation of where we are as a field, what these models can and can't do, um, and how to evaluate the risks and also the opportunities. But then if you think about from a responsible AI perspective, right, we want to develop models and to benefit society or yada yada, what does that actually mean?

[00:25:52] So initially we thought about how do we measure responsibility? And this is not a property of just the model, but it's about how you produce the model and how you deploy the model. And it's important to, we focus a lot on, okay, ChatGPT is an artifact or Claude is an artifact. But it's important to remember the whole ecosystem where there are people producing data that get used to train, uh, this, this model, and then that's model is deployed in various, uh, products, and this is how it has downstream applications. So, you think about the social impact, you need to look at the whole ecosystem. [00:26:30] Um, so as we were doing that, we realized we have no idea what's going on in the ecosystem because companies aren't too forthcoming with the detail.

[00:26:39] So then we decided, well, the first step for responsibility is you need some transparency. And transparency has a long history of being useful as just a general good, you know, uh, measure. For example, you think of nutrition labels and, um, and other, you know, other fields. So, we decided to evaluate, not models, but, you know, organizations based on their transparency practices. So, we have a hundred different indicators, um, covering quite a big span from, you know, did the organization disclose the labor practices for curating the data? Did they disclose the data sources? Did they disclose the compute?

[00:27:22] Russ Altman: Wow.

[00:27:22] Percy Liang: Did they do a model evaluation for capabilities, show limitations? Did they talk about mitigation? Do they disclose the usage policy? Do they have usage policies for how their models can be used? Did they, all the way to, if there's a problem that someone, a user has, is there a reporting mechanism to say that something went wrong? So, it's pretty expansive. And we evaluate 10 different organizations based on these a hundred different indicators, and so, everyone gets a score out of 100 and the top score was a 54.

[00:27:58] Russ Altman: Whoa, that's not very good, that's not very good. And I, yes. So, okay. That really, we have to pause a moment and think about this. Okay. So, this is the best score and I, and what, what, how bad or how low did the scores go?

[00:28:11] Percy Liang: They go down to the tens.

[00:28:14] Russ Altman: Okay.

[00:28:15] Percy Liang: And I should say, I don't want to be too negative here. I think this was really the first, it was like a pop quiz.

[00:28:23] Russ Altman: Right.

[00:28:23] Percy Liang: You know, people didn't see this coming. And, but if you look at what we're trying to measure, many of these things, I think companies can absolutely do and many of these things are actually aligned with, uh, business interests, like having better documentation for the APIs, having better evaluations. Some things are maybe not so much, like people aren't going to suddenly reveal all their proprietary secrets about how, what data they use to train on, but many of them, I think, uh, companies could easily improve on.

[00:28:55] So, the hope is that now we have anchored transparency, which is a very fuzzy abstract concept into broad numbers. We can track how this evolves over time. Maybe next year, the organizations will hopefully do better as a result of having these numbers, uh, stare them in the face.

[00:29:17] Russ Altman: How did the companies respond to this pop quiz? Were they irritated or like HELM, did they say, you know, in the long term, this is probably a good thing.

[00:29:26] Percy Liang: I would say the amount of irritation was higher for the transparency Index than for HELM, for sure.

[00:29:31] Russ Altman: Okay, I think we can all imagine why, and, uh, I know that my students don't like pop quizzes, and I could imagine that the, uh, companies didn't like the pop quiz, and yet it might be good for them, and that's exactly what I tell my students.

[00:29:42] Let me just ask, let's close there's this idea that, um, many of these models are closed. Like we don't have full, as you just described, we don't have full insight into lots of things about how they were built and how they're being used. And then there are open models, uh, which you have pioneered yourself where it's, uh, where you make perfectly and publicly available what the data was that you use to train, how the architecture works.

[00:30:06] Can we end up with just a little discussion about the costs and benefits of open versus closed systems and how you're thinking about those trade offs?

[00:30:14] Percy Liang: Yeah, this is actually a very much an ongoing discussion about the benefits and risks of open versus closed first I'll say that it's a spectrum, Uh, between open and close.

[00:30:25] Often when people refer to open, uh, it means that the weights of the model are widely available. And I'll talk a little bit more about the, what the implication of that is. As opposed to closed, which means that the models are available, but only through an API access.

[00:30:43] Russ Altman: Right.

[00:30:43] Percy Liang: Uh, you can query the model with inputs and get outputs, but you can't actually...

[00:30:47] Russ Altman: Pretty much the archetypal black box.

[00:30:50] Percy Liang: Yeah, yeah, exactly. Um, so what are the benefits of open, um, models? Um, having weights gives you downstream users and developers just a better notion of, okay, what are you dealing with here? This is the model I can inspect it there's also another spectrum here, which is that it would be better to have more description of how the model, how these weights came about. Also, the training data, so, some models, uh, like EleutherAI is an organization that produces models. All their data, all their [00:31:30] code and the models are fully open. That's one end of the spectrum. Meta produces the Llama models, which are probably one of the best open models out there. The weights are open and they're available, but the data is not, so that's somewhat open. And then there's maybe the, now you jump over the line to the API providers.

[00:31:49] So the benefits are open are that, uh, you can have more transparency over what you're actually running. You can extend it more easily, because you can fine tune, you can customize in various ways that you can't when you're just, uh, have a black box API, it's really you're at the, uh, the whim of, you know, whoever's defining the API and what kind of access they give you. And there's more flexibility for you, you can run it in your local hardware, which is, could be beneficial for privacy as well.

[00:32:21] So those are the benefits of openness. Maybe the benefits of closed is that you have greater, basically you have greater control. In particular against potential misuse. Um, so you can say we don't want people to use this model to generate, uh, disinformation. And in principle, you can monitor that and try to prevent people from, uh, doing that.

[00:32:44] Russ Altman: Yeah, you described a little bit about that when you were talking about the reinforcement learning. And I like these kinds of answers. I don't like this. Don't do toxic things, etc.

[00:32:52] Percy Liang: Yeah. So, you can have greater control, although it's more nuanced than we have really time to cover because these closed models can also be jailbroken, which means that you can devise clever prompts, uh, inputs that you can feed them to bypass the safety Uh, filters...

[00:33:11] Russ Altman: Yes. I was going to ask you if closed models were more secure in some way, because if it's inspectable, if the open models are inspectable, they could be inspected by people with nefarious goals. Whereas a closed model might be a little bit harder to kind of use for bad purposes. But I think what you just told me is even in that case, people have figured out ways to kind of get around the security.

[00:33:35] Percy Liang: Yeah, you can circumvent it. And some of the APIs allow you to fine tune, which, and people have shown that it's very easy to just fine tune and break the safety. So, it's, this is very much an ongoing debate. I would say one thing that in favor of open models of, and security is that if you think about how computer security works.

[00:33:59] Russ Altman: Ya.

[00:34:00] Percy Liang: All the cryptographic protocols and all the source code is open source. Open SSL is open, right? And this is crucial for security because it allows the whole world to inspect in broad daylight, here's the security mechanism. And to find vulnerabilities and make sure that we're doing, you know, we're all looking at each other in the eye and say, okay, this is what security means, okay? And one of the lessons from computer security is that so called security by obscurity doesn't work. Right?

[00:34:35] If you say, I'm going to be so secure, but I'm not going to tell you what I, what I'm doing. It turns out that a determined enough bad actor can reverse engineer, figure out, or there's a leak or something. And then it's actually less secure than if you're just, uh, more open about what your security practices are. Now, in modeling, in the model’s case, um, things are, it's not the same because they're models, not code, but some of these principles I think, uh, still carry over.

[00:35:09] Russ Altman: Thanks to Percy Liang. That was the Future of AI Chat. You've been listening to The Future of Everything with Russ Altman. If you enjoy the podcast, please rate and review it. Please follow it and subscribe and tell your friends. We have a back catalog of more than 200, almost 250 episodes that are all available for your listening pleasure, so you'll never be surprised by the future of anything.

[00:35:31] You can follow me on Twitter or X, @RBAltman, on threads, @RussBAltman, and you can follow Stanford Engineering @StanfordENG.