人工智能是如何让我们变得更好的 – Stuart Russell


人工智能时代口译技术应用研究
王华树 | 国内首部聚焦口译技术应用和教学的著作
新书推荐


口笔译教育与评价国际论坛 二号公告
在厦门大学百年校庆之际,邀您齐聚厦门、共襄盛举
论坛推荐

人工智能是如何让我们变得更好的 - Stuart Russell
play-rounded-fill

人工智能是如何让我们变得更好的 - Stuart Russell

About the talk

我们应该如何在发挥人工智能最大用途的同时,预防那些机器人可能带来的威胁呢?随着人工智能的日益完善和发展,人工智能先驱斯图尔特·罗素正在创造一些不同的东西:那就是具有无限可能的机器人。让我们听听他对人类该如何兼容人工智能的看法,如何才能真正利用人工智能使其利用常识、利他主义以及人类的价值观来解决问题。

00:00
This is Lee Sedol. Lee Sedol is one of the world's greatest Go players, and he's having what my friends in Silicon Valley call a "Holy Cow" moment --
这是李世石。 李世石是全世界 最顶尖的围棋高手之一, 在这一刻,他所经历的 足以让我硅谷的朋友们 喊一句”我的天啊“——

00:11
a moment where we realize that AI is actually progressing a lot faster than we expected. So humans have lost on the Go board. What about the real world?
在这一刻,我们意识到 原来人工智能发展的进程 比我们预想的要快得多。 人们在围棋棋盘上已经输了, 那在现实世界中又如何呢?

00:21
Well, the real world is much bigger, much more complicated than the Go board. It's a lot less visible, but it's still a decision problem. And if we think about some of the technologies that are coming down the pike ... Noriko [Arai] mentioned that reading is not yet happening in machines, at least with understanding. But that will happen, and when that happens, very soon afterwards, machines will have read everything that the human race has ever written. And that will enable machines, along with the ability to look further ahead than humans can, as we've already seen in Go, if they also have access to more information, they'll be able to make better decisions in the real world than we can. So is that a good thing? Well, I hope so.
当然了,现实世界要 比围棋棋盘要大得多, 复杂得多。 相比之下每一步也没那么明确, 但现实世界仍然是一个选择性问题。 如果我们想想那一些在不久的未来, 即将来临的新科技…… Noriko提到机器还不能进行阅读, 至少达不到理解的程度, 但这迟早会发生, 而当它发生时, 不久之后, 机器就将读遍人类写下的所有东西。 这将使机器除了拥有 比人类看得更远的能力, 就像我们在围棋中看到的那样, 如果机器能接触到比人类更多的信息, 则将能够在现实世界中 做出比人类更好的选择。 那这是一件好事吗? 我当然希望如此。

01:14
Our entire civilization, everything that we value, is based on our intelligence. And if we had access to a lot more intelligence, then there's really no limit to what the human race can do. And I think this could be, as some people have described it, the biggest event in human history. So why are people saying things like this, that AI might spell the end of the human race? Is this a new thing? Is it just Elon Musk and Bill Gates and Stephen Hawking?
人类的全部文明, 我们所珍视的一切, 都是基于我们的智慧之上。 如果我们能掌控更强大的智能, 那我们人类的 创造力 就真的没有极限了。 我认为这可能就像很多人描述的那样 会成为人类历史上最重要的事件。 那为什么有的人会说出以下的言论, 说人工智能将是人类的末日呢? 这是一个新事物吗? 这只关乎伊隆马斯克、 比尔盖茨,和斯提芬霍金吗?

01:49
Actually, no. This idea has been around for a while. Here's a quotation: "Even if we could keep the machines in a subservient position, for instance, by turning off the power at strategic moments" -- and I'll come back to that "turning off the power" idea later on -- "we should, as a species, feel greatly humbled." So who said this? This is Alan Turing in 1951. Alan Turing, as you know, is the father of computer science and in many ways, the father of AI as well. So if we think about this problem, the problem of creating something more intelligent than your own species, we might call this "the gorilla problem," because gorillas' ancestors did this a few million years ago, and now we can ask the gorillas: Was this a good idea?
其实不是的,人工智能 这个概念已经存在很长时间了。 请看这段话: “即便我们能够将机器 维持在一个屈服于我们的地位, 比如说,在战略性时刻将电源关闭。”—— 我等会儿再来讨论 ”关闭电源“这一话题, ”我们,作为一个物种, 仍然应该自感惭愧。“ 这段话是谁说的呢? 是阿兰图灵,他在1951年说的。 阿兰图灵,众所皆知, 是计算机科学之父。 从很多意义上说, 他也是人工智能之父。 当我们考虑这个问题, 创造一个比自己更智能的 物种的问题时, 我们不妨将它称为”大猩猩问题“, 因为这正是大猩猩的 祖先们几百万年前所经历的。 我们今天可以去问大猩猩们: 那么做是不是一个好主意?

02:37
So here they are having a meeting to discuss whether it was a good idea, and after a little while, they conclude, no, this was a terrible idea. Our species is in dire straits. In fact, you can see the existential sadness in their eyes.
在这幅图里,大猩猩们正在 开会讨论那么做是不是一个好主意, 片刻后他们下定结论,不是的。 那是一个很糟糕的主意。 我们的物种已经奄奄一息了, 你都可以从它们的眼神中看到这种忧伤,

02:54
So this queasy feeling that making something smarter than your own species is maybe not a good idea -- what can we do about that? Well, really nothing, except stop doing AI, and because of all the benefits that I mentioned and because I'm an AI researcher, I'm not having that. I actually want to be able to keep doing AI.
所以创造比你自己更聪明的物种, 也许不是一个好主意—— 那我们能做些什么呢? 其实没什么能做的, 除了停止研究人工智能, 但因为人工智能能带来 我之前所说的诸多益处, 也因为我是 人工智能的研究者之一, 我可不同意就这么止步。 实际上,我想继续做人工智能。

03:18
So we actually need to nail down the problem a bit more. What exactly is the problem? Why is better AI possibly a catastrophe?
所以我们需要把这个问题更细化一点, 它到底是什么呢? 那就是为什么更强大的 人工智能可能会是灾难呢?

03:27
So here's another quotation: "We had better be quite sure that the purpose put into the machine is the purpose which we really desire." This was said by Norbert Wiener in 1960, shortly after he watched one of the very early learning systems learn to play checkers better than its creator. But this could equally have been said by King Midas. King Midas said, "I want everything I touch to turn to gold," and he got exactly what he asked for. That was the purpose that he put into the machine, so to speak, and then his food and his drink and his relatives turned to gold and he died in misery and starvation. So we'll call this "the King Midas problem" of stating an objective which is not, in fact, truly aligned with what we want. In modern terms, we call this "the value alignment problem."
再来看这段话: ”我们一定得确保我们 给机器输入的目的和价值 是我们确实想要的目的和价值。“ 这是诺博特维纳在1960年说的, 他说这话时是刚看到 一个早期的学习系统, 这个系统在学习如何能把 西洋棋下得比它的创造者更好。 与此如出一辙的一句话, 迈达斯国王也说过。 迈达斯国王说:”我希望 我触碰的所有东西都变成金子。“ 结果他真的获得了点石成金的能力。 那就是他所输入的目的, 从一定程度上说, 后来他的食物、 他的家人都变成了金子, 他死在痛苦与饥饿之中。 我们可以把这个问题 叫做”迈达斯问题“, 这个问题是我们阐述的目标,但实际上 与我们真正想要的不一致, 用现代的术语来说, 我们把它称为”价值一致性问题“。

04:25
Putting in the wrong objective is not the only part of the problem. There's another part. If you put an objective into a machine, even something as simple as, "Fetch the coffee," the machine says to itself, "Well, how might I fail to fetch the coffee? Someone might switch me off. OK, I have to take steps to prevent that. I will disable my 'off' switch. I will do anything to defend myself against interference with this objective that I have been given." So this single-minded pursuit in a very defensive mode of an objective that is, in fact, not aligned with the true objectives of the human race -- that's the problem that we face. And in fact, that's the high-value takeaway from this talk. If you want to remember one thing, it's that you can't fetch the coffee if you're dead.
而输入错误的目标 仅仅是问题的一部分。 它还有另一部分。 如果你为机器输入一个目标, 即便是一个很简单的目标, 比如说”去把咖啡端来“, 机器会对自己说: ”好吧,那我要怎么去拿咖啡呢? 说不定有人会把我的电源关掉。 好吧,那我要想办法 阻止别人把我关掉。 我得让我的‘关闭’开关失效。 我得尽一切可能自我防御, 不让别人干涉我, 这都是因为我被赋予的目标。” 这种一根筋的思维, 以一种十分防御型的 模式去实现某一目标, 实际上与我们人类最初 想实现的目标并不一致—— 这就是我们面临的问题。 实际上,这就是今天这个演讲的核心。 如果你在我的演讲中只记住一件事, 那就是:如果你死了, 你就不能去端咖啡了。

05:17
It's very simple. Just remember that. Repeat it to yourself three times a day.
这很简单。记住它就行了。 每天对自己重复三遍。

05:23
And in fact, this is exactly the plot of "2001: [A Space Odyssey]" HAL has an objective, a mission, which is not aligned with the objectives of the humans, and that leads to this conflict. Now fortunately, HAL is not superintelligent. He's pretty smart, but eventually Dave outwits him and manages to switch him off. But we might not be so lucky. So what are we going to do?
实际上,这正是电影 《2001太空漫步》的剧情。 HAL有一个目标,一个任务, 但这个目标和人类的目标不一致, 这就导致了矛盾的产生。 幸运的是,HAL并不具备超级智能, 他挺聪明的,但还是 比不过人类主角戴夫, 戴夫成功地把HAL关掉了。 但我们可能就没有这么幸运了。 那我们应该怎么办呢?

06:00
I'm trying to redefine AI to get away from this classical notion of machines that intelligently pursue objectives. There are three principles involved. The first one is a principle of altruism, if you like, that the robot's only objective is to maximize the realization of human objectives, of human values. And by values here I don't mean touchy-feely, goody-goody values. I just mean whatever it is that the human would prefer their life to be like. And so this actually violates Asimov's law that the robot has to protect its own existence. It has no interest in preserving its existence whatsoever.
我想要重新定义人工智能, 远离传统的定义, 将其仅限定为 机器通过智能去达成目标。 新的定义涉及到三个原则: 第一个原则是利他主义原则, 也就是说,机器的唯一目标 就是去最大化地实现人类的目标, 人类的价值。 至于价值,我指的不是感情化的价值, 而是指人类对生活所向往的, 无论是什么。 这实际上违背了阿西莫夫定律, 他指出机器人一定要维护自己的生存。 但我定义的机器 对维护自身生存毫无兴趣。

06:45
The second law is a law of humility, if you like. And this turns out to be really important to make robots safe. It says that the robot does not know what those human values are, so it has to maximize them, but it doesn't know what they are. And that avoids this problem of single-minded pursuit of an objective. This uncertainty turns out to be crucial.
第二个原则不妨称之为谦逊原则。 这一条对于制造安全的机器十分重要。 它说的是机器不知道 人类的价值是什么, 机器知道它需要将人类的价值最大化, 却不知道这价值究竟是什么。 为了避免一根筋地追求 某一目标, 这种不确定性是至关重要的。

07:09
Now, in order to be useful to us, it has to have some idea of what we want. It obtains that information primarily by observation of human choices, so our own choices reveal information about what it is that we prefer our lives to be like. So those are the three principles. Let's see how that applies to this question of: "Can you switch the machine off?" as Turing suggested.
那机器为了对我们有用, 它就得掌握一些 关于我们想要什么的信息。 它主要通过观察人类 做的选择来获取这样的信息, 我们自己做出的选择会包含着 关于我们希望我们的生活 是什么样的信息, 这就是三条原则。 让我们来看看它们是如何应用到 像图灵说的那样, “将机器关掉”这个问题上来。

07:37
So here's a PR2 robot. This is one that we have in our lab, and it has a big red "off" switch right on the back. The question is: Is it going to let you switch it off? If we do it the classical way, we give it the objective of, "Fetch the coffee, I must fetch the coffee, I can't fetch the coffee if I'm dead," so obviously the PR2 has been listening to my talk, and so it says, therefore, "I must disable my 'off' switch, and probably taser all the other people in Starbucks who might interfere with me."
这是一个PR2机器人。 我们实验室里有一个。 它的背面有一个大大的红色的开关。 那问题来了:它会让你把它关掉吗? 如果我们按传统的方法, 给它一个目标,让它拿咖啡, 它会想:”我必须去拿咖啡, 但我死了就不能拿咖啡了。“ 显然PR2听过我的演讲了, 所以它说:”我必须让我的开关失灵, 可能还要把那些在星巴克里, 可能干扰我的人都电击一下。“

08:09
So this seems to be inevitable, right? This kind of failure mode seems to be inevitable, and it follows from having a concrete, definite objective.
这看起来必然会发生,对吗? 这种失败看起来是必然的, 因为机器人在遵循 一个十分确定的目标。

08:18
So what happens if the machine is uncertain about the objective? Well, it reasons in a different way. It says, "OK, the human might switch me off, but only if I'm doing something wrong. Well, I don't really know what wrong is, but I know that I don't want to do it." So that's the first and second principles right there. "So I should let the human switch me off." And in fact you can calculate the incentive that the robot has to allow the human to switch it off, and it's directly tied to the degree of uncertainty about the underlying objective.
那如果机器对目标 不那么确定会发生什么呢? 那它的思路就不一样了。 它会说:”好的,人类可能会把我关掉, 但只在我做错事的时候。 我不知道什么是错事, 但我知道我不该做那些事。” 这就是第一和第二原则。 “那我就应该让人类把我关掉。” 事实上你可以计算出机器人 让人类把它关掉的动机, 而且这个动机是 与对目标的不确定程度直接相关的。

08:53
And then when the machine is switched off, that third principle comes into play. It learns something about the objectives it should be pursuing, because it learns that what it did wasn't right. In fact, we can, with suitable use of Greek symbols, as mathematicians usually do, we can actually prove a theorem that says that such a robot is provably beneficial to the human. You are provably better off with a machine that's designed in this way than without it. So this is a very simple example, but this is the first step in what we're trying to do with human-compatible AI.
当机器被关闭后, 第三条原则就起作用了。 机器开始学习它所追求的目标, 因为它知道它刚做的事是不对的。 实际上,我们可以用希腊字母 就像数学家们经常做的那样, 直接证明这一定理, 那就是这样的一个机器人 对人们是绝对有利的。 可以证明我们的生活 有如此设计的机器人会变得 比没有这样的机器人更好。 这是一个很简单的例子,但这只是 我们尝试实现与人类 兼容的人工智能的第一步。

09:30
Now, this third principle, I think is the one that you're probably scratching your head over. You're probably thinking, "Well, you know, I behave badly. I don't want my robot to behave like me. I sneak down in the middle of the night and take stuff from the fridge. I do this and that." There's all kinds of things you don't want the robot doing. But in fact, it doesn't quite work that way. Just because you behave badly doesn't mean the robot is going to copy your behavior. It's going to understand your motivations and maybe help you resist them, if appropriate. But it's still difficult. What we're trying to do, in fact, is to allow machines to predict for any person and for any possible life that they could live, and the lives of everybody else: Which would they prefer? And there are many, many difficulties involved in doing this; I don't expect that this is going to get solved very quickly. The real difficulties, in fact, are us.
现在来看第三个原则。 我知道你们可能正在 为这一个原则而大伤脑筋。 你可能会想:“你知道, 我有时不按规矩办事。 我可不希望我的机器人 像我一样行事。 我有时大半夜偷偷摸摸地 从冰箱里找东西吃, 诸如此类的事。” 有各种各样的事你是 不希望机器人去做的。 但实际上并不一定会这样。 仅仅是因为你表现不好, 并不代表机器人就会复制你的行为。 它会去尝试理解你做事的动机, 而且可能会在合适的情况下制止你去做 那些不该做的事。 但这仍然十分困难。 实际上,我们在做的是 让机器去预测任何一个人, 在他们的任何一种 可能的生活中 以及别人的生活中, 他们会更倾向于哪一种? 这涉及到诸多困难; 我不认为这会很快地就被解决。 实际上,真正的困难是我们自己。

10:32
As I have already mentioned, we behave badly. In fact, some of us are downright nasty. Now the robot, as I said, doesn't have to copy the behavior. The robot does not have any objective of its own. It's purely altruistic. And it's not designed just to satisfy the desires of one person, the user, but in fact it has to respect the preferences of everybody. So it can deal with a certain amount of nastiness, and it can even understand that your nastiness, for example, you may take bribes as a passport official because you need to feed your family and send your kids to school. It can understand that; it doesn't mean it's going to steal. In fact, it'll just help you send your kids to school.
就像我刚说的那样, 我们做事不守规矩, 我们中有的人甚至行为肮脏。 就像我说的, 机器人并不会复制那些行为, 机器人没有自己的目标, 它是完全无私的。 它的设计不是去满足 某一个人、一个用户的欲望, 而是去尊重所有人的意愿。 所以它能对付一定程度的肮脏行为。 它甚至能理解你的不端行为,比如说 假如你是一个边境护照官员, 很可能收取贿赂, 因为你得养家、 得供你的孩子们上学。 机器人能理解这一点, 它不会因此去偷, 它反而会帮助你去供孩子们上学。

11:16
We are also computationally limited. Lee Sedol is a brilliant Go player, but he still lost. So if we look at his actions, he took an action that lost the game. That doesn't mean he wanted to lose. So to understand his behavior, we actually have to invert through a model of human cognition that includes our computational limitations -- a very complicated model. But it's still something that we can work on understanding.
我们的计算能力也是有限的。 李世石是一个杰出的围棋大师, 但他还是输了。 如果我们看他的行动, 他最终输掉了棋局。 但这不意味着他想要输。 所以要理解他的行为, 我们得从人类认知模型来反过来想, 这包含了我们的计算能力限制, 是一个很复杂的模型, 但仍然是我们可以尝试去理解的。

11:45
Probably the most difficult part, from my point of view as an AI researcher, is the fact that there are lots of us, and so the machine has to somehow trade off, weigh up the preferences of many different people, and there are different ways to do that. Economists, sociologists, moral philosophers have understood that, and we are actively looking for collaboration.
可能对于我这样一个 人工智能研究人员来说最大的困难, 是我们彼此各不相同。 所以机器必须想办法去判别衡量 不同人的不同需求, 而又有众多方法去做这样的判断。 经济学家、社会学家、 哲学家都理解这一点, 我们正在积极地去寻求合作。

12:08
Let's have a look and see what happens when you get that wrong. So you can have a conversation, for example, with your intelligent personal assistant that might be available in a few years' time. Think of a Siri on steroids. So Siri says, "Your wife called to remind you about dinner tonight." And of course, you've forgotten. "What? What dinner? What are you talking about?"
让我们来看看如果我们 把这一步弄错了会怎么样。 举例来说,你可能会 与你的人工智能助理, 有这样的对话: 这样的人工智能可能几年内就会出现, 可以把它想做加强版的Siri。 Siri对你说:“你的妻子打电话 提醒你今晚要跟她共进晚餐。” 而你呢,自然忘了这回事: “什么?什么晚饭? 你在说什么?”

12:30
"Uh, your 20th anniversary at 7pm."
“啊,你们晚上7点, 庆祝结婚20周年纪念日。”

12:36
"I can't do that. I'm meeting with the secretary-general at 7:30. How could this have happened?"
“我可去不了。 我约了晚上7点半见领导。 怎么会这样呢?”

12:42
"Well, I did warn you, but you overrode my recommendation."
“呃,我可是提醒过你的, 但你不听我的建议。”

12:48
"Well, what am I going to do? I can't just tell him I'm too busy."
“我该怎么办呢?我可不能 跟领导说我有事,没空见他。”

12:52
"Don't worry. I arranged for his plane to be delayed."
“别担心。我已经安排了, 让他的航班延误。

12:58
"Some kind of computer malfunction."
“像是因为某种计算机故障那样。”

13:01
"Really? You can do that?"
“真的吗?这个你也能做到?”

13:04
"He sends his profound apologies and looks forward to meeting you for lunch tomorrow."
“领导很不好意思,跟你道歉, 并且告诉你明天 中午午饭不见不散。”

13:10
So the values here -- there's a slight mistake going on. This is clearly following my wife's values which is "Happy wife, happy life."
这里就有一个小小的问题。 这显然是在遵循我妻子的价值论, 那就是“老婆开心,生活舒心”。

13:21
It could go the other way. You could come home after a hard day's work, and the computer says, "Long day?"
它也有可能发展成另一种情况。 你忙碌一天,回到家里, 电脑对你说:“像是繁忙的一天啊?”

13:28
"Yes, I didn't even have time for lunch."
“是啊,我连午饭都没来得及吃。”

13:30
"You must be very hungry."
“那你一定很饿了吧。”

13:31
"Starving, yeah. Could you make some dinner?"
“快饿晕了。你能做点晚饭吗?”

13:36
"There's something I need to tell you."
“有一件事我得告诉你。

13:40
"There are humans in South Sudan who are in more urgent need than you."
”南苏丹的人们可比你更需要照顾。”

13:46
"So I'm leaving. Make your own dinner."
“所以我要离开了。 你自己做饭去吧。”

13:50
So we have to solve these problems, and I'm looking forward to working on them.
我们得解决这些问题, 我也很期待去解决。

13:55
There are reasons for optimism. One reason is, there is a massive amount of data. Because remember -- I said they're going to read everything the human race has ever written. Most of what we write about is human beings doing things and other people getting upset about it. So there's a massive amount of data to learn from.
我们有理由感到乐观。 理由之一是 我们有大量的数据, 记住,我说过机器将能够阅读一切 人类所写下来的东西, 而我们写下的大多数是 我们做的什么事情, 以及其他人对此有什么意见。 所以机器可以从大量的数据中去学习。

14:11
There's also a very strong economic incentive to get this right. So imagine your domestic robot's at home. You're late from work again and the robot has to feed the kids, and the kids are hungry and there's nothing in the fridge. And the robot sees the cat.
同时从经济的角度, 我们也有足够的动机 去把这件事做对。 想象一下,你家里有个居家机器人, 而你又得加班, 机器人得给孩子们做饭, 孩子们很饿, 但冰箱里什么都没有。 然后机器人看到了家里的猫。

14:28
And the robot hasn't quite learned the human value function properly, so it doesn't understand the sentimental value of the cat outweighs the nutritional value of the cat.
机器人还没学透人类的价值论, 所以它不知道 猫的感情价值 大于猫的营养价值。

14:40
So then what happens? Well, it happens like this: "Deranged robot cooks kitty for family dinner." That one incident would be the end of the domestic robot industry. So there's a huge incentive to get this right long before we reach superintelligent machines.
接下来会发生什么? 差不多是这样的: 头版头条:“疯狂的机器人 把猫煮了给主人当晚饭!” 这一个事故就足以结束 整个居家机器人产业。 所以我们有足够的动机在我们实现 超级智能机器让它更加完善。

15:00
So to summarize: I'm actually trying to change the definition of AI so that we have provably beneficial machines. And the principles are: machines that are altruistic, that want to achieve only our objectives, but that are uncertain about what those objectives are, and will watch all of us to learn more about what it is that we really want. And hopefully in the process, we will learn to be better people. Thank you very much.
总结来说: 我想要改变人工智能的定义, 让我们可以证明机器对我们是有利的。 这三个原则是: 机器是利他的, 只想着实现我们的目标, 但它不确定我们的目标是什么, 所以它会观察我们, 从中学习我们想要的究竟是什么。 希望在这个过程中, 我们也能学会成为更好的人。 谢谢大家。

15:30
Chris Anderson: So interesting, Stuart. We're going to stand here a bit because I think they're setting up for our next speaker.
克里斯安德森: 非常有意思,斯图尔特。 我们趁着工作人员 为下一位演讲者布置的时候 来简单聊几句。

15:37
A couple of questions. So the idea of programming in ignorance seems intuitively really powerful. As you get to superintelligence, what's going to stop a robot reading literature and discovering this idea that knowledge is actually better than ignorance and still just shifting its own goals and rewriting that programming?
我有几个问题。 从直觉上来看,将无知编入到程序中 似乎是一个很重要的理念, 当你要实现超级智能时, 什么能阻止机器人? 当它在阅读和学习的过程中发现, 知识比无知更强大, 然后就改变它的目标 去重新编写程序呢?

15:57
Stuart Russell: Yes, so we want it to learn more, as I said, about our objectives. It'll only become more certain as it becomes more correct, so the evidence is there and it's going to be designed to interpret it correctly. It will understand, for example, that books are very biased in the evidence they contain. They only talk about kings and princes and elite white male people doing stuff. So it's a complicated problem, but as it learns more about our objectives it will become more and more useful to us.
斯图尔特拉塞尔:是的, 我们想要它去学习,就像我说的, 学习我们的目标。 它只有在理解得越来越正确的时候, 才会变得更确定, 我们有证据显示, 它的设计使它能按正确的方式理解。 比如说,它能够理解书中的论证是 带有非常强的偏见的。 书中只会讲述国王、王子 和那些精英白人男性做的事。 这是一个复杂的问题, 但当它更深入地学习我们的目标时, 它就变得对我们更有用。

16:34
CA: And you couldn't just boil it down to one law, you know, hardwired in: "if any human ever tries to switch me off, I comply. I comply."
CA:那你不能把这些 都集中在一条准则里吗? 把这样的命令写在它的程序里: “如果人类什么时候想把我关掉, 我服从。我服从。”

16:43
SR: Absolutely not. That would be a terrible idea. So imagine that you have a self-driving car and you want to send your five-year-old off to preschool. Do you want your five-year-old to be able to switch off the car while it's driving along? Probably not. So it needs to understand how rational and sensible the person is. The more rational the person, the more willing you are to be switched off. If the person is completely random or even malicious, then you're less willing to be switched off.
SR:绝对不行, 那将是一个很糟糕的主意。 试想一下,你有一辆无人驾驶汽车, 你想让它送你五岁的孩子 去上学。 你希望你五岁的孩子 能在汽车运行过程中 将它关闭吗? 应该不会吧。 它得理解下指令的人有多理智, 是不是讲道理。 这个人越理智, 它就越愿意自己被关掉。 如果这个人是完全思绪混乱 或者甚至是有恶意的, 那你就不愿意它被关掉。

17:12
CA: All right. Stuart, can I just say, I really, really hope you figure this out for us. Thank you so much for that talk. That was amazing.
CA:好吧。斯图尔特,我得说 我真的希望你为我们 能把这一切研究出来, 很感谢你的演讲,太精彩了。

相关推荐
5/5

原创视频版权为主办方及译直播所有,请勿擅自使用
已赞3

评论:

1 条评论,访客:0 条,站长:0 条

0%好评

  • 好评:(0%)
  • 中评:(0%)
  • 差评:(0%)

最新评论

发表回复