Decision theories, LW-style

There is a lot of confusion about Newcomb’s paradox, various decision theories discussed on LessWrong, free will, determinism, and so on. In this post, I try to make all of this less confusing.

I am not going to talk about the domain of decision theory in general. This post is purely about the parts LW is interested in.

tl;dr

Popular decision paradoxes are inherently contradictory – e. g. “what if you had free will but were still completely predictable” (Newcomb’s paradox), or “what if you really wanted agents to cooperate but also wanted them to be ruthless” (prisoner’s dilemma). Useful questions become useless mind-benders.

Decision theories are expressly for agents that don’t have free will – choosing a decision theory for an agent literally means “what algorithm should an agent blindly follow?”. Furthermore, a lot of interesting questions about decision theories only apply in settings when you have more than one agent following similar theories and you want something from them (like “cooperate without prior communication”).

Trying to design algorithms for those scenarios is much more productive than trying to spot contradictions in decision paradoxes where the roles of the algorithm designer and the algorithm executor coincide.

I. Newcomb’s paradox: what if you had free will but didn’t?

To mess with you, The Ultimate Predictor, also known as “Omega”, has maybe put an iPad into a box. Or maybe not.

On the top of the box, there is a note: if you punch yourself before opening the box, the box will have an iPad in it. You can take it and go home and brag to everyone and waste the next week watching funny British panel shows, especially Would I Lie To You?, alone, in the dark. (I would like to preemptively note that nothing of the sort has ever happened to me, except for the whole second part.)

Omega is not a god, but they are never wrong and can not possibly ever be wrong. This you know for sure. Do you obey and punch yourself before opening the box?

It seems like you definitely should. However! Omega can not change what’s inside the box, so let’s be Smart™. If there is an iPad, you should not punch yourself, because then you will have both the iPad and your dignity. If there isn’t an iPad, you should not punch yourself either, because why would you? So in both cases you can just skip that bit and open the box. Surely, it sounds like the proponents of this point of view – affectionately called “two-boxers” for complicated reasons – have a smart argument on their side.

However, those who obey Omega – “one-boxers” – have a good argument too, which is that they have iPads and the other side does not, despite being so very smart. So what should you do?

II. A rule of thumb: decision theories are for your kids, not for you

You are now living on the home planet of Omega. Not a day passes without being offered a box, or two boxes, or three boxes, and it gets old really fast. You resolve to never leave the house, and instead you have coaxed your kid to do your errands. Naturally, all iPads accumulated by the kid during those errands are your rightful property.

Now the question becomes: what should you teach the kid? And it’s a much easier question. You can teach the kid Evidential Decision Theory and off you go, while still believing Causal Decision Theory is smart and the previous one is dumb.

This is the deal with decision theories. Treat them as “what is the most useful behavior for a stupid agent in some stupidly convoluted world?”. They are not about you, the Mastermind Plotter, a free agent who is impossible to predict, yet somehow also possible to predict. They are about a kid, or maybe a self-driving car, or a religious community – i.e. an agent or set of agents that can be influenced.

III. Prisoner’s dilemma: what if you really wanted agents to cooperate but also wanted them to be ruthless?

Let’s apply this principle to another paradox, the prisoner’s dilemma. I don’t feel like inventing a silly framing for it, so you can read about it on Wikipedia.

Two players are playing a game:

If they both choose to cooperate, each gets a reward.
If one of them defects, the cheater gets a bigger reward and the nice player is, counterintuitively, punished.
If both of them defect, nobody gets anything.

The Smart™ reasoning goes like this: if the other player cooperates, you should defect and get a big reward. If the other player defects, you should also defect – to avoid punishment. Therefore, you should defect, period.

The dilemma lies in the fact that when the players are smart, the outcome is not. So being nice turns out to be better than being smart. Huh.

What is the right decision algorithm? To figure this out, again, think about a kid.

If you have a kid, and you only care about their success in life, and nothing else in the world, you should teach them to be smart. Maybe even psychopathic, though it is debatable.

If you have two kids, however, you should teach them to be smart but be nice to each other, so that they will get rewards whenever they happen to play with each other. (Or, if the defector’s reward is much bigger than the cooperator’s punishment, they should take turns at defecting and exploit the system.)

Why? Because you care about both kids! If you only care about one of them, you can teach one of them to be ruthless and the other – to be nice and turn the other cheek. If you care about both of them but want them to be ruthless, teach them to cooperate with each other and no one else. If you want them to be maximally ruthless and then you say “oh but why they don’t cooperate”, well, you want a contradiction.

Again, this is the deal with decision theories. I’m not going to use a fancy decision theory to decide how to live my life, but I am very interested in a fancy decision theory that I can instill into the malleable minds of my kids, readers, self-driving cars, whatever. And being Smart™ doesn’t quite cut it here – this is how you get, for instance, a fleet of murderous cars. This is why we need something better, and this is why thinking about decision theories is worth spending time on.

IV. XOR blackmail problem

In the bottomless chest of decision theory edge cases, there is another wacky one that we have to deal with.

An agent hears a rumor that their house has been infested by termites, at a repair cost of $1,000,000. The next day, the agent receives a letter from the trustworthy predictor Omega saying:

“I know whether or not you have termites, and I have sent you this letter if and only if exactly one of the following is true: (i) the rumor is false, and you are going to pay me $1,000 upon receiving this letter; or (ii) the rumor is true, and you will not pay me upon receiving this letter.”

Thinking “if I do X, it must be Y” does not work here. Evidential Decision Theory wants you to pay up to somehow magically end up in the universe where the rumor is false, because according to the problem statement, deciding to pay must mean that the rumor is false.

“But didn’t the same thing happen in the Newcomb’s paradox and you said exactly the opposite thing?” Once more, think of a kid to make it easier.

Do you want to teach your kid to pay upon receiving a letter like this? Then Omega will send them letters when someone spreads false rumors about them, and they will be bleeding money. Furthermore, they will not get any letters when the rumors are true.

Do you want to teach your kid to not pay? Then they will get the letters only when the rumors are true, and won’t have to pay anything. Awesome! They don’t bleed money and they get to know when their house is infected with termites, straight from the infallible Omega.

The difference between the two problems is that in Newcomb’s paradox, the contents of the box depend on your decision. In this problem, whether you have termites or not (i.e. what you care about) does not depend on your decision, the only thing that changes is whether you get a letter about it.

This way of thinking is called Functional Decision Theory. In particular, when you are confronted with entities that supposedly know exactly how you think, just go “What should I be thinking to get the best outcome? Okay, then I will think that.”

Note: I suspect that is that in real life it translates to “if you notice that people reward genuine kindness, try to figure out how to be genuinely kind and at the same time still screw people over, so that you get both the benefits of being kind and the benefits of screwing people over”. And it works!

V. Simulation as a causal mechanism

Going back to Newcomb’s paradox, it is still irritating that there is no causal link between “you decide to disobey Omega” and “the box is empty”. Maybe a kid can’t decide anything, but you definitely can, right?

The simulation argument could provide such a causal link.

If we know Omega is never ever wrong, they are probably simulating you to figure out what you will decide, kinda like in Black Mirror (e. g. Hang the DJ). So when you are deciding to take the box, you don’t know if it’s actually you, or you-in-the-simulation. By making the right decision while in the simulation, you can help out your non-simulation version.

This even works with otherwise mind-boggling variants of Newcomb’s paradox, like a variant where everything is the same except that the box is made of glass. You literally look at the box, see the iPad in it, and yet somehow you still have to obey Omega to get the iPad. Why? Because you’re in the simulation, and by choosing to obey the Omega, you will ensure that the real-world version of you will be presented with a full box, instead of an empty box.

A possible objection is: but what if Omega is just really good at psychology and statistics and so on, but doesn’t actually simulate anything? In this case...

VI. Determinism is a great answer to everything

There is no free will, it’s all an illusion, “what should you decide” is not a meaningful question. In fact, if Omega can look at your past life and predict what box you will choose, you personally don’t have much free will, sorry. “Omega probably just noticed that I always two-box when I’m having a grumpy day”. So why are you asking what should you choose, then? Are you having a grumpy day or not? It’s settled then.

Like, okay, you are staring at a glass box with an iPad in it. “Should” you obey Omega and punch yourself anyway? Or for people who have skipped my variant of Newcomb’s paradox entirely: should you one-box even when both boxes are transparent? The answer is: if you find yourself in this situation, you have learned something about yourself. Specifically, that you are a one-boxer. Or, to quote The Last Psychiatrist:

If some street hustler challenges you to a game of three card monte you don’t need to bother to play, just hand him the money, not because you’re going to lose but because you owe him for the insight: he selected you. Whatever he saw in you everyone sees in you, from the dumb blonde at the bar to your elderly father you’ve dismissed as out of touch, the only person who doesn’t see it is you, which is why you fell for it.

Note that this does not mean thinking about decision theories is meaningless – the question of “how should you indoctrinate your kid?” or “what should the self-driving car do?” is still relevant. The difference between you and the self-driving car is that the self-driving car does not have free will, but you supposedly do. Of course the question “what algorithm should I use?” becomes maddening then – you can not, at the same time, (a) follow an algorithm and (b) have free will, aka the ability to overrule the algorithm whenever you feel like it.

VII. The psychopath button

Here is another illustration: the psychopath button problem.

Paul is debating whether to press the ‘kill all psychopaths’ button. It would, he thinks, be much better to live in a world with no psychopaths. Unfortunately, Paul is quite confident that only a psychopath would press such a button. Paul very strongly prefers living in a world with psychopaths to dying. Should Paul press the button?

Should Paul press the button? If he does, he’s a psychopath and he shouldn’t have pressed it. If he doesn’t, he’s not a psychopath and he should have pressed it.

If you treat the button press as a choice between being a psychopath and not being one, the answer is clear: Paul should not press the button, i.e. should not be a psychopath.

If you assume that Paul does not have a choice, the question disappears completely – he will press the button if he’s a psychopath, he won’t if he’s not, in both cases the consequences won’t be good, but that’s how life is sometimes.

The question is only a conundrum when you insist on it being a choice and not being a choice at the same time. Well, good luck with that.

VIII. Conclusion

This is how I recommend approaching decision problems.

If you want to figure out how your robots/kids/agents/cars should behave, mostly drop the philosophy. Look at the history of e. g. cooperation tournaments and what tends to work well there. Do your own experiments. Think about whether you care about the agents, or about the world that the agents are in, and in what proportion. Think about whether you can build a reliable way for agents to read each other’s intentions – e. g. humans can’t hide being angry because their faces get red, stuff like that. Trusted computing, remote attestation. Vitalik Buterin’s vision for Ethereum is ultimately a cooperation platform: inspectable agents, non-forgeable identities, zero-knowledge proofs.

If you want to figure out how you should behave, there are usually two separate questions: “what kind of behavior will win in this implausible scenario?” and “how do I justify this to my/someone’s intuition?”. The first one is often straightforward, and the second one is often resolvable with a combination of the determinism hypothesis and the simulation hypothesis.

Finally, if the problem happens to lie along the lines of “you will do X, but doing X is bad for you, so what should you do, huh?”, just reduce it to this form explicitly and banish it from your mind forever. There are more interesting things to think about.

Follow this blog

Send

Popular

Weird experiences: ??? (the case of Henry Darger)

This post is a collection of quotes from About Henry Darger, a marvellous review of John MacGregor’s Henry Darger...

1 2019

Weird experiences: visual form agnosia (the case of Dee Fletcher)

The patient, Dee Fletcher, loses the ability to visually recognize shapes, objects, or people. She knows there is something on a table, but doesn’t know that it is a pencil or how it is oriented on the table

10 2019

Ctrl ←Weird experiences: ??? (the case of Henry Darger)

Ctrl →Prisoner’s dilemma is the mind killer?

9 comments

خرید اکسیژن ساز برای کرونا 2021

مفهوم طراحی دستگاه اکسیژن ساز ، سازگار با محیط زیست و خلاق است که آب خالص الکترولیز می شود تا 3 لیتر گاز مخلوط اکسیژن هیدروژن تولید کند برای بیماران مبتلا به بیماری کرونا 2019 که برای درمان استنشاق می شوند.
هیدروژن سبک ترین گاز جهان است و هیچ اثر سمی بر انسان ندارد (ایمنی زیستی). علاوه بر این ، وزن مولکولی گازهای ترکیبی هیدروژن و اکسیژن بسیار کم است به طوری که می تواند آلوئول های ریوی را به سرعت به اکسیژن ساز برساند ، مقاومت راه های هوایی را کاهش داده و کار تنفسی آن را کاهش دهد و در نتیجه اشباع اکسیژن مویرگی محیطی (SpO2) را بهبود بخشد. در همین حال ، گاز هیدروژن دارای توانایی ضد التهابی قوی برای جلوگیری از کاهش MOF (نارسایی چند عضو) ناشی از التهاب حاد ویروس است. این می تواند برای بیماران مبتلا به COVID-19 تأیید شده از موارد خفیف ، تا موارد متوسطو موارد شدید استفاده شود.
استنشاق گاز مخلوط اکسیژن ساز هیدروژن می تواند علائم تنگی نفس ، پولیپنه ، ناراحتی قفسه سینه ، درد قفسه سینه در بیماران مبتلا به کروناویروس COVID-19 را کاهش داده و طول مدت بستری آنها را نیز کوتاه کند. این اثرات درمانی گاز هیدروژن اکسیژن مخلوط توسط آکادمیک و آزمایشگاه کلیدی بیماری های تنفسی گزارش شده است و در یک تک نگاری که به طور رسمی توسط انتشارات جهانی علمی ، مقر آن در ایران منتشر شده است ، به پایان رسید.
این دستگاه اکسیژن ساز را می توان به تنهایی یا در ترکیب دو دستگاه (6 لیتر گاز مخلوط) از طریق اتصال سه تایی استفاده کرد. همچنین می توان آن را به موازات خطوط لوله اکسیژن ، ونتیلاتورهای تهاجمی و ونتیلاتورهای غیر تهاجمی در بیمارستان ها استفاده کرد.
گزارش شده است که از زمان همه گیری COVID-19 ، درمان گاز مخلوط اکسیژن هیدروژن نیز در پروتکل تشخیص و درمان پنومونی جدید کروناویروس و پروتکل تشخیص و درمان موارد شدید و بحرانی کروناویروس جدید گنجانده شده است.
https://vinteb.com/اکسیژن-ساز/

ناب زیست 2021

ما هم برج خنک کننده تولید می کنیم که فکر میکنم خیلی به درد شما بخوره برای خنک کاری سیستم هاتون جناب وین طب.
اگر بخاید انواع برج خنک کننده رو بشناسید باید سر بزنید به سایت ما به آدرس زیر
<a href=“https://naabzist.net/%D8%A8%D8%B1%D8%AC-%D8%AE%D9%86%DA%A9-%DA%A9%D9%86%D9%86%D8%AF%D9%87”>https://naabzist.net/%D8%A8%D8%B1%D8%AC-%D8%AE%D9%86%DA%A9-%DA%A9%D9%86%D9%86%D8%AF%D9%87</a>
برج خنک کننده چیست؟
برج خنک کننده مشتمل بر تجهیزاتی است با جنس بدنه گالوانیزه، چوبی، بتنی و فایبرگلاس گاه بصورت منفرد و گاه همراه با چیلر با تقسیم بندی هایی بصورت گرد، مدور و مکعبی، تر و خشک، مدار بسته و مدار باز هیبریدی، جریان مخالف و جریان متقاطع، جریان طبیعی و جریان مکانیکی که بمنظور ایجاد برودت و انتقال گرما از سیال فرآیندی به اتمسفر در پالایشگاه، نیروگاه، مجتمع‌های مسکونی، تجاری، اداری، ورزشی، کارخانجات تزریق پلاستیک و … مورد استفاده قرار می گیرد.

ناب زیست 2021

برج خنک کننده چیست؟
برج خنک کننده مشتمل بر تجهیزاتی است با جنس بدنه گالوانیزه، چوبی، بتنی و فایبرگلاس گاه بصورت منفرد و گاه همراه با چیلر با تقسیم بندی هایی بصورت گرد، مدور و مکعبی، تر و خشک، مدار بسته و مدار باز هیبریدی، جریان مخالف و جریان متقاطع، جریان طبیعی و جریان مکانیکی که بمنظور ایجاد برودت و انتقال گرما از سیال فرآیندی به اتمسفر در پالایشگاه، نیروگاه، مجتمع‌های مسکونی، تجاری، اداری، ورزشی، کارخانجات تزریق پلاستیک و … مورد استفاده قرار می گیرد.
https://naabzist.net/%D8%A8%D8%B1%D8%AC-%D8%AE%D9%86%DA%A9-%DA%A9%D9%86%D9%86%D8%AF%D9%87

مخزن پلی اتیلن 2021

مخزن پلی اتیلن دستگاهی است که مواد فاضلابی در آن ته نشین میشود بدین صورت که برای تصفیه فاضلاب از طریق فرآیند تجزیه ‏زیستی ( باکتری بی هوازی و ته نشینی مواد فاضلابی کاربرد دارد .‏)
قابل توجه است که فاضلاب روی ساختمان که عمل جمع آوری فاضلاب تولیدی و انتقال آن را تا مخزن عهده دار است بر اساس فاضلاب روهای متداول طراحی می شوند بنابراین لازم است که به مسأله حداقل سرعت و شیب در این فاضلابروها دقت شود. در صورتی که بتوان به حل این مسأله فائق شد و از طرفی بتوان موقعیت مخزن را طوری انتخاب نمود که چند خانه را با هم سرویس دهد می توان فاضلاب چند خانه را جمع آوری و به مخزن پلی اتیلن انتقال داد، همچنین برای مناطقی که جنس زمین غیر قابل نفوذ باشد و نتوان از چاههای جاذب سود جست و برای منازل و مؤسساتی که در نواحی روستایی و یا دور از دسترس شبکه های جمع آوری فاضلاب قرار دارند، می تواند یک روش قابل قبول تصفیه فاضلاب باشد
https://septic-tank.ir/%d9%85%d8%ae%d8%b2%d9%86-%d9%be%d9%84%db%8c-%d8%a7%d8%aa%db%8c%d9%84%d9%86.html

valspar 2021

با توجه به گسترش روزافزون کسب و کارهای اینترنتی در جهان، طراحی سایت (web design) امروزه به عنوان نقطه اصلی شروع درآمدهای سرشار اینترنتی به شمار می‌رود. به طور کلی طراحی وب سایت یا همان طراحی سایت، به منظور ایجاد و فراهم نمودن بستر جذاب اینترنتی است که در آن انواع امور خدماتی، دریافت و اشتراک خبر، ثبت نام سرویسها و امور سازمانی، انجمنهای گفتمان و فروم ها و نمایش و فروش کالا و محصولات ارائه میشود.
https://valsparpaint.ir/

kangarlo mat 2021

شرکت بازرگانی کنگرلو بزرگترین شرکت پخش و توزیع تشک های شرکت های بزرگ از جمله خوشخواب ، رویال ، ویستر و پلی تاب ، آرمان ، کلاسیک ، ایران خواب ، جنت ، پائیزان ، یونیک ، یاتاک دونیاسی ، پرنیان ، پاتیرام در شمال غرب کشور می باشد . این شرکت با بهره گیری از کادر مجرب و در دست داشتن مدل های بسیاری از محصولات شرکت های معتبر تبدیل به بزرگترین توزیع کننده محصولات تشک در شمال غرب کشور شده است . این شرکت دارای مجموعه ی عظیمی از تشک های طبی ، فنری ، اسفنجی و بیمارستانی مفتخر است که توانسته با ارائه کیفیت برتر و تنوع مدلها و رنگها برای هر نوع سلیقه ای گامی مفید در جهت تامین نیاز مصرف کننده بردارد.
https://kangarlouco.com/%d8%aa%d8%b4%da%a9/

naabdesign 2021

طراحی سایت ناب دیزاین
https://naab-design.com/website-design/

naabdesign 2021

http://tinyurl.com/4nn4yxyk
its good for me and everyone

mahaaaaaab 2021

<a href=“https://www.tasnimnews.com/fa/news/1399/12/04/2457763/%D8%A7%D8%B2%D9%86-%DA%98%D9%86%D8%B1%D8%A7%D8%AA%D9%88%D8%B1-%D9%88-%DA%A9%D8%A7%D8%B1%D8%A8%D8%B1%D8%AF%D9%87%D8%A7%DB%8C-%D9%88%DB%8C%DA%98%D9%87-%D8%A2%D9%86-%D8%AF%D8%B1-%D8%B5%D9%86%D8%B9%D8%AA”>ازن ژنراتور</a>
امروزه با توجه به پیشرفت روز افزون دانش و تکنولوژی، این گاز ناپایدار به عنوان قویترین اکسید کننده و ضدعفونی کننده شناخته شده است. از این رو از گاز ازن به جهت تصفیه آب کارخانه های آب معدنی، تصفیه و ضدعفونی کردن آب استخرهای شنا، گندزدایی فاضلاب و پساب های صنعتی (فاضلاب پتروشیمی، فاضلاب بهداشتی و انسانی، پزشکی، آزمایشگاهی، نساجی، دامداری و…) مورد استفاده قرار میگیرد.در واقع این گاز به دلیل ساختار ناپایداری که دارد، پس از انجام ضدعفونی و گذشت زمان کوتاه به اکسیژن تبدیل می شود. لذا باقی مانده ای بر جا نمی گذارد. این خصوصیت خارق العاده ازن موجب استفاده گسترده آن شده و ازن را در گروه ضدعفونی کننده های ارگانیک قرار داده است.

Your password

Artyom Kazak

Decision theories, LW-style

tl;dr

I. Newcomb’s paradox: what if you had free will but didn’t?

II. A rule of thumb: decision theories are for your kids, not for you

III. Prisoner’s dilemma: what if you really wanted agents to cooperate but also wanted them to be ruthless?

IV. XOR blackmail problem

V. Simulation as a causal mechanism

VI. Determinism is a great answer to everything

VII. The psychopath button

VIII. Conclusion