Arguments For/against Scheming That Focus On The Path SGD Takes (Section 3 Of "Scheming AIs") Joe Carlsmith Audio podcast

Artwork

Society Philosophy Joe

เนื้อหาจัดทำโดย Joe Carlsmith เนื้อหาพอดแคสต์ทั้งหมด รวมถึงตอน กราฟิก และคำอธิบายพอดแคสต์ได้รับการอัปโหลดและจัดหาให้โดยตรงจาก Joe Carlsmith หรือพันธมิตรแพลตฟอร์มพอดแคสต์ของพวกเขา หากคุณเชื่อว่ามีบุคคลอื่นใช้งานที่มีลิขสิทธิ์ของคุณโดยไม่ได้รับอนุญาต คุณสามารถปฏิบัติตามขั้นตอนที่แสดงไว้ที่นี่ https://th.player.fm/legal

Joe Carlsmith Audio « »
Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs")

2y ago 29:03

แบ่งปัน

MP3•หน้าโฮมของตอน

เนื้อหาจัดทำโดย Joe Carlsmith เนื้อหาพอดแคสต์ทั้งหมด รวมถึงตอน กราฟิก และคำอธิบายพอดแคสต์ได้รับการอัปโหลดและจัดหาให้โดยตรงจาก Joe Carlsmith หรือพันธมิตรแพลตฟอร์มพอดแคสต์ของพวกเขา หากคุณเชื่อว่ามีบุคคลอื่นใช้งานที่มีลิขสิทธิ์ของคุณโดยไม่ได้รับอนุญาต คุณสามารถปฏิบัติตามขั้นตอนที่แสดงไว้ที่นี่ https://th.player.fm/legal

This is section 3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”

Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

… continue reading

บท

1. Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs") (00:00:00)

2. 3. Arguments for/against scheming that focus on the path that SGD takes (00:00:35)

3. 3.1 The training-game-independent proxy-goals story (00:02:38)

4. 3.2 The “nearest max-reward goal” story (00:07:14)

5. 3.2.1 Barriers to schemer-like modifications from SGD’s incrementalism (00:12:21)

6. 3.2.2 Which model is “nearest”? (00:13:53)

7. 3.2.2.1 The common-ness of schemer-like goals in goal space (00:14:28)

8. 3.2.2.2 The nearness of non-schemer goals (00:17:43)

9. 3.2.2.3 The relevance of messy goal-directedness to nearness (00:22:53)

10. 3.2.3 Overall take on the “nearest max-reward goal” argument (00:24:30)

11. 3.3 The possible relevance of properties like simplicity and speed to the path SGD takes (00:25:22)

12. 3.4 Overall assessment of arguments that focus on the path SGD takes (00:27:33)

66 ตอน

#Society #Philosophy #Joe

Artwork

Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs")

Joe Carlsmith Audio

published 2y ago

แบ่งปัน

MP3•หน้าโฮมของตอน

เนื้อหาจัดทำโดย Joe Carlsmith เนื้อหาพอดแคสต์ทั้งหมด รวมถึงตอน กราฟิก และคำอธิบายพอดแคสต์ได้รับการอัปโหลดและจัดหาให้โดยตรงจาก Joe Carlsmith หรือพันธมิตรแพลตฟอร์มพอดแคสต์ของพวกเขา หากคุณเชื่อว่ามีบุคคลอื่นใช้งานที่มีลิขสิทธิ์ของคุณโดยไม่ได้รับอนุญาต คุณสามารถปฏิบัติตามขั้นตอนที่แสดงไว้ที่นี่ https://th.player.fm/legal

This is section 3 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”

Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

… continue reading

บท

1. Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs") (00:00:00)

2. 3. Arguments for/against scheming that focus on the path that SGD takes (00:00:35)

3. 3.1 The training-game-independent proxy-goals story (00:02:38)

4. 3.2 The “nearest max-reward goal” story (00:07:14)

5. 3.2.1 Barriers to schemer-like modifications from SGD’s incrementalism (00:12:21)

6. 3.2.2 Which model is “nearest”? (00:13:53)

7. 3.2.2.1 The common-ness of schemer-like goals in goal space (00:14:28)

8. 3.2.2.2 The nearness of non-schemer goals (00:17:43)

9. 3.2.2.3 The relevance of messy goal-directedness to nearness (00:22:53)

10. 3.2.3 Overall take on the “nearest max-reward goal” argument (00:24:30)

11. 3.3 The possible relevance of properties like simplicity and speed to the path SGD takes (00:25:22)

12. 3.4 Overall assessment of arguments that focus on the path SGD takes (00:27:33)

66 ตอน

#Society #Philosophy #Joe

ทุกตอน

×

ขอต้อนรับสู่ Player FM!

Player FM กำลังหาเว็บ

เปิดฟังกว่า 500+ หัวข้อ

คู่มืออ้างอิงด่วน

พอดคาสต์ยอดนิยม

The Secret Sauce

สัพเพHEYไรว้าาา

Geek Forever’s Podcast

วอยซ์ ออฟ อเมริกา

ข่าวสดสายตรงจากวีโอเอ ภาคภาษาไทย 8:30–9:00 น. - วอยซ์ ออฟ อเมริกา

ปลดล็อกกับหมอเวช

WiTcast (ฟีดเก่า ไม่ใช้แล้ว)

ฟังรายการนี้ในขณะที่คุณสำรวจ