Ian Osband TalkRL: The Reinforcement Learning podcast

Artwork

Reinforcement Learning Machine Learning Robin Ranjit Singh Chauhan Artificial Intelligence Tech

เนื้อหาจัดทำโดย Robin Ranjit Singh Chauhan เนื้อหาพอดแคสต์ทั้งหมด รวมถึงตอน กราฟิก และคำอธิบายพอดแคสต์ได้รับการอัปโหลดและจัดหาให้โดยตรงจาก Robin Ranjit Singh Chauhan หรือพันธมิตรแพลตฟอร์มพอดแคสต์ของพวกเขา หากคุณเชื่อว่ามีบุคคลอื่นใช้งานที่มีลิขสิทธิ์ของคุณโดยไม่ได้รับอนุญาต คุณสามารถปฏิบัติตามขั้นตอนที่แสดงไว้ที่นี่ https://th.player.fm/legal

TalkRL: The Reinforcement Learning Podcast « »
Ian Osband

1+ y ago 1:08:26

แบ่งปัน

MP3•หน้าโฮมของตอน

เนื้อหาจัดทำโดย Robin Ranjit Singh Chauhan เนื้อหาพอดแคสต์ทั้งหมด รวมถึงตอน กราฟิก และคำอธิบายพอดแคสต์ได้รับการอัปโหลดและจัดหาให้โดยตรงจาก Robin Ranjit Singh Chauhan หรือพันธมิตรแพลตฟอร์มพอดแคสต์ของพวกเขา หากคุณเชื่อว่ามีบุคคลอื่นใช้งานที่มีลิขสิทธิ์ของคุณโดยไม่ได้รับอนุญาต คุณสามารถปฏิบัติตามขั้นตอนที่แสดงไว้ที่นี่ https://th.player.fm/legal

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.

We spoke about:

- Information theory and RL

- Exploration, epistemic uncertainty and joint predictions

- Epistemic Neural Networks and scaling to LLMs

Featured References

Reinforcement Learning, Bit by Bit
Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen

From Predictions to Decisions: The Importance of Joint Predictive Distributions

Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy

Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Approximate Thompson Sampling via Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Additional References

Thesis defence, Ian Osband
Homepage, Ian Osband
Epistemic Neural Networks at Stanford RL Forum
Behaviour Suite for Reinforcement Learning, Osband et al 2019
Efficient Exploration for LLMs, Dwaracherla et al 2024

… continue reading

73 ตอน

#Reinforcement Learning #Machine Learning #Robin Ranjit Singh Chauhan #Artificial Intelligence #Tech

Artwork

Ian Osband

TalkRL: The Reinforcement Learning Podcast

84 subscribers

published 1+ y ago

แบ่งปัน

MP3•หน้าโฮมของตอน

เนื้อหาจัดทำโดย Robin Ranjit Singh Chauhan เนื้อหาพอดแคสต์ทั้งหมด รวมถึงตอน กราฟิก และคำอธิบายพอดแคสต์ได้รับการอัปโหลดและจัดหาให้โดยตรงจาก Robin Ranjit Singh Chauhan หรือพันธมิตรแพลตฟอร์มพอดแคสต์ของพวกเขา หากคุณเชื่อว่ามีบุคคลอื่นใช้งานที่มีลิขสิทธิ์ของคุณโดยไม่ได้รับอนุญาต คุณสามารถปฏิบัติตามขั้นตอนที่แสดงไว้ที่นี่ https://th.player.fm/legal

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.

We spoke about:

- Information theory and RL

- Exploration, epistemic uncertainty and joint predictions

- Epistemic Neural Networks and scaling to LLMs

Featured References

Reinforcement Learning, Bit by Bit
Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen

From Predictions to Decisions: The Importance of Joint Predictive Distributions

Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy

Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Approximate Thompson Sampling via Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Additional References

Thesis defence, Ian Osband
Homepage, Ian Osband
Epistemic Neural Networks at Stanford RL Forum
Behaviour Suite for Reinforcement Learning, Osband et al 2019
Efficient Exploration for LLMs, Dwaracherla et al 2024

… continue reading

73 ตอน

#Reinforcement Learning #Machine Learning #Robin Ranjit Singh Chauhan #Artificial Intelligence #Tech

ทุกตอน

×

ขอต้อนรับสู่ Player FM!

Player FM กำลังหาเว็บ

เปิดฟังกว่า 500+ หัวข้อ

คู่มืออ้างอิงด่วน

พอดคาสต์ยอดนิยม

The Secret Sauce

สัพเพHEYไรว้าาา

Geek Forever’s Podcast

วอยซ์ ออฟ อเมริกา

ข่าวสดสายตรงจากวีโอเอ ภาคภาษาไทย 8:30–9:00 น. - วอยซ์ ออฟ อเมริกา

ปลดล็อกกับหมอเวช

WiTcast (ฟีดเก่า ไม่ใช้แล้ว)

ฟังรายการนี้ในขณะที่คุณสำรวจ