Introduction To Mechanistic Interpretability AI Safety Fundamentals: Alignment podcast

Player FM - Internet Radio Done Right

เพิ่มแล้วเมื่อ twoปีที่ผ่านมา
Looks like the publisher may have taken this series offline or changed its URL. Please contact support if you believe it should be working, the feed URL is invalid, or you have any other concerns about it.

เนื้อหาจัดทำโดย BlueDot Impact เนื้อหาพอดแคสต์ทั้งหมด รวมถึงตอน กราฟิก และคำอธิบายพอดแคสต์ได้รับการอัปโหลดและจัดหาให้โดยตรงจาก BlueDot Impact หรือพันธมิตรแพลตฟอร์มพอดแคสต์ของพวกเขา หากคุณเชื่อว่ามีบุคคลอื่นใช้งานที่มีลิขสิทธิ์ของคุณโดยไม่ได้รับอนุญาต คุณสามารถปฏิบัติตามขั้นตอนที่แสดงไว้ที่นี่ https://th.player.fm/legal

This Is Woman's Work with Nicole Kalil

1
How To Pitch Yourself (And Get A Yes) | 300 27:52

2 วันที่แล้ว27:52

ลิสต์เล่นในภายหลัง

ลิสต์

ถูกใจ

ที่ถูกใจแล้ว

27:52

We made it— 300 episodes of This Is Woman’s Work ! And we’re marking this milestone by giving you something that could seriously change the game in your business or career: the skill of pitching yourself effectively. Whether you’re dreaming of being a podcast guest, landing a speaking gig, signing a client, or just asking for what you want with confidence—you’re already pitching yourself, every day. But are you doing it well? In this milestone episode, Nicole breaks down exactly how to pitch yourself to be a podcast guest … and actually hear “yes.” With hundreds of pitches landing in her inbox each month, she shares what makes a guest stand out (or get deleted), the biggest mistakes people make, and why podcast guesting is still one of the most powerful ways to grow your reach, authority, and influence. In This Episode, We Cover: ✅ Why we all need to pitch ourselves—and how to do it without feeling gross ✅ The step-by-step process for landing guest spots on podcasts (and more) ✅ A breakdown of the 3 podcast levels: Practice, Peer, and A-List—and how to approach each ✅ The must-haves of a successful podcast pitch (including real examples) ✅ How to craft a pitch that gets read, gets remembered, and gets results Whether you’re new to pitching or want to level up your game, this episode gives you the exact strategy Nicole and her team use to land guest spots on dozens of podcasts every year. Because your voice deserves to be heard. And the world needs what only you can bring. 🎁 Get the FREE Podcast Pitch Checklist + Additional Information on your Practice Group, Peer Group, and A-List Group Strategies: https://nicolekalil.com/podcast 📥 Download The Podcast Pitch Checklist Here Related Podcast Episodes: Shameless and Strategic: How to Brag About Yourself with Tiffany Houser | 298 How To Write & Publish A Book with Michelle Savage | 279 How To Land Your TED Talk and Skyrocket Your Personal Brand with Ashley Stahl | 250 Share the Love: If you found this episode insightful, please share it with a friend, tag us on social media, and leave a review on your favorite podcast platform! 🔗 Subscribe & Review: Apple Podcasts | Spotify | Amazon Music…

AI Safety Fundamentals: Alignment »
Introduction to Mechanistic Interpretability

ประมาณหนึ่งปีที่แล้ว 11:45

แบ่งปัน

MP3•หน้าโฮมของตอน

ซีรีส์ที่ถูกเก็บถาวร ("ฟีดที่ไม่ได้ใช้งาน" status)

When? This feed was archived on February 21, 2025 21:08 (2M ago). Last successful fetch was on January 02, 2025 12:05 (4M ago)

Why? ฟีดที่ไม่ได้ใช้งาน status. เซิร์ฟเวอร์ของเราไม่สามารถดึงฟีดพอดคาสท์ที่ใช้งานได้สักระยะหนึ่ง

What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.

Our introduction introduces common mech interp concepts, to prepare you for the rest of this session's resources.

Original text: https://aisafetyfundamentals.com/blog/introduction-to-mechanistic-interpretability/
Author(s): Sarah Hastings-Woodhouse

A podcast by BlueDot Impact.
Learn more on the AI Safety Fundamentals website.

บท

1. Introduction to Mechanistic Interpretability (00:00:00)

2. Why might mechanistic interpretability be useful? (00:01:16)

3. Looking inside neural networks (00:03:34)

4. What makes mechanistic interpretability hard? (00:06:33)

5. Addressing polysemanticity (00:08:34)

85 ตอน

#Tech #Society #Philosophy #Blue Dot Impact

Introduction to Mechanistic Interpretability

AI Safety Fundamentals: Alignment

published ประมาณหนึ่งปีที่แล้ว

แบ่งปัน

MP3•หน้าโฮมของตอน

ซีรีส์ที่ถูกเก็บถาวร ("ฟีดที่ไม่ได้ใช้งาน" status)

When? This feed was archived on February 21, 2025 21:08 (2M ago). Last successful fetch was on January 02, 2025 12:05 (4M ago)

Our introduction introduces common mech interp concepts, to prepare you for the rest of this session's resources.

Original text: https://aisafetyfundamentals.com/blog/introduction-to-mechanistic-interpretability/
Author(s): Sarah Hastings-Woodhouse

A podcast by BlueDot Impact.
Learn more on the AI Safety Fundamentals website.

บท

1. Introduction to Mechanistic Interpretability (00:00:00)

2. Why might mechanistic interpretability be useful? (00:01:16)

3. Looking inside neural networks (00:03:34)

4. What makes mechanistic interpretability hard? (00:06:33)

5. Addressing polysemanticity (00:08:34)

85 ตอน

#Tech #Society #Philosophy #Blue Dot Impact

All episodes

1
Introduction to Mechanistic Interpretability 11:45

15 weeksที่แล้ว11:45

11:45

Our introduction introduces common mech interp concepts, to prepare you for the rest of this session's resources. Original text: https://aisafetyfundamentals.com/blog/introduction-to-mechanistic-interpretability/ Author(s): Sarah Hastings-Woodhouse A podcast by BlueDot Impact . Learn more on the AI Safety Fundamentals website.…

1
We Need a Science of Evals 20:12

15 weeksที่แล้ว20:12

20:12

This lays out a number of open questions, in what the author calls a 'Science of Evals'. Original text: https://www.apolloresearch.ai/blog/we-need-a-science-of-evals Author(s): Apollo Research blog A podcast by BlueDot Impact . Learn more on the AI Safety Fundamentals website.

1
Illustrating Reinforcement Learning from Human Feedback (RLHF) 22:32

39 weeksที่แล้ว22:32

22:32

This more technical article explains the motivations for a system like RLHF, and adds additional concrete details as to how the RLHF approach is applied to neural networks. While reading, consider which parts of the technical implementation correspond to the 'values coach' and 'coherence coach' from the previous video. A podcast by BlueDot Impact . Learn more on the AI Safety Fundamentals website.…

1
Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback 32:19

39 weeksที่แล้ว32:19

32:19

This paper explains Anthropic’s constitutional AI approach, which is largely an extension on RLHF but with AIs replacing human demonstrators and human evaluators. Everything in this paper is relevant to this week's learning objectives, and we recommend you read it in its entirety. It summarises limitations with conventional RLHF, explains the constitutional AI approach, shows how it performs, and where future research might be directed. If you are in a rush, focus on sections 1.2, 3.1, 3.4, 4.1, 6.1, 6.2. A podcast by BlueDot Impact . Learn more on the AI Safety Fundamentals website.…

2 yearsที่แล้ว12:47

12:47

In 1972, the Nobel prize-winning physicist Philip Anderson wrote the essay "More Is Different". In it, he argues that quantitative changes can lead to qualitatively different and unexpected phenomena. While he focused on physics, one can find many examples of More is Different in other domains as well, including biology, economics, and computer science. Some examples of More is Different include: Uranium. With a bit of uranium, nothing special happens; with a large amount of uranium packed densely enough, you get a nuclear reaction. DNA. Given only small molecules such as calcium, you can’t meaningfully encode useful information; given larger molecules such as DNA, you can encode a genome. Water. Individual water molecules aren’t wet. Wetness only occurs due to the interaction forces between many water molecules interspersed throughout a fabric (or other material). Original text: https://bounded-regret.ghost.io/future-ml-systems-will-be-qualitatively-different/ Narrated for AI Safety Fundamentals by Perrin Walker of TYPE III AUDIO . --- A podcast by BlueDot Impact . Learn more on the AI Safety Fundamentals website.…

ขอต้อนรับสู่ Player FM!

Player FM กำลังหาเว็บ

เปิดฟังกว่า 500+ หัวข้อ

พอดคาสต์ที่ควรค่าแก่การฟัง

AI Safety Fundamentals: Alignment » Introduction to Mechanistic Interpretability

ซีรีส์ที่ถูกเก็บถาวร ("ฟีดที่ไม่ได้ใช้งาน" status)

บท

1. Introduction to Mechanistic Interpretability (00:00:00)

2. Why might mechanistic interpretability be useful? (00:01:16)

3. Looking inside neural networks (00:03:34)

4. What makes mechanistic interpretability hard? (00:06:33)

5. Addressing polysemanticity (00:08:34)

Introduction to Mechanistic Interpretability

ซีรีส์ที่ถูกเก็บถาวร ("ฟีดที่ไม่ได้ใช้งาน" status)

บท

1. Introduction to Mechanistic Interpretability (00:00:00)

2. Why might mechanistic interpretability be useful? (00:01:16)

3. Looking inside neural networks (00:03:34)

4. What makes mechanistic interpretability hard? (00:06:33)

5. Addressing polysemanticity (00:08:34)

พอดคาสต์ที่ควรค่าแก่การฟัง

ขอต้อนรับสู่ Player FM!

คู่มืออ้างอิงด่วน

AI Safety Fundamentals: Alignment »
Introduction to Mechanistic Interpretability