= 1369370266899185746 # Replace with your channel ID CHANNEL_ID
Simple Discord Channel Fetcher
import nest_asyncio
from dotenv import load_dotenv
load_dotenv()apply() nest_asyncio.
You can use it like this:
# Fetch with default behavior (save and print)
= await fetch_channel_complete_history(CHANNEL_ID, print_summary=False)
channel_data len(channel_data['messages'])
Connected as hamml#3190
176
User
# Fetch and simplify in one call
= await fetch_discord_msgs(CHANNEL_ID, print_summary=True) original, simplified
Fetching complete channel data...
Connected as hamml#3190
Fetching from #lesson-3-implementing-effective-evaluations in AI Evals For Engineers & Technical PMs
Fetching main channel messages...
Fetching thread: question -- when we fix the prompt, and
Found 2 thread messages
Fetching thread: For the second sql constraint error, why
Found 2 thread messages
Fetching thread: question - how do we assess that we
Found 7 thread messages
Fetching thread: @Hamel @sh_reya But, how do we even say
Found 3 thread messages
Fetching thread: @Hamel @sh_reya
Found 3 thread messages
Fetching thread: where do ML models like NLI, etc fall
Found 2 thread messages
Fetching thread: What are your thoughts on measuring
Found 2 thread messages
Fetching thread: Can you explain what you meant by using
Found 3 thread messages
Fetching thread: Would you recommend using LLM-as-judge
Found 3 thread messages
Fetching thread: Is it okay to have a related LLM (
Found 2 thread messages
Fetching thread: so assuming more expensive = better
Found 7 thread messages
Fetching thread: Has anyone had a good experience using
Found 2 thread messages
Fetching thread: so you have an LLM prompt for each
Found 8 thread messages
Fetching thread: Which LLM do we use as the judge? the
Found 2 thread messages
Fetching thread: It's possible that you'll address this
Found 2 thread messages
Fetching thread: oh no... 😞 my rule of thumb was to us
Found 3 thread messages
Fetching thread: Is it better to run one “judgement pass”
Found 2 thread messages
Fetching thread: any reason why you would "ask the LLM
Found 4 thread messages
Fetching thread: Silly question, how do folks structure
Found 2 thread messages
Fetching thread: How about assigning scores like
Found 2 thread messages
Fetching thread: When writing these llm as a judge
Found 2 thread messages
Fetching thread: How to approach time sensitive jobs?
Found 2 thread messages
Fetching thread: Once we have enough examples that fail/
Found 5 thread messages
Fetching thread: Do we go through the whole Analyze
Found 2 thread messages
Fetching thread: How much total data for this training an
Found 4 thread messages
Fetching thread: what are reasonable accuracy thresholds
Found 4 thread messages
Fetching thread: Ok, you've trained a judge, you've used
Found 2 thread messages
Fetching thread: how many examples in the prompt is too
Found 4 thread messages
Fetching thread: It’s fun that we’re OVERFITTING even if
Found 3 thread messages
Fetching thread: so basically we optimizing for recall
Found 2 thread messages
Fetching thread: @Hamel @sh_reya I have one more question
Found 4 thread messages
Fetching thread: Given that results are often on
Found 2 thread messages
Fetching thread: Is it better to make a confusion matrix?
Found 4 thread messages
Fetching thread: I would use LLMs from different families
Found 3 thread messages
Fetching thread: I have seen people using spearmen
Found 2 thread messages
Fetching thread: If LLMs struggle to analyze multiple
Found 2 thread messages
Fetching thread: 👏 for emphasizing that these metrics
Found 2 thread messages
Fetching thread: there was recently an article published
Found 2 thread messages
Fetching thread: Yeah, this is very true. You should
Found 12 thread messages
Fetching thread: i guess the question is - why is it
Found 3 thread messages
Fetching thread: why is it important to only run the
Found 2 thread messages
Fetching thread: What is the next step if you notice in
Found 2 thread messages
Fetching thread: What's the use of the training set? I
Found 11 thread messages
Fetching thread: Could you please share any research
Found 2 thread messages
Fetching thread: Is this adjustment (in this use case,
Found 2 thread messages
Fetching thread: What if the unlabelled traces do not
Found 2 thread messages
Fetching thread: Are we now testing the accuracy of the
Found 2 thread messages
Fetching thread: for the bootstrap CI estimates, if our
Found 2 thread messages
Fetching thread: I also found this super helpful to
Found 2 thread messages
Fetching thread: if TPR % and TNR % are both 100%, why is
Found 3 thread messages
Fetching thread: It's amazing how the corrections look
Found 2 thread messages
Fetching thread: is anyone feeling overwhelmed, i was
Found 2 thread messages
Fetching thread: fwiw, when you are reading back through
Found 2 thread messages
Fetching thread: I've spent the past 6 months at work
Found 4 thread messages
Fetching thread: for the precision/ recall balance, it's
Found 2 thread messages
Fetching thread: isnt TPR "true positive rate"? what am i
Found 2 thread messages
Fetching thread: Do you think we should have more than
Found 3 thread messages
Fetching thread: I know that we teased this idea, but I
Found 3 thread messages
Fetching thread: @Hamel @sh_reya Thanks a lot for the
Found 3 thread messages
Fetching thread: Thank you @Hamel and @sh_reya for
Found 8 thread messages
Fetching thread: Whaat
Found 4 thread messages
Fetching thread: Cohen’s Kappa
Found 3 thread messages
Fetched 176 main messages
Total threads found: 62
=== Channel: #lesson-3-implementing-effective-evaluations ===
Guild: AI Evals For Engineers & Technical PMs
Total messages: 176
Total threads: 62
Thread 'question -- when we fix the prompt, and': 2 messages
Thread 'For the second sql constraint error, why': 2 messages
Thread 'question - how do we assess that we': 7 messages
Thread '@Hamel @sh_reya But, how do we even say': 3 messages
Thread '@Hamel @sh_reya': 3 messages
Thread 'where do ML models like NLI, etc fall': 2 messages
Thread 'What are your thoughts on measuring': 2 messages
Thread 'Can you explain what you meant by using': 3 messages
Thread 'Would you recommend using LLM-as-judge': 3 messages
Thread 'Is it okay to have a related LLM (': 2 messages
Thread 'so assuming more expensive = better': 7 messages
Thread 'Has anyone had a good experience using': 2 messages
Thread 'so you have an LLM prompt for each': 8 messages
Thread 'Which LLM do we use as the judge? the': 2 messages
Thread 'It's possible that you'll address this': 2 messages
Thread 'oh no... 😞 my rule of thumb was to us': 3 messages
Thread 'Is it better to run one “judgement pass”': 2 messages
Thread 'any reason why you would "ask the LLM': 4 messages
Thread 'Silly question, how do folks structure': 2 messages
Thread 'How about assigning scores like': 2 messages
Thread 'When writing these llm as a judge': 2 messages
Thread 'How to approach time sensitive jobs?': 2 messages
Thread 'Once we have enough examples that fail/': 5 messages
Thread 'Do we go through the whole Analyze': 2 messages
Thread 'How much total data for this training an': 4 messages
Thread 'what are reasonable accuracy thresholds': 4 messages
Thread 'Ok, you've trained a judge, you've used': 2 messages
Thread 'how many examples in the prompt is too': 4 messages
Thread 'It’s fun that we’re OVERFITTING even if': 3 messages
Thread 'so basically we optimizing for recall': 2 messages
Thread '@Hamel @sh_reya I have one more question': 4 messages
Thread 'Given that results are often on': 2 messages
Thread 'Is it better to make a confusion matrix?': 4 messages
Thread 'I would use LLMs from different families': 3 messages
Thread 'I have seen people using spearmen': 2 messages
Thread 'If LLMs struggle to analyze multiple': 2 messages
Thread '👏 for emphasizing that these metrics': 2 messages
Thread 'there was recently an article published': 2 messages
Thread 'Yeah, this is very true. You should': 12 messages
Thread 'i guess the question is - why is it': 3 messages
Thread 'why is it important to only run the': 2 messages
Thread 'What is the next step if you notice in': 2 messages
Thread 'What's the use of the training set? I': 11 messages
Thread 'Could you please share any research': 2 messages
Thread 'Is this adjustment (in this use case,': 2 messages
Thread 'What if the unlabelled traces do not': 2 messages
Thread 'Are we now testing the accuracy of the': 2 messages
Thread 'for the bootstrap CI estimates, if our': 2 messages
Thread 'I also found this super helpful to': 2 messages
Thread 'if TPR % and TNR % are both 100%, why is': 3 messages
Thread 'It's amazing how the corrections look': 2 messages
Thread 'is anyone feeling overwhelmed, i was': 2 messages
Thread 'fwiw, when you are reading back through': 2 messages
Thread 'I've spent the past 6 months at work': 4 messages
Thread 'for the precision/ recall balance, it's': 2 messages
Thread 'isnt TPR "true positive rate"? what am i': 2 messages
Thread 'Do you think we should have more than': 3 messages
Thread 'I know that we teased this idea, but I': 3 messages
Thread '@Hamel @sh_reya Thanks a lot for the': 3 messages
Thread 'Thank you @Hamel and @sh_reya for': 8 messages
Thread 'Whaat': 4 messages
Thread 'Cohen’s Kappa': 3 messages
Complete channel data saved to: discord_channel_lesson-3-implementing-effective-evaluations_20250528_152711.json
File size: 296.1 KB
=== Simplified Channel: lesson-3-implementing-effective-evaluations ===
Total conversations: 152
--- Conversation 1 ---
Main: davidh5633
Seems relevant to this course topic:
Evaluation Driven Development for Agentic Systems
https://www....
--- Conversation 2 ---
Main: davidh5633
I gather that alignment for the LLM-Judge is a larger up-front process as discussed through chapter ...
--- Conversation 3 ---
Main: davidh5633
Overall, this chapter definitely pointed out a 'hidden cost' of adding features to an LLM applicatio...
Simplified data saved to: discord_simplified_lesson-3-implementing-effective-evaluations_20250528_152711.json
File size: 74.2 KB
✅ Successfully fetched channel data!
📄 Original data: 176 messages, 62 threads
💬 Simplified data: 152 conversations
# from contextpack import ctx_fastcore
# fcdos = ctx_fastcore.fc_llms_ctx.get()
# from httpx import get
# script_docs = get('https://raw.githubusercontent.com/AnswerDotAI/fastcore/0fad28b8a20e437c11d70aa697659fc675656864/fastcore/script.py').text