Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
The State of LLMs 2025: Progress, Problems, and Predictions (sebastianraschka.com)
1 point by nsainsbury 1 day ago | past | discuss
The State of LLMs 2025: Progress, Problems, and Predictions (sebastianraschka.com)
3 points by ModelForge 5 days ago | past | discuss
The State of LLMs 2025: Progress, Progress, and Predictions (sebastianraschka.com)
4 points by ibobev 6 days ago | past | discuss
The State of LLMs 2025: Progress, Progress, and Predictions (sebastianraschka.com)
9 points by vismit2000 6 days ago | past | discuss
New LLM Pre-Training and Post-Training Paradigms (sebastianraschka.com)
2 points by lr0 8 days ago | past | 1 comment
Understanding Encoder and Decoder LLMs (sebastianraschka.com)
1 point by jeffjeffbear 18 days ago | past
A Technical Tour of the DeepSeek Models from V3 to v3.2 (sebastianraschka.com)
23 points by ibobev 32 days ago | past | 1 comment
A Technical Tour of the DeepSeek Models from V3 to v3.2 (sebastianraschka.com)
5 points by mzl 32 days ago | past | 1 comment
Recommendations for Getting the Most Out of a Technical Book (sebastianraschka.com)
2 points by naves 33 days ago | past
A Technical Tour of the DeepSeek Models from V3 to v3.2 (sebastianraschka.com)
8 points by giuliomagnifico 33 days ago | past
Getting the Most Out of a Technical Book (sebastianraschka.com)
4 points by quietlearning 53 days ago | past
Beyond Standard LLMs (sebastianraschka.com)
1 point by vismit2000 58 days ago | past
Beyond Standard LLMs (sebastianraschka.com)
1 point by ibobev 61 days ago | past
A Researcher's Field Guide to Non-Standard LLM Architectures (sebastianraschka.com)
2 points by ModelForge 62 days ago | past
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch) (sebastianraschka.com)
1 point by ibobev 82 days ago | past
Popular Attention Alternatives: GQA, MLA, SWA (sebastianraschka.com)
4 points by ModelForge 82 days ago | past
Multi-Head Latent Attention (sebastianraschka.com)
4 points by ModelForge 84 days ago | past
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch) (sebastianraschka.com)
2 points by ibobev 87 days ago | past
LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge (sebastianraschka.com)
4 points by ModelForge 3 months ago | past
Understanding and Implementing Qwen3 from Scratch (sebastianraschka.com)
1 point by ibobev 3 months ago | past
GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2 (sebastianraschka.com)
490 points by ModelForge 4 months ago | past | 97 comments
From GPT-2 to GPT-OSS: Analyzing the Architectural Advances (sebastianraschka.com)
3 points by mdp2021 4 months ago | past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs (sebastianraschka.com)
1 point by Anon84 5 months ago | past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs (sebastianraschka.com)
4 points by mariuz 5 months ago | past
LLM architecture comparison (sebastianraschka.com)
418 points by mdp2021 5 months ago | past | 24 comments
The Big LLM Architecture Comparison (sebastianraschka.com)
3 points by Quizzical4230 5 months ago | past
Comprehensive ML/AI questions and answers for interview prep (sebastianraschka.com)
2 points by yaiml 6 months ago | past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs (sebastianraschka.com)
4 points by sbbq 6 months ago | past
Intermediate ML and AI questions and answers for interview prep (sebastianraschka.com)
3 points by sbbq 6 months ago | past
Understanding and Coding the KV Cache in LLMs from Scratch (sebastianraschka.com)
6 points by sbbq 6 months ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: