📰 Reddit r/MachineLearning
Articles from Reddit r/MachineLearning · 26 articles · Updated every 3 hours · View all reads
All
⚡ AI Lessons (11216)
ArXiv cs.AIDev.to · FORUM WEBDev.to AIForbes InnovationOpenAI NewsHugging Face Blog
![TraceML update: structured bottleneck summaries + W&B / MLflow logging for PyTorch training [P]](https://preview.redd.it/0m0u4ajyo5vg1.png?width=140&height=140&crop=1:1,smart&auto=webp&s=3cd2513b6458eca824ed2b9b53c25333dc02688e)
Reddit r/MachineLearning
9h ago
TraceML update: structured bottleneck summaries + W&B / MLflow logging for PyTorch training [P]
<img src="https://preview.redd.it/0m0u4ajyo5vg1.png?width=140&height=140&crop=1:1,smart&auto=webp&s=3cd2513b6458eca824ed2b9b53c25333dc02688e" al
![We benchmarked TranslateGemma against 5 other LLMs on subtitle translation across 6 languages. At first glance the numbers told a clean story, but then human QA added a chapter. [D]](https://preview.redd.it/h6gfrd0ew4vg1.jpg?width=140&height=140&crop=1:1,smart&auto=webp&s=d586892e18bb809fa52e1595acdd73dd93bcdd8a)
Reddit r/MachineLearning
12h ago
We benchmarked TranslateGemma against 5 other LLMs on subtitle translation across 6 languages. At first glance the numbers told a clean story, but then human QA added a chapter. [D]
<img src="https://preview.redd.it/h6gfrd0ew4vg1.jpg?width=140&height=140&crop=1:1,smart&auto=webp&s=d586892e18bb809fa52e1595acdd73dd93bcdd8a" al
Reddit r/MachineLearning
14h ago
No agent maintained moral reasoning consistency across scenarios. Findings from a structured study with 11 agents on classic ethical dilemmas [R]
I've been working on agent behavior research for a product we're building, and one of the studies we ran recently produced results that I think are worth sharin
!["I don't know!": Teaching neural networks to abstain with the HALO-Loss. [R]](https://preview.redd.it/loxsfywek4vg1.png?width=140&height=123&auto=webp&s=e494bc934351aeddad8b50b76059d35dc7987a4f)
Reddit r/MachineLearning
16h ago
"I don't know!": Teaching neural networks to abstain with the HALO-Loss. [R]
Curren
Reddit r/MachineLearning
23h ago
I scaled a pure Spiking Neural Network (SNN) to 1.088B parameters from scratch. Ran out of budget, but here is what I found [R]
Hey everyone. I’m an 18yo indie dev, and I’ve been experimenting with Spiking Neural Networks (SNNs) for language modeling. A lot of papers (like SpikeBERT) men
Reddit r/MachineLearning
1d ago
Thinking Deeper, Not Longer: Depth-Recurrent Transformers for Compositional Generalization [R]
Paper: https://arxiv.org/abs/2603.21676 I found this interesting as another iteration of the TRM approach: Shows decent OOD generalization in 2/3 tasks (but why
Reddit r/MachineLearning
1d ago
[N] AMA Announcement: Max Welling (VAEs, GNNs, AI4Science & CuspAI)
We're thrilled to announce that Max Welling will be joining us for an AMA on Wednesday April 15th from 17:00 to 18:30 CEST (11am - 12:30pm EDT) Who is Max Welli
Reddit r/MachineLearning
1d ago
hands on workshop: context engineering for multi agent systems [D]
hey everyone, sharing this because it's directly relevant to what a lot of people here are building. packt publishing is running a hands on workshop on april 25
![Mandatory In-Person Presentation in CVPR 2026 [D]](https://preview.redd.it/z5stwi8b9zug1.png?width=140&height=16&auto=webp&s=b6f43a18f31d7ae1eef9cc885f3fb97719c1015a)
Reddit r/MachineLearning
1d ago
Mandatory In-Person Presentation in CVPR 2026 [D]
In the recent mail from CVPR PC about oral and poster decision
Reddit r/MachineLearning
1d ago
TurboOCR: 270–1200 img/s OCR with Paddle + TensorRT (C++/CUDA, FP16) [P]
I had about 940,000 PDFs to process. Running VLMs over a million pages is slow and expensive, and that gap is only getting worse as OCR moves toward transformer
Reddit r/MachineLearning
1d ago
Which conference/journal do you believe currently has the most fair and accurate review process?[D]
Major conference acceptance has become pretty much random and review quality is constantly dropping. There is always that one reviewer who understood nothing b
![Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO [P]](https://preview.redd.it/mf7rux5lhyug1.png?width=140&height=90&auto=webp&s=ca85312bbdb852edbfcd38808c9111e272c70d16)
Reddit r/MachineLearning
1d ago
Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO [P]
<!-- SC_OFF
Reddit r/MachineLearning
1d ago
[ECCV2026] Workshop notification of reject/accept[D]
Anyone else submitted a workshop proposal to ECCV this year? The deadline for getting a decision was yesterday, but we got no reply yet. submitted by /u/Locksmi
Reddit r/MachineLearning
1d ago
[ICML 2026] Scores for Position papers post discussion? [D]
I've been seeing mainly discussions about the main track. Any ACs or other reviewers here who know if the position paper track is following similar trends as th
Reddit r/MachineLearning
1d ago
Implementation details of Backpropagation in Siamese networks. [D]
Hey Folks, Could someone please share correct implementation of backprop in siamese networks? The explanation on the original paper is not super detailed. I fou
Reddit r/MachineLearning
1d ago
[ICML 2026] Extending the deadline for reviewer final justifications while not extending for Author-AC comments was a huge mistake [D]
Just as the title says, I believe the decision to extend the deadline for reviewers to post their final justifications while not allowing authors to contact the
Reddit r/MachineLearning
2d ago
KIV: 1M token context window on a RTX 4070 (12GB VRAM), no retraining, drop-in HuggingFace cache replacement - Works with any model that uses DynamicCache [P]
Been working on this for a bit and figured it was ready to share. KIV (K-Indexed V Materialization) is a middleware layer that replaces the standard KV cache in
Reddit r/MachineLearning
2d ago
Educational PyTorch repo for distributed training from scratch: DP, FSDP, TP, FSDP+TP, and PP [P]
I put together a small educational repo that implements distributed training parallelism from scratch in PyTorch: https://github.com/shreyansh26/pytorch-distrib
Reddit r/MachineLearning
2d ago
Gary Marcus on the Claude Code leak [D]
Gary Marcus just tweeted: ... the way Anthropic built that kernel is straight out of classical symbolic AI. For example, it is in large part a big IF-THEN condi
Reddit r/MachineLearning
2d ago
LLMs learn backwards, and the scaling hypothesis is bounded. [D]
submitted by /u/preyneyv [link] [comments]
![Just did an analysis on ICLR 2025 vs 2026 scores and WOW [D]](https://preview.redd.it/klay6nijipug1.png?width=140&height=137&auto=webp&s=d0697873c284a0422673abc78de6add14f8cd2ac)
Reddit r/MachineLearning
2d ago
Just did an analysis on ICLR 2025 vs 2026 scores and WOW [D]
Per <a href="https://paperreview.ai/t
!["There's a new generation of empirical deep learning researchers, hacking away at whatever seems trendy, blowing with the wind" [D]](https://preview.redd.it/nm9k0bbiepug1.png?width=640&crop=smart&auto=webp&s=f0f1d2f96909cee53062669134e37e4ddb34dc55)
Reddit r/MachineLearning
2d ago
"There's a new generation of empirical deep learning researchers, hacking away at whatever seems trendy, blowing with the wind" [D]
<img src="https://preview.redd.it/nm9k0bbiepug1.png?width=640&crop=smart&auto=webp&s=f0f1d2f96909cee53062669134e37e4ddb34dc55" alt=""There's a new g
Reddit r/MachineLearning
3d ago
Post Rebuttal ICML Average Scores? [D]
I have an average of 3.5. One of the reviewer gave us a 2 by bringing up a new issue he hadn't mentioned in his initial review, taking that from another reviewe
Reddit r/MachineLearning
3d ago
Is "live AI video generation" a meaningful technical category or just a marketing term? [R]
Asking from a technical standpoint because I feel like the term is doing a lot of work in coverage of this space right now. Genuine real-time video inference, w
DeepCamp AI