📰 Reddit r/deeplearning
Articles from Reddit r/deeplearning · 29 articles · Updated every 3 hours · View all reads
All
⚡ AI Lessons (11216)
ArXiv cs.AIDev.to · FORUM WEBDev.to AIForbes InnovationOpenAI NewsHugging Face Blog
Reddit r/deeplearning
9h ago
How do you guys handle writing your essays?
Writing ssay is so hard for me. I’ve never been good at writing, and honestly, putting my thoughts together in a logical way is a nightmare. I’ve tried so many
Reddit r/deeplearning
9h ago
Introducing Code-Mixed Chain-of-Thought — Teaching Gemma 4 31B to reason bilingually cut thinking tokens by 40% [Mnemic Glorious 31B]
submitted by /u/superman_27 [link] [comments]
Reddit r/deeplearning
10h ago
Transformer regression model overfits on single sample but fails to further reduce loss on a 50-sample dataset
My task consists of forecasting number of upvotes for Reddit posts at time t after posting (how many hours t it was posted ago) based on text/title/time t, curr
Reddit r/deeplearning
10h ago
Built a Japanese ASR benchmark because existing ones can't measure quality differences properly
submitted by /u/holotherapper [link] [comments]
Reddit r/deeplearning
12h ago
Fastest training / fine-tuning framework
submitted by /u/deepnet101 [link] [comments]
Reddit r/deeplearning
12h ago
Is this really AI generated MUSIC VIDEO 🫨
submitted by /u/BeltVarious9958 [link] [comments]
Reddit r/deeplearning
12h ago
OpenAI acquired Hiro Finance 🔥
submitted by /u/adzamai [link] [comments]
Reddit r/deeplearning
13h ago
Help with a build: Training models on high-res images (2000x2500px)
Hi everyone, I’ve been tasked with putting together a PC build for my company to train neural networks. I’m not an expert in the field, so I could use some eyes
Reddit r/deeplearning
14h ago
“Found a very useful playlist for learning document classification with LayoutLMv3. Worth watching if you’re into OCR/document AI.”
submitted by /u/Imaginary_Step_6165 [link] [comments]
Reddit r/deeplearning
14h ago
https://www.youtube.com/watch?v=PW2wi1C-tM0
submitted by /u/Due_Pace_4325 [link] [comments]
Reddit r/deeplearning
15h ago
DinoDS isn’t “more scraped data.” It’s behavior engineering for LLMs.
I don’t think the interesting question anymore is “how much data did you scrape?” It’s: what exact model behavior did you engineer? That’s how we’ve been thinki
Reddit r/deeplearning
18h ago
Is this for REAL ?????
submitted by /u/BeltVarious9958 [link] [comments]
Reddit r/deeplearning
18h ago
Optimizers Explained Visually | SGD, Momentum, AdaGrad, RMSProp & Adam
Optimizers Explained Visually in under 4 minutes — SGD, Momentum, AdaGrad, RMSProp, and Adam all broken down with animated loss landscapes so you can see exactl
Reddit r/deeplearning
20h ago
kontext-brain: ontology-graph context retrieval that beats RAG on token efficiency (+54% reduction)
submitted by /u/FantasticSeaweed2342 [link] [comments]
Reddit r/deeplearning
20h ago
We built a pre-generation LLM guardrail that blocks prompt injection at the residual stream level, before the model outputs anything [Mistral 7B, 0% FP, 100% detection]
Most LLM monitors work like this: the model generates a response, you check if it’s bad, you log it. By the time you alert, the output already exists. We built
Reddit r/deeplearning
21h ago
ICML 2026 acceptance threshold vs. What we have seen in Neurips 2025 [D]
After the rebuttals our paper has a borderline average score of 3.75. I thought the odds weren't very bad (given what Copilot says) until I saw last year's neur
Reddit r/deeplearning
22h ago
MIRAS framework unifies Transformers, Mamba, RetNet, and Titans as four design choices over associative memory
submitted by /u/thisguy123123 [link] [comments]
Reddit r/deeplearning
1d ago
Trained a 125M LM from scratch instead of fine-tuning GPT-2 — releasing weights + SFT framework for others to build on
submitted by /u/Kill_Streak308 [link] [comments]
Reddit r/deeplearning
1d ago
👋Ti diamo il benvenuto su r/artificial_intellig - Per prima cosa, presentati e leggi le linee guida!
AI, INTELLIGENZA ARTIFICIALE, HARDWARE, AGENTI, INFERENZA, AUTOMAZIONI, N8N, SCHEDE TESLA, SCHEDE DI ACCELERAZIONE PER L'INTELLIGENZA ARTIFICIALE, RAM, QUANTIZZ
Reddit r/deeplearning
1d ago
[ Removed by Reddit ]
[ Removed by Reddit on account of violating the content policy . ] submitted by /u/abe_abou [link] [comments]
Reddit r/deeplearning
1d ago
I built a platform that turns anything u want to learn into a course!
Hey everyone, I've been a Coursera user for years and kept running into the same wall: every course is built for a generic learner. You get "Python 101" but it'
Reddit r/deeplearning
1d ago
Benchmarked Gemma 4 E2B: The 2B model beat every larger sibling on multi-turn (70%)
Tested Gemma 4 E2B across 10 enterprise task suites against Gemma 2 2B, Gemma 3 4B, Gemma 4 E4B, and Gemma 3 12B. Run locally on Apple Silicon. Overall ranking
Reddit r/deeplearning
1d ago
Created a dataset system for training real LLM behaviors (not just prompts)
Most LLM dataset discussions still revolve around size, coverage, or “high-quality text,” but in practice the real failure mode shows up later when you actually
Reddit r/deeplearning
1d ago
How VLAs Work - Mathematics for Engineers
submitted by /u/Nice-Dragonfly-4823 [link] [comments]
DeepCamp AI