📰 Dev.to · Tech_Nuggets
4 articles · Updated every 3 hours · View all reads
All
Articles 81,304Blog Posts 105,092Tech Tutorials 19,806Research Papers 17,820News 13,845
⚡ AI Lessons

Dev.to · Tech_Nuggets
11h ago
Sampling strategies compared: temperature, top-p, top-k, min-p, and what actually works in production
A production-oriented comparison of LLM sampling parameters -- how temperature, top-p, top-k, and min-p reshape the output distribution, what combos actually wo

Dev.to · Tech_Nuggets
1d ago
Quantization formats compared: GGUF vs GPTQ vs AWQ vs NF4
A practical comparison of the four major LLM weight quantization formats — which one to use for CPU, GPU serving, and fine-tuning, with current version numbers

Dev.to · Tech_Nuggets
2d ago
LoRA and QLoRA fine-tuning: what they actually do under the hood
A practical walkthrough of LoRA and QLoRA -- how low-rank adaptation works, what NF4 quantization brings, and when to use each.

Dev.to · Tech_Nuggets
6d ago
KV cache quantization: what FP8/INT8 K and V actually buy you, and where they break
FP8 and INT8 KV caches cut attention state ~50%, but they shift the target model's logit distribution — and that can quietly halve the gains from speculative de
DeepCamp AI