Daily Research Feed
Tracking trending AI/ML papers and models across arXiv, HuggingFace, and AlphaXiv.
Saturday, May 30, 2026
AgentDoG 1.5 proposes lightweight safety alignment for open-world AI agents with 81 upvotes; Qwen-VLA unifies manipulation and navigation across robot embodiments; VoxCPM surges 1,815 stars/day for tokenizer-free multilingual TTS
agent-safety-alignmentembodied-foundation-modelsunified-retrieval-systemsspeech-synthesis-renaissanceefficient-lora-techniquesvideo-world-models
14
papers
12
models
15
repos
Friday, May 29, 2026
LaRA detects data contamination in RL post-training via layer-wise representation analysis; NAVA from Baidu achieves native audio-visual alignment for joint generation; agent skills ecosystem continues explosive growth with Understand-Anything gaining 3,776 stars/day
rl-training-integritynative-multimodal-generationagent-skills-ecosystemcinematic-video-generationcompact-edge-modelsai-output-quality-alignment
5
papers
15
models
15
repos
Thursday, May 28, 2026
ResearchMath-14K introduces largest research-level math dataset with multi-agent curation from Seoul National University; NEO-ov pioneers native one-vision VLMs for multi-image and video understanding; Understand-Anything leads GitHub with 4,465 stars/day as agent skills ecosystem explodes
research-level-mathematical-reasoningnative-vision-language-modelsagent-skills-ecosystemefficient-video-generationagent-governance-and-securitycompact-model-deployment
4
papers
12
models
15
repos
Wednesday, May 27, 2026
Light paper day highlights KAIST's GARD for robust 3D reconstruction under degraded conditions and Tencent's EvalVerse for cinematic video evaluation; NousResearch hermes-agent (169K stars) and ECC agent harness (194K stars) dominate GitHub; DeepSeek-V4-Pro maintains 5M-download lead as Anima (1,556 likes) and Hy-MT2 translation models surge
robust-3d-reconstructionvideo-generation-evaluationdiffusion-model-efficiencyspatial-foundation-modelsai-agent-infrastructurespecialized-model-ecosystem
4
papers
12
models
15
repos
Tuesday, May 26, 2026
SMART unlocks latent multi-vector retrieval from frozen single-vector models as a plug-and-play upgrade; AutoResearch AI surveys the full spectrum of AI-powered scientific workflow automation; Tencent Hy-MT2 translation models and ByteDance Lance multimodal generator dominate HuggingFace trending; AI coding agent tooling consolidation accelerates with ECC (192K stars), andrej-karpathy-skills (155K stars), and Understand-Anything (31K stars) leading GitHub
multi-vector-retrieval-efficiencyai-research-automationtranslation-model-specializationmultimodal-generation-convergenceai-coding-agent-infrastructurefinancial-ai-tooling
5
papers
12
models
14
repos
Monday, May 25, 2026
ETCHR decouples image editing from reasoning to unlock fine-grained visual chain-of-thought; Shannon Scaling Law reframes LLM training as noisy-channel transmission; AI coding agent infrastructure dominates GitHub with Understand-Anything (4,000 stars today) and andrej-karpathy-skills (2,551 stars today)
visual-chain-of-thought-reasoningscaling-laws-information-theoryagent-skill-optimization3d-scene-reconstructionai-coding-agent-infrastructureefficient-image-generation
6
papers
12
models
15
repos
Sunday, May 24, 2026
DelTA reframes RLVR as token-level discrimination with 189 upvotes; code knowledge graphs and AI coding agent plugins dominate GitHub; DeepSeek-V4-Pro and Tencent Hy-MT2 lead model trends
rlvr-token-credit-assignmentattention-sparsification-efficiencyai-coding-agent-infrastructurevideo-generation-consistencymultimodal-grounded-reasoningagentic-evaluation-benchmarks
15
papers
10
models
13
repos
Saturday, May 23, 2026
RLVR token-credit assignment (DelTA) advances fine-grained LLM training signals; full-attention sparsification shows LLMs are intrinsically sparse; agent governance and tooling ecosystems explode on GitHub
rlvr-and-credit-assignmentattention-sparsificationagentic-evaluation-benchmarksagent-infrastructure-and-governancemultimodal-robustnesskv-cache-and-inference-efficiency
14
papers
11
models
12
repos
Friday, May 22, 2026
Agent trajectory compilation (ACC) opens new long-context training paradigm; Gated DeltaNet-2 decouples linear attention memory editing; code knowledge graphs and agentic skills frameworks explode on GitHub
agent-training-from-trajectoriesefficient-attention-mechanismsagent-benchmarks-evaluationagentic-coding-toolscurriculum-reinforcement-learning
10
papers
10
models
15
repos
Thursday, May 21, 2026
Audio-visual Clever Hans effect exposes MLLM hallucinations; RL-for-reasoning wave crests with five new methods; agent infrastructure matures as OpenComputer and EnvFactory tackle verifiable environments
rl-for-reasoningagent-infrastructurevideo-generation-editingmultimodal-hallucinationautonomous-research
14
papers
10
models
12
repos
Wednesday, May 20, 2026
Artifact-Bench exposes MLLM blindspots in AI video quality assessment; OmniGUI pioneers omni-modal GUI agent benchmarking; agent skills and code knowledge graphs dominate GitHub with Karpathy-inspired best practices surging
ai-video-quality-evaluationomni-modal-agent-benchmarkingrl-credit-assignment-and-process-rewardsagent-knowledge-graphs-and-memoryagent-skills-ecosystem-consolidationmultilingual-document-understanding
6
papers
12
models
15
repos
Tuesday, May 19, 2026
AI auto-research integrity crisis mapped end-to-end; ODE-native video alignment via KVPO breaks new ground; open-source personal AI and agent-native CLI tooling dominate GitHub
ai-research-automation-integrityvideo-generation-infrastructurediffusion-language-model-hybridspersonal-ai-and-agent-toolingllm-inference-optimizationagent-native-interfaces
15
papers
12
models
15
repos
Tuesday, April 14, 2026
Claude-powered agentic coding ecosystem explodes on GitHub with hermes-agent and skills frameworks; financial AI surges with Kronos foundation model; AI SRE tooling emerges as new frontier
agentic-coding-assistantsfinancial-ai-modelsllm-context-and-memory-managementai-infrastructure-and-sredocument-preprocessing-for-llmsclaude-ecosystem-explosion
0
papers
0
models
15
repos
Monday, April 13, 2026
SFT generalization vindicated with conditional analysis reaching 294 upvotes; ClawBench tests AI agents on 153 real-world online tasks; MegaStyle scales style datasets to 170K prompts; agent tooling ecosystem explodes on GitHub
sft-vs-rl-generalizationreal-world-agent-evaluationstyle-and-visual-generationvideo-understanding-and-generationagent-tooling-ecosystemopen-weight-model-competition
15
papers
12
models
13
repos
Sunday, April 12, 2026
SFT generalization rethink surges to 190 upvotes reshaping post-training orthodoxy; ClawBench leaps to 122 testing agents on real-world tasks; GLM-5.1 MoE and Netflix void-model debut on HuggingFace; hermes-agent dominates GitHub at 6,438 stars/day
sft-generalization-momentumagent-evaluation-maturationmoe-architecture-diversificationvideo-generation-controlagentic-infrastructure-buildoutvision-language-ocr-push
13
papers
12
models
14
repos
Saturday, April 11, 2026
Agentic AI frameworks dominate GitHub trending with hermes-agent, Archon, and multica surging; financial foundation models and tokenizer-free TTS signal new frontier applications; Claude Code tooling meta-layer emerges as a distinct engineering discipline
agentic-ai-frameworksclaude-code-meta-toolingdomain-specific-foundation-modelstokenizer-free-ttsdocument-parsing-infrastructurewatermark-adversarial-research
10
papers
11
models
15
repos
Friday, April 10, 2026
Agentic AI frameworks surge with NousResearch Hermes-Agent and Multica hitting thousands of GitHub stars; Financial AI gains traction via Kronos foundation model; Claude Code best-practices meta-repos signal maturing LLM developer tooling ecosystem
agentic-ai-frameworksllm-developer-toolingfinancial-foundation-modelsspeech-synthesis-multilingualdata-infrastructure-for-aipersonalized-learning-agents
14
papers
11
models
15
repos
Thursday, April 9, 2026
GBQA benchmark reveals frontier LLMs catch under half of game bugs autonomously; ThinkTwice unifies reasoning and self-refinement via GRPO; Gemma 4 family dominates HuggingFace trending with six model variants
agent-evaluation-benchmarksreasoning-self-refinementefficient-large-model-trainingdiffusion-language-modelsgemma4-ecosystemagent-framework-explosion
13
papers
11
models
13
repos
Wednesday, April 8, 2026
In-Place Test-Time Training enables LLMs to adapt during inference; Polynomial Mixer achieves linear-time attention replacement; Gym-Anything turns any software into an agent environment
test-time-adaptationlinear-attention-replacementsagent-environment-infrastructurehallucination-detectionautonomous-agent-evaluationagent-tooling-dominance
14
papers
10
models
12
repos
Tuesday, April 7, 2026
Video-MME-v2 raises the bar for video understanding evaluation; Adam's Law reveals textual frequency scaling in LLMs; Gemma 4 family dominates model releases with MoE and any-to-any variants
video-understanding-benchmarksempirical-scaling-lawsagent-trajectory-optimizationtool-use-efficiencygemma-4-ecosystemvirtual-try-on-and-video-synthesis
13
papers
11
models
11
repos
Monday, April 6, 2026
On-device AI inference surges with Google LiteRT-LM and AI Edge Gallery; CORAL and Steerable Visual Representations maintain strong momentum; Claude-distilled Qwen and Gemma-4 dominate model charts
on-device-edge-inferenceagentic-frameworksautonomous-multi-agent-evolutionrepresentation-steering-and-reasoning-introspectionopen-weight-distillation-scalingdeveloper-ai-augmentation
13
papers
11
models
15
repos
Sunday, April 5, 2026
CORAL's multi-agent evolution framework surges to 36 upvotes as autonomous AI-for-AI research gains momentum; Steerable Visual Representations hits 40 upvotes; Claude-distilled Qwen and Gemma-4 continue model chart domination
autonomous-multi-agent-evolutionrepresentation-steeringllm-pre-decision-encodingopen-weight-distillation-scalingai-agent-developer-toolingsupply-chain-ai-forecasting
13
papers
12
models
11
repos
Saturday, April 4, 2026
Steerable visual representations and LLM pre-decision biases challenge core assumptions; multi-agent evolution frameworks and adversarial 3D textures push agent capabilities and risks; Gemma-4 and Claude-distilled Qwen dominate trending models
representation-steering-and-controlllm-reasoning-mechanismsmulti-agent-evolutionadversarial-robustness-3dvideo-understanding-and-editingopen-weight-distillation
13
papers
11
models
13
repos
Friday, April 3, 2026
Agent safety and benchmark proliferation dominate the day; LLM reasoning robustness under context pressure emerges as a critical concern; distillation and efficient scaling techniques show surprising gains
agent-safety-and-securitybenchmark-and-evaluationefficient-depth-scalingllm-reasoning-robustnessmultimodal-visual-reasoningself-improvement-and-distillation
15
papers
12
models
10
repos
Thursday, April 2, 2026
Medical AI gets its ImageNet moment with 1000+ dataset survey; Terminal-only agents challenge complex enterprise frameworks; Pretraining science matures with daVinci-LLM scaling laws
medical-ai-datasetsterminal-agentspretraining-sciencecot-monitorabilitymultimodal-generationedge-models
12
papers
12
models
12
repos
Wednesday, April 1, 2026
FIPO advances RL reasoning with future-KL credit assignment; Agentic AI frameworks dominate GitHub and HuggingFace; Qwen 3.5 ecosystem explodes across model charts
rl-reasoningagentic-aimultimodal-unificationspeculative-decodingqwen-ecosystem
12
papers
12
models
10
repos
Tuesday, March 31, 2026
Trillion-parameter scientific foundation model arrives; Agent skill distillation from trajectories gains traction; Coding agents get specialized models and organicity benchmarks
trillion-scale-modelsagent-skill-learningcoding-agentsmedical-aidiffusion-transformersvideo-generationgithub-trending
14
papers
10
models
10
repos
Monday, March 30, 2026
Attention Residuals rethink Transformers; LLM agents autonomously discover GPU kernels and RL algorithms; AI safety alarms as models fail without adversarial prompts
transformer-architectureautonomous-agentsai-safetyvideo-generationreasoning-distillationself-improvement
15
papers
10
models