Light paper day highlights KAIST's GARD for robust 3D reconstruction under degraded conditions and Tencent's EvalVerse for cinematic video evaluation; NousResearch hermes-agent (169K stars) and ECC agent harness (194K stars) dominate GitHub; DeepSeek-V4-Pro maintains 5M-download lead as Anima (1,556 likes) and Hy-MT2 translation models surge

EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation

High Relevance

Songlin Yang, Haobin Zhong, Ruilin Zhang, Xiaotong Zhao, Shuai Li et al. — Tencent

EvalVerse introduces a comprehensive evaluation framework for cinematic video generation that goes beyond basic prompt-following to assess professional filmmaking quality. It organizes evaluation around the filmmaking workflow (pre-production, production, post-production) and injects expert-calibrated judgments into VLMs through fine-tuning, enabling Chain-of-Thought reasoning about cinematic quality.

Key Findings

•
Existing benchmarks evaluate 'rightness' (prompt-following) but fundamentally neglect 'goodness' (cinematic quality, acting, and aesthetics)
•
Expert-calibrated VLM fine-tuning enables explicit Chain-of-Thought reasoning about professional cinematic quality
•
Extends evaluation coverage to complex multi-shot sequencing and audio-visual integration beyond single-clip assessment

video-generationbenchmarkcinematic-qualityvlm-evaluationexpert-calibration

2 upvotes

RT-Lynx: Putting the GEMM Sparsity In a Right Way for Diffusion Models

High Relevance

Xing Cong, Hanlin Tang, Kan Liu, Lan Tao, Lin Qu, Chenhao Xie — Beihang University

RT-Lynx advocates a paradigm shift from weight to activation sparsification for Diffusion Transformers. The key insight is that DiT activations are intrinsically sparse and significantly more robust to N:M semi-structured sparsification than weights. With error-compensation techniques and optimized CUDA kernels, RT-Lynx achieves up to 1.55x inference speedup while preserving generation quality.

Key Findings

•
DiT activations are intrinsically sparse and significantly more robust to N:M semi-structured sparsification than weights
•
Activation sparsification with error-compensation techniques preserves generation quality across multiple diffusion models
•
Optimized CUDA kernels achieve up to 1.55x average speedup in linear layers, translating theoretical FLOP savings to wall-clock gains

diffusion-transformerssparsityinference-optimizationactivation-pruningcuda-kernels

1 upvotes

SpatialBench: Is Your Spatial Foundation Model an All-Round Player?

High Relevance

Haosong Peng, Hao Li, Jiaqi Chen, Yuhao Pan, Runmao Yao et al. — Hong Kong University of Science and Technology, Nanyang Technological University, Northwestern Polytechnical University

SpatialBench presents a comprehensive cross-paradigm benchmark for spatial foundation models, evaluating 41 models across 6 paradigms on 19 datasets with 546 scenes spanning 5 diverse spatial domains. It reveals that full-context attention maximizes accuracy while bounded-memory strategies enable long-sequence scalability, and that egocentric and wrist-view domains remain dominant failure modes.

Key Findings

•
Full-context attention maximizes accuracy while bounded-memory strategies unlock long-sequence scalability — a direct system design tradeoff
•
Data quality outweighs data volume: carefully curated pseudo-GT supervision consistently outperforms larger noisy datasets
•
Egocentric and wrist-view domains remain dominant out-of-distribution failure modes, pointing to a clear training data gap

spatial-aibenchmark3d-visionfoundation-modelsembodied-ai

0 upvotes

Trending Models (12)

DeepSeek-V4-Pro

DeepSeek AI · text-generation · unknown

The dominant open-weight large language model with conversational capabilities, maintaining its position as the most-downloaded model on HuggingFace with massive community adoption.

conversationaltext-generationdeepseek

5.0M downloads4.3K likes

Anima

Circlestone Labs · image-generation · unknown

A leading open diffusion model compatible with ComfyUI, gaining strong traction as a community-favored image generation model with single-file distribution.

diffusioncomfyuiimage-generation

676.4K downloads1.6K likes

Sulphur-2-base

SulphurAI · text-to-video · unknown

A leading open text-to-video generation model available in both diffusers and GGUF formats, maintaining high download volume for video generation workloads.

text-to-videodiffusersvideo-generation

1.4M downloads1.4K likes

Hy-MT2-1.8B

Tencent · translation · 1.8B

A specialized 1.8B-parameter translation model from Tencent's Hunyuan family, demonstrating strong community interest in dedicated translation models over general-purpose LLMs.

translationhunyuanmultilingual

7.5K downloads1.0K likes

MiniCPM-V-4.6

OpenBMB · image-text-to-text · unknown

Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive

An efficient multimodal vision-language model for image-text understanding, continuing the MiniCPM-V series with strong community adoption for on-device and edge deployment scenarios.

multimodalvision-languageefficient

314.3K downloads978 likes

HauhauCS · text-generation · 35B-A3B (MoE)

A community-produced uncensored variant of Qwen3.6-35B using mixture-of-experts architecture (3B active parameters), distributed in GGUF format for local deployment with vision capabilities.

qwen3.6moeggufuncensoredvision

1.6M downloads912 likes

Lance

ByteDance Research · image-generation · unknown

ByteDance's multimodal generation model targeting both image and video generation, representing the company's push into open multimodal foundation models.

multimodalimage-generationvideo-generation

1.9K downloads866 likes

supertonic-3

Supertone · text-to-speech · unknown

A text-to-speech and speech synthesis model using ONNX format, reflecting growing interest in high-quality open TTS solutions.

ttsspeech-synthesisonnx

48.1K downloads698 likes

Qwen3.6-27B-MTP-GGUF

Unsloth · text-generation · 27B

Unsloth's GGUF quantization of Qwen3.6-27B with Multi-Token Prediction support, enabling efficient local inference of the popular Qwen model family.

ggufquantizedqwenmtp

735.3K downloads503 likes

HRM-Text-1B

Sapient Inc · text-generation · 1B

A compact 1B-parameter text generation model with high download volume, suggesting strong utility for lightweight text generation use cases.

text-generationcompacthrm

103.0K downloads379 likes

Marlin-2B

NemoStation · video-captioning · 2B

A 2B-parameter multimodal video captioning model, supporting video understanding and description generation from video inputs.

videomultimodalvideo-captioning

9.1K downloads380 likes

MiniCPM5-1B

OpenBMB · text-generation · 1B

Lum1104/Understand-Anything

The latest 1B-parameter entry in the MiniCPM series, offering a highly compact language model suitable for edge deployment and resource-constrained environments.

compactminicpmedge-deployment

2.4K downloads313 likes

Trending GitHub Repos (15)

High RelevanceGitHub

Turns any codebase into an interactive knowledge graph for exploration, search, and Q&A. Compatible with Claude Code, Codex, Cursor, Copilot, Gemini CLI, and more. Leading today's star velocity at 4,697 stars/day.

knowledge-graphcode-understandingdeveloper-tools

TypeScript36.1K+4.7K today2.9K

rohitg00/ai-engineering-from-scratch

High RelevanceGitHub

A comprehensive learning resource for AI engineering covering the full lifecycle from learning to building to shipping, gaining 2,155 stars today.

educationai-engineeringlearning-resource

Python20.8K+2.2K today3.5K

affaan-m/ECC

High RelevanceGitHub

A comprehensive agent harness performance optimization system with skills, instincts, memory, security, and research-first development for Claude Code, Codex, Cursor, and other AI coding tools.

agent-harnesscoding-agentsdeveloper-tools

JavaScript194.5K+1.9K today30.0K

anthropics/knowledge-work-plugins

High RelevanceGitHub

Anthropic's open-source repository of plugins for knowledge workers to use with Claude Cowork, signaling institutional investment in agent-assisted workflows.

pluginsknowledge-workanthropic

Python16.7K+1.7K today2.0K

NousResearch/hermes-agent

High RelevanceGitHub

An extensible AI agent framework from NousResearch that grows with the user, representing one of the largest open-source agent platforms by star count.

ai-agentframeworkextensible

Python168.8K+1.5K today28.1K

Leonxlnx/taste-skill

High RelevanceGitHub

A skill file that gives AI coding agents 'good taste' by preventing generation of boring, generic output. Part of the growing agent behavioral alignment ecosystem.

agent-skillsbehavioral-alignmentquality-control

Shell21.9K+1.4K today1.8K

mukul975/Anthropic-Cybersecurity-Skills

High RelevanceGitHub

754 structured cybersecurity skills mapped to 5 frameworks (MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND, NIST AI RMF) for use with AI coding agents across 26 security domains.

cybersecurityagent-skillssecurity-frameworks

Python10.2K+880 today1.2K

hardikpandya/stop-slop

A skill file for removing AI tells from prose, complementing taste-skill in the growing ecosystem of behavioral alignment tools for AI coding agents.

agent-skillswriting-qualitybehavioral-alignment

5.1K+539 today403

shiyu-coder/Kronos

High RelevanceGitHub

A foundation model for the language of financial markets, representing the most technically ambitious entry in the growing financial AI tooling ecosystem.

financial-aifoundation-modelmarkets

Python26.5K+425 today4.6K

dograh-hq/dograh

microsoft/agent-governance-toolkit

Open source voice AI platform and self-hosted alternative to Vapi and Retell, with on-prem deployment, visual workflow builder, MCP native support, and telephony integration.

voice-aiself-hostedtelephony

Python3.3K+399 today708

thedotmack/claude-mem

High RelevanceGitHub

Persistent context across AI agent sessions — captures session activity, compresses it with AI, and injects relevant context into future sessions. Works across multiple agent platforms.

agent-memorycontext-persistencedeveloper-tools

TypeScript78.7K+352 today6.8K

High RelevanceGitHub

Microsoft's policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering toolkit for autonomous AI agents, covering all OWASP Agentic Top 10 risks.

agent-governancesecuritymicrosoftzero-trust

Python2.7K+282 today436

shareAI-lab/learn-claude-code

A nano Claude Code-like agent harness built from scratch, serving as both an educational resource and lightweight implementation reference for agent development.

educationagent-harnessclaude-code

Python62.8K+246 today10.3K

666ghj/MiroFish

High RelevanceGitHub

A universal swarm intelligence engine for prediction tasks, applying collective intelligence algorithms to diverse forecasting domains.

swarm-intelligencepredictioncollective-ai

Python62.7K+162 today9.8K

modelscope/FunASR