🚁 Aerial-VLN-Arxiv-Daily

每日自动追踪无人机视觉语言导航 (Aerial-VLN)、3DGS场景重建与仿真和具身智能基础模型的最新 arXiv 论文。

Updated on 2026.06.26

无人机视觉语言导航
3DGS场景重建与仿真
具身智能基础模型与数据

📌 无人机视觉语言导航

Publish Date (YYYY-MM-DD)	Title	Authors	PDF	HJFY
2026-06-17	A Digital Twin Framework for Traffic-Aware UAV Pavement Monitoring without Lane Closure 面向交通感知的无车道封闭无人机路面监测数字孪生框架摘要	Edwin Salcedo Team	2606.20742	HJFY
2026-06-18	See-and-Reach: Precise Vision-Language Navigation for UAVs within the Field of View 摘要	Jiande Sun Team	2606.20045	HJFY
2026-06-12	Automated Gaze-based Behavioral Segmentation and Temporal Representation for Bridge Inspection in Unconstrained 3D Environments 基于眼动自动化的非约束三维环境中桥梁检查行为分段与时间表征摘要	Mohamad Alipour Team	2606.14893	HJFY
2026-06-11	Guided Diffusion with Distilled Vision-Language Reliability for Aerial Navigation 融合蒸馏视觉-语言可靠性的引导扩散用于空中导航摘要	Dzmitry Tsetserukou Team	2606.13883	HJFY
2026-06-02	AirDreamer: Generalist Drone Navigation with World Models AirDreamer：基于世界模型的通用无人机导航摘要	Guyue Zhou Team	2606.03252	HJFY
2026-06-08	ImagineUAV: Aerial Vision-Language Navigation via World-Action Modeling and Kinodynamic Planning ImagineUAV：基于世界-动作建模与动力学规划的航空视觉-语言导航摘要	Jiankun Yang Team	2606.01205	HJFY
2026-05-29	Can Aerial VLA Models Cooperate? Evaluating Closed-Loop Air-Ground Coordination with CARLA-Air 空中视觉-语言-动作模型能否协作？基于CARLA-Air的闭环空地协调评估摘要	Hong Zhang Team	2605.31066	HJFY
2026-05-26	Uni-LaViRA: Language-Vision-Robot Actions Translation for Unified Embodied Navigation Uni-LaViRA：面向统一具身导航的语言-视觉-机器人动作翻译摘要	Jiebo Luo Team	2605.27582	HJFY
2026-05-19	FlyMirage: A Fully Automated Generation Pipeline for Diverse and Scalable UAV Flight Data via Generative World Model FlyMirage：基于生成式世界模型的多样化可扩展无人机飞行数据全自动生成管道摘要	Xin Zhou Team	2605.19600	HJFY
2026-05-20	CosFly-Track: A Large-Scale Multi-Modal Dataset for UAV Visual Tracking via Multi-Constraint Trajectory Optimization CosFly-Track：面向无人机视觉跟踪的大规模多模态数据集——基于多约束轨迹优化摘要	Ji Pei Team	2605.17776	HJFY
2026-05-15	WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation 摘要	Yong Li Team	2605.15964	HJFY
2026-05-18	Weather-Robust Cross-View Geo-Localization via Prototype-Based Semantic Part Discovery 基于原型语义部件发现的鲁棒天气跨视角地理定位摘要	Long Tran-Thanh Team	2605.11654	HJFY
2026-04-30	Dynamic-TD3: A Novel Algorithm for UAV Path Planning with Dynamic Obstacle Trajectory Prediction Dynamic-TD3：一种融合动态障碍物轨迹预测的无人机路径规划新算法摘要	Yuanlong Yu Team	2605.00059	HJFY
2026-04-23	Instance-level Visual Active Tracking with Occlusion-Aware Planning 具有遮挡感知规划的实例级视觉主动跟踪摘要	Mingkui Tan Team	2604.21453	HJFY
2026-04-19	LookasideVLN: Direction-Aware Aerial Vision-and-Language Navigation LookasideVLN：面向方向的空中视觉与语言导航摘要	Guanbin Li Team	2604.17190	HJFY
2026-04-17	FineCog-Nav: Integrating Fine-grained Cognitive Modules for Zero-shot Multimodal UAV Navigation FineCog-Nav：面向零样本多模态无人机导航的细粒度认知模块集成摘要	Jing Huo Team	2604.16298	HJFY
2026-04-16	RL-STPA: Adapting System-Theoretic Hazard Analysis for Safety-Critical Reinforcement Learning RL-STPA：为安全关键型强化学习调整系统理论危害分析摘要	Benjamin J. Schumeg Team	2604.15201	HJFY
2026-04-15	Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap 无人机视觉与语言导航：进展、挑战与研究路线图摘要	Ji Pei Team	2604.13654	HJFY
2026-04-10	"Take Me Home, Wi-Fi Drone": A Drone-based Wireless System for Wilderness Search and Rescue 摘要	Chenshu Wu Team	2604.09115	HJFY
2026-04-10	HTNav: A Hybrid Navigation Framework with Tiered Structure for Urban Aerial Vision-and-Language Navigation HTNav：面向城市空中视觉语言导航的分层混合导航框架摘要	Jie Qin Team	2604.08883	HJFY
2026-04-09	Vision-Language Navigation for Aerial Robots: Towards the Era of Large Language Models 摘要	Wen Yao Team	2604.07705	HJFY
2026-04-18	AeroScene: Progressive Scene Synthesis for Aerial Robotics AeroScene：面向空中机器人的渐进式场景合成摘要	Anh Nguyen Team	2603.23224	HJFY
2026-03-23	Evolutionary Biparty Multiobjective UAV Path Planning: Problems and Empirical Comparisons 演进式双主体多目标无人机路径规划：问题与实证比较摘要	Yatong Chang Team	2603.21544	HJFY
2026-03-22	SpatialFly: Geometry-Guided Representation Alignment for UAV Vision-and-Language Navigation in Urban Environments SpatialFly：面向城市环境中无人机视觉语言导航的几何引导表示对齐方法摘要	Xiangyang Ji Team	2603.21046	HJFY
2026-03-18	CICDWOA: A Collective Cognitive Sharing Whale Optimization Algorithm with Cauchy Inverse Cumulative Distribution for 2D/3D Path Planning and Engineering Design Problems CICDWOA：一种基于柯西逆累积分布与集体认知共享的鲸鱼优化算法，用于二维/三维路径规划与工程设计问题摘要	Xu Yang Team	2603.20501	HJFY
2026-03-20	HUGE-Bench: A Benchmark for High-Level UAV Vision-Language-Action Tasks HUGE-Bench：面向高级无人机视觉-语言-动作任务的基准测试平台摘要	Mingming Gong Team	2603.19822	HJFY
2026-03-19	Optimal Path Planning in Hostile Environments 敌对环境中的最优路径规划摘要	Haifeng Xu Team	2603.18958	HJFY
2026-03-11	OnFly: Onboard Zero-Shot Aerial Vision-Language Navigation toward Safety and Efficiency OnFly：面向安全与效率的机载零样本空中视觉语言导航摘要	Boyu Zhou Team	2603.10682	HJFY
2026-03-10	WESPR: Wind-adaptive Energy-Efficient Safe Perception & Planning for Robust Flight with Quadrotors WESPR：面向四旋翼稳健飞行的风适应性能效安全感知与规划摘要	Pratap Tokekar Team	2603.09194	HJFY
2026-03-09	ViSA-Enhanced Aerial VLN: A Visual-Spatial Reasoning Enhanced Framework for Aerial Vision-Language Navigation ViSA增强型空中视觉语言导航：一种视觉-空间推理增强的空中视觉语言导航框架摘要	Chenghao Lin Team	2603.08007	HJFY

评估状态保存在浏览器本地（localStorage），换设备/浏览器不会同步。

📌 3DGS场景重建与仿真

Publish Date (YYYY-MM-DD)	Title	Authors	PDF	HJFY
2026-06-25	Scalable Behavior Cloning with Open Data, Training, and Evaluation 基于开放数据、训练与评估的可扩展行为克隆方法摘要	Angjoo Kanazawa Team	2606.27375	HJFY
2026-06-25	VibeAct: Vibration to Actions for Contact-Rich Reactive Robot Dexterity VibeAct：振动驱动的高接触响应型机器人灵巧操作摘要	Jeffrey Ichnowski Team	2606.27344	HJFY
2026-06-25	The SPOTLIGHT Multibeam Real-Time Transient Detection System SPOTLIGHT多波束实时瞬变探测系统摘要	Harshavardhan Reddy Team	2606.27262	HJFY
2026-06-25	Learning to Fold: prizewinning solution at LeHome Challenge 2026 (1st place online, 2nd offline) 学习折叠：LeHome 2026挑战赛夺冠方案（线上第一名，线下第二名）摘要	Ilia Larchenko Team	2606.27163	HJFY
2026-06-25	Vis4GS: A Visual Analytic Tool for 3D Gaussian Splatting Reconstruction Vis4GS：面向3D高斯溅射重建的可视分析工具摘要	Shih-Hsuan Hung Team	2606.26985	HJFY
2026-06-25	RobOralScan: Learning Active Intraoral Scanning for Robotic Dental Reconstruction RobOralScan：面向机器人牙科重建的主动口内扫描学习摘要	Sunghoon Im Team	2606.26955	HJFY
2026-06-25	UAV-MapFusion: RTK-Aligned Uncertainty-Aware Coarse-to-Fine Multi-Session UAV Mapping UAV-MapFusion：基于RTK对齐的不确定性感知多航段无人机地图粗到细融合方法摘要	Wei Wang Team	2606.26928	HJFY
2026-06-25	Probing inflationary particle production with the CMB power spectrum 摘要	Oliver H. E. Philcox Team	2606.26823	HJFY
2026-06-25	Capacity-Controlled Multi-View Stylization of 3D Gaussian Splatting 容量可控的三维高斯泼溅多视角风格化摘要	Hui Huang Team	2606.26754	HJFY
2026-06-25	IDEA: Insensitive to Dynamics Mismatch via Effect Alignment for Sim-to-Real Transfer in Multi-Agent Control IDEA: 基于效果对齐对动力学失配不敏感的多智能体控制仿真到现实迁移方法摘要	Bin He Team	2606.26575	HJFY
2026-06-18	Slow Brain, Fast Planner: Latency-Resilient VLM-Augmented Urban Navigation 慢思考的头脑，快规划的行动：面向延迟容忍的VLM增强城市导航摘要	Bolei Zhou Team	2606.20458	HJFY
2026-06-18	HEPTv2: End-to-End Efficient Point Transformer for Charged Particle Reconstruction HEPTv2: 面向带电粒子重建的端到端高效点云Transformer 摘要	Pan Li Team	2606.20437	HJFY
2026-06-18	TaCauchy: An Extensible FEM Framework for Vision-Based Tactile Simulation TaCauchy：一种面向视觉触觉仿真的可扩展有限元框架摘要	Wenbo Ding Team	2606.20426	HJFY
2026-06-18	Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring 学会提示：通过自适应大语言模型高中辅导提升学生参与度摘要	Michiel T. van der Meer Team	2606.20138	HJFY
2026-06-18	Geometry-Preserving in 3D Gaussian Splatting for LiDAR-Camera Extrinsic Calibration 面向激光雷达-相机外参标定的三维高斯泼溅几何保持方法摘要	Hyoseok Hwang Team	2606.20103	HJFY
2026-06-18	Tri-Info: Generalizable, Interpretable Failure Prediction for VLA Models via Information Theory 三信息量：基于信息理论的VLA模型可泛化可解释故障预测方法摘要	Yanchao Yang Team	2606.19998	HJFY
2026-06-18	MMD-SLAM: Structure-Enhanced Multi-Meta Gaussian Distribution-Guided Visual SLAM MMD-SLAM：结构增强的多元高斯分布引导视觉SLAM 摘要	Chunmao Jiang Team	2606.19874	HJFY
2026-06-17	Scaling Self-Play for End-to-End Driving 面向端到端驾驶的规模化自我对弈摘要	Liam Paull Team	2606.19641	HJFY
2026-06-17	Building Drift: Documenting On-Site Construction Adaptations Across Material Lifecycles 建筑漂移：记录跨材料生命周期的现场施工适应性调整摘要	Mette Ramsgaard Thomsen Team	2606.19609	HJFY
2026-06-17	ev-flow: A Reproducible, NHTS-Grounded Generator of Synthetic Plug-in Electric Vehicle Charging Behavior for Eight U.S. Regions ev-flow：基于美国国家家庭出行调查的、可复现的美国八个地区插电式电动汽车充电行为合成生成器摘要	Bertrand Travacca Team	2606.19520	HJFY
2026-06-15	Di5Guise: 5G Privacy with vSIM Di5Guise：基于vSIM的5G隐私保护方案摘要	Tamara Lehman Team	2606.16943	HJFY
2026-06-15	Decay estimates for beam equations with potentials in dimension two 二维带势阱梁方程的时间衰减估计摘要	Xiaohua Yao Team	2606.16793	HJFY
2026-06-15	PhysGuard: Fisher-Guided Gradient Projection for Sim-to-Real Neural PDE Surrogates PhysGuard：基于Fisher引导的梯度投影实现神经PDE代理模型的仿真到现实迁移摘要	Guillermo A Narsilio Team	2606.16602	HJFY
2026-06-15	Local-GS: Accelerating 3D Gaussian Splatting via Tile-Local Warp Coherence Local-GS：通过瓦片局部束相干性加速3D高斯溅射摘要	Huaping Liu Team	2606.16566	HJFY
2026-06-15	Agile Fall Recovery for Quadrotors with Bidirectional Thrust via Reinforcement Learning 基于强化学习的双向推力四旋翼无人机敏捷跌倒恢复摘要	Fei Gao Team	2606.16513	HJFY
2026-06-15	RealityBridge: Bridging Editable 3D Gaussian Splatting Driving Simulations and Real-World Videos RealityBridge：连接可编辑3D高斯泼溅驾驶模拟与现实世界视频的桥梁摘要	Guanbin Li Team	2606.16278	HJFY
2026-06-15	PolyMerge: Compressing 3D Gaussian Splats with Polytope Coverings for Provably Safe Resource-Constrained Navigation PolyMerge：基于多面体覆盖的3D高斯泼溅压缩技术实现可证明安全的资源受限导航摘要	Glen Chou Team	2606.16232	HJFY
2026-06-15	EgoPhys: Learning Generalizable Physics Models of Deformable Objects from Egocentric Video EgoPhys：从第一人称视频学习可变形物体的通用物理模型摘要	Xiaolong Wang Team	2606.16202	HJFY
2026-06-14	Artificial Intelligence for Power-Converter-Rich Electrical Systems: A Review 面向富含电力变流器的电气系统的人工智能：综述摘要	Peng Wang Team	2606.15948	HJFY
2026-06-14	TurboGS: Accelerating 3D Gaussian Splatting via Error-Guided Sparse Pixel Sampling and Optimization TurboGS：基于误差引导的稀疏像素采样与优化加速三维高斯泼溅摘要	Weiwei Xu Team	2606.15924	HJFY
2026-06-10	Ambient Diffusion Policy: Imitation Learning from Suboptimal Data in Robotics 环境扩散策略：从次优数据中学习机器人模仿摘要	Russ Tedrake Team	2606.12365	HJFY
2026-06-10	MLT-Dedup: Efficient Large-Scale Online Video Deduplication via Multi-Level Representations and Spatial-Temporal Matching MLT-Dedup：基于多层级表征与时空匹配的高效大规模在线视频去重摘要	Kun Xu Team	2606.12215	HJFY
2026-06-10	Point Cloud Segmentation for Autonomous Clip Positioning in Laparoscopic Cholecystectomy on a Phantom 面向腹腔镜胆囊切除术中自主施夹的体模点云分割摘要	Franziska Mathis-Ullrich Team	2606.12048	HJFY
2026-06-10	KinematicRL: A Sim-to-Real Reinforcement Learning Framework For Social Navigation With Kinodynamic Feasibility 运动学强化学习：一种面向社交导航且兼顾动力学可行性的仿真到现实强化学习框架摘要	Chenpeng Yao Team	2606.12042	HJFY
2026-06-10	Unexpected large relative strong phase and search for isospin breaking and $CP$ asymmetries in $J/ψ\to K^(892)\bar K J/ψ→K(892)K衰变中意外大的相对强相位及同位旋破坏与CP不对称性搜寻摘要	J. Zu Team	2606.12002	HJFY
2026-06-10	Wild3R: Feed-Forward 3D Gaussian Splatting from Unconstrained Sparse Photo Collection Wild3R：基于无约束稀疏照片集合的前馈式三维高斯泼溅方法摘要	Toshihiko Yamasaki Team	2606.11894	HJFY
2026-06-10	Scene-Adaptive Nonlinear Tone Curves for Pseudo Ground-Truth Generation in Low-Light 3D Gaussian Splatting 面向低光照3D高斯泼溅的伪真值生成的场景自适应非线性色调曲线摘要	Hong Zhang Team	2606.11841	HJFY
2026-06-10	Seeing What Matters: Perceptual Wrapper with Common Randomness for 3D Gaussian Splatting 摘要	Wen-Hsiao Peng Team	2606.11782	HJFY
2026-06-10	Blind Dexterous Grasping via Real2Sim2Real Tactile Policy Learning 基于Real2Sim2Real触觉策略学习的盲操作灵巧抓取摘要	Chenxi Xiao Team	2606.11767	HJFY
2026-06-10	TacCoRL: Integrating Tactile Feedback into VLA via Simulation TacCoRL: 通过仿真将触觉反馈集成到视觉-语言-动作模型中摘要	Chenfanfu Jiang Team	2606.11743	HJFY
2026-05-29	Learning Controlled Separation of Small Objects Between Two Fingers with a Tactile Skin 基于触觉皮肤的双手指间小物体受控分离学习摘要	Berthold Bäuml Team	2605.31486	HJFY
2026-05-29	Dirac-Phase CP-Violation in the Low-Scale Type-I Seesaw with Three Right-Handed Neutrinos 具有三个右手中微子的低尺度I型跷跷板模型中的狄拉克相位CP破坏摘要	S. T. Petcov Team	2605.31454	HJFY
2026-05-29	Triangle Splatting SLAM 三角形溅射SLAM 摘要	Andrew J. Davison Team	2605.31419	HJFY
2026-05-29	Scaling Multi-Hop Training Data via Graph-Constrained Path Selection 基于图约束路径选择的大规模多跳训练数据生成摘要	Yike Guo Team	2605.31238	HJFY
2026-05-29	Robust class-gated single-pixel diffractive optical neural network with random-aberration-aware training 具有随机像差感知训练的鲁棒类门控单像素衍射光学神经网络摘要	Jun-Jun Xiao Team	2605.31232	HJFY
2026-05-29	TALON: Token-Aligned Lightweight Adapters for 6-DoF Spacecraft Pose Estimation TALON：用于六自由度航天器姿态估计的令牌对齐轻量级适配器摘要	Djamila Aouada Team	2605.31217	HJFY
2026-05-29	QVGGT: Post-Training Quantized Visual Geometry Grounded Transformer QVGGT：训练后量化的视觉几何基础变换器摘要	Huan Wang Team	2605.31124	HJFY
2026-05-29	Benchmarking Single-Step Inpainting Methods for Multi-Object 3D Gaussian Splatting Scenes 多对象三维高斯泼溅场景的单步图像修补方法基准测试摘要	Daniel Cremers Team	2605.30987	HJFY
2026-05-29	RDGen: Demonstration Generation for High-Quality Robot Learning via Reinforcement Learning 摘要	Xinhai Sun Team	2605.30957	HJFY
2026-05-28	Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes 工业视觉仿真到现实中的先验可用性：CAD引导与CAD缺失模式的综述摘要	Seung-Kyum Choi Team	2605.30581	HJFY
2026-05-20	Mind the Sim-to-Real Gap & Think Like a Scientist 警惕仿真与现实的鸿沟，像科学家一样思考摘要	Alexander Volfovsky Team	2605.21458	HJFY
2026-05-20	Lost in Fog: Sensor Perturbations Expose Reasoning Fragility in Driving VLAs 迷雾迷航：传感器扰动暴露驾驶视觉-语言-动作模型的推理脆弱性摘要	Jelena Frtunikj Team	2605.21446	HJFY
2026-05-20	Detection of a dark matter subhalo in the strongly lensed system PJ011646 强引力透镜系统PJ011646中暗物质子晕的探测摘要	Leo W. H. Fung Team	2605.21212	HJFY
2026-05-20	Transcoding a 3D Gaussian Splatting Model from a Plenoptic Point Cloud or Mesh without the Original Multi-view Images 基于全光点云或网格模型（无需原始多视图图像）的3D高斯泼溅模型转码方法摘要	Neus Sabater Team	2605.21051	HJFY
2026-05-20	Point Cloud Sequence Encoding for Material-conditioned Graph Network Simulators 基于点云序列编码的材料条件图网络模拟器摘要	Gerhard Neumann Team	2605.20978	HJFY
2026-05-20	CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation 摘要	HyeongYeop Kang Team	2605.20872	HJFY
2026-05-20	Resolving Long-Tail Ambiguity in Unsupervised 3D Point Cloud Segmentation with Language Priors 利用语言先验解决无监督三维点云分割中的长尾模糊性问题摘要	Qiuxia Wu Team	2605.20737	HJFY
2026-05-19	Conflict-Aware Active Perception and Control in 3D Gaussian Splatting Fields via Control Barrier Functions 基于控制障碍函数的3D高斯泼溅场中的冲突感知主动感知与控制摘要	Nader Motee Team	2605.20566	HJFY
2026-05-19	TideGS: Scalable Training of Over One Billion 3D Gaussian Splatting Primitives via Out-of-Core Optimization TideGS：通过外存优化实现超十亿级3D高斯泼溅基元的可扩展训练摘要	Chaojian Li Team	2605.20150	HJFY
2026-05-19	OP2GS: Object-Aware 3D Gaussian Splatting with Dual-Opacity Primitives 摘要	Janne Heikkilä Team	2605.20044	HJFY
2026-05-11	Rapid Forest Fuel Load Estimation via Virtual Remote Sensing and Metric-Scale Feed-Forward 3D Reconstruction 通过虚拟遥感与公尺度前馈三维重建实现森林燃料负载快速估算摘要	Jonathan Li Team	2605.10789	HJFY
2026-05-11	MAGS-SLAM: Monocular Multi-Agent Gaussian Splatting SLAM for Geometrically and Photometrically Consistent Reconstruction MAGS-SLAM：面向几何与光度一致重建的单目多智能体高斯泼溅SLAM 摘要	Baoru Huang Team	2605.10760	HJFY
2026-05-11	Network-Normative Belief Updating in High-Dimensional Ideological Space 高维意识形态空间中的网络规范性信念更新摘要	Chico Q. Camargo Team	2605.10726	HJFY
2026-05-11	VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models VEGA: 面向空间感知视觉-语言-动作模型的视觉编码器基础对齐摘要	Shanghang Zhang Team	2605.10485	HJFY
2026-05-11	DySurface: Consistent 4D Surface Reconstruction via Bridging Explicit Gaussians and Implicit Functions DySurface: 通过桥接显式高斯与隐式函数实现一致的4D表面重建摘要	Tae-Kyun Kim Team	2605.10360	HJFY
2026-05-11	AdaptSplat: Adapting Vision Foundation Models for Feed-Forward 3D Gaussian Splatting AdaptSplat：适配视觉基础模型以实现前馈式三维高斯泼溅摘要	Yifeng Shi Team	2605.10239	HJFY
2026-05-11	A cell-decomposition based path planner for 3D navigation in constrained workspaces 一种基于单元分解的约束空间三维导航路径规划方法摘要	Guilherme V. Raffo Team	2605.10086	HJFY
2026-05-11	SDTalk: Structured Facial Priors and Dual-Branch Motion Fields for Generalizable Gaussian Talking Head Synthesis SDTalk：面向可泛化高斯说话人头合成的结构化面部先验与双分支运动场摘要	Lingyun Yu Team	2605.09956	HJFY
2026-05-10	Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching 零样本仿真到现实机器人学习：基于灵巧操作的动态抓取研究摘要	Kaiyu Hang Team	2605.09789	HJFY
2026-05-10	ConFixGS: Learning to Fix Feedforward 3D Gaussian Splatting with Confidence-Aware Diffusion Priors in Driving Scenes ConFixGS：在驾驶场景中利用置信度感知扩散先验学习修复前馈式3D高斯泼溅摘要	Jiaqi Ma Team	2605.09688	HJFY
2026-05-04	CoRAL: Contact-Rich Adaptive LLM-based Control for Robotic Manipulation CoRAL：面向机器人操作的自适应接触丰富型大语言模型控制摘要	Özgür S. Öğüz Team	2605.02600	HJFY
2026-05-04	Robotic Affection -- Opportunities of AI-based haptic interactions to improve social robotic touch through a multi-deep-learning approach 机器人情感——基于多深度学习方法的AI触觉交互在改善社交机器人触觉中的应用机遇摘要	Jens Gerken Team	2605.02538	HJFY
2026-05-04	Sim-to-Real Transfer and Robustness Evaluation of Reinforcement Learning Control with Integrated Perception on an ASV for Floating Waste Capture 面向漂浮垃圾捕获的自主水面艇集成感知强化学习控制的仿真到现实迁移与鲁棒性评估摘要	Cédric Pradalier Team	2605.02529	HJFY
2026-05-04	Beyond Specialization: Robust Reinforcement Learning Navigation via Procedural Map Generators 超越专门化：通过程序化地图生成器实现鲁棒的强化学习导航摘要	Peter Detzner Team	2605.02528	HJFY
2026-05-03	GETA-3DGS: Automatic Joint Structured Pruning and Quantization for 3D Gaussian Splatting GETA-3DGS：面向3D高斯溅射的自动联合结构化剪枝与量化方法摘要	Wanxin Sui Team	2605.02086	HJFY
2026-05-03	From Concept to Capability: Evaluating 3D Gaussian Splatting for Synthetic Scene Editing in Autonomous Driving 从概念到能力：评估三维高斯泼溅在自动驾驶合成场景编辑中的应用摘要	Anders Heyden Team	2605.01995	HJFY
2026-05-02	The Banach-Butterfly Invariant: Influence-Adaptive Walsh Geometry for Ternary Polynomial Threshold Functions 巴拿赫-蝴蝶不变量：三元多项式阈值函数的影响自适应沃尔什几何摘要	Gorgi Pavlov Team	2605.01637	HJFY
2026-05-02	Action Agent: Agentic Video Generation Meets Flow-Constrained Diffusion 行动智能体：结合流约束扩散的智能体视频生成摘要	Dzmitry Tsetserukou Team	2605.01477	HJFY
2026-05-02	Evidence-Based Landing Site Selection and Vison-Based Landing for UAVs in Unstructured Environments 基于证据的非结构化环境无人机着陆点选择与视觉着陆方法摘要	Iraj Mantegh Team	2605.01432	HJFY
2026-05-02	The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice 日志的部分证据：混淆模型选择下语言模型生成的评估摘要	Vasilis Syrgkanis Team	2605.01311	HJFY
2026-04-30	FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems FlexiTac：面向机器人系统的低成本、开源、可扩展触觉传感解决方案摘要	Yunzhu Li Team	2604.28156	HJFY
2026-04-30	GSDrive: Reinforcing Driving Policies by Multi-mode Trajectory Probing with 3D Gaussian Splatting Environment GSDrive：基于3D高斯泼溅环境的多模式轨迹探测强化驾驶策略摘要	Dzmitry Tsetserukou Team	2604.28111	HJFY
2026-04-30	Faster 3D Gaussian Splatting Convergence via Structure-Aware Densification 更快收敛的3D高斯泼溅：基于结构感知的致密化方法摘要	Christian Theobalt Team	2604.28016	HJFY
2026-04-30	Fake3DGS: A Benchmark for 3D Manipulation Detection in Neural Rendering 摘要	Roberto Vezzani Team	2604.27590	HJFY
2026-04-30	Residual Gaussian Splatting for Ultra Sparse-View CBCT Reconstruction 残差高斯泼溅用于超稀疏视角CBCT重建摘要	Qiegen Liu Team	2604.27552	HJFY
2026-04-30	Softmax-GS: Generalized Gaussians Learning When to Blend or Bound Softmax-GS：学习何时混合或约束的广义高斯分布摘要	Li Fuxin Team	2604.27437	HJFY
2026-04-30	Sparse-View 3D Gaussian Splatting in the Wild 摘要	William J. Beksi Team	2604.27422	HJFY
2026-04-30	DOT-Sim: Differentiable Optical Tactile Simulation with Precise Real-to-Sim Physical Calibration DOT-Sim: 具备精确实到仿物理标定的可微分光学触觉仿真摘要	Leonidas Guibas Team	2604.27367	HJFY
2026-04-29	MesonGS++: Post-training Compression of 3D Gaussian Splatting with Hyperparameter Searching MesonGS++：基于超参数搜索的3D高斯泼溅后训练压缩方法摘要	Zhi Wang Team	2604.26799	HJFY
2026-04-29	3D Generation for Embodied AI and Robotic Simulation: A Survey 面向具身智能与机器人仿真的三维生成：综述摘要	Song Guo Team	2604.26509	HJFY
2026-04-23	DualSplat: Robust 3D Gaussian Splatting via Pseudo-Mask Bootstrapping from Reconstruction Failures DualSplat：通过重建失败的伪掩码引导实现鲁棒的3D高斯点云渲染摘要	Yisong Chen Team	2604.21631	HJFY
2026-04-23	Do MLLMs Understand Pointing? Benchmarking and Enhancing Referential Reasoning in Egocentric Vision 多模态大模型理解指向吗？自我中心视觉中参考推理的基准测试与增强摘要	Jie Zhou Team	2604.21461	HJFY
2026-04-23	You Only Gaussian Once: Controllable 3D Gaussian Splatting for Ultra-Densely Sampled Scenes 仅需一次高斯：面向超密集采样场景的可控三维高斯泼溅摘要	Yifeng Shi Team	2604.21400	HJFY
2026-04-23	Listen and Chant Before You Read: The Ladder of Beauty in LM Pre-Training 先听与吟唱，再行阅读：语言模型预训练中的美感阶梯摘要	Yoshinori Nomura Team	2604.21265	HJFY
2026-04-23	WildSplatter: Feed-forward 3D Gaussian Splatting with Appearance Control from Unconstrained Images WildSplatter：基于无约束图像的前馈式三维高斯泼溅与外观控制方法摘要	Yasuhiro Mukaigawa Team	2604.21182	HJFY
2026-04-22	ProMMSearchAgent: A Generalizable Multimodal Search Agent Trained with Process-Oriented Rewards ProMMSearchAgent：基于过程奖励训练的可泛化多模态搜索智能体摘要	Zhizhong Zhang Team	2604.20486	HJFY
2026-04-22	GSCompleter: A Distillation-Free Plugin for Metric-Aware 3D Gaussian Splatting Completion in Seconds GSCompleter：面向度量感知的三维高斯泼溅补全的免蒸馏插件，数秒内完成补全摘要	Yuan Xie Team	2604.20155	HJFY
2026-04-21	Gaussians on a Diet: High-Quality Memory-Bounded 3D Gaussian Splatting Training 高斯节食：高质量内存受限的3D高斯泼溅训练摘要	Miao Yin Team	2604.20046	HJFY
2026-04-21	FluSplat: Sparse-View 3D Editing without Test-Time Optimization FluSplat：无需测试时优化的稀疏视角3D编辑摘要	Yi Xu Team	2604.20038	HJFY
2026-04-21	Precision Kinematic Sunyaev--Zel'dovich Measurements Across Halo Mass and Redshift with DESI DR2 and ACT DR6: Part II. Bright Galaxy Survey and Emission-Line Galaxies 基于DESI DR2和ACT DR6的晕质量与红移范围内精密运动学Sunyaev–Zel’dovich测量：第二部分。亮星系巡天与发射线星系摘要	H. Zou Team	2604.19745	HJFY
2026-04-19	Fringe Projection Based Vision Pipeline for Autonomous Hard Drive Disassembly 基于条纹投影的自主硬盘拆解视觉流程摘要	Beiwen Li Team	2604.17231	HJFY
2026-04-18	Instant Colorization of Gaussian Splats 高斯泼溅的即时着色摘要	Nils Wandel Team	2604.17155	HJFY
2026-04-17	Incoherent Deformation, Not Capacity: Diagnosing and Mitigating Overfitting in Dynamic Gaussian Splatting 非相干形变，而非容量：动态高斯溅射中过拟合的诊断与缓解摘要	Ahmad Droby Team	2604.16747	HJFY
2026-04-17	Active World-Model with 4D-informed Retrieval for Exploration and Awareness 面向探索与感知的主动世界模型：基于四维信息检索的增强摘要	Tara Javidi Team	2604.16733	HJFY
2026-04-17	DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs DENALI：一个支持低成本激光雷达进行非视距空间推理的数据集摘要	Ramesh Raskar Team	2604.16201	HJFY
2026-04-17	Neural Gabor Splatting: Enhanced Gaussian Splatting with Neural Gabor for High-frequency Surface Reconstruction 神经Gabor溅射：融合神经Gabor的高斯溅射增强技术，用于高频表面重建摘要	Nobuyuki Umetani Team	2604.15941	HJFY
2026-04-17	Splats in Splats++: Robust and Generalizable 3D Gaussian Splatting Steganography Splats++中的Splats：鲁棒且可泛化的3D高斯泼溅隐写术摘要	Lei Ma Team	2604.15862	HJFY
2026-04-17	From Seeing to Simulating: Generative High-Fidelity Simulation with Digital Cousins for Generalizable Robot Learning and Evaluation 从观察到模拟：基于数字孪生的生成式高保真仿真，用于可泛化的机器人学习与评估摘要	Ruihai Wu Team	2604.15805	HJFY
2026-04-17	GlobalSplat: Efficient Feed-Forward 3D Gaussian Splatting via Global Scene Tokens GlobalSplat：通过全局场景令牌实现高效前馈式3D高斯溅射摘要	Sagie Benaim Team	2604.15284	HJFY
2026-04-16	TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens TokenGS：通过可学习令牌将3D高斯预测与像素解耦摘要	Zan Gojcic Team	2604.15239	HJFY
2026-04-15	Jump-Start Reinforcement Learning with Vision-Language-Action Regularization 利用视觉-语言-动作正则化实现强化学习的快速启动摘要	Loris Roveda Team	2604.13733	HJFY
2026-04-15	Granularity-Aware Transfer for Tree Instance Segmentation in Synthetic and Real Forests 面向合成与真实森林场景中树木实例分割的粒度感知迁移方法摘要	Karsten Berns Team	2604.13722	HJFY
2026-04-15	A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies 生成式机器人策略中仿真与现实协同训练的机制分析摘要	Yuke Zhu Team	2604.13645	HJFY
2026-04-15	Dehaze-then-Splat: Generative Dehazing with Physics-Informed 3D Gaussian Splatting for Smoke-Free Novel View Synthesis 先除雾后渲染：基于物理信息3D高斯泼溅的生成式除雾技术用于无烟新视角合成摘要	Hanqing Wang Team	2604.13589	HJFY
2026-04-15	DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis DF3DV-1K：一个用于无干扰物新视角合成的大规模数据集与基准摘要	Chin-Teng Lin Team	2604.13416	HJFY
2026-04-14	MSGS: Multispectral 3D Gaussian Splatting MSGS：多光谱三维高斯泼溅摘要	Fang-Lue Zhang Team	2604.13340	HJFY
2026-04-14	SSD-GS: Scattering and Shadow Decomposition for Relightable 3D Gaussian Splatting SSD-GS：面向可重光照3D高斯泼溅的散射与阴影分解摘要	Fang-Lue Zhang Team	2604.13333	HJFY
2026-04-14	PatchPoison: Poisoning Multi-View Datasets to Degrade 3D Reconstruction PatchPoison：毒化多视角数据集以降低三维重建质量摘要	Charu Sharma Team	2604.13153	HJFY
2026-04-14	RMGS-SLAM: Real-time Multi-sensor Gaussian Splatting SLAM RMGS-SLAM：实时多传感器高斯泼溅SLAM系统摘要	Marcelo H. Ang Team	2604.12942	HJFY
2026-04-14	Revisiting the angular size-redshift cosmological test with milliarcsecond radio structures in active galactic nuclei 利用活动星系核毫角秒射电结构重访角大小-红移宇宙学检验摘要	András Kovács Team	2604.12936	HJFY
2026-04-09	SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds SIM1：物理对齐模拟器作为可变形世界中的零样本数据扩展器摘要	Jiangmiao Pang Team	2604.08544	HJFY
2026-04-09	Sumo: Dynamic and Generalizable Whole-Body Loco-Manipulation Sumo：动态且可泛化的全身运动操控摘要	Simon Le Cléac'h Team	2604.08508	HJFY
2026-04-09	BLaDA: Bridging Language to Functional Dexterous Actions within 3DGS Fields BLaDA：在3D高斯溅射场中实现语言到灵巧功能动作的桥梁摘要	Yaonan Wang Team	2604.08410	HJFY
2026-04-09	SurfelSplat: Learning Efficient and Generalizable Gaussian Surfel Representations for Sparse-View Surface Reconstruction SurfelSplat：学习用于稀疏视角表面重建的高效且可泛化的高斯面元表示摘要	Yueqi Duan Team	2604.08370	HJFY
2026-04-09	Scalable Neural Decoders for Practical Fault-Tolerant Quantum Computation 面向实用容错量子计算的可扩展神经解码器摘要	Susanne F. Yelin Team	2604.08358	HJFY
2026-04-09	Controlling the rain fall statistics using Mean-Reverting Jump Diffusion model 使用均值回归跳跃扩散模型控制降雨统计特性摘要	Pankaj Kumar Mishra Team	2604.08338	HJFY
2026-04-09	LLM-Based Data Generation and Clinical Skills Evaluation for Low-Resource French OSCEs 基于大型语言模型的低资源法语OSCE数据生成与临床技能评估摘要	Irina Illina Team	2604.08126	HJFY
2026-04-09	Constraining Ultralight Scalar Dark Matter in the Galactic Center with the S2 Orbit 利用S2轨道约束银河系中心的超轻标量暗物质摘要	Lijing Shao Team	2604.08053	HJFY
2026-04-09	MotionScape: A Large-Scale Real-World Highly Dynamic UAV Video Dataset for World Models MotionScape：面向世界模型的大规模真实世界高动态无人机视频数据集摘要	Lei Wang Team	2604.07991	HJFY
2026-04-09	Generative 3D Gaussian Splatting for Arbitrary-ResolutionAtmospheric Downscaling and Forecasting 基于生成式三维高斯泼溅的任意分辨率大气降尺度与预报摘要	Lei Bai Team	2604.07928	HJFY

评估状态保存在浏览器本地（localStorage），换设备/浏览器不会同步。

📌 具身智能基础模型与数据

Publish Date (YYYY-MM-DD)	Title	Authors	PDF	HJFY
2026-06-25	PhysiFormer: Learning to Simulate Mechanics in World Space PhysiFormer：在全局空间中学习模拟力学摘要	Andrea Vedaldi Team	2606.27364	HJFY
2026-06-25	Hallucination in World Models is Predictable and Preventable 世界模型中的幻觉是可预测且可预防的摘要	Xiaolong Wang Team	2606.27326	HJFY
2026-06-25	Not All Actions Are Equal: Rethinking Conditioning for Dexterous World Model 并非所有动作都同等重要：重新思考灵巧世界模型的条件机制摘要	Renjing Xu Team	2606.27325	HJFY
2026-06-25	EO-WM: A Physically Informed World Model for Probabilistic Earth Observation Forecasting EO-WM：用于概率性地球观测预测的物理信息世界模型摘要	Hengshuang Zhao Team	2606.27277	HJFY
2026-06-25	E-TTS: A New Embodied Test-Time Scaling Framework for Robotic Manipulation E-TTS：面向机器人操作的全新具身测试时缩放框架摘要	Liang Wang Team	2606.27268	HJFY
2026-06-25	Advancing Omnimodal Embodied Agents from Isolated Skills to Everyday Physical Autonomy 推进全模态具身智能体：从孤立技能迈向日常物理自主摘要	Yu-Gang Jiang Team	2606.27251	HJFY
2026-06-25	ForesightSafety-VLA: A Unified Diagnostic Safety Benchmark for Vision-Language-Action Models ForesightSafety-VLA：面向视觉-语言-动作模型的统一诊断性安全基准摘要	Yi Zeng Team	2606.27079	HJFY
2026-06-25	A Generalization Theory for JEPA-Based World Models 基于JEPA的世界模型泛化理论摘要	Yisen Wang Team	2606.27014	HJFY
2026-06-25	Einstein World Models 爱因斯坦世界模型摘要	Kentaro Inui Team	2606.26969	HJFY
2026-06-25	Look-Before-Move: Narrative-Grounded World Visual Attention in Dynamic 3D Story Worlds 先看后动：动态3D故事世界中基于叙事的视觉注意力摘要	Zhenhong Sun Team	2606.26964	HJFY
2026-06-18	Current World Models Lack a Persistent State Core 当前世界模型缺乏持久状态核心摘要	Xiaozhu Ju Team	2606.20545	HJFY
2026-06-18	Slow Brain, Fast Planner: Latency-Resilient VLM-Augmented Urban Navigation 慢思考的头脑，快规划的行动：面向延迟容忍的VLM增强城市导航摘要	Bolei Zhou Team	2606.20458	HJFY
2026-06-18	Finetuning Vision-Language-Action Models Requires Fewer Layers Than You Think 微调视觉-语言-动作模型所需的层数远比你想象的少摘要	Ngo Anh Vien Team	2606.20246	HJFY
2026-06-18	Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring 学会提示：通过自适应大语言模型高中辅导提升学生参与度摘要	Michiel T. van der Meer Team	2606.20138	HJFY
2026-06-18	Frequency-Aware Flow Matching for Continuous and Consistent Robotic Action Generation 面向连续一致机器人动作生成的频率感知流匹配摘要	Simin Li Team	2606.20135	HJFY
2026-06-18	Sensorimotor World Models: Perception for Action via Inverse Dynamics 传感器运动世界模型：通过逆动力学实现面向行动的感知摘要	Bernhard Schölkopf Team	2606.20104	HJFY
2026-06-18	Holo-World: Unified Camera, Object and Weather Control for Video World Model Holo-World：面向视频世界模型的统一相机、物体与天气控制摘要	Xiaoyan Sun Team	2606.20083	HJFY
2026-06-18	See-and-Reach: Precise Vision-Language Navigation for UAVs within the Field of View 视与达：面向无人机视场内的精确视觉语言导航摘要	Jiande Sun Team	2606.20045	HJFY
2026-06-18	Reward as An Agent for Embodied World Models 奖励作为具身世界模型的代理摘要	Shan You Team	2606.19990	HJFY
2026-06-18	Advancing DialNav through Automatic Embodied Dialog Augmentation 通过自动具身对话增强推进DialNav 摘要	Paul Hongsuck Seo Team	2606.19948	HJFY
2026-06-15	Semantic Flip: Synthetic OOD Generation for Robust Refusal in Embodied Question Answering and Spatial Localization 语义翻转：面向具身问答与空间定位中鲁棒拒答的合成分布外样本生成方法摘要	Dooyoung Hong Team	2606.16898	HJFY
2026-06-15	Medical world models: representing medical states, modelling clinical dynamics and guiding intervention policies 医疗世界模型：表征患者状态、建模临床动态与指导干预策略摘要	Haishuai Wang Team	2606.16721	HJFY
2026-06-15	ARB4WM: An Adversarial Robustness Benchmark for World Models in Continuous Control ARB4WM：面向连续控制中世界模型的对抗鲁棒性基准摘要	Zhaoquan Gu Team	2606.16605	HJFY
2026-06-15	Can LLM Agents Infer World Models? Evidence from Agentic Automata Learning 大语言模型智能体能否推断世界模型？来自智能自动机学习的证据摘要	Gabriel Stanovsky Team	2606.16576	HJFY
2026-06-15	Kairos: A Native World Model Stack for Physical AI Kairos：面向物理智能的原生世界模型栈摘要	Xiaogang Wang Team	2606.16533	HJFY
2026-06-15	BadWorld: Adversarial Attacks on World Models BadWorld：世界模型的对抗性攻击摘要	Xingyi Yang Team	2606.16519	HJFY
2026-06-15	BRICKS-WM: Building Reusability via Interface Composition Kinetics for Structured World Models BRICKS-WM：通过接口组合动力学构建结构化世界模型的可复用性摘要	De-Chuan Zhan Team	2606.16489	HJFY
2026-06-15	HOLO-MPPI: Multi-Scenario Motion Planning via Hierarchical Policy Optimization HOLO-MPPI：基于分层策略优化的多场景运动规划摘要	Sangjae Bae Team	2606.16480	HJFY
2026-06-15	FlowMPC: Improving Flow Matching policies with World Models FlowMPC：利用世界模型改进流匹配策略摘要	Chandon Hamel Team	2606.16286	HJFY
2026-06-15	GraphWorld: Long-Horizon Planning with World Models for End-to-End Autonomous Driving 摘要	Yadan Luo Team	2606.16274	HJFY
2026-06-10	World Pilot: Steering Vision-Language-Action Models with World-Action Priors 世界领航员：借助世界-动作先验引导视觉-语言-动作模型摘要	Zhaoxiang Zhang Team	2606.12403	HJFY
2026-06-10	DIRECT: When and Where Should You Allocate Test-Time Compute in Embodied Planners? DIRECT：具身规划器中何时何地应分配测试时计算资源？摘要	Marco Pavone Team	2606.12402	HJFY
2026-06-10	VLGA: Vision-Language-Geometry-Action Models for Autonomous Driving VLGA：面向自动驾驶的视觉-语言-几何-动作模型摘要	Burhan Yaman Team	2606.12396	HJFY
2026-06-10	Slots, Transitions, Loops: Learning Composable World Models for ARC 槽位、转换、循环：为ARC学习可组合的世界模型摘要	Andreas Geiger Team	2606.12316	HJFY
2026-06-10	Learning What to Say to Your VLA: Mostly Harmless Vision Language Action Model Steering 学习对视觉-语言-动作模型说什么：一种基本无害的模型操控方法摘要	Andrea Bajcsy Team	2606.12299	HJFY
2026-06-10	Making Foresight Actionable: Repurposing Representation Alignment in World Action Models 将预见转化为行动：世界行动模型中的表征对齐重构摘要	Xihui Liu Team	2606.12217	HJFY
2026-06-10	DAM-VLA: Decoupled Asynchronous Multimodal Vision Language Action model 摘要	Rudolf Lioutikov Team	2606.12105	HJFY
2026-06-10	World Model Self-Distillation: Training World Models to Solve General Tasks 世界模型自蒸馏：训练世界模型解决通用任务摘要	Paolo Favaro Team	2606.12072	HJFY
2026-06-10	When Does Language Matter? Multilingual Instructions Reveal Step-wise Language Sensitivity in Vision-Language-Action Models 语言何时重要？多语言指令揭示视觉-语言-行动模型中的步骤级语言敏感性摘要	Wanxiang Che Team	2606.11906	HJFY
2026-06-10	TouchThinker: Scaling Tactile Commonsense Reasoning to the Open World with Large-scale Data and Action-aware Representation TouchThinker：面向开放世界的大规模触觉常识推理与动作感知表征摘要	Shuicheng Yan Team	2606.11637	HJFY
2026-05-29	Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization 位置注意力头与符号注意力头：学习动态、RoPE几何与长度泛化摘要	Cristobal Rojas Team	2605.31558	HJFY
2026-05-29	Learning Controlled Separation of Small Objects Between Two Fingers with a Tactile Skin 基于触觉皮肤的双手指间小物体受控分离学习摘要	Berthold Bäuml Team	2605.31486	HJFY
2026-05-29	IDOL: Inverse-Dynamics-Guided Future Prediction for End-to-End Autonomous Driving IDOL：基于逆动力学引导的未来预测用于端到端自动驾驶摘要	Dongmei Li Team	2605.31476	HJFY
2026-05-29	The Sword, Shield, and Achilles' Heel: Characterizing the Linguistic Inductive Bias of Large Language Models for Spatial Reasoning in Navigation Planning 剑、盾与致命弱点：大型语言模型在导航规划空间推理中的语言归纳偏差特征化摘要	Xiong You Team	2605.31404	HJFY
2026-05-29	LiftNav: Path Planning via Semantic Lifting in TSDF-Guided Gaussian Splatting 摘要	Daniel Roth Team	2605.31376	HJFY
2026-05-29	Dreaming Of Others: Latent Teammate Modeling In World Models For Multi-Agent Reinforcement Learning 想象他人：多智能体强化学习中基于世界模型的潜在队友建模摘要	Tomas Leroy-Stone Team	2605.31361	HJFY
2026-05-29	DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory DecMem：面向分钟级一致世界生成的解耦记忆机制摘要	Kwan-Yee K. Wong Team	2605.31336	HJFY
2026-05-29	AR Forcing: Towards Long-Horizon Robot Navigation World Model AR强迫：面向长时程机器人导航的世界模型摘要	Yan Wang Team	2605.31314	HJFY
2026-05-29	DriveMA: Driving Vision-Language-Action Models with verifiable Meta-Actions DriveMA：基于可验证元动作的驾驶视觉-语言-动作模型摘要	Hang Zhao Team	2605.31271	HJFY
2026-05-29	ERGeoBench:A Comprehensive Benchmark for Embodied Reasoning and Geo-localization in Multimodal Large Language Models ERGeoBench：面向多模态大语言模型的具身推理与地理定位综合基准摘要	Haoran Luo Team	2605.31251	HJFY
2026-05-20	PointACT: Vision-Language-Action Models with Multi-Scale Point-Action Interaction PointACT：基于多尺度点-动作交互的视觉-语言-动作模型摘要	Cordelia Schmid Team	2605.21414	HJFY
2026-05-20	DriveMA: Rethinking Language Interfaces in Driving VLAs with One-Step Meta-Actions DriveMA：用单步元动作重新思考驾驶视觉-语言-动作模型的语言接口摘要	Hang zhao Team	2605.21273	HJFY
2026-05-20	Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving 蒸馏以思考，预见以行动：面向自动驾驶的认知-物理强化学习摘要	Jin Xie Team	2605.21139	HJFY
2026-05-20	Anomaly-Informed Confidence Calibration for Vision-Based Safety Prediction 面向视觉安全预测的异常感知置信度校准方法摘要	Ivan Ruchkin Team	2605.21109	HJFY
2026-05-20	Q-ARVD: Quantizing Autoregressive Video Diffusion Models Q-ARVD：量化自回归视频扩散模型摘要	Xinchao Wang Team	2605.21072	HJFY
2026-05-20	Point Cloud Sequence Encoding for Material-conditioned Graph Network Simulators 基于点云序列编码的材料条件图网络模拟器摘要	Gerhard Neumann Team	2605.20978	HJFY
2026-05-20	Demo-JEPA: Joint-Embedding Predictive Architecture for One-shot Cross-Embodiment Imitation 摘要	Shanghang Zhang Team	2605.20811	HJFY
2026-05-20	VLA-REPLICA: A Low-Cost, Reproducible Benchmark for Real-World Evaluation of Vision-Language-Action Models VLA-REPLICA：面向视觉-语言-动作模型现实世界评估的低成本、可复现基准摘要	Yu Xiang Team	2605.20774	HJFY
2026-05-20	GaussianDream: A Feed-Forward 3D Gaussian World Model for Robotic Manipulation GaussianDream：面向机器人操作的前馈式三维高斯世界模型摘要	Haibao Yu Team	2605.20752	HJFY
2026-05-19	The Yes-Man Syndrome: Benchmarking Abstention in Embodied Robotic Agents “好好先生”综合征：具身机器人智能体中的节制行为基准测试摘要	Z Berkay Celik Team	2605.20544	HJFY
2026-05-11	HarmoWAM: Harmonizing Generalizable and Precise Manipulation via Adaptive World Action Models 摘要	Shanghang Zhang Team	2605.10942	HJFY
2026-05-11	PriorVLA: Prior-Preserving Adaptation for Vision-Language-Action Models PriorVLA：面向视觉-语言-动作模型的先验保持适应方法摘要	Xingyu Chen Team	2605.10925	HJFY
2026-05-11	CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models CapVector：参数空间中面向视觉-语言-动作模型的可迁移能力向量学习摘要	Haoang Li Team	2605.10903	HJFY
2026-05-11	Is Your Driving World Model an All-Around Player? 你的驾驶世界模型是全能选手吗？摘要	Ziwei Liu Team	2605.10858	HJFY
2026-05-11	ALAM: Algebraically Consistent Latent Transitions for Vision-Language-Action Models ALAM：面向视觉-语言-动作模型的代数一致隐状态转移摘要	Gang Pan Team	2605.10819	HJFY
2026-05-11	PhyGround: Benchmarking Physical Reasoning in Generative World Models PhyGround：生成式世界模型中的物理推理基准测试摘要	Yanzhi Wang Team	2605.10806	HJFY
2026-05-11	DeepSight: Long-Horizon World Modeling via Latent States Prediction for End-to-End Autonomous Driving DeepSight：基于潜在状态预测的长时序世界建模用于端到端自动驾驶摘要	Hong Wang Team	2605.10564	HJFY
2026-05-11	VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models VEGA: 面向空间感知视觉-语言-动作模型的视觉编码器基础对齐摘要	Shanghang Zhang Team	2605.10485	HJFY
2026-05-11	CoWorld-VLA: Thinking in a Multi-Expert World Model for Autonomous Driving CoWorld-VLA：面向自动驾驶的多专家世界模型思考框架摘要	Gong Che Team	2605.10426	HJFY
2026-05-11	Position: Life-Logging Video Streams Make the Privacy-Utility Trade-off Inevitable 定位：生活日志视频流使隐私-效用权衡不可避免摘要	Sijie Cheng Team	2605.10404	HJFY
2026-05-04	Active Sampling for Ultra-Low-Bit-Rate Video Compression via Conditional Controlled Diffusion 基于条件控制扩散的超低码率视频压缩主动采样方法摘要	Tara Javidi Team	2605.02849	HJFY
2026-05-04	Existence, Asymptotic Behavior, and Numerical Analysis of a Generalized Abel Differential Equation with Applications in Financial Modeling 广义阿贝尔微分方程的存在性、渐近行为及数值分析及其在金融建模中的应用摘要	Dragos-Patru Covei Team	2605.02831	HJFY
2026-05-04	DynoSLAM: Dynamic SLAM with Generative Graph Neural Networks for Real-World Social Navigation DynoSLAM：面向真实世界社交导航的生成式图神经网络动态SLAM 摘要	Gonzalo Ferrer Team	2605.02759	HJFY
2026-05-04	Latent Bridge: Feature Delta Prediction for Efficient Dual-System Vision-Language-Action Model Inference 潜在桥接：面向高效双系统视觉-语言-动作模型推理的特征增量预测摘要	Hai Li Team	2605.02739	HJFY
2026-05-04	Sim-to-Real Transfer and Robustness Evaluation of Reinforcement Learning Control with Integrated Perception on an ASV for Floating Waste Capture 面向漂浮垃圾捕获的自主水面艇集成感知强化学习控制的仿真到现实迁移与鲁棒性评估摘要	Cédric Pradalier Team	2605.02529	HJFY
2026-05-04	Beyond Specialization: Robust Reinforcement Learning Navigation via Procedural Map Generators 超越专门化：通过程序化地图生成器实现鲁棒的强化学习导航摘要	Peter Detzner Team	2605.02528	HJFY
2026-05-04	Shadow-Loom: Causal Reasoning over Graphical World Model of Narratives 影梭：叙事因果推理的图形化世界模型摘要	David Wilmot Team	2605.02475	HJFY
2026-05-04	Change-Robust Online Spatial-Semantic Topological Mapping 抗变化在线空间-语义拓扑地图构建摘要	Harold Soh Team	2605.02227	HJFY
2026-05-04	Video Generation with Predictive Latents 基于预测潜变量的视频生成摘要	Jie Chen Team	2605.02134	HJFY
2026-05-03	TRAP: Tail-aware Ranking Attack for World-Model Planning TRAP：针对世界模型规划中轨迹排序的尾部感知后门攻击摘要	Xizhao Luo Team	2605.01950	HJFY
2026-04-30	HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation HERMES++：面向统一驾驶世界模型的3D场景理解与生成摘要	Xiang Bai Team	2604.28196	HJFY
2026-04-30	LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models LaST-R1：通过自适应物理潜在推理增强VLA模型的动作能力摘要	Pheng-Ann Heng Team	2604.28192	HJFY
2026-04-30	Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling 摘要	Bin Wang Team	2604.28185	HJFY
2026-04-30	Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces 超越高斯瓶颈：视觉变换器特征空间的拓扑对齐编码摘要	Aykut Erdem Team	2604.28122	HJFY
2026-04-30	Dreaming Across Towns: Semantic Rollout and Town-Adversarial Regularization for Zero-Shot Held-Out-Town Fixed-Route Driving in CARLA 跨城镇行驶：面向CARLA零样本固定路线驾驶的语义推演与城镇对抗正则化摘要	Jaerock Kwon Team	2604.27994	HJFY
2026-04-30	GUI Agents with Reinforcement Learning: Toward Digital Inhabitants 基于强化学习的图形界面智能体：迈向数字居民摘要	Song Guo Team	2604.27955	HJFY
2026-04-30	Flying by Inference: Active Inference World Models for Adaptive UAV Swarms 基于推理的飞行：面向自适应无人机集群的主动推理世界模型摘要	Carlo Regazzoni Team	2604.27935	HJFY
2026-04-30	Simulating clinical interventions with a generative multimodal model of human physiology 基于人体生理学生成式多模态模型的临床干预模拟摘要	Eran Segal Team	2604.27899	HJFY
2026-04-30	Graph World Models: Concepts, Taxonomy, and Future Directions 图世界模型：概念、分类与未来方向摘要	Bei Yu Team	2604.27895	HJFY
2026-04-30	MotuBrain: An Advanced World Action Model for Robot Control MotuBrain：面向机器人控制的先进世界动作模型摘要	Jun Zhu Team	2604.27792	HJFY
2026-04-23	Seeing Fast and Slow: Learning the Flow of Time in Videos 快慢之见：学习视频中的时间流动摘要	Wei-Chiu Ma Team	2604.21931	HJFY
2026-04-23	Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions 关系性道德困境中的机器行为：道德正确性、人类行为预测与模型决策摘要	Meeyoung Cha Team	2604.21871	HJFY
2026-04-23	Hi-WM: Human-in-the-World-Model for Scalable Robot Post-Training Hi-WM：面向可扩展机器人后训练的人机世界模型摘要	Yichen Zhu Team	2604.21741	HJFY
2026-04-23	WorldMark: A Unified Benchmark Suite for Interactive Video World Models WorldMark：交互式视频世界模型统一基准套件摘要	Yongtao Ge Team	2604.21686	HJFY
2026-04-23	LLM-Steered Power Allocation for Parallel QPSK-AWGN Channels 面向并行QPSK-AWGN信道的LLM引导功率分配摘要	Tadashi Wadayama Team	2604.21316	HJFY
2026-04-23	ReCAPA: Hierarchical Predictive Correction to Mitigate Cascading Failures ReCAPA：层级预测校正以缓解级联故障摘要	Hao Wang Team	2604.21232	HJFY
2026-04-23	How VLAs (Really) Work In Open-World Environments 视觉-语言-动作模型在开放世界环境中如何实际运作摘要	Sajjad Pakdamansavoji Team	2604.21192	HJFY
2026-04-23	Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment 通过几何奖励信用分配强化点云-视觉-语言模型中的3D理解摘要	Jungong Han Team	2604.21160	HJFY
2026-04-22	Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics Open-H-Embodiment：面向医学机器人基础模型的大规模数据集摘要	Axel Krieger Team	2604.21017	HJFY
2026-04-22	PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance PokeVLA：以全面世界知识赋能口袋级视觉-语言-动作模型摘要	Wenchao Ding Team	2604.20834	HJFY
2026-04-19	Fringe Projection Based Vision Pipeline for Autonomous Hard Drive Disassembly 基于条纹投影的自主硬盘拆解视觉流程摘要	Beiwen Li Team	2604.17231	HJFY
2026-04-18	TensorHub: Rethinking AI Model Hub with Tensor-Centric Compression TensorHub：以张量为中心压缩重构AI模型仓库摘要	Yue Cheng Team	2604.17104	HJFY
2026-04-18	Mini-BEHAVIOR-Gran: Revealing U-Shaped Effects of Instruction Granularity on Language-Guided Embodied Agents Mini-BEHAVIOR-Gran：揭示指令粒度对语言引导具身智能体的U型影响摘要	Hamid Rezatofighi Team	2604.17019	HJFY
2026-04-18	Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification Rule-VLN：通过语义推理与几何校正桥接感知与合规性摘要	Xiaowen Chu Team	2604.16993	HJFY
2026-04-18	Chain Of Interaction Benchmark (COIN): When Reasoning meets Embodied Interaction 交互链基准（COIN）：当推理遇见具身交互摘要	Qing Li Team	2604.16886	HJFY
2026-04-18	SafeDream: Safety World Model for Proactive Early Jailbreak Detection SafeDream：用于主动早期越狱检测的安全世界模型摘要	Song Wang Team	2604.16824	HJFY
2026-04-17	Active World-Model with 4D-informed Retrieval for Exploration and Awareness 面向探索与感知的主动世界模型：基于四维信息检索的增强摘要	Tara Javidi Team	2604.16733	HJFY
2026-04-17	Human Cognition in Machines: A Unified Perspective of World Models 机器中的人类认知：世界模型的统一视角摘要	Yanzhi Wang Team	2604.16592	HJFY
2026-04-17	The Global Neural World Model: Spatially Grounded Discrete Topologies for Action-Conditioned Planning 全局神经世界模型：面向动作条件规划的空间离散拓扑结构摘要	Noureddine Kermiche Team	2604.16585	HJFY
2026-04-17	DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs DENALI：一个支持低成本激光雷达进行非视距空间推理的数据集摘要	Ramesh Raskar Team	2604.16201	HJFY
2026-04-15	Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective 前馈式三维场景建模：问题驱动视角摘要	Bohan Zhuang Team	2604.14025	HJFY
2026-04-15	Beyond State Consistency: Behavior Consistency in Text-Based World Models 超越状态一致性：文本世界模型中的行为一致性摘要	Dongmei Zhang Team	2604.13824	HJFY
2026-04-15	Jump-Start Reinforcement Learning with Vision-Language-Action Regularization 利用视觉-语言-动作正则化实现强化学习的快速启动摘要	Loris Roveda Team	2604.13733	HJFY
2026-04-15	Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap 无人机视觉语言导航：进展、挑战与研究路线图摘要	Ji Pei Team	2604.13654	HJFY
2026-04-15	AgentComm: Semantic Communication for Embodied Agents AgentComm：具身智能体的语义通信框架摘要	Shi Jin Team	2604.13558	HJFY
2026-04-15	Evolvable Embodied Agent for Robotic Manipulation via Long Short-Term Reflection and Optimization 基于长短时反思与优化的可进化具身机器人操作代理摘要	Xulong Zhang Team	2604.13533	HJFY
2026-04-14	Robotic Manipulation is Vision-to-Geometry Mapping ($f(v) \rightarrow G$): Vision-Geometry Backbones over Language and Video Models 机器人操作即视觉到几何的映射（f(v) → G）：超越语言与视频模型的视觉-几何骨干网络摘要	Guangrun Wang Team	2604.12908	HJFY
2026-04-14	FastGrasp: Learning-based Whole-body Control method for Fast Dexterous Grasping with Mobile Manipulators FastGrasp：基于学习的全身控制方法，用于移动机械臂的快速灵巧抓取摘要	Yuexin Ma Team	2604.12879	HJFY
2026-04-14	HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models 危险竞技场：评估视觉-语言-动作模型中的语义安全性摘要	Yu-Gang Jiang Team	2604.12447	HJFY
2026-04-15	Reading Between the Pixels: Linking Text-Image Embedding Alignment to Typographic Attack Success on Vision-Language Models 像素间的解读：将文本-图像嵌入对齐与视觉语言模型上的排版攻击成功率关联研究摘要	Ankit Garg Team	2604.12371	HJFY
2026-04-09	LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation LAMP：将图像编辑提升为开放世界操作中的通用三维先验摘要	Guofeng Zhang Team	2604.08475	HJFY
2026-04-09	Grounding Clinical AI Competency in Human Cognition Through the Clinical World Model and Skill-Mix Framework 通过临床世界模型与技能组合框架将临床AI能力植根于人类认知摘要	Isaac Shiri Team	2604.08226	HJFY
2026-04-09	Beyond Static Forecasting: Unleashing the Power of World Models for Mobile Traffic Extrapolation 超越静态预测：释放世界模型在移动流量外推中的潜力摘要	Yong Li Team	2604.08199	HJFY
2026-04-09	Governed Capability Evolution for Embodied Agents: Safe Upgrade, Compatibility Checking, and Runtime Rollback for Embodied Capability Modules 具身智能体的受控能力演化：面向具身能力模块的安全升级、兼容性检查与运行时回滚摘要	Zhijun Li Team	2604.08059	HJFY
2026-04-09	MotionScape: A Large-Scale Real-World Highly Dynamic UAV Video Dataset for World Models MotionScape：面向世界模型的大规模真实世界高动态无人机视频数据集摘要	Lei Wang Team	2604.07991	HJFY
2026-04-09	How Far Are Large Multimodal Models from Human-Level Spatial Action? A Benchmark for Goal-Oriented Embodied Navigation in Urban Airspace 大型多模态模型距离人类空间行动能力还有多远？面向城市空域目标导向具身导航的基准测试摘要	Xinlei Chen Team	2604.07973	HJFY
2026-04-09	WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models WorldMAP：利用生成式世界模型自举视觉语言导航轨迹预测摘要	Zhibo Chen Team	2604.07957	HJFY
2026-04-09	Object-Attribute-Relation Model Driven Adaptive Hierarchical Transmission for Multimodal Semantic Communication 基于对象-属性-关系模型驱动的自适应分层传输多模态语义通信框架摘要	Mingquan Lu Team	2604.07859	HJFY
2026-04-09	Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution 驾驭具身智能体：面向策略约束执行的运行时治理摘要	Zhijun Li Team	2604.07833	HJFY
2026-04-09	Learning Without Losing Identity: Capability Evolution for Embodied Agents 学习而不失身份：具身智能体的能力进化摘要	Zhijun Li Team	2604.07799	HJFY