🚁 Aerial-VLN-Arxiv-Daily

每日自动追踪 无人机视觉语言导航 (Aerial-VLN)、3DGS场景重建与仿真 和 具身智能基础模型 的最新 arXiv 论文。

Updated on 2026.06.26

📌 无人机视觉语言导航

Publish Date (YYYY-MM-DD) Title Authors PDF HJFY 评估
2026-06-17 A Digital Twin Framework for Traffic-Aware UAV Pavement Monitoring without Lane Closure
面向交通感知的无车道封闭无人机路面监测数字孪生框架
摘要
Edwin Salcedo Team 2606.20742 HJFY
2026-06-18 See-and-Reach: Precise Vision-Language Navigation for UAVs within the Field of View
摘要
Jiande Sun Team 2606.20045 HJFY
2026-06-12 Automated Gaze-based Behavioral Segmentation and Temporal Representation for Bridge Inspection in Unconstrained 3D Environments
基于眼动自动化的非约束三维环境中桥梁检查行为分段与时间表征
摘要
Mohamad Alipour Team 2606.14893 HJFY
2026-06-11 Guided Diffusion with Distilled Vision-Language Reliability for Aerial Navigation
融合蒸馏视觉-语言可靠性的引导扩散用于空中导航
摘要
Dzmitry Tsetserukou Team 2606.13883 HJFY
2026-06-02 AirDreamer: Generalist Drone Navigation with World Models
AirDreamer:基于世界模型的通用无人机导航
摘要
Guyue Zhou Team 2606.03252 HJFY
2026-06-08 ImagineUAV: Aerial Vision-Language Navigation via World-Action Modeling and Kinodynamic Planning
ImagineUAV:基于世界-动作建模与动力学规划的航空视觉-语言导航
摘要
Jiankun Yang Team 2606.01205 HJFY
2026-05-29 Can Aerial VLA Models Cooperate? Evaluating Closed-Loop Air-Ground Coordination with CARLA-Air
空中视觉-语言-动作模型能否协作?基于CARLA-Air的闭环空地协调评估
摘要
Hong Zhang Team 2605.31066 HJFY
2026-05-26 Uni-LaViRA: Language-Vision-Robot Actions Translation for Unified Embodied Navigation
Uni-LaViRA:面向统一具身导航的语言-视觉-机器人动作翻译
摘要
Jiebo Luo Team 2605.27582 HJFY
2026-05-19 FlyMirage: A Fully Automated Generation Pipeline for Diverse and Scalable UAV Flight Data via Generative World Model
FlyMirage:基于生成式世界模型的多样化可扩展无人机飞行数据全自动生成管道
摘要
Xin Zhou Team 2605.19600 HJFY
2026-05-20 CosFly-Track: A Large-Scale Multi-Modal Dataset for UAV Visual Tracking via Multi-Constraint Trajectory Optimization
CosFly-Track:面向无人机视觉跟踪的大规模多模态数据集——基于多约束轨迹优化
摘要
Ji Pei Team 2605.17776 HJFY
2026-05-15 WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation
摘要
Yong Li Team 2605.15964 HJFY
2026-05-18 Weather-Robust Cross-View Geo-Localization via Prototype-Based Semantic Part Discovery
基于原型语义部件发现的鲁棒天气跨视角地理定位
摘要
Long Tran-Thanh Team 2605.11654 HJFY
2026-04-30 Dynamic-TD3: A Novel Algorithm for UAV Path Planning with Dynamic Obstacle Trajectory Prediction
Dynamic-TD3:一种融合动态障碍物轨迹预测的无人机路径规划新算法
摘要
Yuanlong Yu Team 2605.00059 HJFY
2026-04-23 Instance-level Visual Active Tracking with Occlusion-Aware Planning
具有遮挡感知规划的实例级视觉主动跟踪
摘要
Mingkui Tan Team 2604.21453 HJFY
2026-04-19 LookasideVLN: Direction-Aware Aerial Vision-and-Language Navigation
LookasideVLN:面向方向的空中视觉与语言导航
摘要
Guanbin Li Team 2604.17190 HJFY
2026-04-17 FineCog-Nav: Integrating Fine-grained Cognitive Modules for Zero-shot Multimodal UAV Navigation
FineCog-Nav:面向零样本多模态无人机导航的细粒度认知模块集成
摘要
Jing Huo Team 2604.16298 HJFY
2026-04-16 RL-STPA: Adapting System-Theoretic Hazard Analysis for Safety-Critical Reinforcement Learning
RL-STPA:为安全关键型强化学习调整系统理论危害分析
摘要
Benjamin J. Schumeg Team 2604.15201 HJFY
2026-04-15 Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap
无人机视觉与语言导航:进展、挑战与研究路线图
摘要
Ji Pei Team 2604.13654 HJFY
2026-04-10 "Take Me Home, Wi-Fi Drone": A Drone-based Wireless System for Wilderness Search and Rescue
摘要
Chenshu Wu Team 2604.09115 HJFY
2026-04-10 HTNav: A Hybrid Navigation Framework with Tiered Structure for Urban Aerial Vision-and-Language Navigation
HTNav:面向城市空中视觉语言导航的分层混合导航框架
摘要
Jie Qin Team 2604.08883 HJFY
2026-04-09 Vision-Language Navigation for Aerial Robots: Towards the Era of Large Language Models
摘要
Wen Yao Team 2604.07705 HJFY
2026-04-18 AeroScene: Progressive Scene Synthesis for Aerial Robotics
AeroScene:面向空中机器人的渐进式场景合成
摘要
Anh Nguyen Team 2603.23224 HJFY
2026-03-23 Evolutionary Biparty Multiobjective UAV Path Planning: Problems and Empirical Comparisons
演进式双主体多目标无人机路径规划:问题与实证比较
摘要
Yatong Chang Team 2603.21544 HJFY
2026-03-22 SpatialFly: Geometry-Guided Representation Alignment for UAV Vision-and-Language Navigation in Urban Environments
SpatialFly:面向城市环境中无人机视觉语言导航的几何引导表示对齐方法
摘要
Xiangyang Ji Team 2603.21046 HJFY
2026-03-18 CICDWOA: A Collective Cognitive Sharing Whale Optimization Algorithm with Cauchy Inverse Cumulative Distribution for 2D/3D Path Planning and Engineering Design Problems
CICDWOA:一种基于柯西逆累积分布与集体认知共享的鲸鱼优化算法,用于二维/三维路径规划与工程设计问题
摘要
Xu Yang Team 2603.20501 HJFY
2026-03-20 HUGE-Bench: A Benchmark for High-Level UAV Vision-Language-Action Tasks
HUGE-Bench:面向高级无人机视觉-语言-动作任务的基准测试平台
摘要
Mingming Gong Team 2603.19822 HJFY
2026-03-19 Optimal Path Planning in Hostile Environments
敌对环境中的最优路径规划
摘要
Haifeng Xu Team 2603.18958 HJFY
2026-03-11 OnFly: Onboard Zero-Shot Aerial Vision-Language Navigation toward Safety and Efficiency
OnFly:面向安全与效率的机载零样本空中视觉语言导航
摘要
Boyu Zhou Team 2603.10682 HJFY
2026-03-10 WESPR: Wind-adaptive Energy-Efficient Safe Perception & Planning for Robust Flight with Quadrotors
WESPR:面向四旋翼稳健飞行的风适应性能效安全感知与规划
摘要
Pratap Tokekar Team 2603.09194 HJFY
2026-03-09 ViSA-Enhanced Aerial VLN: A Visual-Spatial Reasoning Enhanced Framework for Aerial Vision-Language Navigation
ViSA增强型空中视觉语言导航:一种视觉-空间推理增强的空中视觉语言导航框架
摘要
Chenghao Lin Team 2603.08007 HJFY

评估状态保存在浏览器本地(localStorage),换设备/浏览器不会同步。

📌 3DGS场景重建与仿真

Publish Date (YYYY-MM-DD) Title Authors PDF HJFY 评估
2026-06-25 Scalable Behavior Cloning with Open Data, Training, and Evaluation
基于开放数据、训练与评估的可扩展行为克隆方法
摘要
Angjoo Kanazawa Team 2606.27375 HJFY
2026-06-25 VibeAct: Vibration to Actions for Contact-Rich Reactive Robot Dexterity
VibeAct:振动驱动的高接触响应型机器人灵巧操作
摘要
Jeffrey Ichnowski Team 2606.27344 HJFY
2026-06-25 The SPOTLIGHT Multibeam Real-Time Transient Detection System
SPOTLIGHT多波束实时瞬变探测系统
摘要
Harshavardhan Reddy Team 2606.27262 HJFY
2026-06-25 Learning to Fold: prizewinning solution at LeHome Challenge 2026 (1st place online, 2nd offline)
学习折叠:LeHome 2026挑战赛夺冠方案(线上第一名,线下第二名)
摘要
Ilia Larchenko Team 2606.27163 HJFY
2026-06-25 Vis4GS: A Visual Analytic Tool for 3D Gaussian Splatting Reconstruction
Vis4GS:面向3D高斯溅射重建的可视分析工具
摘要
Shih-Hsuan Hung Team 2606.26985 HJFY
2026-06-25 RobOralScan: Learning Active Intraoral Scanning for Robotic Dental Reconstruction
RobOralScan:面向机器人牙科重建的主动口内扫描学习
摘要
Sunghoon Im Team 2606.26955 HJFY
2026-06-25 UAV-MapFusion: RTK-Aligned Uncertainty-Aware Coarse-to-Fine Multi-Session UAV Mapping
UAV-MapFusion:基于RTK对齐的不确定性感知多航段无人机地图粗到细融合方法
摘要
Wei Wang Team 2606.26928 HJFY
2026-06-25 Probing inflationary particle production with the CMB power spectrum
摘要
Oliver H. E. Philcox Team 2606.26823 HJFY
2026-06-25 Capacity-Controlled Multi-View Stylization of 3D Gaussian Splatting
容量可控的三维高斯泼溅多视角风格化
摘要
Hui Huang Team 2606.26754 HJFY
2026-06-25 IDEA: Insensitive to Dynamics Mismatch via Effect Alignment for Sim-to-Real Transfer in Multi-Agent Control
IDEA: 基于效果对齐对动力学失配不敏感的多智能体控制仿真到现实迁移方法
摘要
Bin He Team 2606.26575 HJFY
2026-06-18 Slow Brain, Fast Planner: Latency-Resilient VLM-Augmented Urban Navigation
慢思考的头脑,快规划的行动:面向延迟容忍的VLM增强城市导航
摘要
Bolei Zhou Team 2606.20458 HJFY
2026-06-18 HEPTv2: End-to-End Efficient Point Transformer for Charged Particle Reconstruction
HEPTv2: 面向带电粒子重建的端到端高效点云Transformer
摘要
Pan Li Team 2606.20437 HJFY
2026-06-18 TaCauchy: An Extensible FEM Framework for Vision-Based Tactile Simulation
TaCauchy:一种面向视觉触觉仿真的可扩展有限元框架
摘要
Wenbo Ding Team 2606.20426 HJFY
2026-06-18 Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring
学会提示:通过自适应大语言模型高中辅导提升学生参与度
摘要
Michiel T. van der Meer Team 2606.20138 HJFY
2026-06-18 Geometry-Preserving in 3D Gaussian Splatting for LiDAR-Camera Extrinsic Calibration
面向激光雷达-相机外参标定的三维高斯泼溅几何保持方法
摘要
Hyoseok Hwang Team 2606.20103 HJFY
2026-06-18 Tri-Info: Generalizable, Interpretable Failure Prediction for VLA Models via Information Theory
三信息量:基于信息理论的VLA模型可泛化可解释故障预测方法
摘要
Yanchao Yang Team 2606.19998 HJFY
2026-06-18 MMD-SLAM: Structure-Enhanced Multi-Meta Gaussian Distribution-Guided Visual SLAM
MMD-SLAM:结构增强的多元高斯分布引导视觉SLAM
摘要
Chunmao Jiang Team 2606.19874 HJFY
2026-06-17 Scaling Self-Play for End-to-End Driving
面向端到端驾驶的规模化自我对弈
摘要
Liam Paull Team 2606.19641 HJFY
2026-06-17 Building Drift: Documenting On-Site Construction Adaptations Across Material Lifecycles
建筑漂移:记录跨材料生命周期的现场施工适应性调整
摘要
Mette Ramsgaard Thomsen Team 2606.19609 HJFY
2026-06-17 ev-flow: A Reproducible, NHTS-Grounded Generator of Synthetic Plug-in Electric Vehicle Charging Behavior for Eight U.S. Regions
ev-flow:基于美国国家家庭出行调查的、可复现的美国八个地区插电式电动汽车充电行为合成生成器
摘要
Bertrand Travacca Team 2606.19520 HJFY
2026-06-15 Di5Guise: 5G Privacy with vSIM
Di5Guise:基于vSIM的5G隐私保护方案
摘要
Tamara Lehman Team 2606.16943 HJFY
2026-06-15 Decay estimates for beam equations with potentials in dimension two
二维带势阱梁方程的时间衰减估计
摘要
Xiaohua Yao Team 2606.16793 HJFY
2026-06-15 PhysGuard: Fisher-Guided Gradient Projection for Sim-to-Real Neural PDE Surrogates
PhysGuard:基于Fisher引导的梯度投影实现神经PDE代理模型的仿真到现实迁移
摘要
Guillermo A Narsilio Team 2606.16602 HJFY
2026-06-15 Local-GS: Accelerating 3D Gaussian Splatting via Tile-Local Warp Coherence
Local-GS:通过瓦片局部束相干性加速3D高斯溅射
摘要
Huaping Liu Team 2606.16566 HJFY
2026-06-15 Agile Fall Recovery for Quadrotors with Bidirectional Thrust via Reinforcement Learning
基于强化学习的双向推力四旋翼无人机敏捷跌倒恢复
摘要
Fei Gao Team 2606.16513 HJFY
2026-06-15 RealityBridge: Bridging Editable 3D Gaussian Splatting Driving Simulations and Real-World Videos
RealityBridge:连接可编辑3D高斯泼溅驾驶模拟与现实世界视频的桥梁
摘要
Guanbin Li Team 2606.16278 HJFY
2026-06-15 PolyMerge: Compressing 3D Gaussian Splats with Polytope Coverings for Provably Safe Resource-Constrained Navigation
PolyMerge:基于多面体覆盖的3D高斯泼溅压缩技术实现可证明安全的资源受限导航
摘要
Glen Chou Team 2606.16232 HJFY
2026-06-15 EgoPhys: Learning Generalizable Physics Models of Deformable Objects from Egocentric Video
EgoPhys:从第一人称视频学习可变形物体的通用物理模型
摘要
Xiaolong Wang Team 2606.16202 HJFY
2026-06-14 Artificial Intelligence for Power-Converter-Rich Electrical Systems: A Review
面向富含电力变流器的电气系统的人工智能:综述
摘要
Peng Wang Team 2606.15948 HJFY
2026-06-14 TurboGS: Accelerating 3D Gaussian Splatting via Error-Guided Sparse Pixel Sampling and Optimization
TurboGS:基于误差引导的稀疏像素采样与优化加速三维高斯泼溅
摘要
Weiwei Xu Team 2606.15924 HJFY
2026-06-10 Ambient Diffusion Policy: Imitation Learning from Suboptimal Data in Robotics
环境扩散策略:从次优数据中学习机器人模仿
摘要
Russ Tedrake Team 2606.12365 HJFY
2026-06-10 MLT-Dedup: Efficient Large-Scale Online Video Deduplication via Multi-Level Representations and Spatial-Temporal Matching
MLT-Dedup:基于多层级表征与时空匹配的高效大规模在线视频去重
摘要
Kun Xu Team 2606.12215 HJFY
2026-06-10 Point Cloud Segmentation for Autonomous Clip Positioning in Laparoscopic Cholecystectomy on a Phantom
面向腹腔镜胆囊切除术中自主施夹的体模点云分割
摘要
Franziska Mathis-Ullrich Team 2606.12048 HJFY
2026-06-10 KinematicRL: A Sim-to-Real Reinforcement Learning Framework For Social Navigation With Kinodynamic Feasibility
运动学强化学习:一种面向社交导航且兼顾动力学可行性的仿真到现实强化学习框架
摘要
Chenpeng Yao Team 2606.12042 HJFY
2026-06-10 Unexpected large relative strong phase and search for isospin breaking and $CP$ asymmetries in $J/ψ\to K^*(892)\bar K
J/ψ→K*(892)K衰变中意外大的相对强相位及同位旋破坏与CP不对称性搜寻
摘要
J. Zu Team 2606.12002 HJFY
2026-06-10 Wild3R: Feed-Forward 3D Gaussian Splatting from Unconstrained Sparse Photo Collection
Wild3R:基于无约束稀疏照片集合的前馈式三维高斯泼溅方法
摘要
Toshihiko Yamasaki Team 2606.11894 HJFY
2026-06-10 Scene-Adaptive Nonlinear Tone Curves for Pseudo Ground-Truth Generation in Low-Light 3D Gaussian Splatting
面向低光照3D高斯泼溅的伪真值生成的场景自适应非线性色调曲线
摘要
Hong Zhang Team 2606.11841 HJFY
2026-06-10 Seeing What Matters: Perceptual Wrapper with Common Randomness for 3D Gaussian Splatting
摘要
Wen-Hsiao Peng Team 2606.11782 HJFY
2026-06-10 Blind Dexterous Grasping via Real2Sim2Real Tactile Policy Learning
基于Real2Sim2Real触觉策略学习的盲操作灵巧抓取
摘要
Chenxi Xiao Team 2606.11767 HJFY
2026-06-10 TacCoRL: Integrating Tactile Feedback into VLA via Simulation
TacCoRL: 通过仿真将触觉反馈集成到视觉-语言-动作模型中
摘要
Chenfanfu Jiang Team 2606.11743 HJFY
2026-05-29 Learning Controlled Separation of Small Objects Between Two Fingers with a Tactile Skin
基于触觉皮肤的双手指间小物体受控分离学习
摘要
Berthold Bäuml Team 2605.31486 HJFY
2026-05-29 Dirac-Phase CP-Violation in the Low-Scale Type-I Seesaw with Three Right-Handed Neutrinos
具有三个右手中微子的低尺度I型跷跷板模型中的狄拉克相位CP破坏
摘要
S. T. Petcov Team 2605.31454 HJFY
2026-05-29 Triangle Splatting SLAM
三角形溅射SLAM
摘要
Andrew J. Davison Team 2605.31419 HJFY
2026-05-29 Scaling Multi-Hop Training Data via Graph-Constrained Path Selection
基于图约束路径选择的大规模多跳训练数据生成
摘要
Yike Guo Team 2605.31238 HJFY
2026-05-29 Robust class-gated single-pixel diffractive optical neural network with random-aberration-aware training
具有随机像差感知训练的鲁棒类门控单像素衍射光学神经网络
摘要
Jun-Jun Xiao Team 2605.31232 HJFY
2026-05-29 TALON: Token-Aligned Lightweight Adapters for 6-DoF Spacecraft Pose Estimation
TALON:用于六自由度航天器姿态估计的令牌对齐轻量级适配器
摘要
Djamila Aouada Team 2605.31217 HJFY
2026-05-29 QVGGT: Post-Training Quantized Visual Geometry Grounded Transformer
QVGGT:训练后量化的视觉几何基础变换器
摘要
Huan Wang Team 2605.31124 HJFY
2026-05-29 Benchmarking Single-Step Inpainting Methods for Multi-Object 3D Gaussian Splatting Scenes
多对象三维高斯泼溅场景的单步图像修补方法基准测试
摘要
Daniel Cremers Team 2605.30987 HJFY
2026-05-29 RDGen: Demonstration Generation for High-Quality Robot Learning via Reinforcement Learning
摘要
Xinhai Sun Team 2605.30957 HJFY
2026-05-28 Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes
工业视觉仿真到现实中的先验可用性:CAD引导与CAD缺失模式的综述
摘要
Seung-Kyum Choi Team 2605.30581 HJFY
2026-05-20 Mind the Sim-to-Real Gap & Think Like a Scientist
警惕仿真与现实的鸿沟,像科学家一样思考
摘要
Alexander Volfovsky Team 2605.21458 HJFY
2026-05-20 Lost in Fog: Sensor Perturbations Expose Reasoning Fragility in Driving VLAs
迷雾迷航:传感器扰动暴露驾驶视觉-语言-动作模型的推理脆弱性
摘要
Jelena Frtunikj Team 2605.21446 HJFY
2026-05-20 Detection of a dark matter subhalo in the strongly lensed system PJ011646
强引力透镜系统PJ011646中暗物质子晕的探测
摘要
Leo W. H. Fung Team 2605.21212 HJFY
2026-05-20 Transcoding a 3D Gaussian Splatting Model from a Plenoptic Point Cloud or Mesh without the Original Multi-view Images
基于全光点云或网格模型(无需原始多视图图像)的3D高斯泼溅模型转码方法
摘要
Neus Sabater Team 2605.21051 HJFY
2026-05-20 Point Cloud Sequence Encoding for Material-conditioned Graph Network Simulators
基于点云序列编码的材料条件图网络模拟器
摘要
Gerhard Neumann Team 2605.20978 HJFY
2026-05-20 CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation
摘要
HyeongYeop Kang Team 2605.20872 HJFY
2026-05-20 Resolving Long-Tail Ambiguity in Unsupervised 3D Point Cloud Segmentation with Language Priors
利用语言先验解决无监督三维点云分割中的长尾模糊性问题
摘要
Qiuxia Wu Team 2605.20737 HJFY
2026-05-19 Conflict-Aware Active Perception and Control in 3D Gaussian Splatting Fields via Control Barrier Functions
基于控制障碍函数的3D高斯泼溅场中的冲突感知主动感知与控制
摘要
Nader Motee Team 2605.20566 HJFY
2026-05-19 TideGS: Scalable Training of Over One Billion 3D Gaussian Splatting Primitives via Out-of-Core Optimization
TideGS:通过外存优化实现超十亿级3D高斯泼溅基元的可扩展训练
摘要
Chaojian Li Team 2605.20150 HJFY
2026-05-19 OP2GS: Object-Aware 3D Gaussian Splatting with Dual-Opacity Primitives
摘要
Janne Heikkilä Team 2605.20044 HJFY
2026-05-11 Rapid Forest Fuel Load Estimation via Virtual Remote Sensing and Metric-Scale Feed-Forward 3D Reconstruction
通过虚拟遥感与公尺度前馈三维重建实现森林燃料负载快速估算
摘要
Jonathan Li Team 2605.10789 HJFY
2026-05-11 MAGS-SLAM: Monocular Multi-Agent Gaussian Splatting SLAM for Geometrically and Photometrically Consistent Reconstruction
MAGS-SLAM:面向几何与光度一致重建的单目多智能体高斯泼溅SLAM
摘要
Baoru Huang Team 2605.10760 HJFY
2026-05-11 Network-Normative Belief Updating in High-Dimensional Ideological Space
高维意识形态空间中的网络规范性信念更新
摘要
Chico Q. Camargo Team 2605.10726 HJFY
2026-05-11 VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models
VEGA: 面向空间感知视觉-语言-动作模型的视觉编码器基础对齐
摘要
Shanghang Zhang Team 2605.10485 HJFY
2026-05-11 DySurface: Consistent 4D Surface Reconstruction via Bridging Explicit Gaussians and Implicit Functions
DySurface: 通过桥接显式高斯与隐式函数实现一致的4D表面重建
摘要
Tae-Kyun Kim Team 2605.10360 HJFY
2026-05-11 AdaptSplat: Adapting Vision Foundation Models for Feed-Forward 3D Gaussian Splatting
AdaptSplat:适配视觉基础模型以实现前馈式三维高斯泼溅
摘要
Yifeng Shi Team 2605.10239 HJFY
2026-05-11 A cell-decomposition based path planner for 3D navigation in constrained workspaces
一种基于单元分解的约束空间三维导航路径规划方法
摘要
Guilherme V. Raffo Team 2605.10086 HJFY
2026-05-11 SDTalk: Structured Facial Priors and Dual-Branch Motion Fields for Generalizable Gaussian Talking Head Synthesis
SDTalk:面向可泛化高斯说话人头合成的结构化面部先验与双分支运动场
摘要
Lingyun Yu Team 2605.09956 HJFY
2026-05-10 Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching
零样本仿真到现实机器人学习:基于灵巧操作的动态抓取研究
摘要
Kaiyu Hang Team 2605.09789 HJFY
2026-05-10 ConFixGS: Learning to Fix Feedforward 3D Gaussian Splatting with Confidence-Aware Diffusion Priors in Driving Scenes
ConFixGS:在驾驶场景中利用置信度感知扩散先验学习修复前馈式3D高斯泼溅
摘要
Jiaqi Ma Team 2605.09688 HJFY
2026-05-04 CoRAL: Contact-Rich Adaptive LLM-based Control for Robotic Manipulation
CoRAL:面向机器人操作的自适应接触丰富型大语言模型控制
摘要
Özgür S. Öğüz Team 2605.02600 HJFY
2026-05-04 Robotic Affection -- Opportunities of AI-based haptic interactions to improve social robotic touch through a multi-deep-learning approach
机器人情感——基于多深度学习方法的AI触觉交互在改善社交机器人触觉中的应用机遇
摘要
Jens Gerken Team 2605.02538 HJFY
2026-05-04 Sim-to-Real Transfer and Robustness Evaluation of Reinforcement Learning Control with Integrated Perception on an ASV for Floating Waste Capture
面向漂浮垃圾捕获的自主水面艇集成感知强化学习控制的仿真到现实迁移与鲁棒性评估
摘要
Cédric Pradalier Team 2605.02529 HJFY
2026-05-04 Beyond Specialization: Robust Reinforcement Learning Navigation via Procedural Map Generators
超越专门化:通过程序化地图生成器实现鲁棒的强化学习导航
摘要
Peter Detzner Team 2605.02528 HJFY
2026-05-03 GETA-3DGS: Automatic Joint Structured Pruning and Quantization for 3D Gaussian Splatting
GETA-3DGS:面向3D高斯溅射的自动联合结构化剪枝与量化方法
摘要
Wanxin Sui Team 2605.02086 HJFY
2026-05-03 From Concept to Capability: Evaluating 3D Gaussian Splatting for Synthetic Scene Editing in Autonomous Driving
从概念到能力:评估三维高斯泼溅在自动驾驶合成场景编辑中的应用
摘要
Anders Heyden Team 2605.01995 HJFY
2026-05-02 The Banach-Butterfly Invariant: Influence-Adaptive Walsh Geometry for Ternary Polynomial Threshold Functions
巴拿赫-蝴蝶不变量:三元多项式阈值函数的影响自适应沃尔什几何
摘要
Gorgi Pavlov Team 2605.01637 HJFY
2026-05-02 Action Agent: Agentic Video Generation Meets Flow-Constrained Diffusion
行动智能体:结合流约束扩散的智能体视频生成
摘要
Dzmitry Tsetserukou Team 2605.01477 HJFY
2026-05-02 Evidence-Based Landing Site Selection and Vison-Based Landing for UAVs in Unstructured Environments
基于证据的非结构化环境无人机着陆点选择与视觉着陆方法
摘要
Iraj Mantegh Team 2605.01432 HJFY
2026-05-02 The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice
日志的部分证据:混淆模型选择下语言模型生成的评估
摘要
Vasilis Syrgkanis Team 2605.01311 HJFY
2026-04-30 FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems
FlexiTac:面向机器人系统的低成本、开源、可扩展触觉传感解决方案
摘要
Yunzhu Li Team 2604.28156 HJFY
2026-04-30 GSDrive: Reinforcing Driving Policies by Multi-mode Trajectory Probing with 3D Gaussian Splatting Environment
GSDrive:基于3D高斯泼溅环境的多模式轨迹探测强化驾驶策略
摘要
Dzmitry Tsetserukou Team 2604.28111 HJFY
2026-04-30 Faster 3D Gaussian Splatting Convergence via Structure-Aware Densification
更快收敛的3D高斯泼溅:基于结构感知的致密化方法
摘要
Christian Theobalt Team 2604.28016 HJFY
2026-04-30 Fake3DGS: A Benchmark for 3D Manipulation Detection in Neural Rendering
摘要
Roberto Vezzani Team 2604.27590 HJFY
2026-04-30 Residual Gaussian Splatting for Ultra Sparse-View CBCT Reconstruction
残差高斯泼溅用于超稀疏视角CBCT重建
摘要
Qiegen Liu Team 2604.27552 HJFY
2026-04-30 Softmax-GS: Generalized Gaussians Learning When to Blend or Bound
Softmax-GS: 学习何时混合或约束的广义高斯分布
摘要
Li Fuxin Team 2604.27437 HJFY
2026-04-30 Sparse-View 3D Gaussian Splatting in the Wild
摘要
William J. Beksi Team 2604.27422 HJFY
2026-04-30 DOT-Sim: Differentiable Optical Tactile Simulation with Precise Real-to-Sim Physical Calibration
DOT-Sim: 具备精确实到仿物理标定的可微分光学触觉仿真
摘要
Leonidas Guibas Team 2604.27367 HJFY
2026-04-29 MesonGS++: Post-training Compression of 3D Gaussian Splatting with Hyperparameter Searching
MesonGS++:基于超参数搜索的3D高斯泼溅后训练压缩方法
摘要
Zhi Wang Team 2604.26799 HJFY
2026-04-29 3D Generation for Embodied AI and Robotic Simulation: A Survey
面向具身智能与机器人仿真的三维生成:综述
摘要
Song Guo Team 2604.26509 HJFY
2026-04-23 DualSplat: Robust 3D Gaussian Splatting via Pseudo-Mask Bootstrapping from Reconstruction Failures
DualSplat:通过重建失败的伪掩码引导实现鲁棒的3D高斯点云渲染
摘要
Yisong Chen Team 2604.21631 HJFY
2026-04-23 Do MLLMs Understand Pointing? Benchmarking and Enhancing Referential Reasoning in Egocentric Vision
多模态大模型理解指向吗?自我中心视觉中参考推理的基准测试与增强
摘要
Jie Zhou Team 2604.21461 HJFY
2026-04-23 You Only Gaussian Once: Controllable 3D Gaussian Splatting for Ultra-Densely Sampled Scenes
仅需一次高斯:面向超密集采样场景的可控三维高斯泼溅
摘要
Yifeng Shi Team 2604.21400 HJFY
2026-04-23 Listen and Chant Before You Read: The Ladder of Beauty in LM Pre-Training
先听与吟唱,再行阅读:语言模型预训练中的美感阶梯
摘要
Yoshinori Nomura Team 2604.21265 HJFY
2026-04-23 WildSplatter: Feed-forward 3D Gaussian Splatting with Appearance Control from Unconstrained Images
WildSplatter:基于无约束图像的前馈式三维高斯泼溅与外观控制方法
摘要
Yasuhiro Mukaigawa Team 2604.21182 HJFY
2026-04-22 ProMMSearchAgent: A Generalizable Multimodal Search Agent Trained with Process-Oriented Rewards
ProMMSearchAgent:基于过程奖励训练的可泛化多模态搜索智能体
摘要
Zhizhong Zhang Team 2604.20486 HJFY
2026-04-22 GSCompleter: A Distillation-Free Plugin for Metric-Aware 3D Gaussian Splatting Completion in Seconds
GSCompleter:面向度量感知的三维高斯泼溅补全的免蒸馏插件,数秒内完成补全
摘要
Yuan Xie Team 2604.20155 HJFY
2026-04-21 Gaussians on a Diet: High-Quality Memory-Bounded 3D Gaussian Splatting Training
高斯节食:高质量内存受限的3D高斯泼溅训练
摘要
Miao Yin Team 2604.20046 HJFY
2026-04-21 FluSplat: Sparse-View 3D Editing without Test-Time Optimization
FluSplat:无需测试时优化的稀疏视角3D编辑
摘要
Yi Xu Team 2604.20038 HJFY
2026-04-21 Precision Kinematic Sunyaev--Zel'dovich Measurements Across Halo Mass and Redshift with DESI DR2 and ACT DR6: Part II. Bright Galaxy Survey and Emission-Line Galaxies
基于DESI DR2和ACT DR6的晕质量与红移范围内精密运动学Sunyaev–Zel’dovich测量:第二部分。亮星系巡天与发射线星系
摘要
H. Zou Team 2604.19745 HJFY
2026-04-19 Fringe Projection Based Vision Pipeline for Autonomous Hard Drive Disassembly
基于条纹投影的自主硬盘拆解视觉流程
摘要
Beiwen Li Team 2604.17231 HJFY
2026-04-18 Instant Colorization of Gaussian Splats
高斯泼溅的即时着色
摘要
Nils Wandel Team 2604.17155 HJFY
2026-04-17 Incoherent Deformation, Not Capacity: Diagnosing and Mitigating Overfitting in Dynamic Gaussian Splatting
非相干形变,而非容量:动态高斯溅射中过拟合的诊断与缓解
摘要
Ahmad Droby Team 2604.16747 HJFY
2026-04-17 Active World-Model with 4D-informed Retrieval for Exploration and Awareness
面向探索与感知的主动世界模型:基于四维信息检索的增强
摘要
Tara Javidi Team 2604.16733 HJFY
2026-04-17 DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs
DENALI:一个支持低成本激光雷达进行非视距空间推理的数据集
摘要
Ramesh Raskar Team 2604.16201 HJFY
2026-04-17 Neural Gabor Splatting: Enhanced Gaussian Splatting with Neural Gabor for High-frequency Surface Reconstruction
神经Gabor溅射:融合神经Gabor的高斯溅射增强技术,用于高频表面重建
摘要
Nobuyuki Umetani Team 2604.15941 HJFY
2026-04-17 Splats in Splats++: Robust and Generalizable 3D Gaussian Splatting Steganography
Splats++中的Splats:鲁棒且可泛化的3D高斯泼溅隐写术
摘要
Lei Ma Team 2604.15862 HJFY
2026-04-17 From Seeing to Simulating: Generative High-Fidelity Simulation with Digital Cousins for Generalizable Robot Learning and Evaluation
从观察到模拟:基于数字孪生的生成式高保真仿真,用于可泛化的机器人学习与评估
摘要
Ruihai Wu Team 2604.15805 HJFY
2026-04-17 GlobalSplat: Efficient Feed-Forward 3D Gaussian Splatting via Global Scene Tokens
GlobalSplat:通过全局场景令牌实现高效前馈式3D高斯溅射
摘要
Sagie Benaim Team 2604.15284 HJFY
2026-04-16 TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens
TokenGS:通过可学习令牌将3D高斯预测与像素解耦
摘要
Zan Gojcic Team 2604.15239 HJFY
2026-04-15 Jump-Start Reinforcement Learning with Vision-Language-Action Regularization
利用视觉-语言-动作正则化实现强化学习的快速启动
摘要
Loris Roveda Team 2604.13733 HJFY
2026-04-15 Granularity-Aware Transfer for Tree Instance Segmentation in Synthetic and Real Forests
面向合成与真实森林场景中树木实例分割的粒度感知迁移方法
摘要
Karsten Berns Team 2604.13722 HJFY
2026-04-15 A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies
生成式机器人策略中仿真与现实协同训练的机制分析
摘要
Yuke Zhu Team 2604.13645 HJFY
2026-04-15 Dehaze-then-Splat: Generative Dehazing with Physics-Informed 3D Gaussian Splatting for Smoke-Free Novel View Synthesis
先除雾后渲染:基于物理信息3D高斯泼溅的生成式除雾技术用于无烟新视角合成
摘要
Hanqing Wang Team 2604.13589 HJFY
2026-04-15 DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis
DF3DV-1K:一个用于无干扰物新视角合成的大规模数据集与基准
摘要
Chin-Teng Lin Team 2604.13416 HJFY
2026-04-14 MSGS: Multispectral 3D Gaussian Splatting
MSGS:多光谱三维高斯泼溅
摘要
Fang-Lue Zhang Team 2604.13340 HJFY
2026-04-14 SSD-GS: Scattering and Shadow Decomposition for Relightable 3D Gaussian Splatting
SSD-GS:面向可重光照3D高斯泼溅的散射与阴影分解
摘要
Fang-Lue Zhang Team 2604.13333 HJFY
2026-04-14 PatchPoison: Poisoning Multi-View Datasets to Degrade 3D Reconstruction
PatchPoison:毒化多视角数据集以降低三维重建质量
摘要
Charu Sharma Team 2604.13153 HJFY
2026-04-14 RMGS-SLAM: Real-time Multi-sensor Gaussian Splatting SLAM
RMGS-SLAM:实时多传感器高斯泼溅SLAM系统
摘要
Marcelo H. Ang Team 2604.12942 HJFY
2026-04-14 Revisiting the angular size-redshift cosmological test with milliarcsecond radio structures in active galactic nuclei
利用活动星系核毫角秒射电结构重访角大小-红移宇宙学检验
摘要
András Kovács Team 2604.12936 HJFY
2026-04-09 SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds
SIM1:物理对齐模拟器作为可变形世界中的零样本数据扩展器
摘要
Jiangmiao Pang Team 2604.08544 HJFY
2026-04-09 Sumo: Dynamic and Generalizable Whole-Body Loco-Manipulation
Sumo:动态且可泛化的全身运动操控
摘要
Simon Le Cléac'h Team 2604.08508 HJFY
2026-04-09 BLaDA: Bridging Language to Functional Dexterous Actions within 3DGS Fields
BLaDA:在3D高斯溅射场中实现语言到灵巧功能动作的桥梁
摘要
Yaonan Wang Team 2604.08410 HJFY
2026-04-09 SurfelSplat: Learning Efficient and Generalizable Gaussian Surfel Representations for Sparse-View Surface Reconstruction
SurfelSplat:学习用于稀疏视角表面重建的高效且可泛化的高斯面元表示
摘要
Yueqi Duan Team 2604.08370 HJFY
2026-04-09 Scalable Neural Decoders for Practical Fault-Tolerant Quantum Computation
面向实用容错量子计算的可扩展神经解码器
摘要
Susanne F. Yelin Team 2604.08358 HJFY
2026-04-09 Controlling the rain fall statistics using Mean-Reverting Jump Diffusion model
使用均值回归跳跃扩散模型控制降雨统计特性
摘要
Pankaj Kumar Mishra Team 2604.08338 HJFY
2026-04-09 LLM-Based Data Generation and Clinical Skills Evaluation for Low-Resource French OSCEs
基于大型语言模型的低资源法语OSCE数据生成与临床技能评估
摘要
Irina Illina Team 2604.08126 HJFY
2026-04-09 Constraining Ultralight Scalar Dark Matter in the Galactic Center with the S2 Orbit
利用S2轨道约束银河系中心的超轻标量暗物质
摘要
Lijing Shao Team 2604.08053 HJFY
2026-04-09 MotionScape: A Large-Scale Real-World Highly Dynamic UAV Video Dataset for World Models
MotionScape:面向世界模型的大规模真实世界高动态无人机视频数据集
摘要
Lei Wang Team 2604.07991 HJFY
2026-04-09 Generative 3D Gaussian Splatting for Arbitrary-ResolutionAtmospheric Downscaling and Forecasting
基于生成式三维高斯泼溅的任意分辨率大气降尺度与预报
摘要
Lei Bai Team 2604.07928 HJFY

评估状态保存在浏览器本地(localStorage),换设备/浏览器不会同步。

📌 具身智能基础模型与数据

Publish Date (YYYY-MM-DD) Title Authors PDF HJFY 评估
2026-06-25 PhysiFormer: Learning to Simulate Mechanics in World Space
PhysiFormer:在全局空间中学习模拟力学
摘要
Andrea Vedaldi Team 2606.27364 HJFY
2026-06-25 Hallucination in World Models is Predictable and Preventable
世界模型中的幻觉是可预测且可预防的
摘要
Xiaolong Wang Team 2606.27326 HJFY
2026-06-25 Not All Actions Are Equal: Rethinking Conditioning for Dexterous World Model
并非所有动作都同等重要:重新思考灵巧世界模型的条件机制
摘要
Renjing Xu Team 2606.27325 HJFY
2026-06-25 EO-WM: A Physically Informed World Model for Probabilistic Earth Observation Forecasting
EO-WM:用于概率性地球观测预测的物理信息世界模型
摘要
Hengshuang Zhao Team 2606.27277 HJFY
2026-06-25 E-TTS: A New Embodied Test-Time Scaling Framework for Robotic Manipulation
E-TTS:面向机器人操作的全新具身测试时缩放框架
摘要
Liang Wang Team 2606.27268 HJFY
2026-06-25 Advancing Omnimodal Embodied Agents from Isolated Skills to Everyday Physical Autonomy
推进全模态具身智能体:从孤立技能迈向日常物理自主
摘要
Yu-Gang Jiang Team 2606.27251 HJFY
2026-06-25 ForesightSafety-VLA: A Unified Diagnostic Safety Benchmark for Vision-Language-Action Models
ForesightSafety-VLA:面向视觉-语言-动作模型的统一诊断性安全基准
摘要
Yi Zeng Team 2606.27079 HJFY
2026-06-25 A Generalization Theory for JEPA-Based World Models
基于JEPA的世界模型泛化理论
摘要
Yisen Wang Team 2606.27014 HJFY
2026-06-25 Einstein World Models
爱因斯坦世界模型
摘要
Kentaro Inui Team 2606.26969 HJFY
2026-06-25 Look-Before-Move: Narrative-Grounded World Visual Attention in Dynamic 3D Story Worlds
先看后动:动态3D故事世界中基于叙事的视觉注意力
摘要
Zhenhong Sun Team 2606.26964 HJFY
2026-06-18 Current World Models Lack a Persistent State Core
当前世界模型缺乏持久状态核心
摘要
Xiaozhu Ju Team 2606.20545 HJFY
2026-06-18 Slow Brain, Fast Planner: Latency-Resilient VLM-Augmented Urban Navigation
慢思考的头脑,快规划的行动:面向延迟容忍的VLM增强城市导航
摘要
Bolei Zhou Team 2606.20458 HJFY
2026-06-18 Finetuning Vision-Language-Action Models Requires Fewer Layers Than You Think
微调视觉-语言-动作模型所需的层数远比你想象的少
摘要
Ngo Anh Vien Team 2606.20246 HJFY
2026-06-18 Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring
学会提示:通过自适应大语言模型高中辅导提升学生参与度
摘要
Michiel T. van der Meer Team 2606.20138 HJFY
2026-06-18 Frequency-Aware Flow Matching for Continuous and Consistent Robotic Action Generation
面向连续一致机器人动作生成的频率感知流匹配
摘要
Simin Li Team 2606.20135 HJFY
2026-06-18 Sensorimotor World Models: Perception for Action via Inverse Dynamics
传感器运动世界模型:通过逆动力学实现面向行动的感知
摘要
Bernhard Schölkopf Team 2606.20104 HJFY
2026-06-18 Holo-World: Unified Camera, Object and Weather Control for Video World Model
Holo-World:面向视频世界模型的统一相机、物体与天气控制
摘要
Xiaoyan Sun Team 2606.20083 HJFY
2026-06-18 See-and-Reach: Precise Vision-Language Navigation for UAVs within the Field of View
视与达:面向无人机视场内的精确视觉语言导航
摘要
Jiande Sun Team 2606.20045 HJFY
2026-06-18 Reward as An Agent for Embodied World Models
奖励作为具身世界模型的代理
摘要
Shan You Team 2606.19990 HJFY
2026-06-18 Advancing DialNav through Automatic Embodied Dialog Augmentation
通过自动具身对话增强推进DialNav
摘要
Paul Hongsuck Seo Team 2606.19948 HJFY
2026-06-15 Semantic Flip: Synthetic OOD Generation for Robust Refusal in Embodied Question Answering and Spatial Localization
语义翻转:面向具身问答与空间定位中鲁棒拒答的合成分布外样本生成方法
摘要
Dooyoung Hong Team 2606.16898 HJFY
2026-06-15 Medical world models: representing medical states, modelling clinical dynamics and guiding intervention policies
医疗世界模型:表征患者状态、建模临床动态与指导干预策略
摘要
Haishuai Wang Team 2606.16721 HJFY
2026-06-15 ARB4WM: An Adversarial Robustness Benchmark for World Models in Continuous Control
ARB4WM:面向连续控制中世界模型的对抗鲁棒性基准
摘要
Zhaoquan Gu Team 2606.16605 HJFY
2026-06-15 Can LLM Agents Infer World Models? Evidence from Agentic Automata Learning
大语言模型智能体能否推断世界模型?来自智能自动机学习的证据
摘要
Gabriel Stanovsky Team 2606.16576 HJFY
2026-06-15 Kairos: A Native World Model Stack for Physical AI
Kairos:面向物理智能的原生世界模型栈
摘要
Xiaogang Wang Team 2606.16533 HJFY
2026-06-15 BadWorld: Adversarial Attacks on World Models
BadWorld:世界模型的对抗性攻击
摘要
Xingyi Yang Team 2606.16519 HJFY
2026-06-15 BRICKS-WM: Building Reusability via Interface Composition Kinetics for Structured World Models
BRICKS-WM:通过接口组合动力学构建结构化世界模型的可复用性
摘要
De-Chuan Zhan Team 2606.16489 HJFY
2026-06-15 HOLO-MPPI: Multi-Scenario Motion Planning via Hierarchical Policy Optimization
HOLO-MPPI:基于分层策略优化的多场景运动规划
摘要
Sangjae Bae Team 2606.16480 HJFY
2026-06-15 FlowMPC: Improving Flow Matching policies with World Models
FlowMPC:利用世界模型改进流匹配策略
摘要
Chandon Hamel Team 2606.16286 HJFY
2026-06-15 GraphWorld: Long-Horizon Planning with World Models for End-to-End Autonomous Driving
摘要
Yadan Luo Team 2606.16274 HJFY
2026-06-10 World Pilot: Steering Vision-Language-Action Models with World-Action Priors
世界领航员:借助世界-动作先验引导视觉-语言-动作模型
摘要
Zhaoxiang Zhang Team 2606.12403 HJFY
2026-06-10 DIRECT: When and Where Should You Allocate Test-Time Compute in Embodied Planners?
DIRECT:具身规划器中何时何地应分配测试时计算资源?
摘要
Marco Pavone Team 2606.12402 HJFY
2026-06-10 VLGA: Vision-Language-Geometry-Action Models for Autonomous Driving
VLGA:面向自动驾驶的视觉-语言-几何-动作模型
摘要
Burhan Yaman Team 2606.12396 HJFY
2026-06-10 Slots, Transitions, Loops: Learning Composable World Models for ARC
槽位、转换、循环:为ARC学习可组合的世界模型
摘要
Andreas Geiger Team 2606.12316 HJFY
2026-06-10 Learning What to Say to Your VLA: Mostly Harmless Vision Language Action Model Steering
学习对视觉-语言-动作模型说什么:一种基本无害的模型操控方法
摘要
Andrea Bajcsy Team 2606.12299 HJFY
2026-06-10 Making Foresight Actionable: Repurposing Representation Alignment in World Action Models
将预见转化为行动:世界行动模型中的表征对齐重构
摘要
Xihui Liu Team 2606.12217 HJFY
2026-06-10 DAM-VLA: Decoupled Asynchronous Multimodal Vision Language Action model
摘要
Rudolf Lioutikov Team 2606.12105 HJFY
2026-06-10 World Model Self-Distillation: Training World Models to Solve General Tasks
世界模型自蒸馏:训练世界模型解决通用任务
摘要
Paolo Favaro Team 2606.12072 HJFY
2026-06-10 When Does Language Matter? Multilingual Instructions Reveal Step-wise Language Sensitivity in Vision-Language-Action Models
语言何时重要?多语言指令揭示视觉-语言-行动模型中的步骤级语言敏感性
摘要
Wanxiang Che Team 2606.11906 HJFY
2026-06-10 TouchThinker: Scaling Tactile Commonsense Reasoning to the Open World with Large-scale Data and Action-aware Representation
TouchThinker:面向开放世界的大规模触觉常识推理与动作感知表征
摘要
Shuicheng Yan Team 2606.11637 HJFY
2026-05-29 Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization
位置注意力头与符号注意力头:学习动态、RoPE几何与长度泛化
摘要
Cristobal Rojas Team 2605.31558 HJFY
2026-05-29 Learning Controlled Separation of Small Objects Between Two Fingers with a Tactile Skin
基于触觉皮肤的双手指间小物体受控分离学习
摘要
Berthold Bäuml Team 2605.31486 HJFY
2026-05-29 IDOL: Inverse-Dynamics-Guided Future Prediction for End-to-End Autonomous Driving
IDOL:基于逆动力学引导的未来预测用于端到端自动驾驶
摘要
Dongmei Li Team 2605.31476 HJFY
2026-05-29 The Sword, Shield, and Achilles' Heel: Characterizing the Linguistic Inductive Bias of Large Language Models for Spatial Reasoning in Navigation Planning
剑、盾与致命弱点:大型语言模型在导航规划空间推理中的语言归纳偏差特征化
摘要
Xiong You Team 2605.31404 HJFY
2026-05-29 LiftNav: Path Planning via Semantic Lifting in TSDF-Guided Gaussian Splatting
摘要
Daniel Roth Team 2605.31376 HJFY
2026-05-29 Dreaming Of Others: Latent Teammate Modeling In World Models For Multi-Agent Reinforcement Learning
想象他人:多智能体强化学习中基于世界模型的潜在队友建模
摘要
Tomas Leroy-Stone Team 2605.31361 HJFY
2026-05-29 DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory
DecMem:面向分钟级一致世界生成的解耦记忆机制
摘要
Kwan-Yee K. Wong Team 2605.31336 HJFY
2026-05-29 AR Forcing: Towards Long-Horizon Robot Navigation World Model
AR强迫:面向长时程机器人导航的世界模型
摘要
Yan Wang Team 2605.31314 HJFY
2026-05-29 DriveMA: Driving Vision-Language-Action Models with verifiable Meta-Actions
DriveMA:基于可验证元动作的驾驶视觉-语言-动作模型
摘要
Hang Zhao Team 2605.31271 HJFY
2026-05-29 ERGeoBench:A Comprehensive Benchmark for Embodied Reasoning and Geo-localization in Multimodal Large Language Models
ERGeoBench:面向多模态大语言模型的具身推理与地理定位综合基准
摘要
Haoran Luo Team 2605.31251 HJFY
2026-05-20 PointACT: Vision-Language-Action Models with Multi-Scale Point-Action Interaction
PointACT:基于多尺度点-动作交互的视觉-语言-动作模型
摘要
Cordelia Schmid Team 2605.21414 HJFY
2026-05-20 DriveMA: Rethinking Language Interfaces in Driving VLAs with One-Step Meta-Actions
DriveMA:用单步元动作重新思考驾驶视觉-语言-动作模型的语言接口
摘要
Hang zhao Team 2605.21273 HJFY
2026-05-20 Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving
蒸馏以思考,预见以行动:面向自动驾驶的认知-物理强化学习
摘要
Jin Xie Team 2605.21139 HJFY
2026-05-20 Anomaly-Informed Confidence Calibration for Vision-Based Safety Prediction
面向视觉安全预测的异常感知置信度校准方法
摘要
Ivan Ruchkin Team 2605.21109 HJFY
2026-05-20 Q-ARVD: Quantizing Autoregressive Video Diffusion Models
Q-ARVD:量化自回归视频扩散模型
摘要
Xinchao Wang Team 2605.21072 HJFY
2026-05-20 Point Cloud Sequence Encoding for Material-conditioned Graph Network Simulators
基于点云序列编码的材料条件图网络模拟器
摘要
Gerhard Neumann Team 2605.20978 HJFY
2026-05-20 Demo-JEPA: Joint-Embedding Predictive Architecture for One-shot Cross-Embodiment Imitation
摘要
Shanghang Zhang Team 2605.20811 HJFY
2026-05-20 VLA-REPLICA: A Low-Cost, Reproducible Benchmark for Real-World Evaluation of Vision-Language-Action Models
VLA-REPLICA:面向视觉-语言-动作模型现实世界评估的低成本、可复现基准
摘要
Yu Xiang Team 2605.20774 HJFY
2026-05-20 GaussianDream: A Feed-Forward 3D Gaussian World Model for Robotic Manipulation
GaussianDream:面向机器人操作的前馈式三维高斯世界模型
摘要
Haibao Yu Team 2605.20752 HJFY
2026-05-19 The Yes-Man Syndrome: Benchmarking Abstention in Embodied Robotic Agents
“好好先生”综合征:具身机器人智能体中的节制行为基准测试
摘要
Z Berkay Celik Team 2605.20544 HJFY
2026-05-11 HarmoWAM: Harmonizing Generalizable and Precise Manipulation via Adaptive World Action Models
摘要
Shanghang Zhang Team 2605.10942 HJFY
2026-05-11 PriorVLA: Prior-Preserving Adaptation for Vision-Language-Action Models
PriorVLA:面向视觉-语言-动作模型的先验保持适应方法
摘要
Xingyu Chen Team 2605.10925 HJFY
2026-05-11 CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models
CapVector:参数空间中面向视觉-语言-动作模型的可迁移能力向量学习
摘要
Haoang Li Team 2605.10903 HJFY
2026-05-11 Is Your Driving World Model an All-Around Player?
你的驾驶世界模型是全能选手吗?
摘要
Ziwei Liu Team 2605.10858 HJFY
2026-05-11 ALAM: Algebraically Consistent Latent Transitions for Vision-Language-Action Models
ALAM:面向视觉-语言-动作模型的代数一致隐状态转移
摘要
Gang Pan Team 2605.10819 HJFY
2026-05-11 PhyGround: Benchmarking Physical Reasoning in Generative World Models
PhyGround:生成式世界模型中的物理推理基准测试
摘要
Yanzhi Wang Team 2605.10806 HJFY
2026-05-11 DeepSight: Long-Horizon World Modeling via Latent States Prediction for End-to-End Autonomous Driving
DeepSight:基于潜在状态预测的长时序世界建模用于端到端自动驾驶
摘要
Hong Wang Team 2605.10564 HJFY
2026-05-11 VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models
VEGA: 面向空间感知视觉-语言-动作模型的视觉编码器基础对齐
摘要
Shanghang Zhang Team 2605.10485 HJFY
2026-05-11 CoWorld-VLA: Thinking in a Multi-Expert World Model for Autonomous Driving
CoWorld-VLA:面向自动驾驶的多专家世界模型思考框架
摘要
Gong Che Team 2605.10426 HJFY
2026-05-11 Position: Life-Logging Video Streams Make the Privacy-Utility Trade-off Inevitable
定位:生活日志视频流使隐私-效用权衡不可避免
摘要
Sijie Cheng Team 2605.10404 HJFY
2026-05-04 Active Sampling for Ultra-Low-Bit-Rate Video Compression via Conditional Controlled Diffusion
基于条件控制扩散的超低码率视频压缩主动采样方法
摘要
Tara Javidi Team 2605.02849 HJFY
2026-05-04 Existence, Asymptotic Behavior, and Numerical Analysis of a Generalized Abel Differential Equation with Applications in Financial Modeling
广义阿贝尔微分方程的存在性、渐近行为及数值分析及其在金融建模中的应用
摘要
Dragos-Patru Covei Team 2605.02831 HJFY
2026-05-04 DynoSLAM: Dynamic SLAM with Generative Graph Neural Networks for Real-World Social Navigation
DynoSLAM:面向真实世界社交导航的生成式图神经网络动态SLAM
摘要
Gonzalo Ferrer Team 2605.02759 HJFY
2026-05-04 Latent Bridge: Feature Delta Prediction for Efficient Dual-System Vision-Language-Action Model Inference
潜在桥接:面向高效双系统视觉-语言-动作模型推理的特征增量预测
摘要
Hai Li Team 2605.02739 HJFY
2026-05-04 Sim-to-Real Transfer and Robustness Evaluation of Reinforcement Learning Control with Integrated Perception on an ASV for Floating Waste Capture
面向漂浮垃圾捕获的自主水面艇集成感知强化学习控制的仿真到现实迁移与鲁棒性评估
摘要
Cédric Pradalier Team 2605.02529 HJFY
2026-05-04 Beyond Specialization: Robust Reinforcement Learning Navigation via Procedural Map Generators
超越专门化:通过程序化地图生成器实现鲁棒的强化学习导航
摘要
Peter Detzner Team 2605.02528 HJFY
2026-05-04 Shadow-Loom: Causal Reasoning over Graphical World Model of Narratives
影梭:叙事因果推理的图形化世界模型
摘要
David Wilmot Team 2605.02475 HJFY
2026-05-04 Change-Robust Online Spatial-Semantic Topological Mapping
抗变化在线空间-语义拓扑地图构建
摘要
Harold Soh Team 2605.02227 HJFY
2026-05-04 Video Generation with Predictive Latents
基于预测潜变量的视频生成
摘要
Jie Chen Team 2605.02134 HJFY
2026-05-03 TRAP: Tail-aware Ranking Attack for World-Model Planning
TRAP:针对世界模型规划中轨迹排序的尾部感知后门攻击
摘要
Xizhao Luo Team 2605.01950 HJFY
2026-04-30 HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
HERMES++:面向统一驾驶世界模型的3D场景理解与生成
摘要
Xiang Bai Team 2604.28196 HJFY
2026-04-30 LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models
LaST-R1:通过自适应物理潜在推理增强VLA模型的动作能力
摘要
Pheng-Ann Heng Team 2604.28192 HJFY
2026-04-30 Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
摘要
Bin Wang Team 2604.28185 HJFY
2026-04-30 Beyond Gaussian Bottlenecks: Topologically Aligned Encoding of Vision-Transformer Feature Spaces
超越高斯瓶颈:视觉变换器特征空间的拓扑对齐编码
摘要
Aykut Erdem Team 2604.28122 HJFY
2026-04-30 Dreaming Across Towns: Semantic Rollout and Town-Adversarial Regularization for Zero-Shot Held-Out-Town Fixed-Route Driving in CARLA
跨城镇行驶:面向CARLA零样本固定路线驾驶的语义推演与城镇对抗正则化
摘要
Jaerock Kwon Team 2604.27994 HJFY
2026-04-30 GUI Agents with Reinforcement Learning: Toward Digital Inhabitants
基于强化学习的图形界面智能体:迈向数字居民
摘要
Song Guo Team 2604.27955 HJFY
2026-04-30 Flying by Inference: Active Inference World Models for Adaptive UAV Swarms
基于推理的飞行:面向自适应无人机集群的主动推理世界模型
摘要
Carlo Regazzoni Team 2604.27935 HJFY
2026-04-30 Simulating clinical interventions with a generative multimodal model of human physiology
基于人体生理学生成式多模态模型的临床干预模拟
摘要
Eran Segal Team 2604.27899 HJFY
2026-04-30 Graph World Models: Concepts, Taxonomy, and Future Directions
图世界模型:概念、分类与未来方向
摘要
Bei Yu Team 2604.27895 HJFY
2026-04-30 MotuBrain: An Advanced World Action Model for Robot Control
MotuBrain:面向机器人控制的先进世界动作模型
摘要
Jun Zhu Team 2604.27792 HJFY
2026-04-23 Seeing Fast and Slow: Learning the Flow of Time in Videos
快慢之见:学习视频中的时间流动
摘要
Wei-Chiu Ma Team 2604.21931 HJFY
2026-04-23 Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions
关系性道德困境中的机器行为:道德正确性、人类行为预测与模型决策
摘要
Meeyoung Cha Team 2604.21871 HJFY
2026-04-23 Hi-WM: Human-in-the-World-Model for Scalable Robot Post-Training
Hi-WM:面向可扩展机器人后训练的人机世界模型
摘要
Yichen Zhu Team 2604.21741 HJFY
2026-04-23 WorldMark: A Unified Benchmark Suite for Interactive Video World Models
WorldMark:交互式视频世界模型统一基准套件
摘要
Yongtao Ge Team 2604.21686 HJFY
2026-04-23 LLM-Steered Power Allocation for Parallel QPSK-AWGN Channels
面向并行QPSK-AWGN信道的LLM引导功率分配
摘要
Tadashi Wadayama Team 2604.21316 HJFY
2026-04-23 ReCAPA: Hierarchical Predictive Correction to Mitigate Cascading Failures
ReCAPA:层级预测校正以缓解级联故障
摘要
Hao Wang Team 2604.21232 HJFY
2026-04-23 How VLAs (Really) Work In Open-World Environments
视觉-语言-动作模型在开放世界环境中如何实际运作
摘要
Sajjad Pakdamansavoji Team 2604.21192 HJFY
2026-04-23 Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment
通过几何奖励信用分配强化点云-视觉-语言模型中的3D理解
摘要
Jungong Han Team 2604.21160 HJFY
2026-04-22 Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics
Open-H-Embodiment:面向医学机器人基础模型的大规模数据集
摘要
Axel Krieger Team 2604.21017 HJFY
2026-04-22 PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance
PokeVLA:以全面世界知识赋能口袋级视觉-语言-动作模型
摘要
Wenchao Ding Team 2604.20834 HJFY
2026-04-19 Fringe Projection Based Vision Pipeline for Autonomous Hard Drive Disassembly
基于条纹投影的自主硬盘拆解视觉流程
摘要
Beiwen Li Team 2604.17231 HJFY
2026-04-18 TensorHub: Rethinking AI Model Hub with Tensor-Centric Compression
TensorHub:以张量为中心压缩重构AI模型仓库
摘要
Yue Cheng Team 2604.17104 HJFY
2026-04-18 Mini-BEHAVIOR-Gran: Revealing U-Shaped Effects of Instruction Granularity on Language-Guided Embodied Agents
Mini-BEHAVIOR-Gran:揭示指令粒度对语言引导具身智能体的U型影响
摘要
Hamid Rezatofighi Team 2604.17019 HJFY
2026-04-18 Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification
Rule-VLN:通过语义推理与几何校正桥接感知与合规性
摘要
Xiaowen Chu Team 2604.16993 HJFY
2026-04-18 Chain Of Interaction Benchmark (COIN): When Reasoning meets Embodied Interaction
交互链基准(COIN):当推理遇见具身交互
摘要
Qing Li Team 2604.16886 HJFY
2026-04-18 SafeDream: Safety World Model for Proactive Early Jailbreak Detection
SafeDream:用于主动早期越狱检测的安全世界模型
摘要
Song Wang Team 2604.16824 HJFY
2026-04-17 Active World-Model with 4D-informed Retrieval for Exploration and Awareness
面向探索与感知的主动世界模型:基于四维信息检索的增强
摘要
Tara Javidi Team 2604.16733 HJFY
2026-04-17 Human Cognition in Machines: A Unified Perspective of World Models
机器中的人类认知:世界模型的统一视角
摘要
Yanzhi Wang Team 2604.16592 HJFY
2026-04-17 The Global Neural World Model: Spatially Grounded Discrete Topologies for Action-Conditioned Planning
全局神经世界模型:面向动作条件规划的空间离散拓扑结构
摘要
Noureddine Kermiche Team 2604.16585 HJFY
2026-04-17 DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs
DENALI:一个支持低成本激光雷达进行非视距空间推理的数据集
摘要
Ramesh Raskar Team 2604.16201 HJFY
2026-04-15 Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective
前馈式三维场景建模:问题驱动视角
摘要
Bohan Zhuang Team 2604.14025 HJFY
2026-04-15 Beyond State Consistency: Behavior Consistency in Text-Based World Models
超越状态一致性:文本世界模型中的行为一致性
摘要
Dongmei Zhang Team 2604.13824 HJFY
2026-04-15 Jump-Start Reinforcement Learning with Vision-Language-Action Regularization
利用视觉-语言-动作正则化实现强化学习的快速启动
摘要
Loris Roveda Team 2604.13733 HJFY
2026-04-15 Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap
无人机视觉语言导航:进展、挑战与研究路线图
摘要
Ji Pei Team 2604.13654 HJFY
2026-04-15 AgentComm: Semantic Communication for Embodied Agents
AgentComm:具身智能体的语义通信框架
摘要
Shi Jin Team 2604.13558 HJFY
2026-04-15 Evolvable Embodied Agent for Robotic Manipulation via Long Short-Term Reflection and Optimization
基于长短时反思与优化的可进化具身机器人操作代理
摘要
Xulong Zhang Team 2604.13533 HJFY
2026-04-14 Robotic Manipulation is Vision-to-Geometry Mapping ($f(v) \rightarrow G$): Vision-Geometry Backbones over Language and Video Models
机器人操作即视觉到几何的映射(f(v) → G):超越语言与视频模型的视觉-几何骨干网络
摘要
Guangrun Wang Team 2604.12908 HJFY
2026-04-14 FastGrasp: Learning-based Whole-body Control method for Fast Dexterous Grasping with Mobile Manipulators
FastGrasp:基于学习的全身控制方法,用于移动机械臂的快速灵巧抓取
摘要
Yuexin Ma Team 2604.12879 HJFY
2026-04-14 HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models
危险竞技场:评估视觉-语言-动作模型中的语义安全性
摘要
Yu-Gang Jiang Team 2604.12447 HJFY
2026-04-15 Reading Between the Pixels: Linking Text-Image Embedding Alignment to Typographic Attack Success on Vision-Language Models
像素间的解读:将文本-图像嵌入对齐与视觉语言模型上的排版攻击成功率关联研究
摘要
Ankit Garg Team 2604.12371 HJFY
2026-04-09 LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation
LAMP:将图像编辑提升为开放世界操作中的通用三维先验
摘要
Guofeng Zhang Team 2604.08475 HJFY
2026-04-09 Grounding Clinical AI Competency in Human Cognition Through the Clinical World Model and Skill-Mix Framework
通过临床世界模型与技能组合框架将临床AI能力植根于人类认知
摘要
Isaac Shiri Team 2604.08226 HJFY
2026-04-09 Beyond Static Forecasting: Unleashing the Power of World Models for Mobile Traffic Extrapolation
超越静态预测:释放世界模型在移动流量外推中的潜力
摘要
Yong Li Team 2604.08199 HJFY
2026-04-09 Governed Capability Evolution for Embodied Agents: Safe Upgrade, Compatibility Checking, and Runtime Rollback for Embodied Capability Modules
具身智能体的受控能力演化:面向具身能力模块的安全升级、兼容性检查与运行时回滚
摘要
Zhijun Li Team 2604.08059 HJFY
2026-04-09 MotionScape: A Large-Scale Real-World Highly Dynamic UAV Video Dataset for World Models
MotionScape:面向世界模型的大规模真实世界高动态无人机视频数据集
摘要
Lei Wang Team 2604.07991 HJFY
2026-04-09 How Far Are Large Multimodal Models from Human-Level Spatial Action? A Benchmark for Goal-Oriented Embodied Navigation in Urban Airspace
大型多模态模型距离人类空间行动能力还有多远?面向城市空域目标导向具身导航的基准测试
摘要
Xinlei Chen Team 2604.07973 HJFY
2026-04-09 WorldMAP: Bootstrapping Vision-Language Navigation Trajectory Prediction with Generative World Models
WorldMAP:利用生成式世界模型自举视觉语言导航轨迹预测
摘要
Zhibo Chen Team 2604.07957 HJFY
2026-04-09 Object-Attribute-Relation Model Driven Adaptive Hierarchical Transmission for Multimodal Semantic Communication
基于对象-属性-关系模型驱动的自适应分层传输多模态语义通信框架
摘要
Mingquan Lu Team 2604.07859 HJFY
2026-04-09 Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution
驾驭具身智能体:面向策略约束执行的运行时治理
摘要
Zhijun Li Team 2604.07833 HJFY
2026-04-09 Learning Without Losing Identity: Capability Evolution for Embodied Agents
学习而不失身份:具身智能体的能力进化
摘要
Zhijun Li Team 2604.07799 HJFY

评估状态保存在浏览器本地(localStorage),换设备/浏览器不会同步。