| # |
Paper Title |
TLDR |
Project Page |
Paper |
Code |
| 1 |
CLONE: Closed-Loop Whole-Body Humanoid Teleoperation for Long-Horizon Tasks |
Robust whole-body humanoid teleoperation using mixture-of-experts control and LiDAR correction. |
link |
link |
N/A |
| 2 |
Disentangled Multi-Context Meta-Learning: Unlocking Robust and Generalized Task Learning |
Disentangles task factors for robust meta-learning and sim-to-real transfer. |
N/A |
link |
N/A |
| 3 |
Meta-Optimization and Program Search using Language Models for Task and Motion Planning |
Uses LLMs for meta-optimization and program search in task and motion planning. |
N/A |
link |
N/A |
| 4 |
Text2Touch: Tactile In-Hand Manipulation with LLM-Designed Reward Functions |
Leverages LLM-designed reward functions for tactile in-hand robot manipulation. |
N/A |
link |
N/A |
| 5 |
Multi-Loco: Unifying Multi-Embodiment Legged Locomotion via Reinforcement Learning Augmented Diffusion |
Unifies locomotion control across embodiments with RL-augmented diffusion models. |
N/A |
link |
N/A |
| 6 |
SimShear: Sim-to-Real Shear-based Tactile Servoing |
Uses shear-based tactile feedback for sim-to-real transfer in tactile servoing. |
N/A |
link |
N/A |
| 7 |
Focusing on What Matters: Object-Agent-centric Tokenization for Vision Language Action models |
Introduces object-agent-centric tokenization to improve VLA model reasoning. |
N/A |
OpenReview |
N/A |
| 8 |
AT-Drone: Benchmarking Adaptive Teaming in Multi-Drone Pursuit |
Benchmark for evaluating adaptive teaming strategies in multi-drone pursuit tasks. |
N/A |
link |
N/A |
| 9 |
Uncertainty-Aware Scene Understanding via Efficient Sampling-Free Confidence Estimation |
Confidence estimation for scene understanding without expensive sampling. |
N/A |
link |
N/A |
| 10 |
ObjectReact: Learning Object-Relative Control for Visual Navigation |
Learns object-relative navigation control for robust visual navigation. |
Link |
link |
Link |
| 11 |
Decentralized Aerial Manipulation of a Cable-Suspended Load Using Multi-Agent Reinforcement Learning |
Decentralized multi-agent RL for aerial manipulation of suspended loads. |
N/A |
link |
N/A |
| 12 |
ReasonPlan: Unified Scene Prediction and Decision Reasoning for Closed-loop Autonomous Driving |
Integrates scene prediction and decision reasoning for end-to-end autonomous driving. |
N/A |
link |
N/A |
| 13 |
Embrace Contacts: humanoid shadowing with full body ground contacts |
Enables humanoid shadowing with consistent full-body ground contacts. |
N/A |
link |
N/A |
| 14 |
Distilling for Long-Horizon Prehensile and Non-Prehensile Manipulation |
Distills policies to handle long-horizon prehensile and non-prehensile tasks. |
N/A |
link |
N/A |
| 15 |
Constraint-Aware Diffusion Guidance for Robotics: Real-Time Obstacle Avoidance for Autonomous Racing |
Uses diffusion guidance for real-time obstacle avoidance in racing scenarios. |
N/A |
link |
N/A |
| 16 |
Learning Impact-Rich Rotational Maneuvers via Centroidal Velocity Rewards and Sim-to-Real Techniques: A One-Leg Hopper Flip Case Study |
Trains one-leg hopper to flip using impact-rich rewards and sim-to-real techniques. |
N/A |
link |
N/A |
| 17 |
LodeStar: Long-horizon Dexterity via Synthetic Data Augmentation from Human Demonstrations |
Enhances dexterity with synthetic data augmentation from human demonstrations. |
link |
link |
N/A |
| 18 |
Imitation Learning Based on Disentangled Representation Learning of Behavioral Characteristics |
Applies disentangled representation learning for imitation learning of behaviors. |
N/A |
link |
N/A |
| 19 |
Motion Priors Reimagined: Adapting Flat-Terrain Skills for Complex Quadruped Mobility |
Adapts flat-terrain locomotion skills to complex terrains using motion priors. |
link |
link |
link |
| 20 |
Sequence Modeling for Time-Optimal Quadrotor Trajectory Optimization with Sampling-based Robustness Analysis |
Applies sequence modeling for fast and robust quadrotor trajectory optimization. |
N/A |
link |
N/A |
| 21 |
HALO: Human Preference Aligned Offline Reward Learning for Robot Navigation |
Aligns offline reward learning with human preferences for navigation tasks. |
N/A |
link |
N/A |
| 22 |
RoboMonkey: Scaling Test-Time Sampling and Verification for Vision-Language-Action Models |
Scales up verification and sampling for VLA models during test-time. |
link |
link |
link |
| 23 |
Constraint-Preserving Data Generation for One-Shot Visuomotor Policy Generalization |
Generates constraint-preserving data to improve one-shot visuomotor generalization. |
N/A |
link |
N/A |
| 24 |
Hand-Eye Autonomous Delivery: Learning Humanoid Navigation, Locomotion and Reaching |
Trains humanoid robots for navigation and reaching in delivery scenarios. |
N/A |
link |
N/A |
| 25 |
FLARE: Robot Learning with Implicit World Modeling |
Implicit world modeling for more efficient and robust robot learning. |
N/A |
link |
N/A |
| 26 |
From Real World to Logic and Back: Learning Generalizable Relational Concepts For Long Horizon Robot Planning |
Learns relational concepts that bridge real-world experience with symbolic planning for long-horizon tasks. |
N/A |
link |
N/A |
| 27 |
NeuralSVCD for Efficient Swept Volume Collision Detection |
Neural approach to fast swept-volume–based collision detection for motion planning. |
N/A |
link |
N/A |
| 28 |
MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention |
Inverse RL method that aligns policies from sparse interventions via max-entropy residual Q-learning. |
Link |
link |
N/A |
| 29 |
Joint Model-based Model-free Diffusion for Planning with Constraints |
Combines model-based and model-free diffusion to plan under explicit constraints. |
N/A |
link |
Link |
| 30 |
DEQ-MPC : Deep Equilibrium Model Predictive Control |
Uses deep equilibrium networks to solve MPC problems efficiently at inference time. |
N/A |
link |
N/A |
| 31 |
CUPID: Curating Data your Robot Loves with Influence Functions |
Applies influence functions to select training data that most benefits robot learning. |
N/A |
link |
N/A |
| 32 |
Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop |
Trains perception to support action using a behavior-cloning + RL loop for active visual policies. |
N/A |
link |
N/A |
| 33 |
Imagine, Verify, Execute: Memory-guided Agentic Exploration with Vision-Language Models |
Uses VLMs with memory to imagine plans, verify outcomes, and execute tasks agentically. |
link |
link |
N/A |
| 34 |
Dynamics-Compliant Trajectory Diffusion for Super-Nominal Payload Manipulation |
Generates manipulation trajectories via diffusion that respect dynamics under heavy payloads. |
N/A |
link |
N/A |
| 35 |
CoRI: Communication of Robot Intent for Physical Human-Robot Interaction |
Framework for communicating robot intent to improve safety and fluency in pHRI. |
N/A |
link |
N/A |
| 36 |
KoopMotion: Learning Almost Divergence Free Koopman Flow Fields for Motion Planning |
Learns Koopman operator–based flow fields that guide smooth, divergence-free motion plans. |
N/A |
link |
N/A |
| 37 |
Learning Smooth State-Dependent Traversability from Dense Point Clouds |
Predicts traversability as a smooth, state-dependent function directly from dense 3D point clouds. |
N/A |
link |
N/A |
| 38 |
Leveraging Correlation Across Test Platforms for Variance-Reduced Metric Estimation |
Reduces evaluation variance by exploiting correlations across heterogeneous testbeds. |
N/A |
link |
N/A |
| 39 |
Agreement Volatility: A Second-Order Metric for Uncertainty Quantification in Surgical Robot Learning |
Introduces a second-order metric capturing volatility in model agreement to quantify uncertainty. |
N/A |
link |
N/A |
| 40 |
Sample-Efficient Online Control Policy Learning with Real-Time Recursive Model Updates |
Online control learning with recursive updates to maintain real-time performance. |
N/A |
link |
N/A |
| 41 |
Lucid-XR: An Extended-Reality Data Engine for Robotic Manipulation |
Builds an XR-based engine to synthesize diverse manipulation data at scale. |
N/A |
OpenReview |
N/A |
| 42 |
CLAMP: Crowdsourcing a LArge-scale in-the-wild haptic dataset with an open-source device for Multimodal robot Perception |
Collects large-scale in-the-wild haptic data using an open-source crowdsourced device. |
N/A |
link |
N/A |
| 43 |
Distributed Upload and Active Labeling for Resource-Constrained Fleet Learning |
Active labeling and distributed upload strategies tailored for fleet learning under bandwidth limits. |
N/A |
OpenReview |
N/A |
| 44 |
DreamGen: Unlocking Generalization in Robot Learning through Video World Models |
Uses video world models to improve generalization across robot tasks and environments. |
N/A |
link |
N/A |
| 45 |
LocoTouch: Learning Dynamic Quadrupedal Transport with Tactile Sensing |
Incorporates tactile feedback for robust, dynamic quadruped transport behaviors. |
N/A |
link |
N/A |
| 46 |
TopoCut: Learning Multi-Step Cutting with Spectral Rewards and Discrete Diffusion Policies |
Solves multi-step cutting via discrete diffusion guided by topology-aware rewards. |
N/A |
link |
N/A |
| 47 |
WoMAP: World Models For Embodied Open-Vocabulary Object Localization |
Uses open-vocabulary world models to localize objects for embodied agents. |
N/A |
link |
N/A |
| 48 |
MirrorDuo: Reflection-Consistent Visuomotor Learning from Mirrored Demonstration Pairs |
Improves visuomotor learning by enforcing reflection consistency across mirrored demos. |
N/A |
OpenReview |
N/A |
| 49 |
First Order Model-Based RL through Decoupled Backpropagation |
Enables efficient model-based RL via a first-order method with decoupled gradients. |
N/A |
link |
N/A |
| 50 |
CLASS: Contrastive Learning via Action Sequence Supervision for Robot Manipulation |
Contrastive pretraining supervised by action sequences to enhance manipulation policies. |
N/A |
link |
N/A |
| 51 |
Articulated Object Estimation in the Wild |
Estimates articulation and pose of everyday objects from unconstrained, real-world data. |
N/A |
link |
N/A |
| 52 |
Multimodal Fused Learning for Solving the Generalized Traveling Salesman Problem in Robotic Task Planning |
Fuses multiple sensing/modalities to solve generalized TSP instances for robot task planning. |
N/A |
link |
N/A |
| 53 |
Dexplore: Scalable Neural Control for Dexterous Manipulation from Reference Scoped Exploration |
Unified optimization uses MoCap as soft guidance to scale dexterous manipulation control. |
N/A |
link |
N/A |
| 54 |
PrioriTouch: Adapting to User Contact Preferences for Whole-Arm Physical Human-Robot Interaction |
Learns and prioritizes user-specific contact preferences for safer whole-arm pHRI. |
link |
link |
N/A |
| 55 |
UnPose: Uncertainty-Guided Diffusion Priors for Zero-Shot Pose Estimation |
Uses diffusion priors with uncertainty to enable model-free, zero-shot 6D pose estimation. |
N/A |
link |
N/A |
| 56 |
Distilling On-device Language Models for Robot Planning with Minimal Human Intervention |
Distills compact LMs on-device to plan robot actions with minimal human supervision. |
N/A |
link |
N/A |
| 57 |
Mechanistic Interpretability for Steering Vision-Language-Action Models |
Probes and steers VLA internals using mech-int tools to improve safety and control. |
N/A |
link |
N/A |
| 58 |
FFHFlow: Diverse and Uncertainty-Aware Dexterous Grasp Generation via Flow Variational Inference |
Generates diverse, uncertainty-aware dexterous grasps using flow-based variational inference. |
N/A |
link |
N/A |
| 59 |
Tool-as-Interface: Learning Robot Policies from Observing Human Tool Use |
Transfers tool-use knowledge from human demonstrations to robot visuomotor policies. |
N/A |
link |
N/A |
| 60 |
DiWA: Diffusion Policy Adaptation with World Models |
Offline fine-tuning of diffusion policies using a learned world model instead of real interactions. |
link |
link |
N/A |
| 61 |
Self-supervised Learning Of Visual Pose Estimation Without Pose Labels By Classifying LED States |
Replaces pose labels with LED-state classification to self-supervise pose estimation. |
N/A |
link |
N/A |
| 62 |
Diffusion-Guided Multi-Arm Motion Planning |
Guides multi-arm planning with diffusion priors to handle high-dimensional coordination. |
N/A |
link |
N/A |
| 63 |
GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering |
Employs online 3D semantic scene graphs to plan, explore, and answer embodied questions. |
link |
link |
N/A |
| 64 |
Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering |
Structured memory (“mind palace”) for long-term EQA with exploration–recall tradeoffs. |
N/A |
link |
N/A |
| 65 |
Steerable Scene Generation with Post Training and Inference-Time Search |
Steers generative scene models via post-training objectives and search during inference. |
N/A |
link |
N/A |
| 66 |
RICL: Adding In-Context Adaptability to Pre-Trained Vision-Language-Action Models |
Enables in-context task adaptation for pre-trained VLA models without full finetuning. |
Link |
link |
N/A |
| 67 |
ARCH: Hierarchical Hybrid Learning for Long-Horizon Contact-Rich Robotic Assembly |
Hierarchical composition of parametric skills plus high-level policy for contact-rich assembly. |
link |
link |
N/A |
| 68 |
Pointing3D: A Benchmark for 3D Object Referral via Pointing Gestures |
Benchmark for grounding 3D object references expressed via human pointing gestures. |
N/A |
link |
N/A |
| 69 |
Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation |
Improves manipulation RL by merging/disentangling multi-view representations. |
N/A |
link |
N/A |
| 70 |
Learning Deployable Locomotion Control via Differentiable Simulation |
Trains locomotion controllers in differentiable simulation for real-world deployment. |
N/A |
link |
N/A |
| 71 |
BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities |
Open robot suite + algorithms for bimanual, mobile whole-body household manipulation. |
link |
link |
link |
| 72 |
GraspQP: Differentiable Optimization of Force Closure for Diverse and Robust Dexterous Grasping |
Differentiable force-closure QP yields diverse, stable dexterous grasps and dataset. |
link |
link |
link |
| 73 |
Adapting by Analogy: OOD Generalization of Visuomotor Policies via Functional Correspondence |
Improves OOD generalization by mapping task analogies via functional correspondence. |
N/A |
link |
N/A |
| 74 |
Long Range Navigator (LRN): Extending robot planning horizons beyond metric maps |
Learns affordance frontiers from vision to plan over horizons far beyond local maps. |
link |
link |
N/A |
| 75 |
Toward Real-World Cooperative and Competitive Soccer with Quadrupedal Robot Teams |
Hierarchical MARL for decentralized quadruped robot soccer in real-world settings. |
N/A |
link |
N/A |
| 76 |
Multi-critic Learning for Whole-body End-effector Twist Tracking |
Uses multiple critics to improve whole-body control for accurate end-effector twist tracking. |
N/A |
link |
N/A |
| 77 |
SIREN: Semantic, Initialization-Free Registration of Multi-Robot Gaussian Splatting Maps |
Registers multi-robot Gaussian splat maps using semantics without requiring initialization. |
Link |
link |
N/A |
| 78 |
Rapid Mismatch Estimation via Neural Network Informed Variational Inference |
Estimates model–reality mismatch quickly using VI guided by neural networks. |
N/A |
link |
N/A |
| 79 |
In-Context Iterative Policy Improvement for Dynamic Manipulation |
Iteratively improves manipulation policies via in-context updates without full retraining. |
N/A |
link |
N/A |
| 80 |
Cost-aware Discovery of Contextual Failures using Bayesian Active Learning |
Finds failure modes with a cost-aware Bayesian active learning strategy. |
N/A |
link |
N/A |
| 81 |
KDPE: A Kernel Density Estimation Strategy for Diffusion Policy Trajectory Selection |
Selects better diffusion trajectories via kernel density estimation over candidates. |
N/A |
link |
N/A |
| 82 |
SocialNav-SUB: Benchmarking VLMs for Scene Understanding in Social Robot Navigation |
Benchmarks vision-language models for socially aware navigation scene understanding. |
N/A |
link |
N/A |
| 83 |
Generating Robot Constitutions & Benchmarks for Semantic Safety |
Creates rule “constitutions” and benchmarks to evaluate semantic safety in robots. |
N/A |
link |
N/A |
| 84 |
Few-Shot Neuro-Symbolic Imitation Learning for Long-Horizon Planning and Acting |
Combines neuro-symbolic reasoning with few-shot imitation for long-horizon tasks. |
N/A |
link |
N/A |
| 85 |
Capability-Aware Shared Hypernetworks for Flexible Heterogeneous Multi-Robot Coordination |
Uses shared hypernetworks aware of robot capabilities for heterogeneous team coordination. |
N/A |
link |
N/A |
| 86 |
AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies |
Adds an auxiliary visual cue to boost spatial awareness in visuomotor policies. |
N/A |
link |
N/A |
| 87 |
Beyond Constant Parameters: Hyper Prediction Models and HyperMPC |
Predicts time-varying parameters via hyper models and integrates them into MPC. |
N/A |
link |
N/A |
| 88 |
FLOWER: Democratizing Generalist Robot Policies with Efficient Vision-Language-Flow Models |
Builds efficient VLF models to make generalist robot policies more accessible. |
N/A |
link |
N/A |
| 89 |
Subteaming and Adaptive Formation Control for Coordinated Multi-Robot Navigation |
Dynamically forms subteams and adapts formations for coordinated navigation. |
N/A |
link |
N/A |
| 90 |
Force-Modulated Visual Policy for Robot-Assisted Dressing with Arm Motions |
Integrates force modulation into visual policies for safer robot-assisted dressing. |
N/A |
link |
N/A |
| 91 |
ComposableNav: Instruction-Following Navigation in Dynamic Environments via Composable Diffusion |
Composes diffusion skills to follow instructions for navigation in dynamic scenes. |
N/A |
link |
N/A |
| 92 |
Train-Once Plan-Anywhere Kinodynamic Motion Planning via Diffusion Trees |
Trains diffusion trees once to enable generalizable kinodynamic motion planning. |
N/A |
link |
N/A |
| 93 |
ZipMPC: Compressed Context-Dependent MPC Cost via Imitation Learning |
Learns compact, context-aware MPC cost functions from demonstrations. |
N/A |
link |
N/A |
| 94 |
EndoVLA: Dual-Phase Vision-Language-Action for Precise Autonomous Tracking in Endoscopy |
Two-phase VLA framework for accurate autonomous camera tracking in endoscopy. |
N/A |
link |
N/A |
| 95 |
Morphologically Symmetric Reinforcement Learning for Ambidextrous Bimanual Manipulation |
Exploits morphological symmetry to learn ambidextrous bimanual manipulation skills. |
N/A |
link |
N/A |
| 96 |
Robot Operating Home Appliances by Reading User Manuals |
Extracts procedural knowledge from manuals to operate home appliances autonomously. |
N/A |
link |
N/A |
| 97 |
CARE: Enhancing Safety of Visual Navigation through Collision Avoidance via Repulsive Estimation |
Improves navigation safety with a repulsive-estimation collision avoidance module. |
N/A |
link |
N/A |
| 98 |
MimicFunc: Imitating Tool Manipulation from a Single Human Video via Functional Correspondence |
Imitates tool-use from a single video by aligning functional correspondences. |
N/A |
link |
N/A |
| 99 |
CogniPlan: Uncertainty-Guided Path Planning with Conditional Generative Layout Prediction |
Predicts environment layouts conditionally and plans paths with uncertainty guidance. |
Link |
link |
N/A |
| 100 |
Fast Flow-based Visuomotor Policies via Conditional Optimal Transport Couplings |
Trains fast visuomotor policies using conditional optimal transport–based flow couplings. |
N/A |
link |
N/A |
| 101 |
Enabling Long(er) Horizon Imitation for Manipulation Tasks by Modeling Subgoal Transitions |
Improves long-horizon imitation by explicitly modeling transitions between subgoals. |
N/A |
link |
N/A |
| 102 |
Search-TTA: A Multi-Modal Test-Time Adaptation Framework for Visual Search in the Wild |
Performs multi-modal test-time adaptation to robustify visual search in-the-wild. |
N/A |
link |
N/A |
| 103 |
Mastering Multi-Drone Volleyball through Hierarchical Co-Self-Play Reinforcement Learning |
Hierarchical co-self-play enables coordinated multi-drone volleyball behaviors. |
N/A |
link |
N/A |
| 104 |
Wheeled Lab: Modern Sim2Real for Low-cost, Open-source Wheeled Robotics |
Low-cost open-source wheeled robotics platform with strong sim-to-real pipeline. |
N/A |
link |
N/A |
| 105 |
DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control |
VLM enhanced with a plug-in diffusion expert for general-purpose robot control. |
Link |
link |
Link |
| 106 |
From Space to Time: Enabling Adaptive Safety with Learned Value Functions via Disturbance Recasting |
Recasts disturbances and learns value functions to adapt safety constraints over time. |
N/A |
link |
N/A |
| 107 |
Mobi-: Mobilizing Your Robot Learning Policy |
Framework/tools to package and deploy learned robot policies across platforms (“mobilize”). |
Link |
link |
Link |
| 108 |
Generative Visual Foresight Meets Task-Agnostic Pose Estimation in Robotic Table-top Manipulation |
Combines visual foresight with task-agnostic pose estimation for table-top tasks. |
N/A |
link |
N/A |
| 109 |
Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation |
Diffusion-based dynamics and generative state estimation for deformable cloth manipulation. |
N/A |
link |
N/A |
| 110 |
Contrastive Forward Prediction Reinforcement Learning for Adaptive Fault-Tolerant Legged Robots |
Contrastive forward prediction enables adaptive, fault-tolerant legged locomotion. |
N/A |
link |
N/A |
| 111 |
Action-Free Reasoning for Policy Generalization |
Encourages high-level reasoning without action supervision to improve generalization. |
Link |
link |
N/A |
| 112 |
Learn from What We HAVE: History-Aware VErifier that Reasons about Past Interactions Online |
Online verifier leverages historical interactions to prevent policy failures. |
N/A |
link |
N/A |
| 113 |
KineDex: Learning Tactile-Informed Visuomotor Policies via Kinesthetic Teaching for Dexterous Manipulation |
Kinesthetic teaching with tactile cues to train dexterous visuomotor policies. |
N/A |
link |
N/A |
| 114 |
FlashBack: Consistency Model-Accelerated Shared Autonomy |
Uses consistency models to accelerate and stabilize shared autonomy. |
N/A |
link |
N/A |
| 115 |
Granular loco-manipulation: Repositioning rocks through strategic sand avalanche |
Manipulates granular media to reposition rocks via controlled sand avalanches. |
N/A |
link |
N/A |
| 116 |
D-CODA: Diffusion for Coordinated Dual-Arm Data Augmentation |
Generates coordinated bimanual data via diffusion for dual-arm manipulation learning. |
N/A |
link |
N/A |
| 117 |
JaxRobotarium: Training and Deploying Multi-Robot Policies in 10 Minutes |
JAX-based framework for rapid training/deployment of multi-robot policies. |
N/A |
link |
N/A |
| 118 |
Uncertainty-aware Latent Safety Filters for Avoiding Out-of-Distribution Failures |
Latent-space safety filters with uncertainty estimates to avoid OOD failures. |
N/A |
link |
N/A |
| 119 |
ATK: Automatic Task-driven Keypoint Selection for Robust Policy Learning |
Selects task-relevant keypoints automatically to improve policy robustness. |
N/A |
link |
N/A |
| 120 |
ManipBench: Benchmarking Vision-Language Models for Low-Level Robot Manipulation |
Benchmark to test VLMs on low-level manipulation perception and control tasks. |
N/A |
link |
N/A |
| 121 |
IRIS: An Immersive Robot Interaction System |
Immersive system for interacting with robots using mixed/extended reality interfaces. |
N/A |
link |
N/A |
| 122 |
ActLoc: Learning to Localize on the Move via Active Viewpoint Selection |
Actively selects viewpoints to maintain localization accuracy while moving. |
N/A |
link |
N/A |
| 123 |
AnyPlace: Learning Generalizable Object Placement for Robot Manipulation |
Learns placement policies that generalize across objects and contexts. |
Link |
link |
Link |
| 124 |
AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World |
Automates real-world evaluation of generalist manipulation policies with minimal setup. |
Link |
link |
Link |
| 125 |
Poke and Strike: Learning Task-Informed Exploration Policies |
Designs exploration policies that “poke & strike” based on task-informed signals. |
N/A |
link |
N/A |
| 126 |
Ensuring Force Safety in Vision-Guided Robotic Manipulation via Implicit Tactile Calibration |
Improves force safety in vision-guided manipulation by implicitly calibrating tactile responses. |
N/A |
link |
N/A |
| 127 |
Off Policy Lyapunov Stability in Reinforcement Learning |
Analyzes and enforces Lyapunov stability guarantees for off-policy RL controllers. |
N/A |
link |
N/A |
| 128 |
ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination |
Learns task-dependent constraints to coordinate multiple agents more effectively. |
N/A |
link |
N/A |
| 129 |
Human-like Navigation in a World Built for Humans |
Plans and navigates using priors that emulate human motion patterns and preferences. |
N/A |
link |
N/A |
| 130 |
Efficient Evaluation of Multi-Task Robot Policies With Active Experiment Selection |
Actively selects informative trials to evaluate multi-task policies with fewer experiments. |
N/A |
link |
N/A |
| 131 |
Fail2Progress: Learning from Real-World Robot Failures with Stein Variational Inference |
Uses SVGD to turn failure experiences into policy improvements from real robot runs. |
Link |
link |
N/A |
| 132 |
ControlVLA: Few-shot Object-centric Adaptation for Pre-trained Vision-Language-Action Models |
Adapts pretrained VLA models to new objects/tasks with few-shot object-centric tuning. |
N/A |
link |
N/A |
| 133 |
SafeBimanual: Diffusion-based trajectory optimization for safe bimanual manipulation |
Optimizes bimanual trajectories with diffusion priors while enforcing safety constraints. |
N/A |
link |
N/A |
| 134 |
Junction State Estimation for Efficient Exploration in Reinforcement Learning |
Detects “junction” states to guide exploration and reduce sample complexity in RL. |
N/A |
link |
N/A |
| 135 |
QuaDreamer: Controllable Panoramic Video Generation for Quadruped Robots |
Generates controllable panoramic videos to train and evaluate quadruped perception stacks. |
N/A |
link |
N/A |
| 136 |
COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping |
Learned constraint-based policies for robust bimanual grasping under occlusion. |
N/A |
link |
N/A |
| 137 |
D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation |
Optimizes dexterous trajectories for deformables using latent diffusion models. |
Link |
link |
N/A |
| 138 |
Long-VLA: Unleashing Long-Horizon Capability of Vision Language Action Model for Robot Manipulation |
Extends VLA planning capabilities to long-horizon manipulation tasks. |
Link |
link |
N/A |
| 139 |
ImLPR: Image-based LiDAR Place Recognition using Vision Foundation Models |
Leverages VFM features for robust image-based LiDAR place recognition. |
N/A |
link |
N/A |
| 140 |
MoTo: A Zero-shot Plug-in Interaction-aware Navigation for General Mobile Manipulation |
Zero-shot, plugin-based navigation that accounts for interactions in mobile manipulation. |
N/A |
link |
N/A |
| 141 |
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning |
Combines LLM reasoning with closed-loop RL for embodied robotic tasks. |
N/A |
link |
N/A |
| 142 |
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data |
Pretrains a grasping foundation model on large-scale synthetic action sequences. |
Link |
link |
Link |
| 143 |
Learning from 10 Demos: Generalisable and Sample-Efficient Policy Learning with Oriented Affordance Frames |
Achieves generalization from ~10 demos using oriented affordance-frame supervision. |
N/A |
link |
N/A |
| 144 |
Learning Long-Context Diffusion Policies via Past-Token Prediction |
Trains diffusion policies to leverage long past-context using next/past-token prediction. |
Link |
link |
N/A |
| 145 |
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation |
Generates diverse human videos in novel scenes to bootstrap generalizable manipulation. |
Link |
link |
N/A |
| 146 |
Articulate AnyMesh: Open-vocabulary 3D Articulated Objects Modeling |
Models open-vocabulary articulated objects by inferring meshes and kinematic structure. |
N/A |
link |
N/A |
| 147 |
Phantom: Training Robots Without Robots Using Only Human Videos |
Trains robot policies purely from human videos without robot data collection. |
Link |
link |
Link |
| 148 |
Residual Neural Terminal Constraint for MPC-based Collision Avoidance in Dynamic Environments |
Adds a learned residual terminal constraint to MPC for safer collision avoidance. |
N/A |
link |
N/A |
| 149 |
Hold My Beer: Learning Gentle Humanoid Locomotion and End-Effector Stabilization Control |
Trains humanoids for soft-contact locomotion with stabilized end-effector control. |
N/A |
link |
N/A |
| 150 |
TReF-6: Inferring Task-Relevant Frames from a Single Demonstration for One-Shot Skill Generalization |
Infers task-relevant frames from a single demo to enable one-shot skill generalization. |
N/A |
link |
N/A |
| 151 |
UniTac2Pose: A Unified Approach Learned in Simulation for Category-level Visuotactile In-hand Pose Estimation |
Unifies visual+tactile cues in sim to estimate in-hand pose at the category level. |
N/A |
link |
N/A |
| 152 |
Bipedal Balance Control with Whole-body Musculoskeletal Standing and Falling Simulations |
Uses musculoskeletal simulations of standing/falling to learn robust biped balance control. |
N/A |
link |
N/A |
| 153 |
Do LLM Modules Generalize? A Study on Motion Generation for Autonomous Driving |
Evaluates whether LLM-based modules generalize to unseen driving motion generation tasks. |
N/A |
link |
N/A |
| 154 |
AgentWorld: An Interactive Simulation Platform for Scene Construction and Mobile Robotic Manipulation |
Interactive platform for constructing scenes and training mobile manipulation policies. |
N/A |
link |
N/A |
| 155 |
FastUMI: A Scalable and Hardware-Independent Universal Manipulation Interface with Dataset |
Hardware-agnostic manipulation interface with a large supporting dataset for scaling. |
link |
link |
link |
| 156 |
Towards Generalizable Safety in Crowd Navigation via Conformal Uncertainty Handling |
Applies conformal uncertainty methods to ensure safety during crowd navigation. |
N/A |
link |
link |
| 157 |
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation |
Uses VLMs with reflection to plan multi-stage long-horizon manipulation sequences. |
link |
link |
link |
| 158 |
Unsupervised Skill Discovery as Exploration for Learning Agile Locomotion |
Discovers diverse skills for locomotion, turning exploration into agile behaviors. |
N/A |
link |
N/A |
| 159 |
HyperTASR: Hypernetwork-Driven Task-Aware Scene Representations for Robust Manipulation |
Hypernetworks generate task-aware scene embeddings to improve manipulation robustness. |
N/A |
link |
N/A |
| 160 |
GENNAV: Polygon Mask Generation for Generalized Referring Navigable Regions |
Generates polygon masks for referred navigable regions to guide embodied navigation. |
link |
link |
N/A |
| 161 |
PicoPose: Progressive Pixel-to-Pixel Correspondence Learning for Novel Object Pose Estimation |
Learns dense correspondences progressively to estimate 6D poses of novel objects. |
N/A |
link |
N/A |
| 162 |
CHD: Coupled Hierarchical Diffusion for Long-Horizon Tasks |
Couples hierarchical policies with diffusion for scalable long-horizon task planning. |
N/A |
link |
N/A |
| 163 |
BEVCalib: LiDAR-Camera Calibration via Geometry-Guided Bird’s-Eye View Representation |
Performs LiDAR–camera calibration using geometry cues in BEV space without markers. |
link |
link |
link |
| 164 |
Latent Adaptive Planner for Dynamic Manipulation |
Plans dynamic manipulations by adapting in a learned latent space of strategies. |
N/A |
link |
N/A |
| 165 |
SLAC: Simulation-Pretrained Latent Action Space for Whole-Body Real-World RL |
Pretrains a latent action space in sim to accelerate whole-body RL on hardware. |
link |
link |
link |
| 166 |
Neural Robot Dynamics |
Learns differentiable robot dynamics models that capture complex contact effects. |
link |
link |
link |
| 167 |
GC-VLN: Instruction as Graph Constraints for Training-free Vision-and-Language Navigation |
Turns instructions into graph constraints to enable training-free VLN execution. |
N/A |
link |
N/A |
| 168 |
Constrained Style Learning from Imperfect Demonstrations under Task Optimality |
Extracts task-consistent style from noisy demos while enforcing optimality constraints. |
N/A |
link |
N/A |
| 169 |
TypeTele: Releasing Dexterity in Teleoperation by Dexterous Manipulation Types |
Teleop framework decomposed by manipulation “types” to increase dexterous capability. |
link |
link |
N/A |
| 170 |
VT-Refine: Learning Bimanual Assembly with Visuo-Tactile Feedback via Simulation Fine-Tuning |
Combines vision+tactile feedback with sim-finetuning for precise bimanual assembly. |
N/A |
link |
N/A |
| 171 |
Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning |
Adapts 3D scene representations to new domains to improve imitation policy transfer. |
N/A |
link |
N/A |
| 172 |
FOMO-3D: Using Vision Foundation Models for Long-Tailed 3D Object Detection |
Leverages VFM priors to handle long-tailed distributions in 3D detection tasks. |
N/A |
link |
N/A |
| 173 |
Extracting Visual Plans from Unlabeled Videos via Symbolic Guidance |
Induces symbolic plans from unlabeled videos and converts them to robot-executable steps. |
N/A |
link |
N/A |
| 174 |
Self-supervised perception for tactile skin covered dexterous hands |
Self-supervised learning for perception on dexterous hands covered with tactile skin. |
N/A |
link |
N/A |
| 175 |
Predictive Red Teaming: Breaking Policies Without Breaking Robots |
“Red teams” robot policies in simulation to expose failures before real-world deployment. |
link |
link |
N/A |
| 176 |
Learning Long-Horizon Robot Manipulation Skills via Privileged Action |
Uses privileged action signals during training to acquire long-horizon manipulation skills. |
N/A |
link |
N/A |
| 177 |
Towards Embodiment Scaling Laws in Robot Locomotion |
Empirical study of how embodiment and data scale affect locomotion policy performance. |
N/A |
link |
N/A |
| 178 |
O$^3$Afford: One-Shot 3D Object-to-Object Affordance Grounding for Generalizable Robotic Manipulation |
One-shot affordance grounding between novel 3D objects to generalize manipulation skills. |
N/A |
link |
N/A |
| 179 |
Estimating Value of Assistance for Online POMDP Robotic Agents |
Quantifies assistance value online to decide when human help benefits a POMDP agent. |
N/A |
link |
N/A |
| 180 |
Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation |
Analyzes shortcut biases and how dataset diversity/fragmentation impact generalist policies. |
N/A |
link |
N/A |
| 181 |
SDS – See it, Do it, Sorted: Quadruped Skill Synthesis from Single Video Demonstration |
Derives quadruped skills from a single video demo using perception-to-action synthesis. |
link |
link |
link |
| 182 |
TrackVLA: Embodied Visual Tracking in the Wild |
Embeds tracking into VLA frameworks for robust in-the-wild target following. |
link |
link |
link |
| 183 |
GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation |
Leverages human behavior data to improve affordance learning for manipulation tasks. |
N/A |
link |
N/A |
| 184 |
LaDi-WM: A Latent Diffusion-Based World Model for Predictive Manipulation |
Uses latent diffusion world models to predict future states for manipulation planning. |
N/A |
link |
N/A |
| 185 |
ManiFlow: A General Robot Manipulation Policy via Consistency Flow Training |
Trains a general manipulation policy using consistency-based flow training objectives. |
link |
link |
N/A |
| 186 |
Robot Learning from Any Images |
Bootstraps visuomotor skills from broad internet-scale images via clever supervision. |
N/A |
link |
N/A |
| 187 |
Deep Reactive Policy: Learning Reactive Manipulator Motion Planning for Dynamic Environments |
End-to-end learned reactive planner for manipulators operating in dynamic scenes. |
link |
link |
N/A |
| 188 |
CASPER: Inferring Diverse Intents for Assistive Teleoperation with Vision Language Models |
Infers user intent distributions via VLMs to assist teleoperation effectively. |
N/A |
link |
N/A |
| 189 |
UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations |
Transfers skills from human videos to robots using cross-embodiment representations. |
link |
link |
link |
| 190 |
ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation |
Particle-based world model on point clouds to handle multi-object, multi-material dynamics. |
N/A |
link |
N/A |
| 191 |
Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration |
Bridges human-to-robot embodiment gap via sim-to-real RL seeded by one human demo. |
N/A |
link |
link |
| 192 |
One Demo is Worth a Thousand Trajectories: Action-View Augmentation for Visuomotor Policies |
Action-view augmentation from a single demo to train robust visuomotor policies. |
N/A |
link |
N/A |
| 193 |
VLM-AD: End-to-End Autonomous Driving through Vision-Language Model Supervision |
Supervises end-to-end driving policies using high-level VLM reasoning signals. |
N/A |
link |
N/A |
| 194 |
LLM-Guided Probabilistic Program Induction for POMDP Model Estimation |
Uses LLMs to induce probabilistic programs that estimate POMDP dynamics/observations. |
N/A |
link |
N/A |
| 195 |
Vision in Action: Learning Active Perception from Human Demonstrations |
Learns active perception strategies from human demos for better task performance. |
link |
link |
link |
| 196 |
Point Policy: Unifying Observations and Actions with Key Points for Robot Manipulation |
Represents both observations and actions as keypoints to simplify manipulation learning. |
link |
link |
link |
| 197 |
From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity |
Discovers diverse real-world skills via unsupervised quality-diversity search. |
N/A |
link |
N/A |
| 198 |
Robust Dexterous Grasping of General Objects |
Designs policies for robust dexterous grasping across varied, previously unseen objects. |
link |
link |
link |
| 199 |
Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids |
Transfers vision-based dexterous manipulation skills onto humanoid platforms via RL. |
link |
link |
N/A |
| 200 |
Humanoid Policy ~ Human Policy |
Explores alignment between humanoid robot policies and human-like control strategies. |
link |
link |
link |
| 201 |
ToddlerBot: Open-Source ML-Compatible Humanoid Platform for Loco-Manipulation |
Open-source humanoid platform designed for learning-based loco-manipulation research. |
link |
link |
link |
| 202 |
TWIST: Teleoperated Whole-Body Imitation System |
Teleoperation framework enabling full-body imitation to collect rich humanoid data. |
link |
link |
link |
| 203 |
exUMI: Extensible Robot Teaching System with Action-aware Task-agnostic Tactile Representation |
Extensible teaching system that learns task-agnostic tactile reps aware of action context. |
N/A |
link |
N/A |
| 204 |
OPAL: Visibility-aware LiDAR-to-OpenStreetMap Place Recognition via Adaptive Radial Fusion |
Performs LiDAR↔OSM place recognition using visibility-aware, radial fusion in BEV. |
N/A |
link |
N/A |
| 205 |
CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion |
Introduces causal diffusion to stabilize autoregressive visuomotor policy learning. |
N/A |
link |
N/A |
| 206 |
Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids |
Humanoids auto-adapt policies in the real world via self-training and evaluation loops. |
N/A |
link |
N/A |
| 207 |
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation |
Learning-free VLM pipeline that parses language/vision prompts for UAV navigation targets. |
link |
link |
link |
| 208 |
LaVA-Man: Learning Visual Action Representations for Robot Manipulation |
Learns visual action embeddings to improve generalization in robot manipulation. |
N/A |
link |
N/A |
| 209 |
3DS-VLA: A 3D Spatial-Aware Vision Language Action Model for Robust Multi-Task Manipulation |
VLA model with explicit 3D spatial awareness for robust, multi-task manipulation. |
N/A |
link |
N/A |
| 210 |
Uncertainty-aware Accurate Elevation Modeling for Off-road Navigation via Neural Processes |
Neural processes produce uncertainty-aware elevation maps for safer off-road planning. |
N/A |
link |
N/A |
| 211 |
Generalist Robot Manipulation beyond Action Labeled Data |
Trains generalist manipulation policies without relying on explicit action labels. |
link |
link |
link |
| 212 |
BranchOut: Capturing Realistic Multimodality in Autonomous Driving Decisions |
Models branching, multimodal futures to better capture diverse driving decisions. |
N/A |
link |
N/A |
| 213 |
Co-Design of Soft Gripper with Neural Physics |
Jointly optimizes soft gripper design and control with differentiable neural physics. |
link |
link |
link |
| 214 |
Elucidating the Design Space of Torque-aware Vision-Language-Action Models |
Systematic study of torque-aware VLA design choices for manipulation performance. |
N/A |
link |
N/A |
| 215 |
RoboChemist: Long-Horizon and Safety-Compliant Robotic Chemical Experimentation |
Automates long-horizon chemical experiments with explicit safety constraints. |
N/A |
link |
N/A |
| 216 |
COLLAGE: Adaptive Fusion-based Retrieval for Augmented Policy Learning |
Retrieves and fuses relevant experience adaptively to augment policy learning. |
link |
link |
link |
| 217 |
GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation |
Generates large-scale synthetic data to train task-oriented grasping policies. |
link |
link |
link |
| 218 |
Motion Blender Gaussian Splatting for Dynamic Reconstruction |
Combines motion blending with Gaussian splats to reconstruct dynamic scenes. |
N/A |
link |
N/A |
| 219 |
Improving Efficiency of Sampling-based Motion Planning via Message-Passing Monte Carlo |
Message-passing Monte Carlo reduces variance and improves sampling-based planning efficiency. |
N/A |
link |
N/A |
| 220 |
CaRL: Learning Scalable Planning Policies with Simple Rewards |
Scales planning policies using simple reward structures with strong generalization. |
N/A |
link |
N/A |
| 221 |
Pseudo-Simulation for Autonomous Driving |
Uses pseudo-simulated data to train and evaluate driving policies more efficiently. |
N/A |
link |
link |