Awesome-Robotics-Manipulation
β¨ About
This repo contains a curated list of Robot Manipulation papers relating to Robotics domain.
Please feel free to send pull requests or email me to add papers! This version of the repository may have some typos, so donβt hesitate to contact me for corrections!
π Table of Contents
- Awesome Papers
- Survey
- Grasp
- Manipulation
- Representation Learning with Auxiliary Tasks
- Visual Representation Learning
- Multimodal Representation Learning
- Latent Action Learning
- World Model
- Asynchronous Action Learning
- Diffusion Policy Learning
- Other Policies
- Vision Language Action Models
- Reinforcement Learning
- Motion, Tranjectory and Flow
- Data Collection, Selection and Augmentation
- Affordance Learning
- 3D Representation for Manipulation
- 3D Representation Policy Learning
- Reasoning, Planning and Code Generation
- Generalization
- Generalist
- Human-Robot Interaction and Collaboration
- Mobile Manipulation
- Tactile-based Manipulation
- Dexterous Manipulation
- Other Applications
- Awesome Benchmarks
- Awesome-techniques
π Awesome Papers
π Survey
Title | Venue | Date | Code | Notes |
---|---|---|---|---|
A Survey of Embodied Learning for Object-Centric Robotic Manipulation | arXiv | 2024-08-21 | Manipulation | |
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI | arXiv | 2024-07-09 | Embodied Agent | |
A Survey on Vision-Language-Action Models for Embodied AI | arXiv | 2024-05-23 | - | VLA Models |
Survey of Learning-based Approaches for Robotic In-Hand Manipulation | arXiv | 2024-01-15 | - | In-hand Manipulation |
Language-conditioned Learning for Robotic Manipulation: A Survey | arXiv | 2023-12-17 | Manipulation | |
Deep Learning Approaches to Grasp Synthesis: A Review | T-RO 2023 | 2023-07-06 | Project | Grasp |
π¦Ύ Grasp
Rectangle-based Grasp
6-DoF Grasp
Title | Venue | Date | Code |
---|---|---|---|
Real-to-Sim Grasp: Rethinking the Gap between Simulation and Real World in Grasp Detection | CoRL 2024 | 2024-10-09 | Project |
OrbitGrasp: SE(3)-Equivariant Grasp Learning | CoRL 2024 | 2024-07-03 | Project |
EquiGraspFlow: SE(3)-Equivariant 6-DoF Grasp Pose Generative Flows | CoRL 2024 | 2024-09-06 | |
EconomicGrasp: An Economic Framework for 6-DoF Grasp Detection | ECCV 2024 | 2024-07-11 | |
Generalizing 6-DoF Grasp Detection via Domain Prior Knowledge | CVPR 2024 | 2024-04-02 | |
FlexLoG: Rethinking 6-Dof Grasp Detection: A Flexible Framework for High-Quality Grasping | arXiv | 2024-03-22 | - |
HGGD: Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes | RA-L 2023 | 2024-03-27 | |
AnyGrasp: Robust and Efficient Grasp Perception in Spatial and Temporal Domains | T-RO 2023 | 2022-12-16 | Github |
GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping | CVPR 2020 | 2020 | |
6-DOF GraspNet: Variational Grasp Generation for Object Manipulation | ICCV 2019 | 2019-05-25 |
Grasp with 3D Techniques
Title | Venue | Date | Code |
---|---|---|---|
SDF | |||
IGD: Implicit Grasp Diffusion: Bridging the Gap between Dense Prediction and Sampling-based Grasping | CoRL 2024 | Gitlab | |
NeuGraspNet: Learning Any-View 6DoF Robotic Grasping in Cluttered Scenes via Neural Surface Rendering | RSS 2024 | 2023-06-12 | - |
NeRF | |||
LERF-TOGO: Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping | CoRL 2023 | 2023-09-14 | |
GraspNeRF: Multiview-based 6-DoF Grasp Detection for Transparent and Specular Objects Using Generalizable NeRF | ICRA 2023 | 2022-10-12 | |
3D Gaussian Splatting (3DGS) | |||
SparseGrasp: Robotic Grasping via 3D Semantic Gaussian Splatting from Sparse Multi-View RGB Images | arXiv | 2024-12-03 | - |
GraspSplats: Efficient Manipulation with 3D Feature Splatting | CoRL 2024 | 2024-09-03 | |
GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping | RA-L 2024 | 2024-03-14 |
Language-Driven Grasp
Title | Venue | Date | Code |
---|---|---|---|
RoboReflect: Robotic Reflective Reasoning for Grasping Ambiguous-Condition Objects | arXiv | 2025-01-16 | - |
Attribute-Based Robotic Grasping with Data-Efficient Adaptation | T-RO 2024 | 2024-12-12 | Project |
RTAGrasp: Learning Task-Oriented Grasping from Human Videos via Retrieval, Transfer, and Alignment | ICRA 2025 | 2024-09-24 | Project |
LGrasp6D: Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance | ECCV 2024 | 2024-07-18 | |
Reasoning Grasping: Reasoning Grasping via Multimodal Large Language Model | CoRL 2024 | 2024-02-09 | Project |
ThinkGrasp: A Vision-Language System for Strategic Part Grasping in Clutter | CoRL 2024 | 2024-07-16 | |
OWG: Towards Open-World Grasping with Large Vision-Language Models | CoRL 2024 | 2024-06-26 | Project |
RT-Grasp: Reasoning Tuning Robotic Grasping via Multi-modal Large Language Model | IROS 2024 | 2024-11-07 | Project |
Grasp for Transparent Objects
Title | Venue | Date | Code |
---|---|---|---|
T2SQNet: A Recognition Model for Manipulating Partially Observed Transparent Tableware Objects | CoRL 2024 | 2024-09-06 | |
ASGrasp: Generalizable Transparent Object Reconstruction and Grasping from RGB-D Active Stereo Camera | ICRA 2024 | 2024-05-09 | |
Dex-NeRF: Using a Neural Radiance Field to Grasp Transparent Objects | CoRL 2021 | 2021-10-27 |
Dexterous Grasp
Title | Venue | Date | Code |
---|---|---|---|
Grasp What You Want: Embodied Dexterous Grasping System Driven by Your Voice | arXiv | 2024-12-14 | Project |
UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping | arXiv | 2024-12-03 |
π€ Manipulation
Representation Learning with Auxiliary Tasks
Visual Representation Learning
Multimodal Representation Learning
Title | Venue | Date | Code |
---|---|---|---|
VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation | ICLR 2025 | 2025-01-25 | - |
MS-Bot: Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation | CoRL 2024 | 2024-08-02 | |
MUTEX: Learning Unified Policies from Multimodal Task Specifications | CoRL 2023 | 2023-09-25 |
Latent Action Learning
Title | Venue | Date | Code |
---|---|---|---|
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation | arXiv | 2024-12-05 | |
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation | ICRA 2025 | 2024-09-27 | Project |
IGOR: Image-GOal Representations Atomic Control Units for Foundation Models in Embodied AI | - | 2024 | Project |
LAPA: Latent Action Pretraining from Videos | ICLR 2025 | 2024-10-15 | |
GRIF: Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control | CoRL 2023 | 2023-06-30 | |
MimicPlay: Long-Horizon Imitation Learning by Watching Human Play | CoRL 2023 | 2023-02-24 | |
KOAP: Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers | arXiv | 2024-10-24 | - |
LAPO: Learning to Act without Actions | ICLR 2024 | 2023-12-17 | |
ILPO: Imitating Latent Policies from Observation | ICML 2019 | 2018-05-21 |
World Model
Title | Venue | Date | Code |
---|---|---|---|
Sirius-Fleet: Multi-Task Interactive Robot Fleet Learning with Visual World Models | CoRL 2024 | 2024-10-30 | Project |
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning | CoRL 2023 | 2024-01-06 | Project |
FOWM: Finetuning Offline World Models in the Real World | CoRL 2023 | 2023-10-24 | |
SWIM: Structured World Models from Human Videos | RSS 2023 | 2023-08-23 | Project |
Surfer: Progressive Reasoning with World Models for Robotic Manipulation | arXiv | 2023-06-20 |
Asynchronous Action Learning
Title | Venue | Date | Code |
---|---|---|---|
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation | NeurIPS 2024 | 2024-10-14 | Project |
HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers | CoRL 2024 | 2024-09-12 | - |
MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models | CoRL 2023 | 2024-01-25 |
Diffusion Policy Learning
Other Policies
Vision Language Action Models
Reinforcement Learning
Motion, Tranjectory and Flow
Data Collection, Selection and Augmentation
Affordance Learning
3D Representation for Manipulation
3D Representation Policy Learning
Reasoning, Planning and Code Generation
Generalization
Generalist
Human-Robot Interaction and Collaboration
Mobile Manipulation
Title | Venue | Date | Code |
---|---|---|---|
Robi Butler: Remote Multimodal Interactions with Household Robot Assistant | arXiv | 2024-09-30 | Project |
TaMMa: Target-driven Multi-subscene Mobile Manipulation | CoRL 2024 | 2024-09-06 | - |
SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning | CoRL 2023 | 2024-07-12 | Project |
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation | CoRL 2024 | 2024-01-04 | |
GAMMA: Graspability-Aware Mobile MAnipulation Policy Learning based on Online Grasping Pose Fusion | ICRA 2024 | 2023-09-27 |
Tactile-based Manipulation
Title | Venue | Date | Code |
---|---|---|---|
Digitizing Touch with an Artificial Multimodal Fingertip | arXiv | 2024-11-04 | |
Sparsh: Self-supervised touch representations for vision-based tactile sensing | CoRL 2024 | 2024 | |
MimicTouch: Leveraging Multi-modal Human Tactile Demonstrations for Contact-rich Manipulation | CoRL 2024 | 2023-10-25 | Project |
Octopi: Object Property Reasoning with Large Tactile-Language Models | RSS 2024 | 2024-05-05 | |
RoboPack: Learning Tactile-Informed Dynamics Models for Dense Packing | RSS 2024 | 2024-07-01 | Project |
RotateIt: General In-Hand Object Rotation with Vision and Touch | CoRL 2023 | 2023-09-18 | Project |
T-DEX: Dexterity from Touch: Self-Supervised Pre-Training of Tactile Representations with Robotic Play | CoRL 2023 | 2023-03-21 |
Dexterous Manipulation
Other Applications
π Awesome Benchmarks
Grasp Datasets
Title | Venue | Date | Code |
---|---|---|---|
QDGset: A Large Scale Grasping Dataset Generated with Quality-Diversity | arXiv | 2024-10-03 | Project |
Real-to-Sim Grasp: Rethinking the Gap between Simulation and Real World in Grasp Detection | CoRL 2024 | 2024-10-09 | Project |
Grasp-Anything-6D: Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance | ECCV 2024 | 2024-07-18 | |
Grasp-Anything++: Language-driven Grasp Detection | CVPR 2024 | 2024-06-13 | |
Grasp-Anything: Large-scale Grasp Dataset from Foundation Models | ICRA 2024 | 2023-09-18 | |
GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping | CVPR 2020 | 2020 |
Manipulation Benchmarks
Cross-Embodiment Benchmarks
Title | Venue | Date | Code |
---|---|---|---|
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation | arXiv | 2024-12-18 | Project |
GENESIS: A generative world for general-purpose robotics & embodied AI learning | - | - | |
ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI | arXiv | 2024-10-01 | |
All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents | arXiv | 2024-08-20 | Dataset |
Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence? | NeurIPS 2023 | 2023-03-31 | |
Isaac Lab: Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments | RA-L 2023 | 2023-01-10 |
π οΈ Awesome Techniques
Title | Venue | Date | Code |
---|---|---|---|
Awesome-Implicit-NeRF-Robotics: Neural Fields in Robotics: A Survey | - | 2024-10-26 | |
Awesome-Video-Robotic-Papers | - | 2024 | |
Awesome-Generalist-Robots-via-Foundation-Models: Neural Fields in Robotics: A Survey | - | 2024 | |
Awesome-Robotics-3D | - | 2024 | |
Awesome-Robotics-Foundation-Models: Foundation Models in Robotics: Applications, Challenges, and the Future | - | 2023-12-13 | |
Awesome-LLM-Robotics | - | 2022 |
β¨ Citation
If you find this repository useful, please consider citing this list:
@misc{bai2024roboticsmanipulation,
title = {Awesome-Robotics-Manipulation},
author = {Bai, Shuanghao and Ding, Pengxiang and Zhang, Haoran},
journal = {GitHub repository},
url = {https://github.com/BaiShuanghao/Awesome-Robotics-Manipulation},
year = {2024},
}