Planit
Home
About
TeachIt
SeeIt
Research
Planit

Beyond Geometric Path Planning: Learning Context Driven Preferences

Most existing trajectory planners focus on finding feasible and obstacle-free trajectories. But for a robot to be successful in any human environment, it also needs to understand their preferences.

In our previous work, we presented an algorithm to learn such preferences via eliciting online feedback from the user, which does not need to be an optimal demonstration. We demonstrate that the robot can generalize its learning and produce preferred trajectories for new environments and situations, such as household chores and grocery checkout tasks.

Egg traj c
Multiple trajectories for moving an egg container.
Knife traj c
Robot plans a bad trajectory (waypoints 1-2-4) with knife close to flower. As feedback user corrects waypoint 2 and moves it to waypoint 3.
Zerogfeedback c
User providing zero-G feedback on Baxter.

With PlanIt, we extend this principle and show the users various predicted paths and obtain user preferences on an extremely large scale. The system is built so as to make it as easy as liking or disliking a trajectory portion to obtain these preferences, allowing us to harness the crowd's intelligence. Our extensive experiments on more than 120 environments show that this feedback, even though sub-optimal and noisy, brings large improvements in predicted trajectories.

You can see our algorithms at work in the following videos:


Safe planning via human touch!
Let me watch my football!

  • PlanIt: A Crowdsourcing Approach for Learning to Plan Paths from Large Scale Preference Feedback.
    Technical Report 2014
    Ashesh Jain, Debarghya Das, Jayesh K. Gupta and Ashutosh Saxena. [ arXiv || PlanIt Website]

  • Learning Trajectory Preferences for Manipulators via Iterative Improvement.
    In NIPS 2013
    Ashesh Jain, Brian Wojcik, Thorsten Joachims and Ashutosh Saxena. [pdf || bibtex || ppt || project & video]

  • Beyond Geometric Path Planning: Learning Context-Driven Trajectory Preferences via Sub-optimal Feedback.
    In ISRR 2013
    Ashesh Jain, Shikhar Sharma, and Ashutosh Saxena. [pdf || bibtex || project & video]