|
Short Bio |
Hi I am Anant! I am a Research Engineer at 1x Technologies. At 1x I am working on making humanoids more capable general purpose robots so that they can be part of our homes and do our chores for us! My main intrest lies in Deep Learning, more specifically representation learning and robot-learning.
I also like to work on building scalable software. Previously I was a Graduate Research Assistan at CILVR Lab, NYU, where I worked on problems in Robot Learning, advised by Prof. Lerrel Pinto and Soumith Chintala
|
|
Education |
|
Experience
|
Research Engineer
1x Technologies | California, USA
|
Sept 2023 - Present
|
Working with Humanoids to make them capable of working in real-world settings like homes and warehouses
|
|
Graduate Research Assistant
CILVR, New York University | New York, USA
|
Jan 2022 - Sept 2023
|
Working on problems in Robot Learning, in collaboration with Hyundai. Specifically in the filed of Imitation Learning, Representation Learning, and Generalizable RL. Advised by Prof. Lerrel Pinto
|
|
Research & Development Intern
Temasek Lab, Nanyang Technological University | Singapore
|
Jan - June 2021
|
Lead the development of Meeting Room Speech Recognition Android Application, capable of automatic transcription. Also built image-captioning model using transformer
|
|
Remote Research Assistant
CITEC Lab, Bielefeld University | Germany
|
2020 - 2021
|
Using Conflict Based Search (CBD) and Deep-Q learning achieved decent performance on the flatland environment (3rd position in
round 1 and 6th position on round 2)
|
|
Research & Publications
|
|
On Bringing Robots Home
Nur Muhammad Mahi Shafiullah*, Anant Rai*, Haritheja Etukuru, Yiqian Liu, Ishan Misra, Soumith Chintala, Lerrel Pinto
submitted to Science Robotics, 2023
project page | code | arXiv
In this project, we addressed a key barrier in home robotics, particularly in imitation learning: the lack of affordable and user-friendly tools for collecting robot demonstrations.
To solve this, we developed an innovative, cost-effective tool named the Stick using which we compile the 'Homes of New York' (HoNY) dataset, encompassing 13 hours of interactions from 22 NYC homes, complete with comprehensive video and action data.
We further proposed the Home Pretrained Representation (HPR), a ResNet-34 model pre-trained on the HoNY dataset using self-supervised learning, to initialize robot policies in new environments. We show that this initialization can beat baseline models in unseen home environments.
|
|
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Abhishek Padalkar, Acorn Pooley, Ajay Mandlekar, ...Anant Rai, ... Zichen Jeff Cui
submitted to ICRA, 2023
project page | code | arXiv
This work explores the potential for consolidating robotic learning by developing "generalist" X-robot policies adaptable across various robots, tasks, and environments. Traditionally, robotic learning involves training separate models for each application, robot, and environment.
We are able to show that training on largescale diverse robotic dataset can be used to learn general skill like better spatial understanding in both the absolute and relative sense, generalization to unseen objects and overall a much more robust model that can beat baselines trained on robot-specific/task-specific data only
|
|
Teach a Robot to FISH: Versatile Imitation from One Minute of Demonstrations
Siddhant Haldar*, Jyothish Pari*, Anant Rai, Lerrel Pinto
won best student paper award at RSS, 2023
project page | code | arXiv
In this work, we present Fast Imitation of Skills from Humans (FISH), a new imitation learning approach that can learn robust visual skills with less than a minute of human demonstrations.
Given a weak base-policy trained by offline imitation of demonstrations, FISH computes rewards that correspond to the “match” between the robot’s behavior and the demonstrations.
Along with beating the state-of-art, FISH is versatile, which allows it to be used across robot morphologies (e.g. xArm, Allegro, Stretch) and camera configurations
|
|
Risk Analysis using Trajectory Prediction in Indian Traffic
(collaboration with UMD)
Rahul Jha*, Anant Rai*, Rahul Kala
IEEE ITSC, 2022
paper
Built a module using SSD (45 mAP) and SORT to detect and track agents in novel video dataset of Indian Traffic. Developed special LSTM based network architecture to learn the agent’s behaviours and interactions for trajectory
prediction in dense heterogeneous traffic. Devised Weighted-Elliptical-Model for risk modelling and combined it with trajectory prediction to get novel predictive-risk-analysis (20% improvement over baseline)
|
|
RLMU: Reinforcement Learning with Self-supervised Human Motion Analysis
Anant Rai*, Umang Sharma*, Xu Cao*
code | pdf
Unlabelled videos of humans are a large and almost untapped source of data for robots to learn several routine tasks. The paper RLV proposes a framework that can leverage such videos to
enhance learning of challeneging vision-based tasks. The drawback to their model is that it can be extremely unstable and lead to a lot of zero reward rollouts. In our work, we make the framework more robust by adding motion
understanding module to stabalize cross domain adaption using discriminator. In our experiments on the task of Visual Puahing, we see our framework doing better than the RLV (15% improvement in performance)
|
|
Food Recommendation System using Neural Collaborative Filtering and Sentiment Analysis
Tinku Singh1, Ashwin Raut2, Dhruv Agarwal3, Rahul Jha4, Anant Rai5, Manish Kumar6
ICAIR, 2020
code | pdf
Used Data Mining and an array of Deep Learning techniques like Collaborative filtering, NN classifier and Sentiment
Analysis to analyse and capture food habits of Indian customers. Collected and organised data from food giants like Swiggy and Zomato on a flask-server, for analysis and predictions based
on location, and presented results on a web-app created using Node Js.
|
|