Anant Rai

Research Engineer
1x Technologies

Email  /  GitHub  /  LinkedIn  /  CV

profile photo
Short Bio
Hi I am Anant! I am a Research Engineer at 1x Technologies. At 1x I am working on making humanoids more capable general purpose robots so that they can be part of our homes and do our chores for us! My main intrest lies in Deep Learning, more specifically representation learning and robot-learning. I also like to work on building scalable software. Previously I was a Graduate Research Assistan at CILVR Lab, NYU, where I worked on problems in Robot Learning, advised by Prof. Lerrel Pinto and Soumith Chintala

Master of Science, Computer Science Sept 2021 - May 2023
New York University, Courant | New York, USA
B.Tech - Information Technology July 2017-Aug 2021
Indian Institute of Information Technology, Allahabad | India
Research Engineer
1x Technologies | California, USA
Sept 2023 - Present
Working with Humanoids to make them capable of working in real-world settings like homes and warehouses
Graduate Research Assistant
CILVR, New York University | New York, USA
Jan 2022 - Sept 2023
Working on problems in Robot Learning, in collaboration with Hyundai. Specifically in the filed of Imitation Learning, Representation Learning, and Generalizable RL. Advised by Prof. Lerrel Pinto
Research & Development Intern
Temasek Lab, Nanyang Technological University | Singapore
Jan - June 2021
Lead the development of Meeting Room Speech Recognition Android Application, capable of automatic transcription. Also built image-captioning model using transformer
Remote Research Assistant
CITEC Lab, Bielefeld University | Germany
2020 - 2021
Using Conflict Based Search (CBD) and Deep-Q learning achieved decent performance on the flatland environment (3rd position in round 1 and 6th position on round 2)
Research & Publications
On Bringing Robots Home
Nur Muhammad Mahi Shafiullah*, Anant Rai*, Haritheja Etukuru, Yiqian Liu, Ishan Misra, Soumith Chintala, Lerrel Pinto
submitted to Science Robotics, 2023
project page | code | arXiv

In this project, we addressed a key barrier in home robotics, particularly in imitation learning: the lack of affordable and user-friendly tools for collecting robot demonstrations. To solve this, we developed an innovative, cost-effective tool named the Stick using which we compile the 'Homes of New York' (HoNY) dataset, encompassing 13 hours of interactions from 22 NYC homes, complete with comprehensive video and action data. We further proposed the Home Pretrained Representation (HPR), a ResNet-34 model pre-trained on the HoNY dataset using self-supervised learning, to initialize robot policies in new environments. We show that this initialization can beat baseline models in unseen home environments.

Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Abhishek Padalkar, Acorn Pooley, Ajay Mandlekar, ...Anant Rai, ... Zichen Jeff Cui
submitted to ICRA, 2023
project page | code | arXiv

This work explores the potential for consolidating robotic learning by developing "generalist" X-robot policies adaptable across various robots, tasks, and environments. Traditionally, robotic learning involves training separate models for each application, robot, and environment. We are able to show that training on largescale diverse robotic dataset can be used to learn general skill like better spatial understanding in both the absolute and relative sense, generalization to unseen objects and overall a much more robust model that can beat baselines trained on robot-specific/task-specific data only

Teach a Robot to FISH: Versatile Imitation from One Minute of Demonstrations
Siddhant Haldar*, Jyothish Pari*, Anant Rai, Lerrel Pinto
won best student paper award at RSS, 2023
project page | code | arXiv

In this work, we present Fast Imitation of Skills from Humans (FISH), a new imitation learning approach that can learn robust visual skills with less than a minute of human demonstrations. Given a weak base-policy trained by offline imitation of demonstrations, FISH computes rewards that correspond to the “match” between the robot’s behavior and the demonstrations. Along with beating the state-of-art, FISH is versatile, which allows it to be used across robot morphologies (e.g. xArm, Allegro, Stretch) and camera configurations

Risk Analysis using Trajectory Prediction in Indian Traffic (collaboration with UMD)
Rahul Jha*, Anant Rai*, Rahul Kala

Built a module using SSD (45 mAP) and SORT to detect and track agents in novel video dataset of Indian Traffic. Developed special LSTM based network architecture to learn the agent’s behaviours and interactions for trajectory prediction in dense heterogeneous traffic. Devised Weighted-Elliptical-Model for risk modelling and combined it with trajectory prediction to get novel predictive-risk-analysis (20% improvement over baseline)

RLMU: Reinforcement Learning with Self-supervised Human Motion Analysis
Anant Rai*, Umang Sharma*, Xu Cao*
code | pdf

Unlabelled videos of humans are a large and almost untapped source of data for robots to learn several routine tasks. The paper RLV proposes a framework that can leverage such videos to enhance learning of challeneging vision-based tasks. The drawback to their model is that it can be extremely unstable and lead to a lot of zero reward rollouts. In our work, we make the framework more robust by adding motion understanding module to stabalize cross domain adaption using discriminator. In our experiments on the task of Visual Puahing, we see our framework doing better than the RLV (15% improvement in performance)

Solving Multiagent Planning problem using Deep Reinforcement Learning
Luca Hermes, Anant Rai, Andre Melnik

Devised optimal environment representation using trees and graphs for intelligent cost calculation and re-routing decisions in the environments of NeurIPS2021: Flatland challenge . Using Conflict Based Search (CBD) and Deep-Q learning achieved decent performance on the environment (3rd position in round 1 and 6th position on round 2)

Food Recommendation System using Neural Collaborative Filtering and Sentiment Analysis
Tinku Singh1, Ashwin Raut2, Dhruv Agarwal3, Rahul Jha4, Anant Rai5, Manish Kumar6
ICAIR, 2020
code | pdf

Used Data Mining and an array of Deep Learning techniques like Collaborative filtering, NN classifier and Sentiment Analysis to analyse and capture food habits of Indian customers. Collected and organised data from food giants like Swiggy and Zomato on a flask-server, for analysis and predictions based on location, and presented results on a web-app created using Node Js.

Template From Here