I am an MS by Research student at CVIT, IIIT
Hyderabad. I am advised by
Dr C.V. Jawahar on creating an
OCR (Optical Character Reader) for Indian languages (Hindi, Tamil, and Telugu). My current
research focuses on
improving word retrieval and recognition in a large document corpus. I am broadly interested in 2D
and 3D Computer Vision,
Deep Learning and related problems.
My ultimate goal is to contribute to the development of machines capable of reading an instruction
and creating new machines! I'm a very inquisitive person and always willing to learn about fields
but not limited to, science, technology, astrophysics, and physics.
Checkout my Blog!
Google Scholar  /
August 2019 - January 2020 (Hyderabad, Telangana)
Paper [DAS2020 (Oral Presentation)]
Worked on improving word recognition and retrieval in large document collections for
Indian scripts like Hindi, Telugu and Tamil. This work was supervised by
Dr Praveen Krishnan and Dr C.V.
- Accomplished improved word accuracy by 1.4% for Hindi and 1.8% for Telugu by
hypotheses and deep embeddings.
- Proposed techniques like Naive Merge, Query Expansion for improving word
11.12% for the Hindi language.
Created an OCR for Hindi, Tamil and Telugu. Worked on a novel semi-supervised training
for Convolutional Recurrent Neural Network (using CTC loss). Reported an improved word
accuracy by 2.5% and character accuracy by 5%.
March 2019 - August 2019 (Gandhinagar, Gujarat)
Worked on the project titled "Cultural Heritage Preservation and Restoration using
Digital 3D Models",
under Prof. Shanmuganathan Raman.
The project was supported by NVIDIA and
IMPRINT (Impacting Research Innovation and Technology) an
initiative of the Government of India.
Major work done:
- Data Collection in the form of Point Clouds using Faro Focus 3D Laser Scanner.
- Point Cloud Alignment using algorithms like ICP (Using Eigenvalues
Eigenvectors, SVD, and studied
various deep learning approaches like Deep Closest Point, DeepICP, Discriminative
Auto-Encoder Approach, PointNetLK).
- Developed an algorithm for Point Cloud Completion using Fully-Connected
Auto-Encoder and got some
decent results on ShapeNet dataset.
Artificial Intelligence Intern
Meditab Software, Inc.
September 2018 - March 2019 (Ahmedabad, Gujarat)
Worked on the project titled "Facility Layout Optimization using Genetic
Major work done:
- Successful in generating optimal facility layouts, by implementing ELOPE
Layout Optimization and Evaluator) and using it with the Genetic Algorithm, this led to a
in travelling time by 75% for the DosePacker robots leading to more efficient DosePacker
- Created an automatic log file analyzer capable of predicting a possible machine
breakdown, leading to a 70%
decrease in maintenance time of the DosePacker system and saving on maintenance costs.
Artificial Intelligence Research Intern
June 2018 - July 2018 (Greater Noida, Uttar Pradesh)
Worked on the project titled "Credibility Examination of Human Footprint Using
The project was supported by NVIDIA by providing DGX 1 Tesla V100.
Major work done:
- Collected dataset of footprints from 180 volunteers, using a paper scanner at
- Developed a custom Convolution Neural Network for classifying humans based on
the shape and size of
their footprints. The network was trained on the data collected earlier.
Data Analyst Intern
April 2018 - June 2018 (Ahmedabad, Gujarat)
Worked on applying Artificial Intelligence and Machine Learning to an onsite
detection tool for
instantaneous scanning of intracranial bleeding.
Major work done:
- Created software capable of tracking patients using Python and SQLite, leading
better workflow for the people working on collecting the brain scans.
- Successful in detecting actual signal amidst noise (coming from a brain scan
near-infrared laser scanner), by implementing an automatic signal extractor using Python.
Automatic Garbage Detection and Collection
Task was to come up with a device capable of detecting garbage and automatically
picking it up.
- Lead a team of three and developed a system capable of detecting waste bottles
using CNN (MobileNets).
- Developed an algorithm for getting a rough estimation of the depth of the garbage
(with an error margin of 2cms).
- Developed a path planning algorithm based on the concept of PID (by considering
bottle as the centre).
- Codes developed were efficient enough to be run on a RaspberryPi.
- One of the 4% projects selected for demonstration at SSIP annual conference.
Created an end-to-end system for detecting smiling faces in a live video stream using
Convolutional Neural Network.
Self Driving Car
Learned about Deep Q Learning by implementing it for driving a car autonomously.
In this project, I worked on autoencoders to learn the features from 1,40,000 images.
Then using the trained autoencoder with added convolution layers to classify the anime to answer
various questions with 74.6% accuracy like:
- Does the image contain any nudity or sexual content? (Yes, No)
- Is this an interesting image or not? (Yes, no)
Siddhant Bansal, Seema Patel, Ishita Shah, Prof. Alpesh Patel, Prof.
Jagruti Makwana, and Dr. Rajesh Thakker. "AGDC: Automatic Garbage Detection and Collection."
- B. Siddhant, P. Krishnan, and C. V. Jawahar, “Fused Text Recogniser and Deep Embeddings
Improve Word Recognition and Retrieval,” in IAPR International Workshop on Document Analysis
Systems (DAS), 2020. ArXiv: 2007.00166