Hi, I'm Preetham P
As an Applied Computer Vision Scientist at Rakuten, I specialize in leveraging the power of images to solve diverse challenges. My expertise spans semantic segmentation, object detection, optical character recognition (OCR), and classification. Currently, I am deeply focused on the dynamic field of image generation, constantly exploring the latest advancements in Generative AI models, including Vision-Language Models (VLMs) and cutting-edge image generation techniques, to implement innovative research and expand creative possibilities.
Experience
-
Rakuten, Inc. - Computer Vision Scientist
October 2019 – Present · Tokyo, Japan
At Rakuten, I led the development of advanced Generative AI models, including Qwen/SDXL ControlNet, to create visually stunning product backgrounds and photorealistic marketing banners, significantly boosting ad creativity and conversion rates. I also innovated interactive AI systems for generating children's drawing-style illustrations and developed a high-accuracy Japanese OCR model. My work encompassed implementing cutting-edge diffusion model techniques and building MLOps infrastructure for GenAI model deployment.
-
Machine Learning Intern, Central Government of India
May 2018 – Jul 2018 · Document Recommendation and NER System
Built a comprehensive Document Recommendation and NER system, processing Big Data with Apache Spark and PySpark’s Word2Vec for recommendations (via Elasticsearch).
Education
-
Indian Institute of Technology(IIT) Delhi - Dual Degree in Mathematics and Computing | 2019
Publications and Patents
Publications
- Part Level Segmentation in 3D point clouds: ShapeNet Challenge at ICCV 2017 Link
Patents
- Patented the idea of extracting features from Faster R-CNN detection model to train a Similarity Check model for Quality Assurance automation based on website layout obtained from different mobile screens. [Patent Number: US12067708B2]
- Patented the model architecture of Weakly Supervised Object Localization by utilizing the Grad-Cam as the pseudo labels. [Patent Number: US11922667B2]
- Patented an innovative method for detecting Japanese characters that existing State-of-the-Art (SOTA) models, such as CRAFT, failed to recognize. [Patent Number: US12087067B2]
- Developed and Patented a robust model for extracting key information (e.g., Name, Date of Birth) from health insurance cards with varying formats and layouts.[Patent Number: US20240362938A1]