Hi, I'm Preetham P

As an Applied Computer Vision Scientist at Rakuten, I specialize in leveraging the power of images to solve diverse challenges. My expertise spans semantic segmentation, object detection, optical character recognition (OCR), and classification. Currently, I am deeply focused on the dynamic field of image generation, constantly exploring the latest advancements in Generative AI models, including Vision-Language Models (VLMs) and cutting-edge image generation techniques, to implement innovative research and expand creative possibilities.

Location

Tokyo, Japan

Focus

Computer Vision

Contact

Get in touch

Experience

  • Rakuten, Inc. - Computer Vision Scientist

    October 2019 – Present · Tokyo, Japan

    View selected projects →

    At Rakuten, I led the development of advanced Generative AI models, including Qwen/SDXL ControlNet, to create visually stunning product backgrounds and photorealistic marketing banners, significantly boosting ad creativity and conversion rates. I also innovated interactive AI systems for generating children's drawing-style illustrations and developed a high-accuracy Japanese OCR model. My work encompassed implementing cutting-edge diffusion model techniques and building MLOps infrastructure for GenAI model deployment.

  • Machine Learning Intern, Central Government of India

    May 2018 – Jul 2018 · Document Recommendation and NER System

    Built a comprehensive Document Recommendation and NER system, processing Big Data with Apache Spark and PySpark’s Word2Vec for recommendations (via Elasticsearch).

Education

  • Indian Institute of Technology(IIT) Delhi - Dual Degree in Mathematics and Computing | 2019

Publications and Patents

Publications

  • Part Level Segmentation in 3D point clouds: ShapeNet Challenge at ICCV 2017 Link

Patents

  • Patented the idea of extracting features from Faster R-CNN detection model to train a Similarity Check model for Quality Assurance automation based on website layout obtained from different mobile screens. [Patent Number: US12067708B2]
  • Patented the model architecture of Weakly Supervised Object Localization by utilizing the Grad-Cam as the pseudo labels. [Patent Number: US11922667B2]
  • Patented an innovative method for detecting Japanese characters that existing State-of-the-Art (SOTA) models, such as CRAFT, failed to recognize. [Patent Number: US12087067B2]
  • Developed and Patented a robust model for extracting key information (e.g., Name, Date of Birth) from health insurance cards with varying formats and layouts.[Patent Number: US20240362938A1]

Contact

Email: preethamp0197@gmail.com

Links: LinkedIn · GitHub