Vinu Karthek R

Data Sceince (AI/ML/Gen-AI)🤖 | ML Engineering (MLOps/LLM-Ops)☁️ | Statistical Analysis📊

Summary

  • 8 years expertise in development & deployment of scalable AI/ML Models using Python/R in Manufacturing/Semiconductor/IT domains along with 10 years of expertise in developing Automation Framework using Python, PowerShell & NI-LabVIEW
    • Leveraged Apache Spark/Airflow/Databricks for building ETL pipelines, Utilizing Keras, pytorch, scikit-learn & TensorFlow libraries for Model Development harnessing frameworks DVC/MLFlow/Optuna for Model Refinement
    • My expertise extends to deploying/scaling solutions through Docker, Kubernetes (Seldon Core) & GitCICD Pipelines.
    • Proficient in statistical analysis, viz Hypothesis Testing(AB Testing), Statistical tests (ANOVA, Chi-Squared etc)
    • Create insightful Data Visualizaton Dashboards using Tableau/TIBCO Spotfire/Grafana

About Me
I am a skilled, ambitious, & motivated individual with a diverse range of experiences & capabilities. I excel at effectively managing multiple tasks on a daily basis while maintaining composure under pressure. My key strengths lie in communication, innovation & cultivating strong relationships to achieve optimal outcomes. I approach problem-solving with a blend of creative and logical thinking. I’m highly organized & possess a quick learning ability, thanks to my abundant neuroplasticity
My Typical Day

SKILLS

CODING

Python
R Programming
Embedded C++
NI-LabVIEW

AI/ML Frameworks & Pipelines

Optuna
Langchain
SeldonCore Streamlit FastAPI
Docker K8S K8S Github Gitlab

Data Visualization

Spotfire
Tableau
Power BI
bokeh, seaborn, plotly, ggplot2
PDF Exensio, NI Optimal+, Synopsis SLM

Platforms & OS

DataBricks Dataiku DataRobot
Windows
Linux (Debian, Kali)
Proxmox (VM OS)

Professional Experience

Lead Data Scientist - APAC

  • Develop AI agents using LangGraph integrated with Databricks MCP servers for real-time Unity Catalog data querying and automated visualization
  • Build end-to-end ML data platforms using Databricks medallion architecture (Bronze-Silver-Gold) with Delta Lake and Apache Spark
  • Develop ML models with MLflow experiment tracking, deploy via Databricks Model Registry as REST API endpoints, and implement monitoring using Databricks Lakehouse Monitoring for drift detection.
  • Apr 2025 - Present

    Staff Data Scientist

    • Developed & deployed scalable AI/ML models for Manufacturing/Engineering, including Wafermap Image Classification (ResNet50), Rule Optimization (xgBoost), and Failure Prediction (RUSBoost). These models led to substantial harvestings of ~420K€/year
      • Deployment was done using FastAPI/Streamlit/SeldonCore on Openshift PaaS in K8S PODs. Utilized GitHub Actions & webhooks for CI/CD. The performance of the model monitored using Tableau Dashboards
    • Developed Data extraction & Analysis scripts for Automated Lot-on-hold Analysis (100+ lots/hour over 6 sites across the globe) using MS-SQL,R & UiPath RPA
    • Developed datasheet Q&A Chatbot using Llama2-7B-GGMU. Utilized LangChain to vectorize docs & store to FAISS VectorDB
    • Experience in using Elastic Search, Filebeat, & Logstash to process large amounts of log data and generate insights on Kibana dashboard, resulting in improved debugging performance. Also store the processed data using Datalake API
    • Developed a diverse set of advanced models using TensorFlow, Scikit-learn, pandas, pySpark, and Keras libraries. Models included CNN, SVM, Clustering, Classification/Regression using DecisionTrees/RandomForest /xgBoost. This comprehensive skill set empowered accurate data-driven insights

    Nov 2021 - Apr 2025

    Senior Data Scientist

    • Developed an image classification model using YoLov3 for Failure Analysis image segmentation & fail-chip classification. Achieved a segmentation accuracy of 99% & a fail-chip classification accuracy of 93%
    • Utilized GAN to improve the resolution of low-quality images from an old camera, resulting in high-res images comparable to those from a hi-res camera
    • Expert in Statistical Data analysis & visualization of production test data using Python, NI O+, Spotfire(Exensio) & Tableau
    • Experienced in applying Machine Learning models like Regression, Clustering, xGBoost & random forest for data interpretation using Keras
    • Managed a team of 5 engineers handling Characterizing for multiple Mixed-Signal IPs in the SoC
    • Development of end to end test automation framework for PVT char using LabVIEW & Python Framework
    • Post-Silicon Validation of Mixed-Signal IPs on State of the art Mobile, Modem (5FF) & RF(14nm) SoCs
    • Design & Development of High-Speed (>15Gbps) System Level Test Platform for Validation of 5G (Sub6/mm Wave) RF transceivers

    Jun 2017 - Nov 2021

    Research Student (AI/Machine Learning)

    • Developed python UI for annotating features on airplane door images (viz door, door handle, window etc) , facilitating precise labelling for subsequent analysis
    • Employed deep learning techniques (CNN using tensorflow1.0) for object detection, utilizing annotated data to identify and locate features on airplane doors with 95% accuracy
    • Using the detected features Validating & the laser distance sensor, developed a complex scoring algorithm to confirm if the robot base is facing the airplane door
    • Control the robotic base carrying the camera, laser sensor and aero-bridge depending on the control signals obtained, thus achieving a perfect docking (using Robotic Operating System)

    Aug 2016 - Apr 2017

    Engineering Intern

  • Developed a generic Python automation for post-processing & filtering huge test data (up to 5 GB) using Pandas/Vaex/Dask
  • Robotic handler automation for multi DUT testing using LabVIEW
  • Designed an AAF with Low pass-band attenuation & steep roll-off for a custom digitizer(using Keysight ADS Tool)
  • Dec 2016 - Mar 2017

    Validation Engineer

  • Mixed-Signal Char of Precision IPs viz SARADC, Bandgap References, Analog MUX & Op-Amps
  • Expertise in developing firmware solutions using MCUs, FPGA & NI PXI Hardware
  • Experience in PCB design, layout, silicon level debugging & data analysis
  • Experience in working with Teradyne Eagle ATE
  • Led the successful post-silicon validation of Precision Analog MUX (MUX36S08 & MUX36D04) to include firmware development, test automation, silicon bring up, data collection & data analysis while exceeding project commitments
  • Automated entire measurement system for characterization of Band-gap References, SAR ADC & Analog MUXes thus significantly reducing test times (up to 10x)
  • Apr 2013 - Jun 2016

    ALMA MATER

    M.Sc (Computer Control & Automation)

    Thesis Title: Automated docking of passenger boarding bridge
    Supervisors: Associate Prof. Wang Jianliang (NTU)
    Grade: CGPA: 3.7/5
    2016 - 2017

    B.E (Electronics & Communication Engineering)

    Thesis Title: Real Time Flood Alert System Using GSM
    Grade: CGPA: 8.9 (1st Class with Distinction)
    2008 - 2012

    Certifications

    Databricks - Generative AI Fundamentals

    Databricks Cert
    May 2025

    Dataiku - Advanced Designer

    Dataiku Cert
    Mar 2025

    Dataiku - MLOPs Practitioner

    Dataiku Cert
    Mar 2025

    AWS - Cloud Practitioner

    AWS Cert
    Aug 2024
    Mar 2024

    Statistics Bootcamp II

    NUS-ISS, Singapore
    Apr 2023

    UiPath RPA Essentials

    Tertiary Courses, Singapore
    Jan 2023

    Predictive Analytics - Insights of Trends and Irregularities

    NUS-ISS, Singapore
    Apr 2022

    R for Data Science Certification

    upGrad KnowledgeHut, Singapore
    Mar 2022

    PASSIONS & CURIOSITIES

    3D Printing (FDM & SLA Printers)

  • I am mesmerized by the process of transforming imagination into a 3D model & bringing it to life using 3D printers
  • I have hands-on experience operating/maintaining/automating SLA & FDM 3D Printers

  • Robotics (Robotic Arms & Humanoid Robots)

  • I have built several robotic arms/mini-rovers using 3D printed parts
  • Currently I am working on humanoid robots (Poppy & Inmoov) leveraging my 3D Printing experience
  • I also intend to build an MiniRobot Dog Controlled by NVIDIA Jetson Nano AI Module

  • AI/ML (Hackathons & Personal Projects)

  • I always find joy in trying out various Machine learning projects. Recently, participated in Singapore Datathon Organised by NUS & build a ML model to predict 30 Day Mortality of Critically Ill Liver Cirrhotic Patients using the MIMIC IV data-set with an accuracy of 87\% using the patient's dynamic lab results

  • Homelab (Personal Servers/VMs)

  • I am a big fan of self hosted community. Currently, I have completely automated most of the electrical/electronics in my home using Home Assistant OS running on a Proxmox VM Cluter
  • Also have a NAS setup, running Plex media player & several other self hosted containers like Heimdall, Portainer (Docker/K8s), Dockerized SQL Servers & many more

  • © Vinu Karthek