Codebasics Logo

Powered by
Codebasics Gen AI & DS Bootcamp

Home

About

Projects

My Experience

Certificates

Contact

Arjun K

Arjun K

Data Scientist

[email protected]
Resume

Hello, I am

Arjun K.

2

Python
Projects

3

Machine Learning
Projects

3

Deep Learning
Projects

5

GenAI
Projects

About Me

Hey, there! If you're looking for someone who can turn messy business problems into practical AI solutions, I'm glad you stopped by.

I'm a Data Scientist targeting junior to mid-level roles where business thinking matters as much as technical execution. Over the past 4+ years, I’ve worked across finance, healthcare, retail, hospitality, real estate, and sports, using analytics and machine learning to improve decision-making, reduce manual effort, and drive measurable impact.

I enjoy turning complex ideas into systems people can actually use, from ML pipelines and deep learning applications to RAG workflows, multi-agent AI systems, and real-time APIs. I bring a methodical approach to problem-solving, along with communication, collaboration, and curiosity. Outside work, some of the things I enjoy include, but are not limited to, deadpan humour, satire, self-deprecating humour, fitness, and nutrition science.

Key Skills

Python

Machine Learning

Deep Learning

Generative AI

NLP

Computer Vision

SQL

MLOps

Multi-Agent AI Systems

Retrieval-Augmented Generation

My Projects

Myntra SoleSense – AI Sneaker Assistant
Myntra SoleSense – AI Sneaker Assistant

Domain/Function: Conversational AI, E-commerce

MoodLens AI – Detailed Emotion Recognition
MoodLens AI – Detailed Emotion Recognition

Domain/Function: Computer Vision, Affective Computing, Emotion Recognition

A2A Research Agent for ArXiv
A2A Research Agent for ArXiv

Domain/Function: Academic Research, Multi-Agent AI, Research Automation

MedTrace: Biomedical RAG with Contradiction Detection
MedTrace: Biomedical RAG with Contradiction Detection

Domain/Function: Biomedical Analysis, Medical Literature Review, Contradiction Detection

Healthline RAG Assistant – Medical Research Chatbot
Healthline RAG Assistant – Medical Research Chatbot

Domain/Function: Healthcare, Medical Research, Retrieval-Augmented Generation

HRizzle: MCP-based HR Assistant
HRizzle: MCP-based HR Assistant

Domain/Function: HR Automation, HR Analytics, Conversational AI

CredVibe: Credit Risk Assessment System
CredVibe: Credit Risk Assessment System

Domain/Function: Financial Risk Modeling, Credit Scoring, Probability of Default Modeling

FreshCheck AI – Fresh vs Spoiled Fruit Classifier
FreshCheck AI – Fresh vs Spoiled Fruit Classifier

Domain/Function: Computer Vision, Food Quality Assessment, Spoilage Detection, Produce Quality Assessment, AgroTech

DentDetect AI – Smart Vehicle Damage Assessment
DentDetect AI – Smart Vehicle Damage Assessment

Domain/Function: Computer Vision, Insurance Claims Automation, Vehicle Damage Assessment

BevIntel – ML-Powered Beverage Price Prediction
BevIntel – ML-Powered Beverage Price Prediction

Domain/Function: FMCG, CPG, Pricing Analytics, Revenue Optimization, Beverage Price Prediction

InsureSight AI – Health Insurance Premium Predictor
InsureSight AI – Health Insurance Premium Predictor

Domain/Function: Insurance Analytics, Premium Prediction, Risk Assessment, FinTech, Actuary, Actuarial Science

SpendSensei – Expense Management System
SpendSensei – Expense Management System

Domain/Function: Personal Finance, Expense Tracking, Spending Analytics

FMCG Promotion Performance Analysis
FMCG Promotion Performance Analysis

Domain/Function: Retail Analytics, Promotion Effectiveness, Sales Performance Analysis, FMCG, CPG

My Experience

AtliQ Technologies

Data Science & AI Intern | Remote | Dec 2025 - Feb 2026

Intro:

I worked on a wide range of AI, machine learning, & analytics problems with a strong focus on building systems that were useful in real business settings. My work covered computer vision, pricing analytics, retrieval-augmented generation, credit risk modeling, multi-agent systems, & workflow automation, with equal emphasis on model quality, deployment, explainability, & usability.

Tech Stack Used:

Python, PyTorch, ResNet-50, LightGBM, XGBoost, Scikit-learn, Optuna, Streamlit, FastAPI, MLflow, DagsHub, Pandas, NumPy, Matplotlib, Seaborn, Plotly, Joblib, Pillow, LangChain, ChromaDB, FAISS, Hugging Face, Ollama, Groq API, Pydantic

Highlights:

  1. Built a 6-class image classification system on 3,000+ labeled images using ResNet-50 & Optuna, achieving 80.70% accuracy & helping reduce manual inspection effort in an insurance workflow.
  2. Developed a beverage price prediction engine on 40K+ consumer records using LightGBM, delivered real-time inference through Streamlit, tracked experiments with MLflow & DagsHub, & maintained pricing error within ±6%.
  3. Analyzed 50K+ sales & promotion records for FMCG campaign analysis, improved data accuracy by 20%, identified high-performing products, stores, & promotion types, & reduced manual reporting turnaround by 60%.
  4. Built an insurance premium prediction workflow using XGBoost & Streamlit, improving reliability through age-segmented modeling, feature engineering, preprocessing, encoding, scaling, & input validation.
  5. Developed a fruit freshness classification system across 8 fruit types using PyTorch, ResNet-50, Optuna, & Streamlit, while preventing data leakage through careful dataset splitting & supporting real-time inference.
  6. Built a medical RAG workflow with semantic search, source grounding & strict fallback behavior, validated across 50+ queries with fully grounded responses & zero hallucinations.
  7. Developed a credit risk assessment system with 95.31% recall on defaulters, Gini above 85, KS above 40, explainable scoring logic, & sub-millisecond real-time inference for lending workflows.
  8. Built a medical literature review pipeline with semantic retrieval & stance classification, reducing synthesis time by about 60% & achieving 80% agreement with manual validation across 50 research articles in 2 clinical domains.
  9. Created a 2-agent AI system using the A2A protocol to separate retrieval from summarization, enabling concurrent research processing with a Streamlit interface for structured outputs.
  10. Built an HR automation workflow for Claude Desktop with fuzzy search, modular HRMS components, & SMTP automation, reducing manual processing from hours to seconds.


TenTimes

Jr. Data Scientist | Bengaluru | Jan 2024 - Sep 2024

Intro:

I worked on event intelligence problems across hospitality, real estate, & travel, helping improve attendance prediction, speaker data quality, profile enrichment, & contextual demand forecasting. The role combined machine learning, web scraping, workflow orchestration, monitoring, NLP-driven enrichment, & business-facing product impact.

Tech Stack Used:

Python, SQL, BeautifulSoup, Selenium, Airflow, MLflow, Grafana, SERP API, Ollama, Llama, fuzzy string matching, Levenshtein distance, phonetic matching

Highlights:

  1. Developed a machine learning pipeline to forecast B2B event attendance using venue metrics, exhibitor quality, speaker influence, market dynamics, social traction, & historical behavior.
  2. Supported data extraction with BeautifulSoup & Selenium, orchestrated workflows with Airflow, tracked experiments with MLflow, & built monitoring dashboards in Grafana.
  3. Improved attendance prediction accuracy by 35%, increased sponsor matching & engagement by 15%, & supported adoption by 5 enterprise clients.
  4. Helped build a speaker data enrichment workflow that used fuzzy matching, phonetic matching, & web data extraction to resolve identity variations across fragmented sources.
  5. Improved profile accuracy by 40%, reduced redundancy by 60%, & generated stronger speaker intelligence for organizers & attendees.

FirstSportz

Analyst | Bengaluru | Jun 2023 - Oct 2023

Intro:

I worked at the intersection of sports analytics, audience insights, SEO content strategy, & performance tracking. The role focused on using data to shape relevant content, improve reach, & understand what resonated most with readers.

Tech Stack Used:

Analytics, KPI Tracking, SEO Content Strategy, WordPress, Audience Analysis, Content Performance Analysis

Highlights:

  1. Produced SEO-optimized sports content, including player-focused & tournament-related coverage, reaching 20K+ unique readers per week.
  2. Used statistical analysis & KPI tracking to identify player trends, audience behavior, & content performance patterns.
  3. Contributed to content performance that generated 1.5M+ impressions & reads.

Heart Kinetics

ML Engineer | Remote | Mar 2023 - May 2023

Intro:

I worked in a global data science setting focused on cardiac signal analysis, preprocessing quality, feature extraction, & model-ready data preparation. The experience strengthened my work in biomedical data analysis, dashboarding, exploratory analysis, & data quality monitoring.

Tech Stack Used:

Python, Signal Analysis, Feature Extraction, Dashboarding, Plotly, Data Cleaning, Predictive Modeling Preparation

Highlights:

  1. Collaborated in a 20-member global data science cohort analyzing 729 cardiac signal records for signal analysis, preprocessing, feature extraction, & predictive modeling support.
  2. Designed dashboards to monitor 15+ signal quality & preprocessing indicators.
  3. Improved data exploration, cleaning workflows, & downstream dataset preparation for machine learning.

JMS India

Data Analyst | Bengaluru | Oct 2019 - May 2022

Intro:

I worked across real estate, finance, retail, hospitality, healthcare, & sports, solving business problems in revenue optimization, operational efficiency, business analysis, dashboarding, segmentation, & performance reporting. The role gave me strong exposure to cross-industry analytics, stakeholder-driven KPI design, predictive analytics, data visualization, & business storytelling.

Tech Stack Used:

Python, SQL, SQLAlchemy, Tableau, Domo, ArcGIS, MS Excel, Excel Macros, Selenium, NLTK, TextBlob, Machine Learning, Binary Classification, K-means Clustering

Highlights:

  1. Supported a US-based real estate private equity fund operating across 22 states by automating data processing, building KPI dashboards, & applying predictive analytics for tenant classification.
  2. Helped reduce payment delays by 25% & increase investor interest by 20% through real-time property analytics.
  3. Worked on sales, marketing, customer service, & support analytics for an outsourced solutions provider using Python, SQL, SQLAlchemy, & Tableau.
  4. Increased billable receipts by 15% over 4 months, reduced churn from 12% to 8%, & improved First Call Resolution from 60% to 70%.
  5. Supported a hotel chain with analytics for occupancy, revenue, service, & operational efficiency, helping reduce room vacancy by 15%, increase ancillary revenue by 10%, improve RevPAR by 12%, & reduce CPOR by 8%.
  6. Evaluated 30K+ SaaS companies for investment prioritization using analytics, web scraping, text processing, classification, clustering, & dashboarding.
  7. Improved investment potential by 30%, increased investments by 20%, reduced evaluation time by 40%, & lowered costs by 18%.

Awards & Certificate

img

Python: Beginner to Advanced For Data Professionals

img

SQL for Data Science

img

Math and Statistics For AI, Data Science

img

Natural Language Processing

img

Deep Learning: Beginner to Advanced

img

Mastering Communication & Stakeholder Management

img

Master Machine Learning for Data Science & AI: Beginner to Advanced

img

Gen AI to Agentic AI with Business Projects

AtliQ Technologies Internship 1 Experience Letter

AtliQ Technologies Internship 1 Experience Letter

AtliQ Technologies Internship 2 Experience Letter

AtliQ Technologies Internship 2 Experience Letter

Databricks 14 Days AI Challenge

Databricks 14 Days AI Challenge

Let's Connect

Feel free to get in touch with me. I am always open to discussing new projects, creative ideas or opportunities to be part of your visions.