Portfolio - Manoj | Codebasics

Manoj

GenAI & ML Aspirant

[email protected]

PHONE

+91 8310654643

About Me

👋 Hi, I’m Manoj

I’m a GenAI and Applied Machine Learning practitioner focused on building LLM-powered systems and data-driven pipelines—from ingestion and preprocessing to retrieval, reasoning, and deployment.

Through hands-on projects such as RAG-based research tools, conversational AI systems, and predictive ML models, I’ve delivered solutions that improve answer reliability, reduce processing time, and support real-world decision-making.

Skilled in Python, SQL, RAG architectures, applied machine learning, and lightweight deployment frameworks, I enjoy translating complex data and AI capabilities into reliable, user-centric applications.

🛠️ Tools & Technologies

Python · SQL · RAG · Machine Learning · scikit-learn · Pandas · NumPy
Flask · Streamlit · Vector Search · Git & GitHub
HTML · CSS · JavaScript

🔍 Functional Focus Areas

👉 GenAI Systems & Retrieval-Augmented Generation (RAG)
👉 Applied Machine Learning & Model Evaluation
👉 Preprocessing & LLM Application Deployment

2

Python Projects

2

Machine Learning Projects

1

Deep Learning Project

2

GenAI Projects

Key Skills

Python

Numpy

Pandas

Retrieval-Augmented Generation (RAG)

Applied Machine Learning

Model Evaluation & Performance Analysis

LLM-Powered Application Design

My Projects

My Experience

Data Science Intern — Edu Tantr

Feb 2025 – Apr 2025 | Bengaluru, Karnataka (On-site)
Tech Stack: Python, Pandas, NumPy, Scikit-learn, Matplotlib

Developed an end-to-end Machine Learning solution that predicts health insurance premiums for clients based on demographic, financial, and medical attributes.

Key Contributions:

->Built and deployed an interactive Streamlit web app to predict premium amounts.

->Conducted data preprocessing, handling outliers, missing values, and feature scaling.

->Applied Variance Inflation Factor (VIF) to detect and eliminate multicollinear features.

->Discovered a major source of error — young customers (<25 years) — and segmented data accordingly.

->Trained two separate ML models (young vs rest) for higher accuracy and interpretability.

->Introduced a domain-specific feature “Genetical Risk” to capture inherited health factors.

->Achieved a significant accuracy improvement from 73% → 98% after segmentation.

Results:

✅ Extreme error rate reduced from 27% → 2%

✅ Improved model reliability and interpretability

✅ Demonstrated end-to-end ML lifecycle from data exploration to deployment