Excel
Project
PowerBI
Project
Python
Project
Data Engineer / Analyst with hands-on experience turning messy data into production ML models, stable ELT pipelines, and clear business insights. Skilled in Python, SQL, and PySpark; Snowflake and Databricks for scalable compute; AWS (S3, Lambda, SageMaker) for deployment; and Airflow + dbt for orchestration. Experienced in feature engineering, causal analysis, model tracking with MLflow/Optuna, drift monitoring (PSI), automated retraining, and data-quality gating. Proficient in Tableau and Power BI for storytelling and in using SQL for deep root-cause analysis. Completed a Master’s in Data Analytics, combining statistical rigor with engineering discipline to ship reliable, high-impact data products.
Power BI
Tableau
SQL
Python
AWS
FastDataMasker
MongoDB
Scala
Excel
Power query
Azure Devops
Tricentis Tosca
Domain/Function: Sales & Marketing Analytics
Domain/Function: Business Intelligence & Analytics
Domain/Function: Sales, Customer & Product Performance Insights
Domain/Function: Heart-Disease Risk Prediction & Clinical Decision Support
Deta Engineer/Analyst – JPMorgan Chase | MA May 2024 – Present
• Developed and automated 10+ interactive dashboards using Microsoft Power BI and Tableau, increasing executive reporting efficiency by 40% and accelerating data-driven decision-making.
• Migrated 380 TB from a 250-node Hadoop cluster to an AWS S3 Lakehouse on Apache Iceberg.
• Added Standard-IA/Glacier tiering and FinOps tags, cutting storage and compute costs 35% and shrinking daily batch windows from 8 hours to 90 minutes.
• Built a real time ingestion pipeline with Kafka 3.6, Apache Flink 1.18, and Kinesis Firehose that streams 1.2 billion events a day (≈ 80 k msg/s). Reduced data latency from next day to under 5 minutes, enabling intraday risk updates.
• Created data contracts in JSON-Schema and Protobuf and enforced row- and column-level security with AWS Lake Formation and Apache Ranger, achieving 100% compliance with SEC 17a-4, SOX, and GDPR across 12 business units.
• Added Monte Carlo Data Observability and OpenLineage monitoring to Airflow pipelines, cutting data-downtime incidents by 70% (52 → 16 per quarter) and reducing mean time to recover from 4 hours to 45 minutes.
• Deployed Snowflake Secure Data Sharing with Terraform and GitHub Actions, onboarding 320 analysts and eliminating more than 250 manual CSV requests each quarter.
• Tuned Spark 3.5 jobs on EMR (broadcast joins, AQE, Z-ORDER, Bloom filters, Iceberg MERGE) to cut portfolio risk run times 8× (95 s → 12 s) and lower cluster costs 28% with autoscaling and spot instances.
Data Engineer – Accenture | Hyderabad, India July 2021 – August 2023
• Built automated pipelines that moved CRM, ERP, and third-party API data into a SQL Server staging area, transformed the data in Azure Data Factory, and stored clean Parquet files in the Lakehouse eliminating 90% of manual file drops and unlocking near real time analytics.
• Refactored T SQL jobs that handle 50 M rows per week by rewriting joins, adding window functions, and indexing key columns; cut dashboard load times from 3 + minutes to under 20 seconds.
• Implemented nightly PII masking for 10 production clones with CA Test Data Manager and Fast Data Masker, shortening data-provisioning SLAs from 48 hours to same day and accelerating regression testing.
• Embedded Great Expectations checks in every SQL script, raising first-pass QA approval to 99% and reducing post-release defects by 30%.
• Engineered Azure Databricks PySpark frameworks and reusable Python utilities that ingest 60 M events/day, auto-handle schema drift, and materialize Delta Lake tables cutting ML feature-set prep time 65% and adding Python/PySpark scalability to the team's stack.
• Launched a CI/CD pipeline in Azure DevOps that unit-tests SQL objects and dbt models before one-click promotion to QA, UAT, and Prod, decreasing deployment errors by 30%.
• Created Excel dashboards with Power Query and VBA automations that save 8 analyst hours each week and improve forecast accuracy by 12%.
• Collaborated with cross-functional stakeholders and mentored junior analysts to accelerate data pipeline adoption across Finance, Marketing, and DevOps teams.
• Partnered with product owners in bi-weekly agile ceremonies to prioritize backlog items, cutting rework cycles by 20 %.
Associate Data Analyst –
Accenture | Bangalore, India
July 2020 – Jun 2021
•
Converted ambiguous business asks into
ML/statistical problem specs; defined targets, leakage checks, and evaluation
metrics (AUC, RMSE, silhouette) in versioned design docs.
•
Built model‑ready feature tables with SQL
CTE chains and window functions; cut prep time from ~3 hours to 25 minutes and
eliminated duplicate transformations via shared Git repos.
•
Embedded data‑quality gates (schema drift,
null thresholds) using Great Expectations + pytest inside Airflow DAGs,
blocking bad loads before training.
•
Produced Tableau/Matplotlib
packets—calibration plots, SHAP ranks, cohort funnels—accelerating stakeholder
decisions on campaign tweaks from days to hours.
Feel free to get in touch with me. I am always open to discussing new projects, creative ideas or opportunities to be part of your visions.