Learning 042 min read

Machine Learning

Data Analysis & Model Development

Category

AI & Machine Learning

Reading Time

2 min read

Published

March 2026

Introduction

Machine learning enables computers to learn patterns from data and make intelligent predictions. I've built end to end ML pipelines including data preprocessing, feature engineering, model training, evaluation, and deployment. My work spans classification, regression, and deep learning tasks.

Key Learnings

Data Preprocessing

Learned Cleaning, normalizing, and transforming raw data with handling of missing values, outliers, and feature scaling.

Feature Engineering

Learned Creating meaningful features from raw data using domain knowledge and statistical techniques.

Model Training & Tuning

Learned Training machine learning models with hyperparameter optimization and cross-validation for robust performance.

Model Evaluation

Learned Comprehensive evaluation using appropriate metrics like precision, recall, F1 score, and confusion matrices.

Tools & Technologies

Python

Primary language for ML development with extensive library ecosystem.

Pandas

Data manipulation library for loading, cleaning, and transforming datasets.

NumPy

Numerical computing library for array operations and mathematical functions.

Scikit-Learn

Machine learning library with algorithms for classification, regression, and clustering.

PyCaret

Low-code machine learning library that simplifies model training, comparison, and deployment workflows.

Matplotlib

Data visualization library used to create static, animated, and interactive charts and graphs.

How I Used This in Projects

DementiaInsight - Non-Medical Dementia Risk Classifier (Dec 2025)

Developed an automated ML pipeline to predict dementia risk using non medical features. Implemented data processing, model training and evaluation, and a CLI prediction tool. Achieved strong performance with LightGBM and was a finalist at the ModelX Inter University Hackathon.

PythonPandasLightGBMScikit LearnJupyterClassification

MedPredict - Medical Cost Prediction Model (Nov 2025)

Developed a Random Forest regressor to predict medical insurance costs using lifestyle and demographic indicators. Deployed via Streamlit to provide a lightweight interactive web interface.

PythonScikit LearnRandomForestStreamlitClassification

Skills & Tags

Scikit LearnPythonData ScienceAIData Visualization

Want to explore more?

← Back to Portfolio