Website | Github

Python Streamlit Machine Learning

A comprehensive machine learning-powered web application that predicts multiple diseases including Diabetes, Heart Disease, and Parkinson’s Disease. Built with Streamlit for an intuitive user interface and powered by Support Vector Machine (SVM) and Logistic Regression algorithms.

🌟 Features

  • Multi-Disease Prediction: Predict Diabetes, Heart Disease, and Parkinson’s Disease
  • High Accuracy Models: Trained on medical datasets with optimized algorithms
  • User-Friendly Interface: Clean, intuitive Streamlit web interface
  • Real-time Results: Instant predictions with confidence scores
  • Responsive Design: Works seamlessly across devices
  • Privacy-First: No data storage, all processing done locally
  • 🔄 Automated CI/CD Pipeline: Complete GitHub Actions workflow with testing, Docker builds, and AWS EC2 deployment

📊 Model Performance

1. 💉 Diabetes Prediction

  • Algorithm: Support Vector Machine (Linear Kernel)
  • Features: 8 (Glucose, Blood Pressure, BMI, Age, etc.)
  • Training Accuracy: 78%
  • Test Accuracy: 77%
  • Dataset Size: 768 samples

2. ❤️ Heart Disease Prediction

  • Algorithm: Logistic Regression
  • Features: 13 (Chest Pain, Cholesterol, Heart Rate, etc.)
  • Training Accuracy: 84%
  • Test Accuracy: 78%
  • Dataset Size: 303 samples

3. 🧠 Parkinson’s Disease Prediction

  • Algorithm: Support Vector Machine (Linear Kernel)
  • Features: 22 (Voice measurements and tremor analysis etc.)
  • Training Accuracy: 88%
  • Test Accuracy: 87%
  • Dataset Size: 195 samples

📊 Dataset Information

Data Sources

  • Diabetes: Pima Indians Diabetes Database
  • Heart Disease: UCI Heart Disease Dataset
  • Parkinson’s: Oxford Parkinson’s Disease Detection Dataset

Data Preprocessing

  • Missing value handling
  • Feature scaling and normalization
  • Train-test split (80-20)

🛠️ Technology Stack

TechnologyPurposeVersion
PythonCore Programming3.8+
StreamlitWeb Interface1.28+
PandasData Processing2.0+
NumPyNumerical Computing1.24+
Scikit-LearnMachine Learning1.3+
PickleModel SerializationBuilt-in

DevOps & Deployment Stack

TechnologyPurposeVersion
GitHub ActionsCI/CD PipelineLatest
DockerContainerization24.0+
DockerHubContainer RegistryLatest
AWS EC2Cloud InfrastructureLatest

🔄 CI/CD Pipeline Overview

The pipeline consists of three main stages:

  1. 🧪 Testing Stage: Validates code quality and executes Jupyter notebooks
  2. 🏗️ Build & Push Stage: Creates Docker images and pushes to DockerHub
  3. 📦 Docker Containerization: Automatically builds optimized Docker images
  4. 🐳 DockerHub Integration: Seamlessly pushes images to DockerHub registry
  5. Deployment Stage: Automatically deploys to AWS EC2 with zero downtime
  6. ☁️ Cloud Deployment: Deploys directly to AWS EC2 instances
  7. 🔄 Zero-Downtime Deployment: Rolling updates without service interruption
  8. 🧹 Resource Optimization: Automatic cleanup of unused Docker resources
  9. 📊 Build Caching: Optimized build times using GitHub Actions cache

Quick Start

Prerequisites

Python 3.8+
pip package manager

Local Setup

Installation

  1. Clone the repository
git clone https://github.com/SamanwoySaha/Multiple-Disease-Prediction.git
cd Multiple-Disease-Prediction
  1. Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
  1. Install dependencies
pip install -r requirements.txt
  1. Run the application
streamlit run app.py
  1. Open your browser
Navigate to: http://localhost:8501

Run the app from Docker Image

  1. Pull the image from Docker Hub
docker pull samanwoysaha/multiple-disease-prediction:latest
  1. Run the image
docker run -p 8501:8501 samanwoysaha/multiple-disease-prediction:latest
  1. Open your browser
Navigate to: http://localhost:8501

📁 Project Structure

Multiple-Disease-Prediction/
├── src/
| ├── 📄 app.py # Main Streamlit application
├── 📊 models/
│ ├── diabetes_model.sav # Trained diabetes prediction model
│ ├── heart_disease_model.sav # Trained heart disease model
│ └── parkinsons_model.sav # Trained Parkinson's model
├── 📓 notebooks/
│ ├── diabetes_prediction.ipynb # Diabetes model training notebook
│ ├── heart_disease_prediction.ipynb # Heart disease training notebook
│ └── parkinsons_prediction.ipynb # Parkinson's training notebook
├── 📂 data/
│ ├── diabetes.csv # Diabetes dataset
│ ├── heart_disease.csv # Heart disease dataset
│ └── parkinsons.csv # Parkinson's dataset
├── 🔧 requirements.txt # Python dependencies
├── 📚 README.md # Project documentation

📈 Future Enhancements

  • Additional Diseases: Liver disease, kidney disease, cancer prediction
  • Deep Learning Models: Implement neural networks for better accuracy
  • Mobile App: React Native or Flutter mobile application
  • Advanced Visualizations: Interactive charts and graphs
  • User Authentication: Login system with prediction history
  • Telemedicine Integration: Connect with healthcare providers
  • Multi-language Support: Support for regional languages