π§ Predictive Maintenance MLOps Project
An end-to-end MLOps project for predicting Remaining Useful Life (RUL) of industrial equipment using the NASA C-MAPSS Turbofan Engine Degradation Dataset. This project demonstrates production-grade ML engineering practices including CI/CD, experiment tracking, model serving, containerization, and monitoring.
Predictive maintenance uses machine learning to predict when equipment will fail, enabling proactive maintenance scheduling. This project:
Predicts RUL (Remaining Useful Life) of turbofan engines
Trains multiple models (Random Forest, Gradient Boosting, LSTM, etc.)
Tracks experiments with MLflow
Serves predictions via REST API
Monitors performance through Streamlit dashboard
Automates CI/CD with GitHub Actions
β¬οΈ Reduce unplanned downtime by 30-50%
π° Lower maintenance costs through optimized scheduling
π Extend equipment lifespan with timely interventions
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PREDICTIVE MAINTENANCE SYSTEM β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ βββββββββββββ β
β β Data βββββΆβ Data βββββΆβ Data βββββΆβ Model β β
β β Ingestion β β Validation β βTransformationβ β Training β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ βββββββ¬ββββββ β
β β β β β β
β β β β βΌ β
β β β β βββββββββββββ β
β β β β β Model β β
β β β β βEvaluation β β
β β β β βββββββ¬ββββββ β
β β β β β β
β ββββββββΌββββββββββββββββββββΌββββββββββββββββββββΌβββββββββββββββββββΌββββββ β
β β MLflow Tracking Server β β
β β (Experiments, Parameters, Metrics, Artifacts) β β
β βββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ β
β β Model Registry β β
β β (Versioning, Staging, Production) β β
β βββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββββΌβββββββββββββββββββββββββ β
β βΌ βΌ βΌ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β FastAPI β β Streamlit β β Batch β β
β β REST API β β Dashboard β β Prediction β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ β
β β β β β
β ββββββββΌββββββββββββββββββββββββΌβββββββββββββββββββββββββΌββββββββββββββββ β
β β Prometheus + Grafana β β
β β (Monitoring & Alerting) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INFRASTRUCTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Docker β β GitHub β β DVC β β MongoDB β β
β β Compose β β Actions β β (Data) β β (Storage) β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Category
Technologies
ML/DL
scikit-learn, TensorFlow/Keras, LSTM
MLOps
MLflow, DVC, Docker, GitHub Actions
API
FastAPI, Uvicorn, Pydantic
Data
Pandas, NumPy, MongoDB
Visualization
Streamlit, Plotly, Matplotlib
Testing
pytest, pytest-cov, hypothesis
Code Quality
Black, isort, flake8, mypy, pre-commit
Monitoring
Prometheus, Grafana
β
Automated data ingestion from multiple sources
β
Data validation with quality checks and anomaly detection
β
Feature engineering (lag features, rolling statistics)
β
Multiple model training (RF, GB, Linear, Ridge, Lasso, SVR, LSTM)
β
Hyperparameter tuning with GridSearchCV
β
Model evaluation with comprehensive metrics (RMSE, MAE, RΒ², MAPE)
β
Experiment tracking with MLflow
β
Model registry for versioning and staging
β
Data versioning with DVC
β
CI/CD pipeline with GitHub Actions
β
Containerization with Docker & Docker Compose
β
Pre-commit hooks for code quality
β
REST API with FastAPI for real-time predictions
β
Batch prediction pipeline for large datasets
β
Monitoring dashboard with Streamlit
β
Health checks and API documentation (Swagger/OpenAPI)
β
Risk level classification (Critical, High, Medium, Low)
predictive-maintenance/
βββ .github/
β βββ workflows/
β βββ main.yml # CI/CD pipeline
β βββ model-training.yml # Scheduled training
βββ api/
β βββ __init__.py
β βββ main.py # FastAPI application
β βββ schemas.py # Pydantic models
βββ config/
β βββ config.yaml # Main configuration
β βββ schema.yaml # Data schema
βββ dashboard/
β βββ app.py # Streamlit dashboard
βββ data/
β βββ raw/ # Raw data
β βββ validated/ # Validated data
β βββ transformed/ # Processed features
β βββ predictions/ # Batch predictions
βββ monitoring/
β βββ prometheus.yml # Prometheus config
β βββ grafana/ # Grafana dashboards
βββ notebooks/
β βββ eda.ipynb # Exploratory analysis
βββ src/
β βββ components/
β β βββ data_ingestion.py
β β βββ data_validation.py
β β βββ data_transformation.py
β β βββ model_trainer.py
β β βββ model_evaluation.py
β β βββ batch_prediction.py
β βββ pipelines/
β β βββ training_pipeline.py
β βββ utils/
β β βββ logger.py
β β βββ model_utils.py
β βββ constants/
β β βββ __init__.py
β βββ mlflow_tracking.py
βββ tests/
β βββ unit/
β β βββ test_data_validation.py
β β βββ test_model_evaluation.py
β β βββ test_api.py
β βββ integration/
β β βββ test_pipeline.py
β βββ conftest.py # Pytest fixtures
βββ artifacts/
β βββ models/ # Trained models
β βββ logs/ # Application logs
β βββ reports/ # Evaluation reports
βββ .dvc/ # DVC configuration
βββ .pre-commit-config.yaml # Pre-commit hooks
βββ docker-compose.yml # Docker services
βββ Dockerfile # Multi-stage Dockerfile
βββ requirements.txt # Dependencies
βββ setup.py # Package setup
βββ pyproject.toml # Build configuration
βββ pytest.ini # Pytest configuration
βββ README.md # This file
1. Data Ingestion β Load raw sensor data from source
2. Data Validation β Validate schema, types, and ranges
3. Transformation β Feature engineering & scaling
4. Model Training β Train multiple models
5. Model Evaluation β Compare and select best model
6. Model Registry β Version and stage models
Model
Type
Use Case
Random Forest
Ensemble
Baseline, robust
Gradient Boosting
Ensemble
High accuracy
Linear Regression
Linear
Interpretable
Ridge/Lasso
Linear
Regularized
SVR
Kernel
Non-linear
LSTM
Deep Learning
Sequence modeling
Method
Endpoint
Description
GET
/health
Health check
GET
/models
List available models
POST
/predict
Single/batch prediction
POST
/predict/batch
File-based batch prediction
POST
/models/reload
Reload models
π Monitoring Dashboard
The Streamlit dashboard provides:
Overview : Key metrics, model comparison
Model Performance : Detailed metrics, visualizations
Predictions : Interactive prediction interface
Data Explorer : Feature distributions, correlations
System Health : API status, resource usage
Model Performance (Test Set)
Model
RMSE
MAE
RΒ²
Random Forest
18.5
12.3
0.87
Gradient Boosting
17.2
11.8
0.89
LSTM
15.8
10.5
0.91