📊 BDA - Big Data Analytics Project & Labs
Rendu Rémi Locquette - ESIEE 2025-2026
🎯 Overview
This site contains the complete documentation, laboratories, and final project for the Big Data Analytics course using Apache Spark.
📚 Content
Laboratory Sessions (Laboratory_BDA/)
- Lab 0: Introduction to Spark and PySpark basics
- Lab 1: Data ingestion and transformation
- Lab 2: Advanced transformations and joins
- Lab 3: Optimization and performance tuning
- Lab 4: Advanced analytics and ML pipeline
Each lab includes:
- Practice assignments (
.ipynb) - Graded assignments with reports
- Environment setup guides (
ENV.md) - Metrics and results
Final Project (Projet_BDA/)
Bitcoin Price Direction Prediction with Blockchain Data
- End-to-end PySpark pipeline
- Blockchain data acquisition and parsing
- Feature engineering from on-chain metrics
- Machine learning model comparison
- Comprehensive analysis and reporting
📁 Structure
rendu_remi_locquette/
├── Laboratory_BDA/
│ ├── Lab_0/ - Introduction
│ ├── Lab_1/ - Data Ingestion
│ ├── Lab_2/ - Transformations
│ ├── Lab_3/ - Optimization
│ └── Lab_4/ - Advanced Analytics
│
└── Projet_BDA/
├── Project.ipynb - Full pipeline
├── Report.md - Final report
├── Utils/ - Data acquisition scripts
└── outputs/ - Results & artifacts
🔧 Quick Links
- Resources & Downloads - All notebooks and datasets
- Support Docs - Environment setup and helpers
✅ Requirements
- Python 3.10+
- Apache Spark 3.5+
- PySpark
- Jupyter Notebook
See individual ENV.md files for detailed setup instructions.
Last Updated: December 2025