📊 BDA - Big Data Analytics Project & Labs

Rendu Rémi Locquette - ESIEE 2025-2026


🎯 Overview

This site contains the complete documentation, laboratories, and final project for the Big Data Analytics course using Apache Spark.

📚 Content

Laboratory Sessions (Laboratory_BDA/)

  • Lab 0: Introduction to Spark and PySpark basics
  • Lab 1: Data ingestion and transformation
  • Lab 2: Advanced transformations and joins
  • Lab 3: Optimization and performance tuning
  • Lab 4: Advanced analytics and ML pipeline

Each lab includes:

  • Practice assignments (.ipynb)
  • Graded assignments with reports
  • Environment setup guides (ENV.md)
  • Metrics and results

Final Project (Projet_BDA/)

Bitcoin Price Direction Prediction with Blockchain Data

  • End-to-end PySpark pipeline
  • Blockchain data acquisition and parsing
  • Feature engineering from on-chain metrics
  • Machine learning model comparison
  • Comprehensive analysis and reporting

📁 Structure

rendu_remi_locquette/
├── Laboratory_BDA/
│   ├── Lab_0/ - Introduction
│   ├── Lab_1/ - Data Ingestion
│   ├── Lab_2/ - Transformations
│   ├── Lab_3/ - Optimization
│   └── Lab_4/ - Advanced Analytics
│
└── Projet_BDA/
    ├── Project.ipynb - Full pipeline
    ├── Report.md - Final report
    ├── Utils/ - Data acquisition scripts
    └── outputs/ - Results & artifacts


✅ Requirements

  • Python 3.10+
  • Apache Spark 3.5+
  • PySpark
  • Jupyter Notebook

See individual ENV.md files for detailed setup instructions.


Last Updated: December 2025