Analyzing U.S. Air Flights with Apache Spark and Hadoop
For my university course in Big Data and Cloud Computing, I developed a hands-on project to process and analyze U.S. domestic flight data using a big data pipeline built on Apache Spark, Hadoop and Hive. 📂 Project Repository on GitHub ✈️ Project Overview The dataset comes from the U.S. Department of Transportation (DOT) and includes over 20 years of flight records across domestic U.S. routes. It contains rich details such as: ...