CSSE 434 - Introduction to the Hadoop Ecosystem
- Credit Hours: 4R-0L-4C
- Term Available: -
- Graduate Studies Eligible: Yes
- Prerequisites: CSSE 230 *Some Experience with SQL recommended
- Corequisites: None
This advanced course examines emergent Big Data techniques through hands-on introductions to the various technologies and tools that make up the Hadoop ecosystem. Topics covered include internals of MapReduce and the Hadoop Distributed File system (HDFS), internals of the YARN distributed operating system, MapReduce for data processing, transformation & analysis tools for data at scale (processing terabytes and petabytes of information quickly), scheduling jobs using workflow engines, data transfer tools & real time engines for data processing.