Apache Spark is an open-source, distributed computing framework for processing and analyzing big data.
Big Data and Big Data Technologies
Understand what's big data and take a look at some of the most popular big data technologies used today.
Apache Kafka is a distributed event-streaming platform. It is similar to a big commit log where data is stored in sequence in real-time. A commit log keeps track of what's happening—a record of transactions.
Apache Flume is an open-source, distributed service for collecting and moving logs.
Apache Cassandra is an open-source, distributed, NoSQL DBMS that can process large volumes of data across several servers quickly.
Big data refers to massive and complex volumes of structured, semi-structured or unstructured data. Examples include social media data, transactional data(stock prices, purchase histories), sensor data (location data, weather data) and satellite data.