Apache Spark
Apache Spark is an open-source, distributed computing framework for processing and analyzing big data.
Apache Spark is an open-source, distributed computing framework for processing and analyzing big data.
Apache Kafka is a distributed event-streaming platform. It is similar to a big commit log where data is stored in sequence in real-time. A commit log keeps track of what's happening—a record of transactions.
Apache Flume is an open-source, distributed service for collecting and moving logs.
Apache Cassandra is an open-source, distributed, NoSQL DBMS that can process large volumes of data across several servers quickly.
Big data refers to massive and complex volumes of structured, semi-structured or unstructured data. Examples include social media data, transactional data(stock prices, purchase histories), sensor data (location data, weather data) and satellite data.