+ 1

Does anyone know about "Hadoop and spark"

Any source where i can learn??

8th Aug 2019, 8:03 PM

Babashehu Shettima Musti

1 Answer

+ 11

Hadoop is an tool which is used for making solution of the big data problem. It's used for reduce the big data in which many unnecessary data are present and covert that into important and compressed data. It stores and process data in an distributed environment with distributed system attributes and property You can start hadoop from official documentation or from video lecture from udemy and edax. https://www.edureka.co/blog/hadoop-tutorial/ https://hadoop.apache.org/docs/stable/ spark is another one used for big data processing it too used map reduce and other techniques like hadoop Spark uses Micro-batching for real-time streaming. Apache Spark is open source, general-purpose distributed computing engine used for processing and analyzing a large amount of data and reduce that with only including important data https://spark.apache.org/documentation.html

8th Aug 2019, 8:23 PM

GAWEN STEASY