+ 1
Does anyone know about "Hadoop and spark"
Any source where i can learn??
1 Answer
+ 11
Hadoop is an tool which is used for making solution of the big data problem. It's used for reduce the big data in which many unnecessary data are present and covert that into important and compressed data. It stores and process data in an distributed environment with distributed system attributes and property
You can start hadoop from official documentation or from video lecture from udemy and edax.
https://www.edureka.co/blog/hadoop-tutorial/
https://hadoop.apache.org/docs/stable/
spark is another one used for big data processing it too used map reduce and other techniques like hadoop
Spark uses Micro-batching for real-time streaming. Apache Spark is open source, general-purpose distributed computing engine used for processing and analyzing a large amount of data and reduce that with only including important data
https://spark.apache.org/documentation.html