+ 8
What is the difference between data mining, data warehousing, data science, data analysis and big data?
Explain Work field, features, differences, etc...
3 ответов
+ 12
Data science is the art of data analysis, utilizing statistics and mathematics, probabilistic thoeries and some programming (usually scripting languages) to discover hidden patterns inside the data. That enables the algorithms to "learn themselves" based on the patterns they unravel.
Data mining is part of data science, concentrated on digging through your data, deciding which are relevant and carry some value and which do not.
Data warehousing is about storing big amounts of data, usually in a clustered architecture (as one machine may be far to small to do that) and making sure it is still accessible and operational.
Big data is a term with no precise definition, but in general, it is used when large, very large and extremely large amounts of data, often real-time, are concerned.
+ 9
Tysm☺️
+ 7
appending big data explanation to @kuba's answer
big data refers to the effective classification and storage as well as retrieval and analysis of large amount of data
let us say i have a company, i have a hotel
now in my game park i have an app that customer uses to pay for different services
all customers purchases are piped in, in your database
well consider the fact that customers also pay hotel provided taxis with the app
a casual glance might tell you that you have only financial data but, you have much more
time and place for each purchase and ... customer details like age etc
now the steps would be to
1 store the data securely
2 classify the data
3 crunch large amount of calculations
4 allow easy retrieval
a special note on step 1 normally it depends on your big data plan / strategy / framework you are using.
let us say you use hadoop. hadoop allows you to store your data on servers in such a way that even if 25% of your servers are down, you can access your data.
now for analysis, that datamight be used to discover trends like what old persons prefer etc etc etc what relation between traffic jam and income ... etc etc etc
why so much fuss on data?
simply because too much data flows in
now the hotel does not only have app data, it has lots and lots of data coming in and so it needs an appropriate system to deal with the influx.