+ 8

Why data cleaning plays a vital role in analysis?

11th Mar 2018, 9:28 PM
molly
2 Answers
+ 4
There is an old adage "its only as good as the data". If your data contains duplicates, or invalid entries you will get incorrect results. Simple example, lets say you want to know how may customers you have. you could count the number of email names you have in your customer database. but if you have duplicate emails, or emails that are invalid, then the answer you get will be wrong.
11th Mar 2018, 9:44 PM
Mike Choy
Mike Choy - avatar
+ 7
thank u so much I get it now the exp help hh 😃 I found this answer too Cleaning data from multiple sources to transform it into a format that data analysts or data scientists can work with is a cumbersome process because - as the number of data sources increases, the time take to clean the data increases exponentially due to the number of sources and the volume of data generated in these sources. It might take up to 80% of the time for just cleaning data making it a critical part of analysis task
11th Mar 2018, 9:51 PM
molly