0
Data science 3 different source
Hello guys! I have a problem, I need to join 3 csv file with data about the same companies but from different source. So the data is in different form, I have a problem when I join them how do I know what data I need to select for the right name of the company Example: Google goagle guugle Output google. Thank you !
1 ответ
0
Seems like GIGO (garbage in garbage out).
Are the names just misspelled? I would probably store each set of data separately, then think about combining them if they are a typo or etc.
Perhaps you could add a function to show possible duplicates to a user and let them decide?