+ 2

Finding values from one column in the another column.

In "twitter.csv" I have a column "Names" containing user names ( like @Sandman4, @Holly, @Taylor etc.) and the column "Posts" containing posts of each user (posts look like: best night ever @Taylor). How can I count how many times values from "Names" were mentioned in "Posts" of other users. For example: How many times user with the name @Taylor was mentioned in posts of other users. Note, that each name appears in the column "Name" only once, so there's no need to groupby(). I thought there should be an option to use a loop to check if a row from the column "Name" exists in the column "Posts" and count +=1, but no idea how.

2nd Feb 2020, 2:41 PM
LeiaR
4 Antworten
+ 3
This may not be the complete solution, but it's a start. I am having trouble with pandas weird slicing syntax, so what I found helpful is to set the names as index. If you say they are unique in all rows, then it should be ok. https://code.sololearn.com/c72xiD88NGpI/?ref=app
2nd Feb 2020, 7:10 PM
Tibor Santa
Tibor Santa - avatar
+ 1
How many records do you even have in that dataset? Also it was not clear if the Posts column is a single string, and if one record can contain the same name multiple times, or if this should be counted only once. Maybe another idea is loop through the Posts, filter out the names and store them in a Counter object (from collections library). I am not sure how you run this code or what is causing inefficiency, you seem to be running out of memory. Maybe using generator expressions could solve this.
3rd Feb 2020, 5:15 PM
Tibor Santa
Tibor Santa - avatar
0
The code seems ok, but it takes him eternity to run it, so I can't say for sure if it worked for my code. Any idea what should I change? Update: It led to Fatal Error on my PC...
3rd Feb 2020, 6:44 AM
LeiaR
0
I've found the solution converting the data from the columns to_list( ).
4th Feb 2020, 5:34 PM
LeiaR