+ 4
[Pandas]
I now have 2 dataframes. the first contains my primary data which includes a cluster, a currency and a quantity (and other columns). The second has some extracted data from the first, namely for each cluster, it holds one row for each currency, and also the maximum quantity than was found for that cluster+currency combination. Now I need to filter the first df, to derive a df that only includes those rows that had the maximum quantity for each cluster and currency. How would I do that?
1 Answer
+ 3
Merge both of the Dataframes so you have the max quantity together with your main data. Then filtering will be easy (comparing two columns).
Merge is similar to an SQL join.
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html