0

Python - how to split into multiple data sets in python using two column unique values

I would appreciate, if you can address my query. The Query is as below: SRC Code Var1 var 2 ........................................................ AA P01 BBB B02 AAA P01 CCC C01 BBB B01 DDD D01 AAA P02 BBB B02 CCC C01 ! ! ! I need to split the above data into multiple data frames based on SRC and Code. my problems is the above actual data has around 9500 unique combination of SRC and Code. how can i create multiple data frames with the name SRC_Code (By Variable split). The code should dynamically split into multiple data frame and their by create a numpy list with all data frame names

25th Nov 2017, 4:17 PM
bhavani shankar
bhavani shankar - avatar
3 Réponses
+ 6
If I get it correctly, you want to achieve separate dataframes with only one SRC+Code combination? If so, for such a task I'd advise to use pandas groupby(['SRC', 'Code']) to get GroupBy objects. Setting its index (multiindex, actually) to be based on those two columns should get you started fine, as you should obtain all possible (or rather or existing in tge object) combinations of SRC+code. You can iterate through the multiindex object to assign each of the subdataframes (based on the particular multiindex element) to separate dataframes.
27th Nov 2017, 8:05 AM
Kuba Siekierzyński
Kuba Siekierzyński - avatar
+ 5
You can employ both numpy and pandas to do the job. What format is the input data in?
25th Nov 2017, 8:34 PM
Kuba Siekierzyński
Kuba Siekierzyński - avatar
0
The data is in CSV format Numpy and Pandas will work but how do I create for 9500 unique combinations? so I will be requiring 9500 data sets with SRC_CODE name Then All the data set names in a python list I would appreciate if anyone help to address my question
27th Nov 2017, 7:23 AM
bhavani shankar
bhavani shankar - avatar