0

How can I modify this code?

I have a lot of data, including several columns that the data ID are sorted from low to high and the columns are separated by (;) and are arranged. , and I want each row that contains the ID name to be saved as a separate CSV file with the same ID name. I wrote a part of the code in Jupyter Notebook, but when the name of the ID changes, the program continues to save with the same ID, the previous one. Currently, all rows are stored as 900440_1, 900440_2 and so on.And I want them saved that way: 900440_1,900440_2,900440_900660_1,... The data is as follows: ID Name Date Time Head 900440 A 01.04.2013 00:00:00 250.98 900440 B 01.04.2013 01:00:00 250.98 900440 C 01.04.2013 02:00:00 250.99 . . . 900660 D 01.04.2013 03:00:00 250.99 900660 E 01.04.2013 04:00:00 240.99 900660 F 01.04.2013 05:00:00 260.99 . . . 900770 I 01.04.2013 04:00:00 250.99 900770 J 01.04.2013 05:00:00 251.00 900770 K 01.04.2013 06:00:00 251.00 . . . My code : i = 0 for row in range(len(data)): tmp= data.iloc[[row]] tmp.to_csv(r'C:\Users\IMAN\data\900440_%s.csv'%str(row))

15th Oct 2020, 9:11 AM
Iman Sharifpour
Iman Sharifpour - avatar
6 Respostas
+ 10
I have not a final code for you, but i have something similar. The code is doing the creation of the filename and of the enumeration. In this code i used a list with some lines from your data. Try to use pandas as you started and try to implement the code from me in your program. Hope this helps. When you run the code, it will create the output files in the same directory as the python file is stored. https://code.sololearn.com/caZ5ARGC5EZW/?ref=app
18th Oct 2020, 5:20 PM
Lothar
Lothar - avatar
+ 9
Iman Sharif Pour , do the IDˋs follow in a sorted way or do you have to sort / separate them before you can save them? How many lines of data does your csv file have?
15th Oct 2020, 10:58 AM
Lothar
Lothar - avatar
+ 8
Iman Sharif Pour , i am not quite sure what you are going to do. So if we have e.g. 2000 times the same id, you want to create 2000 separate files? This would mean, that having 2 million rows in total, that you create the same number of files? This would sound a bit strange. But OK, lets see. I assume that you are using pandas dataframe to read and write the csv files. As i don't know the complete code, it would mean that i have to do this by myself?
15th Oct 2020, 1:40 PM
Lothar
Lothar - avatar
+ 2
Maybe this can help you: https://www.kaggle.com/
15th Oct 2020, 9:56 AM
JaScript
JaScript - avatar
+ 1
Dear Lothar, The IDs are different, but there are about two thousand rows from each ID, and then the ID name changes, the number of rows is close to two million, and the number of columns is five. The data are sorted from low to high and the columns are separated by (;) and are arranged. just save each row with 5 columns in a CSV file with its ID. for example, the file name 900440_1.csv contain : ID Name Date Time Head 900440 A 01.04.2013 00:00:00 250.98 file 900440_2.csv : ID Name Date Time Head 900440 B 01.04.2013 01:00:00 250.98
15th Oct 2020, 11:06 AM
Iman Sharifpour
Iman Sharifpour - avatar
+ 1
I need to have a file for each row (approximately two million files), for example in the first row with ID number 900440 where there are five columns with different data to be saved in one file, again the second row with the same ID but With an underline, because files with one name can not be saved (the number can be stored from one to the last ID number, ie 900440_2000.) And again the next ID that starts the file name again 900660_1, 900660_2 to 900660_1, 900660_2000 At the moment, the code I wrote does the job of saving each file, but I want to change the initial part of the file name, which is the same as the ID name when the first 2000 IDs are used up. I would be so grateful if you could edit my code to solve this problem. import pandas as pd data= pd.read_csv('data.csv') i = 0 for row in range(len(data)): tmp= data.iloc[[row]] tmp.to_csv(r'C:\Users\IMAN\data\900440_%s.csv'%str(row))
15th Oct 2020, 2:33 PM
Iman Sharifpour
Iman Sharifpour - avatar