0

How to iterate from where I left off?

I have created a jupyter notebook where I have a list. I want to iterate over the list, do some operations and store the output in a file. Now, if I close the notebook, and reopen it later to add new elements to the list and perform operations on them, how do I continue from where I left off (as I don't want to recompute operations on the previous elements)?

12th Jun 2020, 1:15 PM
Aditya Rana
5 Respuestas
+ 2
I'd maintain a second file which would store the visited hyperlinks. Then you just check "if exists" and if so, you skip this page. And you only append to a file the websites that are not yet in the list. By the way, Jupyter allows you to input() normally. So you can even introduce that, as a user prompt - what to do if you encounter an already scraped website - [I]gnore, [R]edo or [Q]uit :)
12th Jun 2020, 7:29 PM
Kuba Siekierzyński
Kuba Siekierzyński - avatar
+ 1
If you are the only user of this notebook and don't need to secure it from random use, you can simply structure the notebook so that the initial computation is done in one cell. After you run it and get the result, comment its whole content with triple quotes in order for it to be ignored next time. Place the next pieces of code in the cells that follow it. Whenever you need to "reset", just uncomment this first cell.
12th Jun 2020, 2:28 PM
Kuba Siekierzyński
Kuba Siekierzyński - avatar
+ 1
Aditya Rana There might be, just please describe what you want to achieve specifically. If you just want to skip the operation, you can make an if statement checking if it's done (for example create an empty file when it's done and check if the file exists, then don't execute it).
12th Jun 2020, 6:07 PM
Kuba Siekierzyński
Kuba Siekierzyński - avatar
0
Kuba Siekierzyński is there no other way?
12th Jun 2020, 5:54 PM
Aditya Rana
0
Kuba Siekierzyński I want to fetch data by scraping multiple websites. I provide links to those websites inside the list and then after, let's say, getting all <p> tags, I want to write them to a file. So if I continue from starting of the list, there will be repetitions and they are what I'm trying to overcome.
12th Jun 2020, 7:08 PM
Aditya Rana