Saving the State of a Program to Allow It to Be Resumed

Saving the state of a program to allow it to be resumed

Put all of your "state" data in one place and use a pickle.

The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” 1 or “flattening”, however, to avoid confusion, the terms used here are “pickling” and “unpickling”.

Save the current state of program and resume again from last saved point

There might be multiple solutions to this problem but this comes first in mind it will help you to solve this problem.

Approach :

It's very clear, the script starts to download from starting because it can't remember the index till where it has downloaded the last time.

To solve this issue we will create a text file which is having an integer 0 denoting to that up to this index file has been downloaded. And when the script runs it checks what is the integer value present in the text file. (It's like recalling the position). The value in the text file gets incremented with 1 if the file is downloaded successfully.

Code

An Example for understanding ::

Please See: I have manually created a text file with '0' in it earlier.

# Opening the text file
counter =  open('counter.txt',"r")

# Getting the position from where to start.Intially it's 0 later it will be updated
start = counter.read()
print("-->  ",start)
counter.close()

for x in range(int(start),1000):
    print("Processing Done upto : ",x)

    #For every iteration we are writing it in the file with the new position      
    writer = open('counter.txt',"w")
    writer.write(str(x))
    writer.close()

Fixing your code :

Note: Create a text file manually with the name 'counter.txt' and write '0' in it.

import pandas as pd
import requests as rq
import os,time,random,pickle
import csv
data=pd.read_csv("consensus_data.csv",usecols=["CaptureEventID","Species"])

z=data.loc[ data.Species.isin(['buffalo']), :]

df1=pd.DataFrame(z)

data_2=pd.read_csv("all_images.csv")

df2=pd.DataFrame(data_2)

df3=pd.merge(df1,df2,on='CaptureEventID')

p=df3.to_csv('animal_img_list.csv',index=False)

# you need to change the location below
data_final = pd.read_csv("animal_img_list.csv")
output=("/home/avnika/data_serengeti/url_op")

mylist = []

for i in range(0,100):
    x = random.randint(1,10)
    mylist.append(x)

print(mylist)

for y in range(len(mylist)):
    d=mylist[y]
    print(d)

# Opeing the file you manually created with '0' present in it.
counter =  open('counter.txt',"r")
start = counter.read()
count = start
counter.close()

file_name = data_final.URL_Info
print(len(file_name))

# The starting position from the file is used to slice the file_name from 'start' value.
for file in file_name[start:]:
    image_url='https://snapshotserengeti.s3.msi.umn.edu/'+file
    f_name=os.path.split(image_url)[-1]
    print(f_name)
    r=rq.get(image_url)

    with open(output+"/"+f_name, 'wb') as f:
        f.write(r.content)

    # File is downloaded and now, it's time to update the counter in the text file with new position.
    count+=1
    writer = open('counter.txt',"w")
    writer.write(str(count))
    writer.close()

    time.sleep(d)

Hope this helps :)

How can I save the state of running python programs to resume later?

I everything regarding the algorithm state is saved in a class, you can serialize the class an save it to disk: http://docs.python.org/2/library/pickle.html

How to EXACTLY save the state of a program in python?

Ok, so for my specific situation i found a solution. In my question i made an edit that i found how to save the program, to be exact i was able to make a saving mechanism that works . And for the loading mechanism i just had to load that saved file(as suggested by @acw1668) and assign it to the variable dictionary, this replaces the original list made from the .txt file to the altered list from the pickled file.

The entire code:

from random import randint
from tkinter import *
import pickle

dictionary=[]
with open('Dictionary.txt', 'r') as words:
    dictionary = [word.split('\n')[0] for word in words]

root= Tk()
root.title('')
root.geometry('300x500')

def on_close():
    with open(r'C:\Users\Tos\Documents\t.pickle', 'wb') as t_pickle:
        pickle.dump(dictionary, t_pickle)
    root.destroy()

def yes_():
    n = randint(0,len(dictionary))
    global text_label
    text_label.grid_forget()
    text_label = Label(root, text = dictionary[n])
    text_label.config(font=('Arial', 25))
    text_label.grid(row=0, padx=100, pady=100)
    dictionary.remove(dictionary[n])

def no_():
    global text_label
    text_label.grid_forget()
    text_label = Label(root, text='test click')
    text_label.config(font=('Arial', 25))
    text_label.grid(row=0, padx=100, pady=100)

try:
    with open(r'C:\Users\Tos\Documents\t.pickle', 'rb') as t_pickle:
        dictionary = pickle.load(t_pickle)
except FileNotFoundError:
    pass

text_label = Label(root, text='Hello')
text_label.config(font=('Arial', 30))
text_label.grid(row=0, padx=100, pady=100)

question = Label(root, text='Do you know this word?', anchor='center')
question.config(font=('Arial', 12))
question.grid(row=1, pady=20)

by = Button(root, text='Yes', command=yes_)
by.config(font=('Arial', 15))
by.grid(row=2, ipadx=20)

bn = Button(root, text='No', command=no_)
bn.config(font=('Arial', 15))
bn.grid(row=3, pady=10, ipadx=26)

root.protocol('WM_DELETE_WINDOW', on_close)
root.mainloop()

If you would like to know why i used try..except block then the reason is the first time user opens program there wont be any save file available, so to let the error pass easily. this also works for the part when the program is saved upon exiting the file is created and so the continuation of progress can be done.

Finally i would like to say that although i got what i wanted, i feel that this is a work around. i was expecting python to have some function or something that implements save/load functionality; Maybe it does and i dont know about it, im no expert at python. Its possible i worded the question incorrectly. im still interested in knowing how to pause/save the state of a running program, i found many stackeroverflow users asking the same thing and not getting an expected answer. if anyone does know that there exxists a save/load/pause functionality for running programs in python plz do link it.

How to save a program's progress, and resume later?

It's going to be different for every program. For something as simple as, say, a brute force password cracker all that would really need to be saved was the last password tried. For other apps you may need to store several data points, but that's really all there is too it: saving and loading the minimum amount of information needed to reconstruct where you were.

Another common technique is to save an image of the entire program state. If you've ever played with a game console emulator with the ability to save state, this is how they do it. A similar technique exists in Python with pickling. If the environment is stable enough (ie: no varying pointers) you simply copy the entire apps memory state into a binary file. When you want to resume, you copy it back into memory and begin running again. This gives you near perfect state recovery, but whether or not it's at all possible is highly environment/language dependent. (For example: most C++ apps couldn't do this without help from the OS or if they were built VERY carefully with this in mind.)

hibernate-like saving state of a program

What you most likely want is what we call serialization or object marshalling. There are a whole butt load of academic problems with data/object serialization that you can easily google.

That being said given the right library (probably very native) you could do a true snapshot of your running program similarly what "OS specific hibernate" does. Here is an SO answer for doing that on Linux: https://stackoverflow.com/a/12190830/318174

To do the above snapshot-ing though you will most likely need an external process from the process you want to save. I highly recommend you don't that. Instead read/lookup in your language of choice (btw welcome to SO, don't tag every language... that pisses people off) how to do serialization or object marshalling... hint... most people these days pick JSON.

Resuming a recursive function from a saved state

Storing a recursion is not so easy, because to resume the operation you will need to restore the stack, which is not a trivial task. I would go with an iterative algorithm, which is not as elegant as a recursion. But it pays off if interrupting/resuming calculation is needed.

An idea could be:

a subset is represented as a vector of 0s and 1s. 0 means the element is not taken, 1 - the element is taken so [1, 0, 1] for the set {1,2,3} means the subset {1,3}. Clearly only vectors of length N are real subsets.
see this vector as a kind of stack, it represents the state of your "recursion"
the value -1 in this vector is used to trigger the right behavior in the iteration -> similar to returning/backtracking from a recursion.

As algorithm (first, for iterating through all subsets):

def calc_subsets(state, N):#N - number of elements in the original set
     while True: #just iterate
        if storeFlag:#you need to set this flag to store and interrupt
            store(state)
            return
        if len(state)==N and state[-1]!=-1: #a full subset is reached
            evaluate(state)
            state.append(-1)#mark for unwind

        if state[-1]==-1:#means unwind state
            state.pop()
            if not state: #state is empty
                return #unwinded last element, we are done
            if state[-1]==1:#there is noting more to be explored
               state[-1]=-1#mark for unwind in the next iteration
            else:# = 0 is explored, so 1 is the next to explore
               state[-1]=1
        else: #means explore
            state.append(0) # 0 is the first to explore

evaluate is up to you, I just print out the vector:

def evaluate(state):
    print state

To print all subset of 3 elements one should call:

calc_subsets([0], 3)
>>>
[0, 0, 0]
[0, 0, 1]
[0, 1, 0]
[0, 1, 1]
[1, 0, 0]
[1, 0, 1]
[1, 1, 0]
[1, 1, 1]

and to print only the second part:

calc_subsets([0,1,1,-1], 3) 
>>>
[1, 0, 0]
[1, 0, 1]
[1, 1, 0]
[1, 1, 1]

Now, the algorithm can be adapted to iterate only through all subset with a given cardinality. For that, one must keep track of the number of elements in the current subset and trigger unwinding (through pushing -1 into the state vector) if the requested size of the subset is achieved.

Saving the State of a Program to Allow It to Be Resumed