Python Progress Bar

Python Progress Bar

There are specific libraries (like this one here) but maybe something very simple would do:

import time
import sys

toolbar_width = 40

# setup toolbar
sys.stdout.write("[%s]" % (" " * toolbar_width))
sys.stdout.flush()
sys.stdout.write("\b" * (toolbar_width+1)) # return to start of line, after '['

for i in xrange(toolbar_width):
time.sleep(0.1) # do real work here
# update the bar
sys.stdout.write("-")
sys.stdout.flush()

sys.stdout.write("]\n") # this ends the progress bar

Note: progressbar2 is a fork of progressbar which hasn't been maintained in years.

Text progress bar in terminal with block characters

Python 3

A Simple, Customizable Progress Bar

Here's an aggregate of many of the answers below that I use regularly (no imports required).

Note: All code in this answer was created for Python 3; see end of answer to use this code with Python 2.

# Print iterations progress
def printProgressBar (iteration, total, prefix = '', suffix = '', decimals = 1, length = 100, fill = '█', printEnd = "\r"):
"""
Call in a loop to create terminal progress bar
@params:
iteration - Required : current iteration (Int)
total - Required : total iterations (Int)
prefix - Optional : prefix string (Str)
suffix - Optional : suffix string (Str)
decimals - Optional : positive number of decimals in percent complete (Int)
length - Optional : character length of bar (Int)
fill - Optional : bar fill character (Str)
printEnd - Optional : end character (e.g. "\r", "\r\n") (Str)
"""
percent = ("{0:." + str(decimals) + "f}").format(100 * (iteration / float(total)))
filledLength = int(length * iteration // total)
bar = fill * filledLength + '-' * (length - filledLength)
print(f'\r{prefix} |{bar}| {percent}% {suffix}', end = printEnd)
# Print New Line on Complete
if iteration == total:
print()

Sample Usage

import time

# A List of Items
items = list(range(0, 57))
l = len(items)

# Initial call to print 0% progress
printProgressBar(0, l, prefix = 'Progress:', suffix = 'Complete', length = 50)
for i, item in enumerate(items):
# Do stuff...
time.sleep(0.1)
# Update Progress Bar
printProgressBar(i + 1, l, prefix = 'Progress:', suffix = 'Complete', length = 50)

Sample Output

Progress: |█████████████████████████████████████████████-----| 90.0% Complete

Update

There was discussion in the comments regarding an option that allows the progress bar to adjust dynamically to the terminal window width. While I don't recommend this, here's a gist that implements this feature (and notes the caveats).

Single-Call Version of The Above

A comment below referenced a nice answer posted in response to a similar question. I liked the ease of use it demonstrated and wrote a similar one, but opted to leave out the import of the sys module while adding in some of the features of the original printProgressBar function above.

Some benefits of this approach over the original function above include the elimination of an initial call to the function to print the progress bar at 0% and the use of enumerate becoming optional (i.e. it is no longer explicitly required to make the function work).

def progressBar(iterable, prefix = '', suffix = '', decimals = 1, length = 100, fill = '█', printEnd = "\r"):
"""
Call in a loop to create terminal progress bar
@params:
iterable - Required : iterable object (Iterable)
prefix - Optional : prefix string (Str)
suffix - Optional : suffix string (Str)
decimals - Optional : positive number of decimals in percent complete (Int)
length - Optional : character length of bar (Int)
fill - Optional : bar fill character (Str)
printEnd - Optional : end character (e.g. "\r", "\r\n") (Str)
"""
total = len(iterable)
# Progress Bar Printing Function
def printProgressBar (iteration):
percent = ("{0:." + str(decimals) + "f}").format(100 * (iteration / float(total)))
filledLength = int(length * iteration // total)
bar = fill * filledLength + '-' * (length - filledLength)
print(f'\r{prefix} |{bar}| {percent}% {suffix}', end = printEnd)
# Initial Call
printProgressBar(0)
# Update Progress Bar
for i, item in enumerate(iterable):
yield item
printProgressBar(i + 1)
# Print New Line on Complete
print()

Sample Usage

import time

# A List of Items
items = list(range(0, 57))

# A Nicer, Single-Call Usage
for item in progressBar(items, prefix = 'Progress:', suffix = 'Complete', length = 50):
# Do stuff...
time.sleep(0.1)

Sample Output

Progress: |█████████████████████████████████████████████-----| 90.0% Complete

Python 2

To use the above functions in Python 2, set the encoding to UTF-8 at the top of your script:

# -*- coding: utf-8 -*-

And replace the Python 3 string formatting in this line:

print(f'\r{prefix} |{bar}| {percent}% {suffix}', end = printEnd)

With Python 2 string formatting:

print('\r%s |%s| %s%% %s' % (prefix, bar, percent, suffix), end = printEnd)

Python: Progress bar in parse function?

You pretty much just need to break up your list comprehension. I'll use Enlighten here but you can accomplish the same thing with tqdm.

import enlighten

records: list = ...

manager = enlighten.get_manager()
pbar = manager.counter(total=len(records), desc='Parsing records', unit='records')

result = []
for item in records:
result.append(parse_record(item))
pbar.update()

df = pd.DataFrame(result)

If records is a generator not an iterable, you'll need to wrap it with list() or tuple() first so you can get the length.

Can't get progress bar to work in python rich

The problem was that through the use of for i in track(range(1), description='Scraping'): the bar would only go to 100% when the loop had finished. By changing the range() value would make the code loop and would update the bar. To fix this issue I used another rich module called Progress.

By importing Progress and then modifying the code on the Rich Documentation I got:

from rich.progress import Progress
import time

with Progress() as progress:

task1 = progress.add_task("[red]Scraping", total=100)

while not progress.finished:
progress.update(task1, advance=0.5)
time.sleep(0.5)

Essentially:

  • At task1 = progress.add_task("[red]Scraping", total=100) a bar is created with a maximum value of 100
  • The code indented underwhile not progress.finished: will loop until the bar is at 100%
  • At progress.update(task1, advance=0.5) the bar's total will be increased by a value of 0.5.

Therefore, for my specific example, my end result code was:

theme = Theme({'success': 'bold green',
'error': 'bold red', 'enter': 'bold blue'})
console = Console(theme=(theme))
bartotal = 100

with Progress() as progress:
task1 = progress.add_task("[magenta bold]Scraping...", total=bartotal)
while not progress.finished:
console.print("\nDeclaring global variables", style='success')
global pfp
progress.update(task1, advance=4)
global target_id
progress.update(task1, advance=4)
console.print("\nSetting up Chrome driver", style='success')
chrome_options = Options()
progress.update(task1, advance=4)
chrome_options.add_argument("--headless")
progress.update(task1, advance=4)
driver = webdriver.Chrome(options=chrome_options)
progress.update(task1, advance=4)
console.print("\nCreating url for lookup.guru",
style='success')
begining_of_url = "https://lookup.guru/"
progress.update(task1, advance=4)
whole_url = begining_of_url + str(target_id)
progress.update(task1, advance=4)
driver.get(whole_url)
progress.update(task1, advance=4)
console.print(
"\nWaiting up to 10 seconds for lookup.guru to load", style='success')
wait = WebDriverWait(driver, 10)
progress.update(task1, advance=4)
wait.until(EC.visibility_of_element_located(
(By.XPATH, "//img")))
progress.update(task1, advance=4)
console.print("\nScraping images", style='success')
images = driver.find_elements_by_tag_name('img')
progress.update(task1, advance=4)
for image in images:
global pfp
pfp = (image.get_attribute('src'))
break
progress.update(task1, advance=4)
if pfp == "a":
console.print("User not found \n", style='error')
userInput()
progress.update(task1, advance=4)
console.print(
"\nDownloading image to current directory", style='success')
img_data = requests.get(pfp).content
progress.update(task1, advance=4)
with open('pfpimage.png', 'wb') as handler:
handler.write(img_data)
progress.update(task1, advance=4)
filePath = "pfpimage.png"
progress.update(task1, advance=4)
console.print("\nUploading to yandex.com", style='success')
searchUrl = 'https://yandex.com/images/search'
progress.update(task1, advance=4)
files = {'upfile': ('blob', open(
filePath, 'rb'), 'image/jpeg')}
progress.update(task1, advance=4)
params = {'rpt': 'imageview', 'format': 'json',
'request': '{"blocks":[{"block":"b-page_type_search-by-image__link"}]}'}
progress.update(task1, advance=4)
response = requests.post(searchUrl, params=params, files=files)
progress.update(task1, advance=4)
query_string = json.loads(response.content)[
'blocks'][0]['params']['url']
progress.update(task1, advance=4)
img_search_url = searchUrl + '?' + query_string
progress.update(task1, advance=4)
console.print("\nOpening lookup.guru", style='success')
webbrowser.open(whole_url)
progress.update(task1, advance=4)
console.print("\nOpening yandex images", style='success')
webbrowser.open(img_search_url)
progress.update(task1, advance=4)
console.print("\nDone!", style='success')
progress.update(task1, advance=4)

Progress bar with multiprocessing

First a few general comments concerning your code. In your main process you use a path to a file to open zip archive just to retrieve back the original file name. That really does not make too much sense. Then in count_files_7z you iterate the return value from zf.namelist() to build a list of the files within the archive when zf.namelist() is already a list of those files. That does not make too much sense either. You also use the context manager function closing to ensure that the archive is closed at the end of the block, but the with block itself is a context manager that serves the same purpose.

I tried installing alive-progress and the progress bars were a mess. This is a task better suited to multithreading rather than multiprocessing. Actually, it is probably better suited to serial processing since doing concurrent I/O operations to your disk, unless it is a solid state drive, is probably going to hurt performance. You will gain performance if there is heavy CPU-intensive processing involved of the files you read. If that is the case, I have passed to each thread a multiprocessing pool to which you can execute a calls to apply specifying functions in which you have placed CPU-intensive code. But the progress bars will should work better when done under multithreading rather than multiprocessing. Even then I could not get any sort of decent display with alive-progress, which admittedly I did not spend too much time on. So I have switched to using the more common tqdm module available from the PyPi repository.

Even with tqdm there is a problem in that when a progress bar reaches 100%, tqdm must be writing something (a newline?) that relocates the other progress bars. Therefore, what I have done is specified leave=False, which causes the bar to disappear when it reaches 100%. But at least you can see all the progress bars without distortion as they are progressing.

from multiprocessing.pool import Pool, ThreadPool
from threading import Lock
import tqdm
from zipfile import ZipFile
import os
import heapq

def get_filepaths(directory):
file_paths = [] # List which will store all of the full filepaths.
# Walk the tree.
for root, directories, files in os.walk(directory):
for filename in files:
# Join the two strings in order to form the full filepath.
filepath = os.path.join(root, filename)
file_paths.append(filepath) # Add it to the list.
return file_paths # Self-explanatory.


def get_free_position():
""" Return the minimum possible position """
with lock:
free_position = heapq.heappop(free_positions)
return free_position

def return_free_position(position):
with lock:
heapq.heappush(free_positions, position)

def run_performance(zip_file):
position = get_free_position()
with ZipFile(zip_file) as zf:
file_list = zf.namelist()
with tqdm.tqdm(total=len(file_list), position=position, leave=False) as bar:
for f in file_list:
with zf.open(f) as myfile:
... # do things with myfile (perhaps myfile.read())
# for CPU-intensive tasks: result = pool.apply(some_function, args=(arg1, arg2, ... argn))
import time
time.sleep(.005) # simulate doing something
bar.update()
return_free_position(position)

def generate_zip_files():
list_dir = ['path1', 'path2']
for folder in list_dir:
get_all_zips = get_filepaths(folder)
for zip_file in get_all_zips:
yield zip_file

# Required for Windows:
if __name__ == '__main__':
N_THREADS = 5
free_positions = list(range(N_THREADS)) # already a heap
lock = Lock()
pool = Pool()
thread_pool = ThreadPool(N_THREADS)
for result in thread_pool.imap_unordered(run_performance, generate_zip_files()):
pass
pool.close()
pool.join()
thread_pool.close()
thread_pool.join()

The code above uses a multiprocessing thread pool arbitrarily limited in size to 5 just as a demo. You can increase or decrease N_THREADS to whatever value you want, but as I said, it may or may not help performance. If you want one thread per zip file then:

if __name__ == '__main__':
zip_files = list(generate_zip_files())
N_THREADS = len(zip_files)
free_positions = list(range(N_THREADS)) # already a heap
lock = Lock()
pool = Pool()
thread_pool = ThreadPool(N_THREADS)
for result in thread_pool.imap_unordered(run_performance, zip_files):
pass
pool.close()
pool.join()
thread_pool.close()
thread_pool.join()


Related Topics



Leave a reply



Submit