When to Use and When Not to Use Python 3.5 'Await'

When to use and when not to use Python 3.5 `await` ?

By default all your code is synchronous. You can make it asynchronous defining functions with async def and "calling" these functions with await. A More correct question would be "When should I write asynchronous code instead of synchronous?". Answer is "When you can benefit from it". In cases when you work with I/O operations as you noted you will usually benefit:

# Synchronous way:
download(url1) # takes 5 sec.
download(url2) # takes 5 sec.
# Total time: 10 sec.

# Asynchronous way:
await asyncio.gather(
async_download(url1), # takes 5 sec.
async_download(url2) # takes 5 sec.
)
# Total time: only 5 sec. (+ little overhead for using asyncio)

Of course, if you created a function that uses asynchronous code, this function should be asynchronous too (should be defined as async def). But any asynchronous function can freely use synchronous code. It makes no sense to cast synchronous code to asynchronous without some reason:

# extract_links(url) should be async because it uses async func async_download() inside
async def extract_links(url):

# async_download() was created async to get benefit of I/O
html = await async_download(url)

# parse() doesn't work with I/O, there's no sense to make it async
links = parse(html)

return links

One very important thing is that any long synchronous operation (> 50 ms, for example, it's hard to say exactly) will freeze all your asynchronous operations for that time:

async def extract_links(url):
data = await download(url)
links = parse(data)
# if search_in_very_big_file() takes much time to process,
# all your running async funcs (somewhere else in code) will be frozen
# you need to avoid this situation
links_found = search_in_very_big_file(links)

You can avoid it calling long running synchronous functions in separate process (and awaiting for result):

executor = ProcessPoolExecutor(2)

async def extract_links(url):
data = await download(url)
links = parse(data)
# Now your main process can handle another async functions while separate process running
links_found = await loop.run_in_executor(executor, search_in_very_big_file, links)

One more example: when you need to use requests in asyncio. requests.get is just synchronous long running function, which you shouldn't call inside async code (again, to avoid freezing). But it's running long because of I/O, not because of long calculations. In that case, you can use ThreadPoolExecutor instead of ProcessPoolExecutor to avoid some multiprocessing overhead:

executor = ThreadPoolExecutor(2)

async def download(url):
response = await loop.run_in_executor(executor, requests.get, url)
return response.text

How to use async/await in Python 3.5?

Running coroutines requires an event loop. Use the asyncio() library to create one:

import asyncio

# Python 3.7+
asyncio.run(foo())

or

# Python 3.6 and older
loop = asyncio.get_event_loop()
loop.run_until_complete(foo())

Also see the Tasks and Coroutines chapter of the asyncio documentation. If you already have a loop running, you'd want to run additional coroutines concurrently by creating a task (asyncio.create_task(...) in Python 3.7+, asyncio.ensure_future(...) in older versions).

Note however that time.sleep() is not an awaitable object. It returns None so you get an exception after 1 second:

>>> asyncio.run(foo())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/.../lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete
return future.result()
File "<stdin>", line 2, in foo
TypeError: object NoneType can't be used in 'await' expression

In this case you should use the asyncio.sleep() coroutine instead:

async def foo():
await asyncio.sleep(1)

which is cooperates with the loop to enable other tasks to run. For blocking code from third-party libraries that do not have asyncio equivalents, you could run that code in an executor pool. See Running Blocking Code in the asyncio development guide.

How to use async/await in python 3.5+

run_until_complete will block until asyncfoo is done. Instead, you would need two coroutines executed in the loop. Use asyncio.gather to easily start more than one coroutine with run_until_complete.

Here is a an example:

import asyncio

async def async_foo():
print("asyncFoo1")
await asyncio.sleep(3)
print("asyncFoo2")

async def async_bar():
print("asyncBar1")
await asyncio.sleep(1)
print("asyncBar2")

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(async_foo(), async_bar()))
loop.close()

Simplest async/await example possible in Python

To answer your questions, I will provide 3 different solutions to the same problem.

Case 1: just normal Python

import time

def sleep():
print(f'Time: {time.time() - start:.2f}')
time.sleep(1)

def sum(name, numbers):
total = 0
for number in numbers:
print(f'Task {name}: Computing {total}+{number}')
sleep()
total += number
print(f'Task {name}: Sum = {total}\n')

start = time.time()
tasks = [
sum("A", [1, 2]),
sum("B", [1, 2, 3]),
]
end = time.time()
print(f'Time: {end-start:.2f} sec')

output:

Task A: Computing 0+1
Time: 0.00
Task A: Computing 1+2
Time: 1.00
Task A: Sum = 3

Task B: Computing 0+1
Time: 2.01
Task B: Computing 1+2
Time: 3.01
Task B: Computing 3+3
Time: 4.01
Task B: Sum = 6

Time: 5.02 sec

Case 2: async/await done wrong

import asyncio
import time

async def sleep():
print(f'Time: {time.time() - start:.2f}')
time.sleep(1)

async def sum(name, numbers):
total = 0
for number in numbers:
print(f'Task {name}: Computing {total}+{number}')
await sleep()
total += number
print(f'Task {name}: Sum = {total}\n')

start = time.time()

loop = asyncio.get_event_loop()
tasks = [
loop.create_task(sum("A", [1, 2])),
loop.create_task(sum("B", [1, 2, 3])),
]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()

end = time.time()
print(f'Time: {end-start:.2f} sec')

output:

Task A: Computing 0+1
Time: 0.00
Task A: Computing 1+2
Time: 1.00
Task A: Sum = 3

Task B: Computing 0+1
Time: 2.01
Task B: Computing 1+2
Time: 3.01
Task B: Computing 3+3
Time: 4.01
Task B: Sum = 6

Time: 5.01 sec

Case 3: async/await done right

Same as case 2 except the sleep function:

async def sleep():
print(f'Time: {time.time() - start:.2f}')
await asyncio.sleep(1)

output:

Task A: Computing 0+1
Time: 0.00
Task B: Computing 0+1
Time: 0.00
Task A: Computing 1+2
Time: 1.00
Task B: Computing 1+2
Time: 1.00
Task A: Sum = 3

Task B: Computing 3+3
Time: 2.00
Task B: Sum = 6

Time: 3.01 sec

Case 1 and case 2 give the same 5 seconds, whereas case 3 just 3 seconds. So the async/await done right is faster.

The reason for the difference is within the implementation of sleep function.

# case 1
def sleep():
...
time.sleep(1)

# case 2
async def sleep():
...
time.sleep(1)

# case 3
async def sleep():
...
await asyncio.sleep(1)

In case 1 and case 2, they are the "same":
they "sleep" without allowing others to use the resources.
Whereas in case 3, it allows access to the resources when it is asleep.

In case 2, we added async to the normal function. However the event loop will run it without interruption.
Why? Because we didn't say where the loop is allowed to interrupt your function to run another task.

In case 3, we told the event loop exactly where to interrupt the function to run another task. Where exactly? Right here!

await asyncio.sleep(1)

More on this read here

Update 02/May/2020

Consider reading

  • A Hitchhikers Guide to Asynchronous Programming
  • Asyncio Futures and Coroutines

Is there ever a reason to `return await ...` in python asyncio?

Given:

async def foo() -> str:
return 'bar'

What you get when calling foo is an Awaitable, which obviously you'd want to await. What you need to think about is the return value of your function. You can for example do this:

def bar() -> Awaitable[str]:
return foo() # foo as defined above

There, bar is a synchronous function but returns an Awaitable which results in a str.

async def bar() -> str:
return await foo()

Above, bar itself is async and results in an Awaitable when called which results in a str, same as above. There's no real difference between these two usages. Differences appear here:

async def bar() -> Awaitable[str]:
return foo()

In that example, calling bar results in an Awaitable which results in an Awaitable which results in a str; quite different. If you naïvely use the above, you'll get this kind of result:

>>> asyncio.run(bar())
<coroutine object foo at 0x108706290>
RuntimeWarning: coroutine 'foo' was never awaited

As a rule of thumb, every call to an async must be awaited somewhere once. If you have two async (async def foo and async def bar) but no await in bar, then the caller of bar must await twice, which would be odd.

Python 3.5 async/await with real code example

If a third-party library is not compatible with async/await then obviously you can't use it easily. There are two cases:

  1. Let's say that the function in the library is asynchronous and it gives you a callback, e.g.

    def fn(..., clb):
    ...

    So you can do:

    def on_result(...):
    ...

    fn(..., on_result)

    In that case you can wrap such functions into the asyncio protocol like this:

    from asyncio import Future

    def wrapper(...):
    future = Future()
    def my_clb(...):
    future.set_result(xyz)
    fn(..., my_clb)
    return future

    (use future.set_exception(exc) on exception)

    Then you can simply call that wrapper in some async function with await:

    value = await wrapper(...)

    Note that await works with any Future object. You don't have to declare wrapper as async.

  2. If the function in the library is synchronous then you can run it in a separate thread (probably you would use some thread pool for that). The whole code may look like this:

    import asyncio
    import time
    from concurrent.futures import ThreadPoolExecutor

    # Initialize 10 threads
    THREAD_POOL = ThreadPoolExecutor(10)

    def synchronous_handler(param1, ...):
    # Do something synchronous
    time.sleep(2)
    return "foo"

    # Somewhere else
    async def main():
    loop = asyncio.get_event_loop()
    futures = [
    loop.run_in_executor(THREAD_POOL, synchronous_handler, param1, ...),
    loop.run_in_executor(THREAD_POOL, synchronous_handler, param1, ...),
    loop.run_in_executor(THREAD_POOL, synchronous_handler, param1, ...),
    ]
    await asyncio.wait(futures)
    for future in futures:
    print(future.result())

    with THREAD_POOL:
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())

If you can't use threads for whatever reason then using such a library simply makes entire asynchronous code pointless.

Note however that using synchronous library with async is probably a bad idea. You won't get much and yet you complicate the code a lot.

In Python 3.5, is the keyword await equivalent to yield from ?

No, they are not equivalent. await in an async function and yield from in a generator are very similar and share most of their implementation, but depending on your Python version, trying to use yield or yield from inside an async function will either cause an outright SyntaxError or make your function an asynchronous generator function.

When the asyncio docs say "await or yield from", they mean that async functions should use await and generator-based coroutines should use yield from.

Should I use coroutines for synchronous Code from asynchronous Code?

Main point of using async def / await is to explicitly mark places in a code where context switch can happen (where execution flow can switch to another coroutine). Lesser such places - easier to handle a concurrent code. Thus don't make a function coroutine unless you have some reason for it.

What reason do you have to make a function coroutine?

  • If this function would need to await some coroutine, it should be a coroutine either
  • If the function would need to await some Future or other async object, it should be defined a coroutine
  • If the function doesn't need to await for something, but it will be called inside other coroutine and calling the function takes much time. In this case it makes sense to call a function in a thread asynchronously and await the thread is finished.

You can read a bit more about it here (and in discussion under the answer).



Related Topics



Leave a reply



Submit