How many concurrent requests does a single Flask process receive?
When running the development server - which is what you get by running app.run()
, you get a single synchronous process, which means at most 1 request is being processed at a time.
By sticking Gunicorn in front of it in its default configuration and simply increasing the number of --workers
, what you get is essentially a number of processes (managed by Gunicorn) that each behave like the app.run()
development server. 4 workers == 4 concurrent requests. This is because Gunicorn uses its included sync
worker type by default.
It is important to note that Gunicorn also includes asynchronous workers, namely eventlet
and gevent
(and also tornado
, but that's best used with the Tornado framework, it seems). By specifying one of these async workers with the --worker-class
flag, what you get is Gunicorn managing a number of async processes, each of which managing its own concurrency. These processes don't use threads, but instead coroutines. Basically, within each process, still only 1 thing can be happening at a time (1 thread), but objects can be 'paused' when they are waiting on external processes to finish (think database queries or waiting on network I/O).
This means, if you're using one of Gunicorn's async workers, each worker can handle many more than a single request at a time. Just how many workers is best depends on the nature of your app, its environment, the hardware it runs on, etc. More details can be found on Gunicorn's design page and notes on how gevent works on its intro page.
How to make flask handle 25k request per second like express.js
You can use multithreads or gevent to increase gunicorn's concurrency.
Option1 multithreads
eg:
gunicorn -w 4 --threads 100 -b 0.0.0.0:5000 your_project:app
--threads 100
means 100 threads per process.
-w 4
means 4 processes, so -w 4 --threads 100
means 400 requests at a time
Option2 gevent worker
eg:
pip install gevent
gunicorn -w 4 -k gevent --worker-connections 1000 -b 0.0.0.0:5000 your_project:app
-k gevent --worker-connections 1000
means 1000 coroutines per gevent worker process.
-w 4
means 4 processes, so -w 4 -k gevent --worker-connections 1000
means 4000 requests at a time.
For more information, you can refer to my blog post: https://easydevguide.com/posts/gunicorn_concurrency
how does flask handle simultaneous requests?
Flask doesn't. Parallel request handling is the task of the underlying WSGI web server, which sends the requests to Flask for handling.
Flask's built-in development server which is invoked with Flask.run()
runs with threads by default
In production, you'd use one of the WSGI containers or other deployment options, and you control parallelism there. Gunicorn, for example, has the -w
command line argument which controls the number of worker processes, and -k
which controls how these workers work (processes, threads, or a Tornado event machine, among others).
Flask App can't handle more than 6 requests with a single browser
It's browser's TCP connection is limited to 6 only.
reference
How to solve this
Firefox can be configured from within about:config, filter on network.http for various settings; network.http.max-persistent-connections-per-server
is the one to change.
ref
But as it's browser's problem I may use different approach like creating same server with different port.
How to process several HTTP requests with Flask
Let me clear out the confusion for you.
When you are using Flask while developing locally, you use the built-in server which is single-threaded. which means it will only process one request at a time. This is one of the reasons why you shouldn't simply have FLASK_ENV=production
and run in a production environment. The built-in server is not capable to run in those environments. One you change FLASK_ENV
to production and run, you'll find a warning in the terminal.
Now, coming on to how to run Flask in a production environment, CPU's, Core's, Threads and other stuff
To run Flask in a production environment, you need to have a proper application server that can run your Flask application. Here comes in Gunicorn which is compatible with Flask and one of the most sought after ways of running Flask.
In gunicorn, you have different ways to configure an optimal way to run it based on the specs of your servers.
You can achieve it in the following ways:
- Worker Class - The type of worker to use
- No of Workers
- No of Threads
The way you calculate the maximum number of concurrent requests is as follows:
Taking a 4 core server as
As per the documentation of gunicorn, the optimal number of workers
is suggested as (2 * num_of_cores) + 1
which in this case becomes (2*4)+1 = 9
Now, the optimal configuration for the number of threads is 2 to 4 x $(num_of_cores)
which in this case comes out to say 4*9 = 36
So now, you have 9 Workers with 36 threads each. Each thread can handle one request at a time so you can have 9*36=324 concurrent connections
Similarly, you can have the calculation for Waitress. I prefer using Gunicorn so you'll need to check out the docs of waitress for the configuration.
Now coming to Web Servers
Until now, what you have configured is an application server to run Flask. This works, but you shouldn't expose an application server directly to the internet. Instead, it's always suggested to deploy Flask behind a reverse proxy like Nginx. Nginx acts as a full-fledged web server capable of handling real-world workloads.
So in a gist, you could use a combination from the list below as per your requirements,
Flask + Application Server + Web Server where,
Application Server is one of Gunicorn, uWSGI, Gevent, Twisted Web, Waitress, etc and a Web Server from one of Nginx, Apache, Traefik, Caddy, etc
Related Topics
In Django - Model Inheritance - Does It Allow You to Override a Parent Model's Attribute
Get Previous Row's Value and Calculate New Column Pandas Python
How to Share Variables Across Scripts in Python
Beautifulsoup - Search by Text Inside a Tag
Handle Flask Requests Concurrently with Threaded=True
Python Eval: Is It Still Dangerous If I Disable Builtins and Attribute Access
How to Check If a Value Is in the List in Selection from Pandas Data Frame
Record Speakers Output with Pyaudio
Numpy: Formal Definition of "Array_Like" Objects
Plotting Grouped Data in Same Plot Using Pandas
How Can Strings Be Concatenated
Search in Lists of Lists by Given Index
Python String 'In' Operator Implementation Algorithm and Time Complexity
How to Format Date String via Multiple Formats in Python
Time Complexity of String Slice