Are Rack-Based Web Servers Represent Fastcgi Protocol

Why should I avoid using CGI?

CGI, by which I mean both the interface and the common programming libraries and practices around it, was written in a different time. It has a view of request handlers as distinct processes connected to the webserver via environment variables and standard I/O streams.

This was state-of-the-art in its day, when there were not really "web frameworks" and "embedded server modules" as we think of them today. Thus...

CGI tends to be slow

Again, the CGI model spawns one new process per connection. While spawning processes per se is cheap these days, heavy web app initialization — reading and parsing scores of modules, making database connections, etc. — makes this quite expensive.

CGI tends toward too-low-level (IMHO) design

Again, the CGI model explicitly mentions environment variables and standard input as the interface between request and handler. But ... who cares? That's much lower level than the app designer should generally be thinking about. If you look at libraries and code based on CGI, you'll see that the bulk of it encourages "business logic" right alongside form parsing and HTML generation, which is now widely seen as a dangerous mixing of concerns.

Contrast with something like Rack::Builder, where right away the coder is thinking of mapping a namespace to an action, and what that means for the broader web application. (Suddenly we are free to argue about the semantic web and the virtues of REST and this and that, because we're not thinking about generating radio buttons based off user-supplied input.)

Yes, something like Rack::Builder could be implemented on top of CGI, but, that's the point. It'd have to be a layer of abstraction built on top of CGI.

CGI tends to be sneeringly dismissed

Despite CGI working perfectly well within its limitations, despite it being simple and widely understood, CGI is often dismissed out of hand. You, too, might be dismissed out of hand if CGI is all you know.

Understanding CGI / FastCGI in Rails

I asked a similar (although it's more close to implementation then concepts) question here:
HTTP request dispatch from Web Server to a CGI/FastCGI process

However, here's what I've been able to learn on the way:
CGI is a set of "standards" that define how an HTTP/Web Server should communicate with external programs. Note the word standards! Although not an out and out protocol (like HTTP, TCP etc) but it is pretty close being one as the set of standards are complied to by most of the external programs that generate HTML (Ruby, PHP, Python etc).

You can read more about CGI here:
http://hoohoo.ncsa.illinois.edu/cgi/intro.html

and here:
http://www.w3.org/CGI/

FastCGI is an improvement on the way CGI processes are handled - put in a super simple way a FastCGI process stays loaded in the memory for a longer time so that it can process multiple requests while it's loaded in the memory. Obviously that works more efficiently since the time & resources lost in loading the basic CGI environment never does happen that frequently in FastCGI processes

A little off the track and Rails specific, but this is an interesting artice:
http://izumi.plan99.net/blog/index.php/2007/04/05/saving-memory-in-ruby-on-rails/

Differences between Ruby web servers and others like nginx

Rack is an interface, a specification, for application servers in Ruby. These application servers typically take in HTTP requests and return HTTP responses over either TCP ports or unix sockets. Ruby web apps will leverage one of the app servers that implement the Rack specification (thin, puma, unicorn, etc). You wouldn't usually expose the application server directly to the internet for a variety of reasons, one of the most important being that these application servers are meant to be a bridge between application code and http, not to stand up to the wide open internet. So, the app server typically has a web server, like nginx or apache, sitting in front of it. It's very common to have http requests coming into nginx on port 80 and then having nginx distributing those requests to one or more rack application servers running on a different port.

Thin server underperforming / How do evented web servers work?

Sergio is correct. Your app, at this point, is probably better of on the traditional Apache/Passenger model. If you take the evented route, especially on single-threaded platforms like Ruby, you can NEVER block on anything, whether it is the DB, Cache servers, other HTTP requests you might make - nothing.

This is what makes asynchronous (evented) programming harder - it is easy to block on stuff, usually in the form of synchronous disk I/O or DNS resolutions. Non-blocking (evented) frameworks such as nodejs are careful in that they (almost) never provide you with a framework function call that is blocking, rather everything is handled using callbacks (incl DB queries).

This might be easier to visualize if you look at the heart of a single-threaded non-blocking server:

while( wait_on_sockets( /* list<socket> */ &$sockets, /* event */ &$what, $timeout ) ) {
    foreach( $socketsThatHaveActivity as $fd in $sockets ) {
        if( $what == READ ) {   // There is data availabe to read from this socket
            $data = readFromSocket($fd);
            processDataQuicklyWithoutBlocking( $data );
        }
        elseif ($what == WRITE && $data = dataToWrite($fd)) { // This socket is ready to be written to (if we have any data)
            writeToSocket( $fd, $data );    
        }
    }
}

What you see above is called the event loop. wait_on_sockets is usually provided by the OS in the form of a system call, such as select, poll, epoll, or kqueue. If processDataQuicklyWithoutBlocking takes too long, your application's network buffer maintained by the OS (new requests, incoming data etc) will eventually fill up, and will cause it to reject new connections and timeout existing ones, as $socketsThatHaveActivity isn't being handled fast enough. This is different from a threaded server (e.g. a typical Apache install) in that each connection is served using a separate thread/process, so incoming data will be read into the app as soon as it arrives, and outgoing data will be sent without delay.

What non-blocking frameworks like nodejs do when you make (for example) a DB query is to add the socket connection of the DB server to the list of sockets being monitored ($sockets), so even if your query takes a while, your (only) thread isn't blocked on that one socket. Rather they provide a callback:

$db.query( "...sql...", function( $result ) { ..handle result ..} );

As you can see above, db.query returns immediately with absolutely no blocking on the db server whatsoever. This also means you frequently have to write code like this, unless the programming language itself supports async functions (like the new C#):

$db.query( "...sql...", function( $result ) { $httpResponse.write( $result ); $connection.close(); } );

The never-ever-block rule can be somewhat relaxed if you have many processes that are each running an event loop (typically the way to run a node cluster), or use a thread pool to maintain the event loop (java's jetty, netty etc, you can write your own in C/C++). While one thread is blocked on something, other threads can still do the event loop. But under heavy enough load, even these would fail to perform. So NEVER EVER BLOCK in an evented server.

So as you can see, evented servers generally try to solve a different problem - they can have a great many number of open connections. Where they excel at is in just pushing bytes around with light calculations (e.g comet servers, caches like memcached, varnish, proxies like nginx, squid etc). It is worth nothing that even though they scale better, response times generally tend to increase (nothing is better than reserving an entire thread for a connection). Of course, it might not be economically/computationally feasible to run the same number of threads as # of concurrent connections.

Now back to your issue - I would recommmend you still keep Nginx around, as it is excellent at connection management (which is event-based) - generally means handling HTTP keep-alives, SSL etc. You should then connect this to your Rails app using FastCGI, where you still need to run workers, but don't have to rewrite your app to be fully evented. You should also let Nginx serve static content - no point in getting your Rails workers tied up with something that Nginx can usually do better. This approach generally scales much better than Apache/Passenger, especially if you run a high-traffic website.

If you can write your entire app to be evented, then great, but I have no idea on how easy or difficult that is in Ruby.

rails: What happens when two web servers are specified in a Gemfile

Long Answer with example:

you created app with rails 5.0.0, you get puma webserver by default.
you install unicorn gem. and you start the server by rails server still your server is puma. why ? because config/puma.rb file which generated with rails new application-name .
To start unicorn server you have to created one file for unicorn in config folder.
To start unicorn server you have to execute command like this unicorn -c config/unicorn.rb in your terminal.
I think this is the answer of your question.

Short answer:

You have tell rails explicitly which webserver you have to run. Rails will not decide which to run (in case of multiple web server).

I hope this clear your doubts.
Cheers