Short-Polling VS Long-Polling for Real Time Web Applications

Short-polling vs Long-polling for real time web applications?

  • Short polling (a.k.a. AJAX based timer):

    Pros: simpler, not server consuming (if the time between requests is long).

    Cons: bad if you need to be notified WHEN the server event happens with no delay.
    Example (ItsNat based)

  • Long polling (a.k.a. Comet based on XHR)

    Pros: you are notified WHEN the server event happens with no delay.
    Cons: more complex and more server resources used.
    Example (ItsNat based)

In what situations would AJAX long/short polling be preferred over HTML5 WebSockets?

WebSockets is definitely the future now.

Long polling is a dirty workaround to prevent creating connections for each request like AJAX does - but long polling was created when WebSockets didn't exist. Now due to WebSockets,
long polling is going away no more.

WebRTC allows for peer-to-peer communication.

I recommend learning WebSockets.

Comparison:

of different communication techniques on the web

  • AJAX - requestresponse. Creates a connection to the server, sends request headers with optional data, gets a response from the server, and closes the connection.
    Supported in all major browsers.

  • Long poll - requestwaitresponse. Creates a connection to the server like AJAX does, but maintains a keep-alive connection open for some time (not long though). During connection, the open client can receive data from the server. The client has to reconnect periodically after the connection is closed, due to timeouts or data eof. On server side it is still treated like an HTTP request, same as AJAX, except the answer on request will happen now or some time in the future, defined by the application logic.
    support chart (full) | wikipedia

  • WebSockets - clientserver. Create a TCP connection to the server, and keep it open as long as needed. The server or client can easily close the connection. The client goes through an HTTP compatible handshake process. If it succeeds, then the server and client can exchange data in both directions at any time. It is efficient if the application requires frequent data exchange in both ways. WebSockets do have data framing that includes masking for each message sent from client to server, so data is simply encrypted.
    support chart (very good) | wikipedia

  • WebRTC - peerpeer. Transport to establish communication between clients and is transport-agnostic, so it can use UDP, TCP or even more abstract layers. This is generally used for high volume data transfer, such as video/audio streaming, where reliability is secondary and a few frames or reduction in quality progression can be sacrificed in favour of response time and, at least, some data transfer. Both sides (peers) can push data to each other independently. While it can be used totally independent from any centralised servers, it still requires some way of exchanging endPoints data, where in most cases developers still use centralised servers to "link" peers. This is required only to exchange essential data for establishing a connection, after which a centralised server is not required.
    support chart (medium) | wikipedia

  • Server-Sent Events - clientserver. Client establishes persistent and long-term connection to server. Only the server can send data to a client. If the client wants to send data to the server, it would require the use of another technology/protocol to do so. This protocol is HTTP compatible and simple to implement in most server-side platforms. This is a preferable protocol to be used instead of Long Polling. support chart (good, except IE) | wikipedia

Advantages:

The main advantage of WebSockets server-side, is that it is not an HTTP request (after handshake), but a proper message based communication protocol. This enables you to achieve huge performance and architecture advantages. For example, in node.js, you can share the same memory for different socket connections, so they can each access shared variables. Therefore, you don't need to use a database as an exchange point in the middle (like with AJAX or Long Polling with a language like PHP).
You can store data in RAM, or even republish between sockets straight away.

Security considerations

People are often concerned about the security of WebSockets. The reality is that it makes little difference or even puts WebSockets as better option. First of all, with AJAX, there is a higher chance of MITM, as each request is a new TCP connection that is traversing through internet infrastructure. With WebSockets, once it's connected it is far more challenging to intercept in between, with additionally enforced frame masking when data is streamed from client to server as well as additional compression, which requires more effort to probe data. All modern protocols support both: HTTP and HTTPS (encrypted).

P.S.

Remember that WebSockets generally have a very different approach of logic for networking, more like real-time games had all this time, and not like http.

Scaling a chat app - short polling vs. long polling (AJAX, PHP)

A few notes:

  • Polling every second is overkill. The app will still feel very responsive with a few seconds of delay between checks.
  • To save your db's traffic and speed responses, consider using an in memory cache to store undelivered messages. You could still persist messages to the db, the in memory cache would simply be used for queries for new messages to avoid queries to the db every x seconds by each user.
  • Timeout the user's chat after x seconds of inactivity to stop polling to your server. This assures someone leaving a window open won't continue to generate traffic. Offer a simple "Still there? Continue chatting." link for sessions that timeout and warn the user before the timeout so they can extend the timeout.
  • I'd suggest starting out with polling rather than comet/long polling/sockets. Polling is simple to build and support and will likely scale just fine in the short-term. If you get a lot of traffic you can throw hardware and a load balancer at the problem to scale. The entire web is based on polling - polling most certainly scales. There's a point where the complexity of alternatives like comet/long polling/etc make sense, but you need a lot of traffic before the extra development time/complexity are justified.

Polling vs long-polling

polling: hitting a url from a client at some interval

long polling: hitting the url from a client and having the server hold the connection for a period of time, in this way the server can return the connection at the moment when it has information for the client

Not all ajax is long polling. Long polling is best achieved using a framework like cometd. (http://www.cometd.org)

What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet?

In the examples below the client is the browser and the server is the webserver hosting the website.

Before you can understand these technologies, you have to understand classic HTTP web traffic first.

Regular HTTP:

  1. A client requests a webpage from a server.
  2. The server calculates the response
  3. The server sends the response to the client.

HTTP

Ajax Polling:

  1. A client requests a webpage from a server using regular HTTP (see HTTP above).
  2. The client receives the requested webpage and executes the JavaScript on the page which requests a file from the server at regular intervals (e.g. 0.5 seconds).
  3. The server calculates each response and sends it back, just like normal HTTP traffic.

Ajax Polling

Ajax Long-Polling:

  1. A client requests a webpage from a server using regular HTTP (see HTTP above).
  2. The client receives the requested webpage and executes the JavaScript on the page which requests a file from the server.
  3. The server does not immediately respond with the requested information but waits until there's new information available.
  4. When there's new information available, the server responds with the new information.
  5. The client receives the new information and immediately sends another request to the server, re-starting the process.

Ajax Long-Polling

HTML5 Server Sent Events (SSE) / EventSource:

  1. A client requests a webpage from a server using regular HTTP (see HTTP above).
  2. The client receives the requested webpage and executes the JavaScript on the page which opens a connection to the server.
  3. The server sends an event to the client when there's new information available.

    • Real-time traffic from server to client, mostly that's what you'll need
    • You'll want to use a server that has an event loop
    • Connections with servers from other domains are only possible with correct CORS settings
    • If you want to read more, I found these very useful: (article), (article), (article), (tutorial).

HTML5 SSE

HTML5 Websockets:

  1. A client requests a webpage from a server using regular http (see HTTP above).
  2. The client receives the requested webpage and executes the JavaScript on the page which opens a connection with the server.
  3. The server and the client can now send each other messages when new data (on either side) is available.

    • Real-time traffic from the server to the client and from the client to the server
    • You'll want to use a server that has an event loop
    • With WebSockets it is possible to connect with a server from another domain.
    • It is also possible to use a third party hosted websocket server, for example Pusher or others. This way you'll only have to implement the client side, which is very easy!
    • If you want to read more, I found these very useful: (article), (article) (tutorial).

HTML5 WebSockets

Comet:

Comet is a collection of techniques prior to HTML5 which use streaming and long-polling to achieve real time applications. Read more on wikipedia or this article.


Now, which one of them should I use for a realtime app (that I need to
code). I have been hearing a lot about websockets (with socket.io [a
node.js library]) but why not PHP ?

You can use PHP with WebSockets, check out Ratchet.

My Understanding of HTTP Polling, Long Polling, HTTP Streaming and WebSockets

There are more differences than the ones you have identified.

Duplex/directional:

  • Uni-directional: HTTP poll, long poll, streaming.
  • Bi-direcitonal: WebSockets, plugin networking

In order of increasing latency (approximate):

  • WebSockets
  • Plugin networking
  • HTTP streaming
  • HTTP long-poll
  • HTTP polling

CORS (cross-origin support):

  • WebSockets: yes
  • Plugin networking: Flash via policy request (not sure about others)
  • HTTP * (some recent support)

Native binary data (typed arrays, blobs):

  • WebSockets: yes
  • Plugin networking: not with Flash (requires URL encoding across ExternalInterface)
  • HTTP *: recent proposal to enable binary type support

Bandwidth in decreasing efficiency:

  • Plugin networking: Flash sockets are raw except for initial policy request
  • WebSockets: connection setup handshake and a few bytes per frame
  • HTTP streaming (re-use of server connection)
  • HTTP long-poll: connection for every message
  • HTTP poll: connection for every message + no data messages

Mobile device support:

  • WebSocket: iOS 4.2 and up. Some Android via Flash emulation or using Firefox for Android or Google Chrome for Android which both provide native WebSocket support.
  • Plugin networking: some Android. Not on iOS
  • HTTP *: mostly yes

Javascript usage complexity (from simplest to most complicated). Admittedly complexity measures are somewhat subjective.

  • WebSockets
  • HTTP poll
  • Plugin networking
  • HTTP long poll, streaming

Also note that there is a W3C proposal for standardizing HTTP streaming called Server-Sent Events. It is currently fairly early in it's evolution and is designed to provide a standard Javascript API with comparable simplicity to WebSockets.

Technology behind real-time polling

There are several technologies to achieve this:

  • polling: the app makes a request every x milliseconds to check for updates
  • long polling: the app makes a request to the server, but the server only responds when it has new data available (usually if no new data is available in X seconds, an empty response is sent or the connection is killed)
  • forever frame: a hidden iframe is opened in the page and the request is made for a doc that relies on HTTP 1.1 chunked encoding
  • XHR streaming: allows successive messages to be sent from the server without requiring a new HTTP request after each response
  • WebSockets: this is the best option, it keeps the connection alive at all time
  • Flash WebSockets: if WS are not natively supported by the browser, then you can include a Flash script to enhance that functionality

Usually people use Flash WebSockets or long-polling when WebSockets (the most efficient transport) is not available in the browser.

A perfect example on how to combine many transport techniques and abstract them away is Socket.IO.

Additional resources:

http://en.wikipedia.org/wiki/Push_technology

http://en.wikipedia.org/wiki/Comet_(programming))

http://www.leggetter.co.uk/2011/08/25/what-came-before-websockets.html

Server polling with JavaScript

Is there a difference between long-polling and using Comet

http://techoctave.com/c7/posts/60-simple-long-polling-example-with-javascript-and-jquery

Video discussing different techniques: http://vimeo.com/27771528

The book Even Faster Websites has a full chapter (ch. 8) dedicated to 'Scaling with Comet'.

Long-polling vs websocket when expecting one-time response from server-side

The question is, can we say that for the case of one-time responses,
long-polling is better choice than websockets?

Not really. Long polling is inefficient (multiple incoming requests, multiple times your server has to check on the state of the long running job), particularly if the usual time period is long enough that you're going to have to poll many times.


If a given client page is only likely to do this operation once, then you can really go either way. There are some advantages and disadvantages to each mechanism.

At a response time of 5-10 minutes you cannot assume that a single http request will stay alive that long awaiting a response, even if you make sure the server side will stay open that long. Clients or intermediate network equipment (proxies, etc...) just make not keep the initial http connection open that long. That would have been the most efficient mechanism if you could have done that. But, I don't think you can count on that for a random network configuration and client configuration that you do not control.

So, that leaves you with several options which I think you already know, but I will describe here for completeness for others.

Option 1:

  • Establish websocket connection to the server by which you can receive push response.
  • Make http request to initiate the long running operation. Return response that the operation has been successfully initiated.
  • Receive websocket push response some time later.
  • Close webSocket (assuming this page won't be doing this again).

Option 2:

  • Make http request to initiate the long running operation. Return response that the operation has been successfully initiated and probably some sort of taskID that can be used for future querying.
  • Using http "long polling" to "wait" for the answer. Since these requests will likely "time out" before the response is received, you will have to regularly long poll until the response is received.

Option 3:

  • Establish webSocket connection.
  • Send message over webSocket connection to initiate the operation.
  • Receive response some time later that the operation is complete.
  • Close webSocket connection (assuming this page won't be using it any more).

Option 4:

  • Same as option 3, but using socket.io instead of plain webSocket to give you heartbeat and auto-reconnect logic to make sure the webSocket connection stays alive.

If you're looking at things purely from the networking and server efficiency point of view, then options 3 or 4 are likely to be the most efficient. You only have the overhead of one TCP connection between client and server and that one connection is used for all traffic and the traffic on that one connection is pretty efficient and supports actual push so the client gets notified as soon as possible.

From an architecture point of view, I'm not a fan of option 1 because it just seems a bit convoluted when you initiate the request using one technology and then send the response via another and it requires you to create a correlation between the client that initiated an incoming http request and a connected webSocket. That can be done, but it's extra bookkeeping on the server. Option 2 is simple architecturally, but inefficient (regularly polling the server) so it's not my favorite either.



Related Topics



Leave a reply



Submit