What Is The Most Efficient Way to Exchange High Volume Data Between 2 Process

What is the most efficient way to exchange high volume data between 2 process?

Use IPC:

  • One shared memory region: A low overhead shared buffer between the two processes (see shmat()).
  • One semaphore: Its counter would be the numbers of frames available (see semop()). The process that pumps video data from the camera would put a frame in the shared memory region, and put() on the semaphore. The process that record the frames to the disk would get() on the semaphore, and a frame would be in the shared memory.

It's a bit like implementing a queue with a semphore as a counter.

Fastest way to share a connection and data from it with multiple processes?

I believe a dedicated service that exposes the data via shared memory is your best bet. Secondary from that would be a service that multicasts the data via named pipes, except that you're targeting a Unix variant and not Windows.

Another option would be UDP multicast, so that the data replication occurs at the hardware or driver level. The only problem is that UDP data delivery is not guaranteed to be in order, nor is it guaranteed to deliver at all.

I think sharing the physical socket is a hack and should be avoided, you would be better off implementing a driver that did what you wanted the daemon to do transparently (e.g. processes saw the socket as a normal socket except internally the socket was mapped to a single socket, where logic existed to re-broadcast the data among the virtual sockets.) Unfortunately the level of effort to get it right would be significant, and if time to complete is a concern sharing the socket isn't really a good route to take (whether done at the driver level, or via some other hacky means such as sharing the socket descriptor cross-process.)

Sharing the socket also assumes that it is a push-only connection, e.g. no traffic negotiation is ocurring at the app level (requests for data, for example, or acknowledgements of data receipt.)

A quick-path to completion may be to look at projects such as BNC and convert the code, or hijack the general idea, to do what you need. Replicating traffic to local sockets shouldn't incur a huge latency, though you would be exercising the NIC (and associated buffers) for all of the data replication and if you are nearing the limit of the hardware (or have a poor driver and/or TCP stack implementation) then you may wind up with a dead server. Where I work we've seen data replication tank a gigabit ether card at the driver level, so it's not unheard of.

Shared Memory is the best bet if you want to remain platform independent, and performant, while not introducing anything that may become unsupportable in 5 years time due to kernel or hardware/driver changes.

Advice on handling large data volumes

So then what if the processing requires jumping around in the data for multiple files and multiple buffers? Is constant opening and closing of binary files going to become expensive?

I'm a big fan of 'memory mapped i/o', aka 'direct byte buffers'. In Java they are called Mapped Byte Buffers are are part of java.nio. (Basically, this mechanism uses the OS's virtual memory paging system to 'map' your files and present them programmatically as byte buffers. The OS will manage moving the bytes to/from disk and memory auto-magically and very quickly.

I suggest this approach because a) it works for me, and b) it will let you focus on your algorithm and let the JVM, OS and hardware deal with the performance optimization. All to frequently, they know what is best more so than us lowly programmers. ;)

How would you use MBBs in your context? Just create an MBB for each of your files and read them as you see fit. You will only need to store your results. .

BTW: How much data are you dealing with, in GB? If it is more than 3-4GB, then this won't work for you on a 32-bit machine as the MBB implementation is defendant on the addressable memory space by the platform architecture. A 64-bit machine & OS will take you to 1TB or 128TB of mappable data.

If you are thinking about performance, then know Kirk Pepperdine (a somewhat famous Java performance guru.) He is involved with a website, www.JavaPerformanceTuning.com, that has some more MBB details: NIO Performance Tips and other Java performance related things.

AWS best way to handle high volume transactions

So if you want to create a system that is highly resilient, whilst also being redundant I would advise you to take a read of the AWS Well Architected Framework. This will go into more detail that a person can provide on stack overflow.

Regarding individual technologies:

  • If you're transactional like you said, then you should look at using a relational data store for storing the data. I'd recommend taking a look at Amazon Aurora, it has built in features like auto scaling of read onlys and multi master support. Whilst you might be expecting large number, by using autoscaling you will only pay for what you use.
  • Try to decouple your APIs, have a dumb validation layer before handing off to your backend if you can help it. Technologies like SQS (as you mentioned before) help with decoupling when you combine with Lambda.
  • SQS guarantees at least once, so if your system should not write duplicates you'll want to account for idempotency in your application.
  • Also use a dead letter queue (DLQ) to handle any failed actions.
  • Ensure any resources residing in your VPC are spread across availability zones.
  • Use S3, EC2 Backup Manager and RDS snapshots to ensure data is backed up. Most other services has some sort of backup functionality you can enable.
  • Use autoscaling wherever possible to ensure you're reducing costs.
  • Build any infrastructure using an IaC tool (CloudFormation or Terraform), and any provisioning of resources via a tool like (Ansible, Puppet, Chef). Try to follow a pre baked AMI workflow to ensure that it is quick to return to the base server state.

Process Many Files Concurrently — Copy Files Over or Read Through NFS?

I would definitely do #2 - and I would do it as follows:

Run Apache on your main server with all the files. (Or some other HTTP server, if you really want). There are several reason's I'd do it this way:

  1. HTTP is basically pure TCP (with some headers on it). Once the request is sent - it's a very "one-way" protocol. Low overhead, not chatty. High performance and efficiency - low overhead.

  2. If you (for whatever reason) decided you needed to move or scale it out (using a could service, for example) HTTP would be a much better way to move the data around over the open Internet, than NFS. You could use SSL (if needed). You could get through firewalls (if needed). etc..etc..etc...

  3. Depending on the access pattern of your file, and assuming the whole file is required to be read - it's easier/faster just to do one network operation - and pull the whole file in in one whack - rather than to constantly request I/Os over the network every time you're reading a smaller piece of the file.

  4. It could be easy to distribute and run an application that does all this - and doesn't rely on the existance of network mounts - specific file paths, etc. If you have the URL to the files - the client can do it's job. It doesn't need to have established mounts, hard directory - or to become root to set-up such mounts.

  5. If you have NFS connectivity problems - the whole system can get whacky when you try to access the mounts and they hang. With HTTP running in a user-space context - you just get a timeout error - and your application can take whatever action it chooses (like page you - log errors, etc).

Efficient Way to Process Simple but Large Files in C++

You might want to preallocate the vector using .reserve if you have an idea of how large the "average" file is.

Efficiency is a tricky game. Don't play tricks early on, and design a good basic algorithm. If it's not fast enough, you start looking at the IO routines, whether you're creating any "extra" objects (explicitly or implicitly, especially if you're passing parameters around).

In your example, you might want to do a second call to clock() before printing the summary output -- get a slightly more accurate timing! :)



Related Topics



Leave a reply



Submit