How to Get Ec2 Load Balancing Properly Set Up to Allow for Real Time File Syncing

How do I get EC2 load balancing properly set up to allow for real time file syncing?


Do I need to have the database on RDS where every instance simply points to it.

That is one option, or you can boot up another instance to sit behind the app servers, put MySQL on it and have them all connect to that instance. One thing to note, make sure to connect over the internal network using the private ip and make sure all your security is tight.

How about user files. If a user uploaded a file to the site, then this file should be available immediately on all instances
immediately, how is this possible. I don't think having 3 copies on 3
instances is very practical.

No that is not practical. You could upload it to that backend db instance that they all have access to, but really you should probably upload it to S3 in a bucket that all your instances can use with s3tools or something.

If i modify the site, let's say change the something in the CSS file, how do i sync the changes to all instances.

Git. (or svn) But you could use cloudfront for your JS and CSS files.... not a bad idea. And use a S3 bucket as your source...

How do EBS or S3 play a role in all of this.

Your database should always be on ebs volumes so you dont lose it. S3 can be used to share and store files cheaply and easily across your entire environment.

How to sync compiled code to multiple EC2 instances

Elastic Beanstalk seems to be the best route to go now. You simply push your web deploy project to an elastic beanstalk environment and it deploys code to all of your instances. (It manages auto scaling for you.) It also makes sure that new instances launched will have you latest code and it keeps previous versions which you can easily roll back to.

If your asp.net website needs to be auto scaled on AWS, Elastic Beanstalk is really the best end-to-end solution.

Deploy Rails Application on EC2

Sure.

  1. Create a AWS account.

  2. Decided what region you want to be in. Lots of things go into this decision, but worry about it later and just do a cheap one like Oregon or East.

  3. Make sure you are in the correct region at the top left.

  4. Then click launch server.

  5. At this point you have to pick a AMI. An AMI is basically the template you want to use when you boot your server. Amazon gives you some, but there are a ton in the community section. I am a CentOS guy so I usually search for a CentOS AMI. RightScale makes some good ones so you can search for one of those. Make sure you pick i386 or x64 depending on the size of server you want. There are two distinct types of AMI's, EBS backed and S3 backed. Really you should stick with EBS because you have some more freedoms, but there are reasons to use both that are beyond the scope of this answer. Look for EBS and you probably will be good. EBS is the block storage. Basically it is attachable harddrives for your instances. Since everything in the cloud is "virtual" and nothing is thought of in a physical sense, you have to think that way too. So if you want more storage, you can attach some EBS volumes later. One thing though, S3 backed instances go bye bye when you shut them down. The EBS ones will too if you have the delete on termination flag set, but with EBS ones you can "Stop" them as well as "Terminate" them.

  6. Select the size and availability zone. The zone is important if you are going to be setting up some kind of redundancy. Like if I have a master slave setup with MySQL I would put the master in one zone and the slave in another in case Amazon was having troubles that were isolated to one zone. But for this general purpose, don't worry about it.

  7. Advanced Instance Options. Just leave all this alone most likely it is fine. Some of the small things here you can set later like termination protection.

  8. Name it. Whatever.

  9. Make a SSH key. Striaght forward. The only way to login to an Amazon server will be with the SSH key you assign it. There are no user names or passwords.

  10. Security Groups. This is where you could get tripped up, well here and #5. But you should start off with creating a general security group call foo or whatever then adding the ports you want open on it. So if you want to ssh into it, which I assume you do, then open 22. If you want to use it for web then open 80 and 8080 or whatever. But be careful. I usually change my SSH port later to something random. And instead of putting 0.0.0.0/0 on it, I put my personal ip. But if you don't care that much just put 0.0.0.0/0 and open that bad boy to the world.

Then it will boot. As long as it all went as it was supposed to.

Now you can login. Just ssh -i thekey.pem thenwholehostname

Hope that helps.

There is this whole free tier you can use. http://aws.amazon.com/free/

Check that out. I would use that while you play with it.

I did all that from memory so I could have been off. ;)

Realtime syncing of large numbers of log files to S3

You could have a Lambda function triggered automatically as a new object is saved on your S3 bucket. Check Using AWS Lambda with Amazon S3 for details. The event passed to the Lambda function will contain the file name, allowing you to target only the new files in the syncing process.

If you'd like wait until you have, say 1,000 files, in order to sync in batch, you could use AWS SQS and the following workflow (using 2 Lambda functions, 1 CloudWatch rule and 1 SQS queue):

  1. S3 invokes Lambda whenever there's a new file to sync
  2. Lambda stores the filename in SQS
  3. CloudWatch triggers another Lambda function every X minutes/hours to check how many files are there in SQS for syncing. Once there's 1,000 or more, it retrieves those filenames and run the syncing process.

Keep in mind that Lambda has a hard timeout of 5 minutes. If you sync job takes too long, you'll need to break it in smaller chunks.

MySQL Load Balancing

The "table is full" error means your slave doesn't have enough space to perform the ALTER TABLE. You need to get larger disks to resolve that error.

But the subtext is that no one is monitoring your database servers, and that's a bigger problem. You need to get a database administrator, or else get a professional service to do it.

What I'm wondering is how the big guys proceed when dealing with high load servers where you always need the data to be accurate and cannot take any risk?

First, get it out of your head that any system has no risk. That's impossible, if you plan to use the system at all. You can't eliminate the possibility of errors, but you can be prepared to recover from them seamlessly.

The big guys do the following:

  1. Hire operations staff including system administrators, network administrators, database administrators to take care of the servers.

  2. Monitor everything. Use software to track system load, disk space, errors, and many other things continuously. The best option is New Relic. For MySQL slave integrity, use a tool like pt-table-checksum.

  3. Redundancy. Create standby systems and data to take over when (not if) the primary system fails.

You probably want to learn about the field of high availability architecture. Check out this talk: Scalable Internet Architectures

AWS Elastic Load Balancing: Seeing extremely long initial connection time

I think it is a possible ELB misconfiguration. I had the same problem when I put private subnets to ELB. Fixed it by changing private subnets to public. See https://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/elb-manage-subnets.html

how to view aws log real time (like tail -f)

Have a look at awslogs.

If you happen to be working with Lambda/API Gateway specifically, have a look at apilogs.



Related Topics



Leave a reply



Submit