anyone know anything about load balancing? i have a heavily overloaded server, already using the fastest chipset my web host offers. current situation single server, web server + a single filesystem, no mysql database but lots of dynamic pages served via perl scripts. desired configuration multiple web servers, all connected to the same single file system. assuming that my dynamic pages do not write and only read from the file system, what would i install to balance the web server load between two or more web servers? anyone know? when someone goes to www.mydomain.com, the load balancer should choose which web server (server1 or server2 or server3) to send the request to... that is the part that i don't know how to do. and it should all look the same at the front end. i use 64bit centos and apache.
Cant you ask your server provider to implement this for you ? Many are very competent, and some will do it for free (managed server config)
I think a technical "load" balancer would be a separate machine, or an actual piece of hardware...an easier way to do it would be to simply use a round-robin DNS which will even out server loads and will basically rotate the server that it sends people to...Ideally it's not as effective, but it's easier to implement and can have a large impact if you can mirror all the content.
I guess, but I'd rather have control over how it is done. I'd rather not pass this one along to my webhost. Thanks for the advice. Yes I guess it should be a separate piece of hardware, playing a gatekeeper role. But if I involve DNS, it would change the URL wouldn't it? I want this whole backend system remaining invisible to the user, so that URL's stay the same irrespective of which machine is serving. Also I want to use just the one file system, as opposed to mirroring content. This is very important. I think I need to grab the incoming HTTP requests before they go to apache... and then script something. Any idea how to go about this?
ok... i just got some advice... I need to install haproxy on my gatekeeper machine, then getting the server machines to share the filesystem using GlusterFS. This is pretty much what I need I think.
To do it without redundancy (high availability) you'd need 3 machines. two mirrored web servers and a re-director machine. Or, ask your web host if they have an option. It's usually not cheap, and payable monthly as a service.
I think before considering a load balancer you might want to re-work the website itself (application). The bottleneck is most likely the high IO / Perl.. have you checked what's bogging down the machine? CPU, RAM, IO? What's cuasing this bottleneck? Perl, IO? Etc.
I have had success using Squid and Apache (mod_proxy) as HTTP load balancers. As a nice side effect they can both cache static content (mod_cache for Apache) to reduce the number of requests hitting your web servers. For the DocumentRoot, I used rsync at one place I worked to keep all file systems the same and NFS at another place. Keeping your website in Subversion and doing updates on each webserver in a post-commit hook might be another way of keeping them all in sync. As you said, however, you don't want to sync content but rather have one canonical copy of it. NFS is the only system here that meets that requirement. I haven't worked with haproxy or GlusterFS but they both look like interesting technologies. GlusterFS doesn't look like a replacement for NFS either (as I suspected it might be by its name). It looks more like a distributed filesystem that can be spread over several boxes but will look like one single filesystem to any box in the cluster. If you decide to use either of them, let us know how you got on with configuring them and how they handle your workload. I'm always interested in seeing what alternatives there are to the choices we have made in our setup.
Unless it's a REALLY HUGE site, a simple round-robin DNS and a single NFS-exported backend should be enough. No, your URL will not change, but the IP addresses the user sees (if they ever check them) will be different. That really shouldn't matter since a lot of the big guys are using this. Another option is to use Lighttpd or Nginx as a round-robin reverse proxy, with multiple Apache servers serving the dynamic content.
I'd look into a faster httpd daemon before committing to a complex load balanced setup - try http://www.litespeedtech.com We host several forums and other sites that are top-500 alexa ranked using LiteSpeed as the httpd daemon on a single web server with a separate db server. It's an incredibly powerful piece of software. LiteSpeed also develops a software load balancer system which might be worth your time to look into if updating your httpd daemon alone won't do it.
That's also an idea.. though I would recommend a free alternative to LiteSpeedTech's paid solution.. theres been several benchmarks comparing both and they've both been roughly the same. LightHTTPD - I love it
Sweet, thanks for the info! I will need load balancing either way but this will help too if I can get it to work. Nothing beats free Thanks!
We're getting there But actually the solution you mention would probably work for the time being. I'll look into it. Thank you!