I have spent a lot of time googling the issue of server load, and I realize it is pretty difficult to predict, but maybe some of the gurus here have some ideas and can help me. I have a dedicated machine with a decent hosting service, here are the specs: P4 2.4ghz 1 gig DDR RAM 2 80g RAID SATA 7200 rpm Running on RedHat Enterprise 4 We don't do database work on this server, and there aren't any memory intensive processes running on the machine. We have put out some widgets that we hope will catch on in a viral way. Bloggers will post them on blogs, and more users will interact with them, as they continue to spread. We're concerned about scaling, obviously, as we get more popular. All we're really doing is make calls to some short PHP scripts which load the data from amazon s3, serve it out. I'm not concerned with the call to Amazon S3, and our provider assured me many times that bandwidth will not be an issue (of course, we'll get charged if we go over, but our users won't get shut out or anything). So all I'm talking about here is serving short scripts, but (hopefully) a LOT during the day. Any clue how this server will hold up? Or at least point me in the right direction as to what else I need to know to caculate this?
If the scripts are all done then use a benchmarking utility such as "ab" to simulate a load on it and see what happens Since you're on RHEL4 it's probably installed already. Do something like: ab -n 1000 -c 50 http://www.sitename.com/ That will run requests 1000 times with concurrency level of 50, and output some decent statistics for you. You should run 'top' in another window at the same time to see what the cpu, memory, and disk usage on the server is like while this is happening.
Thanks for the tip, I'll try that tool and see what comes out! The scripts are relatively simple. Go fetch an object from S3, and render a page based on that. It's not processor or memory intensive, it's more a question of how many hits/day we can withstand. For instance, 1 million/day. 2 million? Does anybody know of any published benchmarks where I could see what kind of results people were getting for different hardware configurations?
You're not going to be able to get those sort of benchmarks because your script is different than anyone elses and so its not possible to benchmark it. You may be able to find benchmarks on specific applications such as Apache, MySQL on specific hardware but still I doubt you're going to find anything solid. The fact is you have to test your script in your environment and see how the server performs; nobody else can do this. I would guess that your bottleneck will be network bandwidth between S3 and your server rather than cpu, memory, or disk access.
Okay, I realize that everyone's system is different, but I'm looking for some general estimates of what a server can do. What is consider a lot of traffic? What is considered a lot of traffic for a machine like mine? We can talk in general terms, like 1 million hits/day. Is that too much for my P4 to handle? I'm not looking for exact numbers, just interested in getting an idea of what kind of traffic people are seeing on what kind of hardware, to at least get a rough estimate of when we need to upgrade. Obviously, I'll be watching the server load, and making sure things aren't too slow, but I don't want to get to the point where things are awfully slow before I upgrade. I want to have a general idea of when I'll need to do it. How does everybody else tackle this problem, of keeping the hardware slightly ahead of growth in traffic?
It depends on the script or pages you have on your server. Like if you have html pages, then yes 1 million hits can be handled, not easily but possible. If you have some resource intensive script. No, it wont handle.
Don't worry at first - and use the top command a lot If S3 works well, you should be able to serve ALOT with a p4 ... doing what you describe you only parse a small php script, which isn't costly in resources. Question is how you have configured apache (guessing thats what you are running) - guess that's the only bottle neck here. Don't know about 1 or 2 million - but you'll get a clear picture when things pick up slowly ... then (if your not good at it) get someone to tweak apache for you - It's interesting how many resources is needed per widget script/user interaction. Give us som ab and top stats. Perhaps there are more static ways of hosting - perhaps XML is a good option - configured well, and with static files, a mucky Celeron could handle 2 million hits a day.
As others have said, you can use the Apache benchmark tool (httpd.apache.org/docs/2.0/programs/ab.html) to get a rough idea of how many pages you can serve per second. You can also use that tool to test your site using different Apache configurations. For example, if you're using PHP or Perl through CGI, try switching to mod_php or mod_perl. You should see at least a 10-fold increase in the number of simultaneous connections =)