1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Any useful tips for massively scaling PHP applications?

Discussion in 'PHP' started by SlingshotSEO, Oct 15, 2009.

  1. #1
    I am working on a REALLY large project using PHP/MySQL. Does anyone have tips on how to insure that a database intensive PHP application scales effectively? I notice that Facebook.com uses PHP on a massive scale.

    Cloud computing?

    MemCached?

    Apache? Lightspeed?

    Thoughts?
     
    SlingshotSEO, Oct 15, 2009 IP
  2. organicCyborg

    organicCyborg Peon

    Messages:
    330
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #2
    There's no single way to scale every application since each one presents it's own challenges.

    Facebook and Twitter are both sites that have to scale to a huge user base, but the database structure for each is entirely different. I know facebook's setup has always been a huge cluster of servers, each running memcache. Not sure what twitter does though.

    The easiest thing to do is scale vertically before you scale horizontally. That is, buy a more powerful server before you buy two reasonably powerful servers and have to rewrite your application to scape to more than one machine.

    If you think about horizontal scaling early on, it will hinder your development and prevent you from getting a working project out the door.
     
    organicCyborg, Oct 15, 2009 IP
  3. mgutt

    mgutt Active Member

    Messages:
    28
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    90
    #3
    Don't use the database, if it is not really necessary. This is the main part of optimization.

    So this means writing data into a cachefile with serialize() and read it again with unserialize(file_get_contents()) as long the data is static for a longer time. like user profiles, categories, etc.

    In the most big projects is the fulltext search the main problem, so here it is necessary to cache the data. look at google. they cache all results.

    On the server itself you can think about xCache/APC or using squid proxy to fasten the output of php.
     
    mgutt, Oct 16, 2009 IP
  4. jestep

    jestep Prominent Member

    Messages:
    3,659
    Likes Received:
    215
    Best Answers:
    19
    Trophy Points:
    330
    #4
    There's a lot more to it than simply using a proxy or caching mechanism.

    Can you explain what type or usage you're anticipating, specifically web traffic, and database usage (read or write intensive). Are you running any huge reports or doing any major compilations, etc... Are pages anticipated to be static enough that caching would even work? What sort of budget do you have. What bandwidth to the internet do you have available? Are you looking for, shared hosting, a single server setup, a cluster, etc...
     
    jestep, Oct 16, 2009 IP
  5. SlingshotSEO

    SlingshotSEO Peon

    Messages:
    148
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Thanks to all that have chimed in. I appreciate the feedback!

    This program is not graphic intense. We are on the second version and it runs so slow. We have maxed out our dedicated server from the standpoint of what the hosting provider offers. The hosting provider says that we use less than 1% of bandwidth, storage, memory and CPU. This system uses extensive database usage (read and write intensive).

    Our main issue is how slow the system currently runs and we need to scale to at least 5 times the current capacity within 12 months. I have 5 PHP/MySQL programmers on staff full time. CS grads with 3 to 5 years experience after college.

    Any help is much appreciated!!!
     
    SlingshotSEO, Oct 19, 2009 IP
  6. organicCyborg

    organicCyborg Peon

    Messages:
    330
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #6
    It seems like they'd be a better way to get this question answered then asking this forum, when no one here knows the details of your program.
     
    organicCyborg, Oct 19, 2009 IP
  7. SlingshotSEO

    SlingshotSEO Peon

    Messages:
    148
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #7
    I'm looking for suggestions to pass to my team. So far from poking around I've learned about MemCached and Database Sharding. Neither of which my programmers knew about. Anything would help to save time and money. My team doesn't need to reinvent the wheel.

    For example, what would be good options aside from using Apache for a very large, database intensive application with well over 250 simultaneous database accesses?

    + Reputation for good tips.
     
    Last edited: Oct 19, 2009
    SlingshotSEO, Oct 19, 2009 IP
  8. organicCyborg

    organicCyborg Peon

    Messages:
    330
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #8
    If your bottleneck is caused by a database, replacing apache with a lightweight solution won't solve any of your scalability issues.

    De-normalizing your database can make a huge impact if you have to do a lot of joins for your queries, but it's an ugly solution.

    Memcache is going to be your best bet.
     
    organicCyborg, Oct 20, 2009 IP
  9. jestep

    jestep Prominent Member

    Messages:
    3,659
    Likes Received:
    215
    Best Answers:
    19
    Trophy Points:
    330
    #9
    Based on what you have posted thus far, I would definitely be leaning towards a well optimized InnoDB, MySQL database. The last thing you want is table locking on a busy database, so I would stay away from MyISAM. I would also recommend that simply for data integrity reasons.

    When you say "250 simultaneous database accesses?", do you mean 250 unique users, or 250 transactions?

    Your solution to this may simply be hardware. Assuming 250 users with a few transactions each, could easily be 2000 per second, which is a formidable volume if it's consistent. If this is the case, there's going to be no choice but to use a very solid, well designed, dedicated server with a lot of RAM. I don't think clustering would be necessary at this point, but table partitioning may help if you have some very large tables. Next step would be a multi-master replication, or a single-master - multi-slave replication setup. Clustering would be a last resort, as you are forced to use the NDB storage engine, which is lacking in some necessary features like foreign keys support.

    As far as memcached, that or another front-end caching system will greatly reduce load. However, if every page/query is unique, caching wont solve anything, it will just create more load. I personally like APC better than memcached just because it is so much easier to install and configure. This may end up being a developer's choice pick. A squid reverse proxy would be another good front-end caching mechanism.
     
    jestep, Oct 20, 2009 IP