Very large MySQL DB and Performance

Discussion in 'MySQL' started by jestep, Jul 28, 2008.

  1. #1
    We're looking at migrating document archiving to mysql instead of file system archiving. The documents would be stored in a single table with an id, name, type, and description fields along with a blob column with the actual document in it. The only index would be the primary id.

    The DB size will be around 500GB and will grow at a rate of about 1Gb-2Gb per week. However, depending on future usage we 'could' get to several TB's within a year or so.

    Has anyone successfully scaled a MySQL DB to this size. Without the document section, we're sitting at about 10Gb with no performance issues. Hardware is a dedicated DB server with Two Quad Xeon 2.4's, 16Gb RAM, and a large hardware RAID 10 SAS array.
     
    jestep, Jul 28, 2008 IP
  2. itisme1760

    itisme1760 Greenhorn

    Messages:
    74
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    23
    #2
    I've done a 300GB one but I split the load between two Quad Xeons with 8GB RAM on RAID 0.

    Took a couple of hours. I don't know how 500GB will spare on your server but seeing the 16GB RAM, I'd stop everything running on there just to do the migration dedicated.
     
    itisme1760, Jul 28, 2008 IP
  3. klimu

    klimu Peon

    Messages:
    21
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I suppose the first things you hit are the filesize limits of the operating systems, mysql should work beyond that. Also there probably is some configurations to be made to utilize tables that big. Google uses mysql so there is at least one nice reference for huge DB's :D
     
    klimu, Jul 29, 2008 IP
  4. david_t

    david_t Member

    Messages:
    74
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    45
    #4
    Google uses their own database software, BigTable.
     
    david_t, Aug 3, 2008 IP
  5. Avatar Viper

    Avatar Viper Well-Known Member

    Messages:
    1,098
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    128
    #5
    u can try re-indexing often...
     
    Avatar Viper, Aug 3, 2008 IP
  6. shenron

    shenron Notable Member

    Messages:
    4,965
    Likes Received:
    374
    Best Answers:
    0
    Trophy Points:
    295
    #6
    I don't see a valid reason to put your documents inside the database.
    My company has developed a document management software over the last 7 years and we keep the documents outside the database, linking them with a "logical index stream"
    This way you keep your database short and speedy and very easy to manage, backup, etc.

    Just think how Gmail manages attachs ;)
     
    shenron, Aug 3, 2008 IP
  7. Trusted Writer

    Trusted Writer Banned

    Messages:
    1,370
    Likes Received:
    52
    Best Answers:
    0
    Trophy Points:
    160
    #7
    That's true, I have seen at Sourceforge several document managers in JSP, PHP Phython and other programming languages that seems to manage more efficiently such a large collection.

    With a database of that size you need a powerful dedicated server, make adjustments to your MySql and PHP ini files and use Bigdump for the importing task, but it's up to the script how fast queries could be performed.
     
    Trusted Writer, Aug 3, 2008 IP
  8. jestep

    jestep Prominent Member

    Messages:
    3,659
    Likes Received:
    215
    Best Answers:
    19
    Trophy Points:
    330
    #8
    We've looked at doing it this way as well.

    Basically, our customer management database is accessed through an intranet website. Storing scanned documents in the database makes it extremely easy to access them from the intranet application. We use it for some smaller document management purposes and it works extremely well, much better than traditional file storage.

    But as stated, the sheer quantity of this project definitely makes me hesitate to do it. An index as you suggested, stored in the database would be fast no matter the size, so that may be the best way of doing it on this scale.
     
    jestep, Aug 4, 2008 IP
  9. klimu

    klimu Peon

    Messages:
    21
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    I sure would like to see you try them both and benchmark the results. There are pros and cons for both methods. What size are the stored documents ?

    And yes, google search engine uses BigTable but google does use mysql for business critical stuff like adwords(?) It is actually commonly misundertsood.

    "MySQL is the de-facto standard database for web sites that support huge volumes of both data and end users. Examples include Friendster (over 1.5 billion queries a day and 21TB of data storage), Yahoo, Google, and many others. MySQL also powers large data warehouses with examples being Cox Communications and Los Alamos Lab (7TB and growing…) And MySQL is a popular choice for high-traffic OLTP systems in retail, government, and elsewhere."

    http://dev.mysql.com/tech-resources/articles/dispelling-the-myths.html
     
    klimu, Aug 5, 2008 IP
  10. falcondriver

    falcondriver Well-Known Member

    Messages:
    963
    Likes Received:
    47
    Best Answers:
    0
    Trophy Points:
    145
    #10
    storing big files inside a db is pretty much always a bad idea - just because the server is busy much longer to read the files and send it back to the client.
    its much better for the db if you just store a link or location where the client can find the file, this way the db has to process much smaller records and the client can GET the files instead of having the db to SEND it to the client.
     
    falcondriver, Aug 5, 2008 IP