Table structure - which is "best"?

Discussion in 'MySQL' started by James WP, Nov 9, 2007.

  1. #1
    I'm looking for some opinions on the best way to store my data.

    Let's say, for the sake of a simple example in this thread, that I want to store in a MySQL database information on various pets, which are either cats or dogs. Let's also say that both can and always will be described with the same attributes (shown below). Finally, imagine this database will (for some reason) be accessed online by a large number of users via PHP webpages, who would only need information on cats or dogs at any one time.

    My database could therefore look like this:

    [​IMG]

    Or it could look like this:

    [​IMG]

    My question, therefore, is which do you think is the most suitable structure and why?
    Would any circumstances change your mind? (a busy server, more species, more records...)
    What are the pros and cons of each method?

    Any and all input is appreciated! :)
     
    James WP, Nov 9, 2007 IP
  2. Synch

    Synch Peon

    Messages:
    76
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    0
    #2
    I can't see the first image.

    Based on the 2nd image, I would make a table with all the species, a table with all the colors, and a table to store the favorite food seperately.

    You can reference the colors and species by a relational key, and I would put the food seperately so that you could eventually have more than one favorite food per animal (it happens!). This would save space if the site got really huge. If you plan on running reports on these tables than every night I would dump the contents into a rollup table. (that is if you end up with millions of rows though)
     
    Synch, Nov 10, 2007 IP
  3. James WP

    James WP Active Member

    Messages:
    42
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    91
    #3
    Thanks Synch. I'm sorry that you can't see the first image - it shows two tables: one for cats and one for dogs.

    I understand your point about favorite foods, though for this example please imagine that such a scenario cannot arise. I have really just tried to create a simple example representing a situation that would take a lot more explaining.

    Perhaps another way of looking at the same problem is to imagine two completely separate messageboards stored on one database - would you store the posts in separate tables despite having the same fields, or in one table with an extra field denoting which site they belong to?
     
    James WP, Nov 10, 2007 IP
  4. N_F_S

    N_F_S Active Member

    Messages:
    2,475
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    90
    #4
    I'm using the first method, in the long run it will be easy for you to manage, although 2nd is more time savy and compact.
     
    N_F_S, Nov 11, 2007 IP
  5. James WP

    James WP Active Member

    Messages:
    42
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    91
    #5
    Thanks N_F_S. I'm thinking it probably won't be that much easier to manage, as all I'd have to do is include a "WHERE Species = 'Cat'" clause to my queries, but you do have a point there.

    Are there any other factors I should consider?

    I suppose my main concerns are:

    - whether searching for Bingo the dog would take significantly longer in a combined PETS table than a smaller DOGS table (ie, would website users notice a greater lag if the table was big enough?);

    - whether processes accessing cat data in the combined table would hinder/delay other processes accessing dog data, to a greater extent than if they were in separate tables.

    Apologies if I'm not using the most efficient terminology here!
     
    James WP, Nov 12, 2007 IP
  6. N_F_S

    N_F_S Active Member

    Messages:
    2,475
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    90
    #6
    James,

    You really could notice some greater lag if the table was big enough, we are talking about tens-hundreds of thousands records, not less.

    By the way, I'd suggest you to use in WHERE clause numbers, you can assign Cat, dog, etc. 1,2,3 id's then instead of Species = 'Cat' search SpeciesID = '1', SpeciesID = '2', etc.
     
    N_F_S, Nov 13, 2007 IP
  7. flippers.be

    flippers.be Peon

    Messages:
    432
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #7
    IMO:
    with restrictions that you only have cats and dogs and won't have other species later, and you always select on only 1 species and not 2: make 2 tables.

    It will be much faster.. especially when there are thousands of rows, indexes will be smaller requiring less blocks to read from disk, ..
     
    flippers.be, Nov 14, 2007 IP
  8. Petey

    Petey Peon

    Messages:
    68
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Hi James,

    N_F_S is right, it is all dependent on the size of data.

    If we are only talking about less than 50 thousand records then I would probably follow your PETS design. Put all the data in a single table, index the search columns and optimise the database engine so it caches query results.

    Write some example queries and use 'select explain' to see the query plan and tune your indexes.

    If the dataset is bigger then I would follow the CATS and DOGS design. If you Google 'star schemas' you will find some good design info to help you.
     
    Petey, Nov 14, 2007 IP
  9. GroceryPriceBooks

    GroceryPriceBooks Peon

    Messages:
    501
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #9
    I think the 3rd one. It provides the most information on your animals.
     
    GroceryPriceBooks, Nov 14, 2007 IP
  10. James WP

    James WP Active Member

    Messages:
    42
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    91
    #10
    Thanks for your input, everyone! :)

    It could eventually reach that size, so this is a strong argument for the separate tables.

    You're right, in the practical case this is what I'll do.

    Indexing is something else I need to look into. :D Yep, that's the level I'm at... :eek:

    This is interesting. I've had a quick look and a star schema seems like a good idea, although it may get messy if I have to add any fields/tables at a later date. I'll obviously read about it in more detail to weigh up the pros and cons.

    So would your opinion change if I said there would also be hamsters, rabbits and canaries involved? (Probably not a sentence you expect to read on Digital Point!)
     
    James WP, Nov 14, 2007 IP
  11. flippers.be

    flippers.be Peon

    Messages:
    432
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Yes and no.

    If you want to optimize only for speed of inserts/retrieval then you still make a separate table for each species.

    However in the end you may end up with 100 tables that are very similar.. you will need to create select/insert statements or stored procs for them all which are very similar..

    If you want to reduce time for development/coding then the 1st layout with 1 common table is best.
    That way if you want to add a new species you can do it almost immediately (one insert in a reference table). Else you will have to create a new table and make the associated insert/select code, put a new version of the database and associated program on your server, and so on..
     
    flippers.be, Nov 15, 2007 IP
  12. msaqibansari

    msaqibansari Peon

    Messages:
    84
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Now what if you also have an other PET? Will you create another table? Its not a great idea just organize all data in one table and filter them with PET Type. It is easy to manage.
     
    msaqibansari, Nov 15, 2007 IP