Hello I read in a freelance job description the following: I have implemented php+mysql mostly with dummy data in order to learn queries, the db structure etc.., I mean what is specific about huge databases concerning structure, implementation, usage? I guess large databases also require data storage management, backup solutions etc.. Thanks, iago
I think the most important thing is to design your database well for the information it will contain. After that it is just the amount of storage space you have. What you can do is make a test application and find a way to get lots and lots of people to use it then see what happens. Another thing you could do is generate fake data using https://mockaroo.com/. There also have to be some books out there or large databases you could check out.
I just looked at a table with live data that has 237,715,456 rows and it has ~20 GB and it still working just fine on a small VPS. It depends on how you structure your data. I have a VPS wide backup for now, but yes, sharding the database would be a good idea for the future. Everyone I read about, that is using mysql on a HUGE (global scale), does it by also using other caching mechanisms like redis, for ex. Hundres of millions of rows is not that much for a modern server running an optimized mysql instance. If "explain query" doesn't show anything bad anymore, my next step would be to add more RAM then run mysqltuner on the server, to tweak the database itself. This last step alone is gonna make you database a lot faster, depending on your starting point...