Hello everyone! I am creating a brand new website where I will be relying heavily on the mySQL databases and a custom search engine (I will be making a post in the search engine and server forums for specific ideas on that). I am not asking for code, just some ideas. I have two projects at the moment. One is a USER video site, and one is a personal video site. The user video site will pretty much be similar to y0utube or any of the likes. The other will be more of a privately owned video site where a few hundred people upload thousands of videos of random sizes to the server. My main worry is SQL crashes. I don't know how to prepare myself for this. You see, I'm a web programmer, not a server guru (I wish I was). I have a decent solution for my private site, since I am the one who will be designing the site and know exactly what we'll have, I can make separate databases and separate tables for each different section (i.e. one for gaming videos, one for tutorials [just random examples, not necessarily this]). The y0utube type of site, I don't know what to do with that. How can I create a table that will be well designed and handle all those users and videos (assume we get the same amount of users as y0utube, just for worst case scenario). I have done some research and found out I have to SHARD the databases, but I have no clue on how or why or anything else for that matter about SHARDING over multiple servers. OPTIMIZATION HELP: My database needs to be search-able, from the keywords to categories to the video titles to even the username. The categories can even have sub categories, but that's at a later topic. How can I optimize my database to handle all this? My keywords column looks similar to: keyword1, keyword2, keyword3. Right now I am using the SQL search query, but I think I can make it faster and better. Also, I DO NOT want to use other people's search engine. Even google's. So many people have told me to do that, but I don't think it'd be a good website with original content if I use someone else's work. Any ideas on crawlers?
I'd do something like this: Table users: id, username, password, other info you want. Table category: id categoryname etc Table video: id, userID, categoryID, video title etc For the search function, you can use the following in search drop box: <option value="categoryID"> Category name </option> Your query: "SELECT video_title FROM videotable WHERE categoryID='$categoryID'"; Same applies to keywords, video titles and similar. For users (This probably will be in a link) <a href="search.php?user=userid"> All videos by username </a> query will be similar to above, just a different WHERE clause. Regards
Hey, thanks for the quick response. However, while I do have the set up pretty much like you just gave me, the search comes out slow when I have more than 1000 videos. I've tested searching simultaneously over 20 PCs on LAN, and it comes out REAL slow.
suggest indexing on your main tables, also optimising main tables using normalisation would be handy. Perhaps move your MySQL server to a dedicated machine rather than shared with apache if the option allows itself? With regard to normalisation of tables, if you can post your schema, I'd be happy to take a quick eye through them and put them into 3NF/BCF where-ever I can.
Here are my SQL prototypes: comments: vid(video id), uid(poster's id), comment(the comment), time(time posted UNIX timestamp), flag(if marked as incorrect or indecent) users: id, username, display(this is how they initially registered), password, email, ip(ip they registered with), joined (date joined), birthday, country,gender, status (banned, user, admin) videos: id, name, location, screenshot, poster, date, category, keywords, status, parts(id's of parts if this has more than one), description, views(views specific for the video), rating (rating specific for the video) views (helps prevent multiposting from users OR ip addresses): id, videoid, ip, username rating (helps prevent multiposting from users OR ip addresses): id, videoid, value, ip, username that's me schema. this is, of course, just a prototype, but it's pretty much in the final stages... the coding works nicely with the site im testing it all (with dummy entries)
I'm looking at videos (a video can belong to more than one category, therefore you could have a category subtable: catid, category followed by a link table: linkid, catid, videoid) as that looks like it could be a transisitive dependency, likewise for column parts because you're saying that if it's a series of videos, then store all the id's of those videos in one column when you should pad it out to another table linking them - you'd already be giving yourself more work to do... and possibly comments that can be normalised further. I'll reply later with a bit more fleshed out answer for you.