Hi guys, My query grabs the total movies and pictures which are currently out for a certain gender. I only display stars that have at least 1 picture or 1 movie. The below works, just wondering if anybody else has a better query or way to speed up my query. select id, name, photo_link, views, star_ratings.rating, IFNULL(movies_per_star.total_movies, 0) as total_movies, IFNULL(picturesets_per_star.total_picturesets, 0) as total_picturesets, date_added from stars LEFT JOIN (select star_id, count(*) as total_movies from movies, video_stars where movies.date_added < '2010-01-21 18:38:32' and movies.id = video_stars.video_id group by star_id) as movies_per_star ON stars.id = movies_per_star.star_id LEFT JOIN (select star_id, count(*) as total_picturesets from picturesets, pictureset_stars where picturesets.date_added < '2010-01-21 18:38:32' and picturesets.id = pictureset_stars.pictureset_id group by star_id) as picturesets_per_star ON stars.id = picturesets_per_star.star_id LEFT JOIN star_ratings ON stars.id = star_ratings.star_id where gender = 'f' and (total_movies!=0 or total_picturesets!=0) order by rating desc, date_added desc limit 16 offset 0 PHP: Here's the database structure CREATE TABLE `stars` ( `id` int(11) NOT NULL auto_increment, `name` varchar(200) NOT NULL, `gender` char(1) NOT NULL, `description` varchar(500) NOT NULL, `photo_link` varchar(500) NOT NULL, `views` int(11) NOT NULL, `date_added` datetime NOT NULL, PRIMARY KEY (`id`) ) CREATE TABLE `pictureset_stars` ( `id` int(11) NOT NULL auto_increment, `pictureset_id` int(11) NOT NULL, `star_id` int(11) NOT NULL, PRIMARY KEY (`id`) ) CREATE TABLE `video_stars` ( `id` int(11) NOT NULL auto_increment, `video_id` int(11) NOT NULL, `star_id` int(11) NOT NULL, PRIMARY KEY (`id`) ) Thanks guys, any help is appreciated.
Hi sc_king, I'm just wondering: -why you use these IFNULL in the final result set with combination with outer joins? It seems inner join for all tables will do the trick... -let's check this subquery: ............................................. (select star_id, count(*) as total_picturesets from picturesets, pictureset_stars where picturesets.date_added < '2010-01-21 18:38:32' and picturesets.id = pictureset_stars.pictureset_id group by star_id .............................................. Code (markup): Rejecting these, which count=0, is the best we can do here before joining : ............................................. (select star_id, count(*) as total_picturesets from picturesets, pictureset_stars where picturesets.date_added < '2010-01-21 18:38:32' and picturesets.id = pictureset_stars.pictureset_id group by star_id having count(*)>0 .............................................. Code (markup): Finally, your query is not 100% correct-it returns limit 16 rows but if we have few records only, it will return and these, which have: the above conflicts with your condition: Regards, Nick
Hi koko, thanks for the reply. let's say if i inner join stars and movies, if the star has no movies the star will be lost before i can inner join the number of pictures. As for the ifnull, i want it to output 0 if it's null. If they're both 0, (total_movies!=0 or total_picturesets!=0) will take care of it. Do you have an example of how you would create the query? Is there anything else I can do? Subqueries? Indexes? I'm not too familiar with using indexes Thanks again.
Hi sc_king, I think all clear now. Maybe this query is better: please try it. I've used having count>0 twice before joining to reduce the final result set before where clause. You've to add Index for stars.date_added too. Please let me know is it better after indexing. select id, name, photo_link, views, star_ratings.rating, IFNULL(movies_per_star.total_movies, 0) as total_movies, IFNULL(picturesets_per_star.total_picturesets, 0) as total_picturesets, date_added from stars LEFT JOIN (select star_id, count(*) as total_movies from movies, video_stars where movies.date_added < '2010-01-21 18:38:32' and movies.id = video_stars.video_id group by star_id having count(*)>0) as movies_per_star ON stars.id = movies_per_star.star_id LEFT JOIN (select star_id, count(*) as total_picturesets from picturesets, pictureset_stars where picturesets.date_added < '2010-01-21 18:38:32' and picturesets.id = pictureset_stars.pictureset_id group by star_id having count(*)>0) as picturesets_per_star ON stars.id = picturesets_per_star.star_id LEFT JOIN star_ratings ON stars.id = star_ratings.star_id where (gender = 'f') and (total_movies or total_picturesets) order by rating desc, date_added desc limit 16 offset 0; Code (markup): Regards, Nick p.p.: I see 3 create table scripts and we use a total of 5 tables in query.
CREATE TABLE `picturesets` ( `id` int(11) NOT NULL auto_increment, `title` varchar(500) NOT NULL, `description` varchar(1000) NOT NULL, `date_added` datetime NOT NULL, PRIMARY KEY (`id`) ) CREATE TABLE `movies` ( `id` int(11) NOT NULL auto_increment, `description` text NOT NULL, `date_added` datetime NOT NULL, PRIMARY KEY (`id`) ) having count(*) > 0 didn't help, since it only counts star_ids which exist within the table. if there are no star_ids, result will not include anything. So having count(*) > 0 and leaving it out is the same in my case. added an index to the date, star_ids, video_ids none of those helped either Thanks for your help.
Is there a way I can use limit $a offset $b on the stars so i won't have to use the whole table? Problem that comes along is that if the star within that range has 0 movies and 0 pictures.
Maybe something like this : SELECT * FROM ( select id, name, photo_link, views, date_added from stars WHERE id IN( select distinct star_id from movies, video_stars where movies.date_added < '2010-01-21 18:38:32' and movies.id = video_stars.video_id UNION select distinct star_id from picturesets, pictureset_stars where picturesets.date_added < '2010-01-21 18:38:32' and picturesets.id = pictureset_stars.pictureset_id ) LIMIT 16) A LEFT JOIN star_ratings ON A.id = star_ratings.star_id order by rating desc, date_added desc limit 16 offset 0; Code (markup): Edit: The above is the same as checking count>0 in inner join in sub-queries. SELECT * FROM ( select id, name, photo_link, views, date_added from stars WHERE id IN( select star_id from movies, video_stars where movies.date_added < '2010-01-21 18:38:32' and movies.id = video_stars.video_id GROUP BY star_id HAVING COUNT(*)>0 UNION select star_id from picturesets, pictureset_stars where picturesets.date_added < '2010-01-21 18:38:32' and picturesets.id = pictureset_stars.pictureset_id GROUP BY star_id HAVING COUNT(*)>0 ) LIMIT 16) A LEFT JOIN star_ratings ON A.id = star_ratings.star_id order by rating desc, date_added desc limit 16 offset 0; Code (markup):
Hey koko, Thanks for the reply again. The IN with 2000 ids seems slow compared to the left joins. my .3 second query turned into a .8 second query. I've also lost the total_movies and total_pictures with the union.
So far I know, you are using too many tables to show information from. IN & UNIONS are slower than JOIN, but when many tables are joined, and as data increases, JOIN can also slower down. Will it be possible to split the queries somehow? OR try and keep data preprocessed? I know, it sounds funny, but unless you are working on a smaller chunk of data, idea of having one query retrieve data from more than 5 - 6 tables should not be the primary option. Did you try making an SP (Stored Procedure)? I am not so good with SPs but if one can give it a try, this can be improved and also maintenance will be easy.
why do you need to optimize it. Nowadays DBMS optimizes the query automatically impicitly. no matter what you write they do it them selves.
Hey guys, As far as i know mysql doesn't really optimize your queries, maybe the more powerful ones but I could be wrong. I will look into stored procedures, but my page will be a list of stars. Each page is a different list of 20 stars with their videos and pictures. I don't know if that'll help me anymore. I'm going to continue looking for ways to break my query down. Thanks for the replies.
Mysql does not anytime optimize query. It however tries to compile the query optimally so that results are retrieved faster. So far I believe, Stored Procedure should do the trick and should be given a consideration as breaking down these queries into multiple one will make multiple to and fro through database which can eventually result in reduced performance. It should ideally be one call to database and all data is in php