Help optimize a query

Discussion in 'Databases' started by sc_king, Jan 21, 2010.

  1. #1
    Hi guys,

    My query grabs the total movies and pictures which are currently out for a certain gender.
    I only display stars that have at least 1 picture or 1 movie.

    The below works, just wondering if anybody else has a better query or way to speed up my query.

    select id, name, photo_link, views, star_ratings.rating, IFNULL(movies_per_star.total_movies, 0) as total_movies, IFNULL(picturesets_per_star.total_picturesets, 0) as total_picturesets, date_added from stars 
    
    LEFT JOIN (select star_id, count(*) as total_movies from movies, video_stars where movies.date_added < '2010-01-21 18:38:32' and movies.id = video_stars.video_id group by star_id) as movies_per_star ON stars.id = movies_per_star.star_id 
    
    LEFT JOIN (select star_id, count(*) as total_picturesets from picturesets, pictureset_stars where picturesets.date_added < '2010-01-21 18:38:32' and picturesets.id = pictureset_stars.pictureset_id group by star_id) as picturesets_per_star ON stars.id = picturesets_per_star.star_id 
    
    LEFT JOIN star_ratings ON stars.id = star_ratings.star_id where gender = 'f' and (total_movies!=0 or total_picturesets!=0) order by rating desc, date_added desc limit 16 offset 0
    PHP:
    Here's the database structure

    CREATE TABLE `stars` (
    `id` int(11) NOT NULL auto_increment,
    `name` varchar(200) NOT NULL,
    `gender` char(1) NOT NULL,
    `description` varchar(500) NOT NULL,
    `photo_link` varchar(500) NOT NULL,
    `views` int(11) NOT NULL,
    `date_added` datetime NOT NULL,
    PRIMARY KEY (`id`)
    )

    CREATE TABLE `pictureset_stars` (
    `id` int(11) NOT NULL auto_increment,
    `pictureset_id` int(11) NOT NULL,
    `star_id` int(11) NOT NULL,
    PRIMARY KEY (`id`)
    )

    CREATE TABLE `video_stars` (
    `id` int(11) NOT NULL auto_increment,
    `video_id` int(11) NOT NULL,
    `star_id` int(11) NOT NULL,
    PRIMARY KEY (`id`)
    )

    Thanks guys, any help is appreciated.
     
    sc_king, Jan 21, 2010 IP
  2. koko5

    koko5 Active Member

    Messages:
    394
    Likes Received:
    14
    Best Answers:
    1
    Trophy Points:
    70
    #2
    Hi sc_king,

    I'm just wondering:
    -why you use these IFNULL in the final result set with combination with outer joins? It seems inner join for all tables will do the trick...
    -let's check this subquery:
    
    .............................................
    (select star_id, count(*) as total_picturesets from picturesets, pictureset_stars 
    where picturesets.date_added < '2010-01-21 18:38:32' 
    and picturesets.id = pictureset_stars.pictureset_id 
    group by star_id
    ..............................................
    
    Code (markup):
    Rejecting these, which count=0, is the best we can do here before joining :
    
    .............................................
    (select star_id, count(*) as total_picturesets from picturesets, pictureset_stars 
    where picturesets.date_added < '2010-01-21 18:38:32' 
    and picturesets.id = pictureset_stars.pictureset_id 
    group by star_id
    having count(*)>0
    ..............................................
    
    Code (markup):
    Finally, your query is not 100% correct-it returns limit 16 rows but if we have few records only, it will return and these, which have:
    the above conflicts with your condition:
    Regards,
    Nick
     
    koko5, Jan 22, 2010 IP
  3. sc_king

    sc_king Peon

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Hi koko,

    thanks for the reply.

    let's say if i inner join stars and movies, if the star has no movies the star will be lost before i can inner join the number of pictures.

    As for the ifnull, i want it to output 0 if it's null. If they're both 0, (total_movies!=0 or total_picturesets!=0) will take care of it.

    Do you have an example of how you would create the query?

    Is there anything else I can do? Subqueries? Indexes? I'm not too familiar with using indexes

    Thanks again.
     
    Last edited: Jan 22, 2010
    sc_king, Jan 22, 2010 IP
  4. koko5

    koko5 Active Member

    Messages:
    394
    Likes Received:
    14
    Best Answers:
    1
    Trophy Points:
    70
    #4
    Hi sc_king,

    I think all clear now.

    Maybe this query is better: please try it.
    I've used having count>0 twice before joining to reduce the final result set before where clause.

    You've to add Index for stars.date_added too. Please let me know is it better after indexing.
    
    select id, 
    name, 
    photo_link, 
    views, 
    star_ratings.rating, 
    IFNULL(movies_per_star.total_movies, 0) as total_movies, 
    IFNULL(picturesets_per_star.total_picturesets, 0) as total_picturesets, 
    date_added from stars 
    LEFT JOIN 
    (select star_id, 
    count(*) as total_movies from movies, video_stars 
    where movies.date_added < '2010-01-21 18:38:32' 
    and movies.id = video_stars.video_id 
    group by star_id
    having count(*)>0) as movies_per_star 
    ON stars.id = movies_per_star.star_id 
    LEFT JOIN 
    (select star_id, 
    count(*) as total_picturesets from picturesets, pictureset_stars 
    where picturesets.date_added < '2010-01-21 18:38:32' 
    and picturesets.id = pictureset_stars.pictureset_id 
    group by star_id
    having count(*)>0) as picturesets_per_star 
    ON stars.id = picturesets_per_star.star_id 
    LEFT JOIN 
    star_ratings 
    ON stars.id = star_ratings.star_id 
    where (gender = 'f') and (total_movies or total_picturesets) 
    order by rating desc, 
    date_added desc 
    limit 16 offset 0;
    
    Code (markup):
    Regards,
    Nick
    p.p.: I see 3 create table scripts and we use a total of 5 tables in query.
     
    koko5, Jan 22, 2010 IP
  5. sc_king

    sc_king Peon

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    CREATE TABLE `picturesets` (
    `id` int(11) NOT NULL auto_increment,
    `title` varchar(500) NOT NULL,
    `description` varchar(1000) NOT NULL,
    `date_added` datetime NOT NULL,
    PRIMARY KEY (`id`)
    )

    CREATE TABLE `movies` (
    `id` int(11) NOT NULL auto_increment,
    `description` text NOT NULL,
    `date_added` datetime NOT NULL,
    PRIMARY KEY (`id`)
    )

    having count(*) > 0 didn't help, since it only counts star_ids which exist within the table. if there are no star_ids, result will not include anything.
    So having count(*) > 0 and leaving it out is the same in my case.

    added an index to the date, star_ids, video_ids none of those helped either

    Thanks for your help.
     
    sc_king, Jan 22, 2010 IP
  6. sc_king

    sc_king Peon

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Is there a way I can use limit $a offset $b on the stars so i won't have to use the whole table? Problem that comes along is that if the star within that range has 0 movies and 0 pictures.
     
    sc_king, Jan 22, 2010 IP
  7. koko5

    koko5 Active Member

    Messages:
    394
    Likes Received:
    14
    Best Answers:
    1
    Trophy Points:
    70
    #7
    Maybe something like this :
    
    SELECT * FROM (
    select id, 
    name, 
    photo_link, 
    views, 
    date_added from stars 
    WHERE id IN(
    select distinct star_id from movies, video_stars 
    where movies.date_added < '2010-01-21 18:38:32' 
    and movies.id = video_stars.video_id 
    UNION
    select distinct star_id from picturesets, pictureset_stars 
    where picturesets.date_added < '2010-01-21 18:38:32' 
    and picturesets.id = pictureset_stars.pictureset_id 
    ) LIMIT 16) A
    LEFT JOIN 
    star_ratings 
    ON A.id = star_ratings.star_id 
    order by 
    rating desc, 
    date_added desc 
    limit 16 offset 0;
    
    Code (markup):
    Edit: The above is the same as checking count>0 in inner join in sub-queries.
    
    SELECT * FROM (
    select id, 
    name, 
    photo_link, 
    views, 
    date_added from stars 
    WHERE id IN(
    select star_id from movies, video_stars 
    where movies.date_added < '2010-01-21 18:38:32' 
    and movies.id = video_stars.video_id 
    GROUP BY star_id
    HAVING COUNT(*)>0
    UNION
    select star_id from picturesets, pictureset_stars 
    where picturesets.date_added < '2010-01-21 18:38:32' 
    and picturesets.id = pictureset_stars.pictureset_id 
    GROUP BY star_id
    HAVING COUNT(*)>0
    ) LIMIT 16) A
    LEFT JOIN 
    star_ratings 
    ON A.id = star_ratings.star_id 
    order by 
    rating desc, 
    date_added desc 
    limit 16 offset 0;
    
    Code (markup):
     
    Last edited: Jan 22, 2010
    koko5, Jan 22, 2010 IP
  8. sc_king

    sc_king Peon

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Hey koko,

    Thanks for the reply again.

    The IN with 2000 ids seems slow compared to the left joins.
    my .3 second query turned into a .8 second query.

    I've also lost the total_movies and total_pictures with the union.
     
    sc_king, Jan 23, 2010 IP
  9. mastermunj

    mastermunj Well-Known Member

    Messages:
    687
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    110
    #9
    So far I know, you are using too many tables to show information from. IN & UNIONS are slower than JOIN, but when many tables are joined, and as data increases, JOIN can also slower down.

    Will it be possible to split the queries somehow? OR try and keep data preprocessed?

    I know, it sounds funny, but unless you are working on a smaller chunk of data, idea of having one query retrieve data from more than 5 - 6 tables should not be the primary option. Did you try making an SP (Stored Procedure)? I am not so good with SPs but if one can give it a try, this can be improved and also maintenance will be easy.
     
    mastermunj, Jan 23, 2010 IP
  10. rahuldas14

    rahuldas14 Well-Known Member

    Messages:
    679
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    130
    #10
    why do you need to optimize it. Nowadays DBMS optimizes the query automatically impicitly. no matter what you write they do it them selves.
     
    rahuldas14, Jan 23, 2010 IP
  11. sc_king

    sc_king Peon

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Hey guys,

    As far as i know mysql doesn't really optimize your queries, maybe the more powerful ones but I could be wrong.

    I will look into stored procedures, but my page will be a list of stars. Each page is a different list of 20 stars with their videos and pictures. I don't know if that'll help me anymore.

    I'm going to continue looking for ways to break my query down.

    Thanks for the replies.
     
    Last edited: Jan 23, 2010
    sc_king, Jan 23, 2010 IP
  12. mastermunj

    mastermunj Well-Known Member

    Messages:
    687
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    110
    #12
    Mysql does not anytime optimize query. It however tries to compile the query optimally so that results are retrieved faster.
    So far I believe, Stored Procedure should do the trick and should be given a consideration as breaking down these queries into multiple one will make multiple to and fro through database which can eventually result in reduced performance. It should ideally be one call to database and all data is in php :)
     
    mastermunj, Jan 23, 2010 IP