complicated query

Discussion in 'MySQL' started by Garth Hagerman, Apr 7, 2019.

  1. #1
    It seems like I should be able to do this in one query, but I can't figure out the syntax.
    My site is selling books. In this section, I'm setting up pages which list books sorted by keywords. For each of the keywords, I need to make sure that there are at least two active titles tagged with that keyword before that keyword is included in a linked list. So, each of the keywords with at least two active titles are listed.

    I've got three relevant tables:

    `books` is the table which has the bulk of the info about each title. Primary key is `isbn`.

    `remove` keeps track of which titles are no longer available. Fields are `id`(key), `isbn`, and `day2go`. If an entry in `books` does not have an entry in `remove`, that title is available.

    `topicLU` is a lookup table connecting the `topics` table to `books`. Fields are `id` (key), `isbn`, and `topicid`.

    For this query, we have a value for topicLU.topicid. Call it $topicid. I want a list of books which share the isbn field with `topicLU` but are not in `remove`. I could do this with two queries and a loop, but it seems like I should be able to do it with a left join.

    thanks in advance...
     
    Garth Hagerman, Apr 7, 2019 IP
  2. sarahk

    sarahk iTamer Staff

    Messages:
    28,807
    Likes Received:
    4,534
    Best Answers:
    123
    Trophy Points:
    665
    #2
    select books.title, books.isbn, topicLU.topicid, topic.topic
    from `books`
    inner join topicLU on books.isbn = topicLU.isbn
    inner join topic on topicLU.topicid = topicLU.topicid
    where not exists (select remove.id from `remove` where remove.isbn = books.isbn)
    and exists (select tlu2.topicid, count(b2.isbn) as counter from topicLU as tlu2 inner join books as b2 on tlu2.isbn = b2.isbn group by tlu2.topicid where topicLU.topicid = tlu2.topicid order by tlu2.topicid having counter > 2)
    Code (markup):
    untested, obviously, and I'm racing out the door. If you want to give us a sql fiddle we'll be able to help you write the query properly and test it. http://sqlfiddle.com/ It takes a bit to set it up but well worth going through the process.
     
    sarahk, Apr 7, 2019 IP
  3. Garth Hagerman

    Garth Hagerman Peon

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    1
    #3
    Wow. That's impressive. And sqlfiddle looks like a powerful tool. I'll see if I can get this working. Thanks.
     
    Garth Hagerman, Apr 8, 2019 IP
  4. jestep

    jestep Prominent Member

    Messages:
    3,659
    Likes Received:
    215
    Best Answers:
    19
    Trophy Points:
    330
    #4
    Is the last exists statement necessary? It seems like you could adjust the join statement and then use a where clause in the main query to avoid the second nested query all together. The fewer nested queries, the better. They create a ton of overhead and should be avoided when possible.
     
    jestep, Apr 9, 2019 IP
  5. sarahk

    sarahk iTamer Staff

    Messages:
    28,807
    Likes Received:
    4,534
    Best Answers:
    123
    Trophy Points:
    665
    #5
    I was rushing a bit...

    I created a fiddle and then the site crashed but this is what I had for the schema

    CREATE TABLE `books`( `isbn` INT(11) NOT NULL AUTO_INCREMENT, `title` VARCHAR(150), PRIMARY KEY (`isbn`) ); 
    CREATE TABLE `remove`( `id` INT(11) NOT NULL AUTO_INCREMENT, `isbn` int(11), day2go DATETIME, PRIMARY KEY (`id`) ); 
    CREATE TABLE `topicLU`( `id` INT(11) NOT NULL AUTO_INCREMENT, `isbn` int(11), topicid int(11), PRIMARY KEY (`id`) ); 
    CREATE TABLE `topic`( `id` INT(11) NOT NULL AUTO_INCREMENT, `topic` VARCHAR(150), PRIMARY KEY (`id`) ); 
    
    insert into `books` values (1000, 'Clean & Lean')
    , (1001, 'Where the crawdads sing')
    , (1002, 'Life will be the death of me')
    , (1003,'Becoming Michelle Obama')
    , (1004, 'Diary of an Awesome Friendly Kid')
    , (1005, 'Wolfpack')
    , (1006, 'Girl Stop Apologising');
    insert into `remove` values (1, 1004, now());
    insert into `topic` values (1, 'Youth'), (2,'Biography'), (3,'Fiction'), (4, 'Self Help'), (5,'Health');
    insert into `topicLU` values 
    (1, 1000, 5)
    , (2, 1000, 4)
    , (3, 1006, 4)
    , (4, 1001, 3)
    , (5, 1003, 2)
    , (6, 1004, 1)
    , (7, 1005, 4);
    Code (markup):
    and I created a query that went something like this - but this query doesn't check that the 2 titles that have a specific topic aren't also in the remove list.

    select books.title, books.isbn, topicLU.topicid, topic.topic, count(tlu2.id) as `check`
    from `books`
    inner join topicLU on books.isbn = topicLU.isbn
    inner join topic on topicLU.topicid = topic.id
    inner join topicLU as tlu2 on topic.id = tlu2.topicid
    where not exists (select remove.id from `remove` where remove.isbn = books.isbn)
    group by books.isbn, topic.topic
    order by books.isbn
    having `check` >= 2
    Code (markup):
     
    sarahk, Apr 9, 2019 IP
  6. Garth Hagerman

    Garth Hagerman Peon

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    1
    #6
    Groovy. One thing I've opted to do to improve performance is to add an `active` field to the `topics` table. That way, the complicated query can be in a hidden updating script which sets the value of `active` and which I can run when there have been changes to the available books or tags. As I was conceiving of it originally, the complicated query had to check each topic to see if it's active on the public page which lists all of the [active] topics. Since books are added or removed infrequently, it seems far more efficient to add a simple WHERE `active` = 'yes' to the query on the public page.
     
    Garth Hagerman, Apr 9, 2019 IP