1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

complicated query

Discussion in 'MySQL' started by Garth Hagerman, Apr 7, 2019.

  1. #1
    It seems like I should be able to do this in one query, but I can't figure out the syntax.
    My site is selling books. In this section, I'm setting up pages which list books sorted by keywords. For each of the keywords, I need to make sure that there are at least two active titles tagged with that keyword before that keyword is included in a linked list. So, each of the keywords with at least two active titles are listed.
    SEMrush
    I've got three relevant tables:

    `books` is the table which has the bulk of the info about each title. Primary key is `isbn`.

    `remove` keeps track of which titles are no longer available. Fields are `id`(key), `isbn`, and `day2go`. If an entry in `books` does not have an entry in `remove`, that title is available.

    `topicLU` is a lookup table connecting the `topics` table to `books`. Fields are `id` (key), `isbn`, and `topicid`.

    For this query, we have a value for topicLU.topicid. Call it $topicid. I want a list of books which share the isbn field with `topicLU` but are not in `remove`. I could do this with two queries and a loop, but it seems like I should be able to do it with a left join.

    thanks in advance...
     
    Garth Hagerman, Apr 7, 2019 IP
    SEMrush
  2. sarahk

    sarahk iTamer Staff

    Messages:
    24,831
    Likes Received:
    3,365
    Best Answers:
    98
    Trophy Points:
    615
    #2
    select books.title, books.isbn, topicLU.topicid, topic.topic
    from `books`
    inner join topicLU on books.isbn = topicLU.isbn
    inner join topic on topicLU.topicid = topicLU.topicid
    where not exists (select remove.id from `remove` where remove.isbn = books.isbn)
    and exists (select tlu2.topicid, count(b2.isbn) as counter from topicLU as tlu2 inner join books as b2 on tlu2.isbn = b2.isbn group by tlu2.topicid where topicLU.topicid = tlu2.topicid order by tlu2.topicid having counter > 2)
    Code (markup):
    untested, obviously, and I'm racing out the door. If you want to give us a sql fiddle we'll be able to help you write the query properly and test it. http://sqlfiddle.com/ It takes a bit to set it up but well worth going through the process.
     
    sarahk, Apr 7, 2019 IP
  3. Garth Hagerman

    Garth Hagerman Peon

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    1
    #3
    Wow. That's impressive. And sqlfiddle looks like a powerful tool. I'll see if I can get this working. Thanks.
     
    Garth Hagerman, Apr 8, 2019 IP
  4. jestep

    jestep Prominent Member Premium Member

    Messages:
    3,656
    Likes Received:
    214
    Best Answers:
    19
    Trophy Points:
    330
    #4
    Is the last exists statement necessary? It seems like you could adjust the join statement and then use a where clause in the main query to avoid the second nested query all together. The fewer nested queries, the better. They create a ton of overhead and should be avoided when possible.
     
    jestep, Apr 9, 2019 IP
  5. sarahk

    sarahk iTamer Staff

    Messages:
    24,831
    Likes Received:
    3,365
    Best Answers:
    98
    Trophy Points:
    615
    #5
    I was rushing a bit...

    I created a fiddle and then the site crashed but this is what I had for the schema

    CREATE TABLE `books`( `isbn` INT(11) NOT NULL AUTO_INCREMENT, `title` VARCHAR(150), PRIMARY KEY (`isbn`) ); 
    CREATE TABLE `remove`( `id` INT(11) NOT NULL AUTO_INCREMENT, `isbn` int(11), day2go DATETIME, PRIMARY KEY (`id`) ); 
    CREATE TABLE `topicLU`( `id` INT(11) NOT NULL AUTO_INCREMENT, `isbn` int(11), topicid int(11), PRIMARY KEY (`id`) ); 
    CREATE TABLE `topic`( `id` INT(11) NOT NULL AUTO_INCREMENT, `topic` VARCHAR(150), PRIMARY KEY (`id`) ); 
    
    insert into `books` values (1000, 'Clean & Lean')
    , (1001, 'Where the crawdads sing')
    , (1002, 'Life will be the death of me')
    , (1003,'Becoming Michelle Obama')
    , (1004, 'Diary of an Awesome Friendly Kid')
    , (1005, 'Wolfpack')
    , (1006, 'Girl Stop Apologising');
    insert into `remove` values (1, 1004, now());
    insert into `topic` values (1, 'Youth'), (2,'Biography'), (3,'Fiction'), (4, 'Self Help'), (5,'Health');
    insert into `topicLU` values 
    (1, 1000, 5)
    , (2, 1000, 4)
    , (3, 1006, 4)
    , (4, 1001, 3)
    , (5, 1003, 2)
    , (6, 1004, 1)
    , (7, 1005, 4);
    Code (markup):
    and I created a query that went something like this - but this query doesn't check that the 2 titles that have a specific topic aren't also in the remove list.

    select books.title, books.isbn, topicLU.topicid, topic.topic, count(tlu2.id) as `check`
    from `books`
    inner join topicLU on books.isbn = topicLU.isbn
    inner join topic on topicLU.topicid = topic.id
    inner join topicLU as tlu2 on topic.id = tlu2.topicid
    where not exists (select remove.id from `remove` where remove.isbn = books.isbn)
    group by books.isbn, topic.topic
    order by books.isbn
    having `check` >= 2
    Code (markup):
     
    sarahk, Apr 9, 2019 IP
  6. Garth Hagerman

    Garth Hagerman Peon

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    1
    #6
    Groovy. One thing I've opted to do to improve performance is to add an `active` field to the `topics` table. That way, the complicated query can be in a hidden updating script which sets the value of `active` and which I can run when there have been changes to the available books or tags. As I was conceiving of it originally, the complicated query had to check each topic to see if it's active on the public page which lists all of the [active] topics. Since books are added or removed infrequently, it seems far more efficient to add a simple WHERE `active` = 'yes' to the query on the public page.
     
    Garth Hagerman, Apr 9, 2019 IP