Similarity according to tags

Discussion in 'Databases' started by ars_oguz, Sep 14, 2007.

  1. #1
    Hi,
    I have a video table and a tag table. How can i select similar videos according to tags order by most number of shared tags.

    Like video 1 has a b c d tags
    video 2 has b
    video 3 has a b
    video 4 has a b c
    video 5 has a d e f

    the order should be 4 > 3 > 2 > 5 > 1

    thanks for your help...
     
    ars_oguz, Sep 14, 2007 IP
  2. JenniP

    JenniP Peon

    Messages:
    250
    Likes Received:
    15
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Its not exactly the same as yours but it does do what you ask.

    SELECT
    Video_1.Name AS LinkName,
    COUNT(Video_1.Name) AS LinkCount
    FROM
    dbo.Tags
    INNER JOIN
    dbo.Video
    ON
    dbo.Tags.VideoId = dbo.Video.Id
    INNER JOIN
    dbo.Tags AS Tags_1
    ON
    dbo.Tags.tag = Tags_1.tag
    INNER JOIN
    dbo.Video AS Video_1
    ON
    Tags_1.VideoId = Video_1.Id
    WHERE
    (dbo.Video.Id = 1)
    AND
    dbo.Video.Id <> Video_1.Id
    GROUP BY
    Video_1.Name
    ORDER BY
    COUNT(Video_1.Name) DESC

    Which gives the result

    Video Name Shared Tag Count
    My Test 4 3
    My Test 5 2
    My Test 3 2
    My Test 2 1

    How it works:

    Firstly it joins Video and Tags to get the tags for the searched for Video. Then it joins Tags back onto itself (Tags_1) so you will get all the items that have the same tags as the original Video. From there it joins back on to Video (Video_1) to show the videos with the same tag. Finally it filters the original video selected from the list in Video_1 otherwise you will always see the video you are looking at top of the tag list.

    If your going to do a query like this though, it will be worth indexing the tags table on both the Video foreign key and the tags with a lot of videos and tags it will end up being fairly slow without the indexes.

    Jen
     
    JenniP, Sep 15, 2007 IP