Php script to help find one specific sentence or "string" in a massive table

Discussion in 'PHP' started by Misanthr, Dec 10, 2011.

  1. #1
    I have a huge problem, well not really huge, just annoying. I have a site that shows quotes from interviews. I host both the quotes, and the interviews on the same site. Most quotes (2400) are linked properly, but 500+ are not linked either because the quotes were added before we decided to link them to the interviews, or the interviews are not on the site itself. a Quick query:

    select source from quotes_table where interview_id='0' and type_id = '1' order by source
    Interview id = 0 means its not linked, and type_id = 1 means interviews, as there are lyrical quotes as well.
    AFFECTED ROWS: 530

    There are a total of 2251 quotes on the site
    1747 of which are from interviews. so just over 1/3rd are not linked

    Im not worried about how much ram, time it would take to find what quote matches what interview if the interview exists on the site itself.
    we have a total of 236 interviews on the site.


    so for example the quote "I'm not against God. I'm against the Misuse of God" is on my site but that quote is not attached to an interview. I am looking for a script where I can find the quotes with no interviews attached, add them to a var (no post box even needed) and scan the ENTIRE interview table until that exact quote matches (say 90%) or more of string match.

    The output would be the quote from the quotes table, and the id, source, year, month, day from interviews (table) the id from interviews table would be the interview_id for the quotes table. They might show up in multiple tables, which i would like to know but if not one is fine.

    Could someone please take a few mins of their time to make a script for me? Sorry for the winded post, but I just smoked my last bowl, I have been at this for hours trying to match something from the quotes table to the interviews table (the sources are duplicated at the moment) that will be fixed and only the interview source will be used. but the sources do not even match up.

    Please help if you can. My brain hurts. Thank you
     
    Misanthr, Dec 10, 2011 IP
  2. Rukbat

    Rukbat Well-Known Member

    Messages:
    2,908
    Likes Received:
    37
    Best Answers:
    51
    Trophy Points:
    125
    #2
    You don't scan the table, you do a LIKE select. (Or, depending on the database, what indexes you have and where the quotes are, a BEGINS WITH or ENDS WITH search.

    
    SELECT INTERVIEW_ID FROM INTERVIEWS WHERE INTERVIEW LIKE '%I''m not against God. I'm against the Misuse of God%';
    
    Code (markup):
    (Note that single quotes have to be quoted.) That gives you a single row (unless you have 2 interviews with that string). LIKE is a linear search, so it takes a while. BEGINS and ENDS use indexes so they're a lot faster.
     
    Rukbat, Dec 10, 2011 IP
  3. Misanthr

    Misanthr Peon

    Messages:
    35
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I ran the query

    SELECT id FROM interviews WHERE interview LIKE '%I''m not against God. I''m against the Misuse of God%';
    Return:AFFECTED ROWS: 0

    So then I ran this query

    SELECT id FROM interviews WHERE interview LIKE '%I like myself fine half the time. When I don''t like myself, it''s pretty evident from my behavior.%';

    Which I know is linked to the interview table correctly, yet returns the same thing. Affected rows 0. This shits a bit advanced for me, and my partner is out on a weekend bender. I was hoping to get it fixed by before he got home

    the site in question is mansonquotes dot com
    when a quote is viewed, the Source will be a link if it links to our interview section from where the quote came from. all new quotes we add do that. If it is grey, then it is not linked, or the interview does not exist. I tried twice to get it to work but it was way too tiring.

    Ill add TWO retarded, and i mean retarded attempts at this. and both failed. derp1 was more complete, then with derp2 i got annoyed and said fuck it. I never even expected a response actually. Warning, my formatting is real off as i was rushing this View attachment derp.php View attachment derp2.php

    they are not completed at all, i just got annoyed. Im out of smoke now. but my code might give you an example of what i am trying to accomplish

    thank fuck with the 200+ interviews we do not have to worry about this bullshit. linking source between the tables wont work as its not an exact match, interview table has many fields. source,day,month,year etc. where the quotes_table has all that info mushed into just the source table which is fine for lyrical, but a mess for quotes
     
    Last edited: Dec 10, 2011
    Misanthr, Dec 10, 2011 IP
  4. ro2biz

    ro2biz Peon

    Messages:
    4
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    1
    #4
    in my opinion even if it's a bit late you should do the following.

    create a php script which explodes the sentence into separate words:

    $yoursentence = "I'm not against God. I'm against the Misuse of God";
    $avar = explode(" ",$yoursentence);
    $num = count($avar); //get the amount of results

    now let's build us a query.

    $whereclause ='';
    for($i=0;$i<$num;$i++){
    $whereclause .= " && `field` LIKE '%".$avar[$i]."%' ";
    }

    now that we have the where clause defined we should put the query together and we'll have the following:

    $query = "SELECT `id` FROM `table` WHERE `id`>'0' ".$whereclause."";

    I put the `id`>'0' because the for function does not know that the 1st item in a where clause shouldn't have &&.

    When ran the script will return a query like the following:
    SELECT `id` FROM `table` WHER `id`>'0' && `field` LIKE '%I'm%' && `field` LIKE '%not%' && `field` LIKE etc etc etc

    This is somehow what you need if it ain't an exact match.
     
    ro2biz, May 10, 2013 IP
  5. deathshadow

    deathshadow Acclaimed Member

    Messages:
    9,732
    Likes Received:
    1,999
    Best Answers:
    253
    Trophy Points:
    515
    #5
    Not if like a good little dooby you are using prepared queries. Of course then you can axe the string addition nonsense and reuse the same search prepare.

    Guessing wildly on table structure, but let's say you've got a $db var that's an initialized PDO object...

    $quotes = $db->query('
    	SELECT id,text FROM quotes
    	WHERE interview_id = 0
    ');
    
    $interviews = $db->prepare('
    	SELECT id FROM interviews
    	WHERE interview LIKE :quote
    ');
    
    while ($quote = $quotes->fetch()) {
    
    	$interviews->execute(array(
    		':quote' = > $quote['text']
    	));
    	
    	while ($interview = $interviews->fetch()) {
    		/*
    			do whatever it is you are doing for processing/plugging in ID's
    			here. NOT sure how or even if you are set up to handle that the
    			same quote could appear in multiple different interviews -- I'd
    			probably have in quotes a a list of interviews stored using
    			serialize
    		*/
    	}
    }[/quote]
    
    That could probably all be made one query, but that takes me a bit out of my comfort zone.
    Code (markup):
    Oops, didn't even notice this was a bump.
     
    deathshadow, May 10, 2013 IP