parsing keyword

Discussion in 'PHP' started by kohashi, Apr 22, 2007.

  1. #1
    I was wondering if anyone has found an *easy* way to guess at what keywords might be from an article.

    What I am trying to do is take an article, have php script run, and then guess what keyword(s) are but i don't want it calling google or another source brute forcing' to try and find a keyword. Has anyone done something like this and what sort of algorithm did you use?
     
    kohashi, Apr 22, 2007 IP
  2. commandos

    commandos Notable Member

    Messages:
    3,648
    Likes Received:
    329
    Best Answers:
    0
    Trophy Points:
    280
    #2
    what do u mean by :

    what keywords might be from an article

    and then guess what keyword(s) are

    are what ?
     
    commandos, Apr 22, 2007 IP
  3. kohashi

    kohashi Well-Known Member

    Messages:
    1,198
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    140
    #3
    Given an article (length irrelevent) which has a title and a body, i am looking for an easy way to figure out the topic(keyword(s)) of the article.
     
    kohashi, Apr 22, 2007 IP
  4. commandos

    commandos Notable Member

    Messages:
    3,648
    Likes Received:
    329
    Best Answers:
    0
    Trophy Points:
    280
    #4
    i dont know if this can help you :

    DEMO

    You give the url and it give the keywords ...
     
    commandos, Apr 22, 2007 IP
    ErectADirectory likes this.
  5. ErectADirectory

    ErectADirectory Guest

    Messages:
    656
    Likes Received:
    65
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Nice hack commando, this would work but there is probably a more elegant way to do this.

    Key words are not as useful for this stuff as key phrases are. I would probably search for 2-5 word strings that are popular on the page. From there I would do the same for just the title alone and allow it to count for 3x the amount that key phrases in the body. Your title is very important for SEO as it's at the top of the page and in the browser ( <h1> & <title> )tags).

    This tool would be very useful for many reasons, I'll watch the thread to keep tabs of this.
     
    ErectADirectory, Apr 22, 2007 IP
  6. commandos

    commandos Notable Member

    Messages:
    3,648
    Likes Received:
    329
    Best Answers:
    0
    Trophy Points:
    280
    #6
    i was just messing with the curl function :) looooooong time ago .
     
    commandos, Apr 22, 2007 IP
  7. ErectADirectory

    ErectADirectory Guest

    Messages:
    656
    Likes Received:
    65
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Well break it back out and add a little of this

    
    $submitted_page = file_get_contents($_GET['url']);
    // get title
        preg_match ("/\<title\>([^`]*?)\<\/title\>/", $submitted_page, $temp);
        $title = ereg_replace("[^A-Za-z0-9 ?!.,-]", "", str_replace("<title>", "", str_replace("</title>", "", $temp[0])));
    $title = $title . " " . $title . " " . $title ;
        
    PHP:
    and feed the script the $title 1st. This way it gives 3x more value to the title than the body.

    If you want to further this I know I have a script laying around somewhere that does this for phrases & not just words. Now where did I put that .........
     
    ErectADirectory, Apr 22, 2007 IP