How did spammers mocked this script security?

Discussion in 'PHP' started by nikomaster, Jan 2, 2011.

  1. #1
    Hi there,

    Yesterday I found out some spam in my website. By default, all user generated links are detected using this regex
    <a(.*?)href=["|'](.*?)["|'](.*?)</a>
    Code (markup):
    in this way the script detects the URL and outputs this code:
    <a href="http://someurl.com" rel="nofollow">SomE Anchor Text</a>
    Code (markup):
    The only thing I do is to add the rel="nofollow" tag. This is to protect some how my site from being devaluated from the search engines as well as it disencourages spammers to put links in my site as their links won't count to increase their search engine rankings.

    This is done by Wordpress software.

    However, some clever spammer managed to bypass the code by putting some code like this:
    <a href="http://spammyurl.com <a href="http://anotherspammyurl.com"">Some Anchor</a>>...</a>
    Code (markup):
    Apparently this code bypasses the regex an put directly this along with the rest of the code:
    &lt;a href=&quot;http://spammyurl.com <a href="http://anotherspammyurl.com"">Some anchor</a> &gt;...&lt; a &gt;
    Code (markup):
    I have tried testing the spammer code to find out what did he really do to mock the regex but it does not work when I tried it.

    Any thoughts?
     
    nikomaster, Jan 2, 2011 IP
  2. EricBruggema

    EricBruggema Well-Known Member

    Messages:
    1,740
    Likes Received:
    28
    Best Answers:
    13
    Trophy Points:
    175
    #2
    Why not loop the code and only exit when there are no changes...
     
    EricBruggema, Jan 2, 2011 IP
  3. nikomaster

    nikomaster Member

    Messages:
    606
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    30
    #3
    I just figured out what they did. I only need to remove any HTML from the URL using strip_tags.
     
    nikomaster, Jan 2, 2011 IP
  4. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #4
    Your code is vulnerable to XSS (look it up).

    Furthermore the vulnerability looks like its caused by:

    (.*?)

    within the regex, as that allows any character - anyone can bypass that by closing the tag and inserting malicous HTML and/or JavaScript.
     
    danx10, Jan 2, 2011 IP
  5. jazzcho

    jazzcho Peon

    Messages:
    326
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Welcome to the world of filtering. It is NOT easy. :)
     
    jazzcho, Jan 3, 2011 IP
  6. nikomaster

    nikomaster Member

    Messages:
    606
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    30
    #6
    I have followed the trace of the spammer and he also spammed some wordpress blogs successfully posting links without the rel=nofollow tag. I just do not know what is this guy doing.
     
    nikomaster, Jan 3, 2011 IP
  7. Moustafa.Elkady

    Moustafa.Elkady Member

    Messages:
    24
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    31
    #7
    see this realted problem
    is this code not working too:
    function nofollow($text){
        return preg_replace('/(<a[\s\r\n]+[^>]+)>/i', '\\1 rel="nofollow">',$text);	
    }
    Code (markup):
    try it please
    source:
    http://www.jooria.com/snippets?snippet=12
    Code (markup):
     
    Moustafa.Elkady, Jan 3, 2011 IP