1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

How to prevent faulty characters at the end of the URLs

Discussion in 'Site & Server Administration' started by tankard, Jun 20, 2016.

  1. #1
    Hi,
    I have a very bad template that messes up my server responses. Unfortunately, I can't change the template right now. Is there a htaccess fix for this problem?

    The problem is that you can make up any url you wish.
    For example, this is the correct page url:
    example.com/category/page?id=123&gid=456
    If you were to type in something like this:
    example.com/category/page?id=123&gid=456RandomCharacters
    it should return a 404 response, right? In my case it returns a 200 response.

    So, for every page, you can have billions of different urls and they will all be valid.
    How do I prevent people and bots from adding characters to my urls?
    thanks.
     
    tankard, Jun 20, 2016 IP
  2. PoPSiCLe

    PoPSiCLe Illustrious Member

    Messages:
    4,623
    Likes Received:
    725
    Best Answers:
    152
    Trophy Points:
    470
    #2
    Actually, no. The ?id and &gid are parsed by the page in question - and those are all valid. Why? Because the code on the page doesn't check the $_GET-variables.

    It will be VERY hard to do anything about this in .htaccess, if you have many valid links, since you'd have to whitelist anything that isn't correct. If it is only numbers, and you always know the length, you can probably do a check for that, but my suggestion would be to rewrite whatever is checking the values and validate them, and return an error if they don't pan out.

    I am at least hoping that if you put in an invalid ?id, then you get a 404? So if I put in ?id=4005343433443 it won't return anything, or give you a 404? If so, all you need to worry about is the &gid-part.
     
    PoPSiCLe, Jun 24, 2016 IP
  3. satoved

    satoved Member

    Messages:
    5
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    36
    #3
    I see two ways of solving this problem:
    1) Canonical URLs and robots.txt to prevent SE indexing this unwanted URLs
    2) Fix it through the PHP code, that should check these GET parameters and return 404 page if something is wrong
     
    satoved, Jun 26, 2016 IP