Storing pageviews in MySQL

Discussion in 'MySQL' started by dadougalee, May 20, 2008.

  1. #1
    Hey, I was thinking about writing a pageview php script that stores each pageview in a MySQL database that includes the timestamp, users IP, site page, and such.

    I can obviously predict that the amount of items that are going to be written to this database will be incredibly large. I was curious if this was an OK way to use MySQL or should I think of another method?

    Also, what kind of maintenance would be involved with something like this? Would their be an index overflow quickly? Would I have to export the items frequently to make room for new items?

    MySQL is not really my specialty, I'm hoping some of the gurus in here can help me. Thanks.
     
    dadougalee, May 20, 2008 IP
  2. jestep

    jestep Prominent Member

    Messages:
    3,659
    Likes Received:
    215
    Best Answers:
    19
    Trophy Points:
    330
    #2
    I've done it several times and mysql works fine for this purpose, but this could easily be subject to the hardward your website is on, and the amount of traffic that you receive. In my experience we basically stored the session id (Char 32), Ip address, and each page the user visited with a timestamp. Since you're only dealing with a few rows, it takes a long time to really fill the database up. Also use the built in functions: INET_ATON() and INET_NTOA() for dealing with IP address storage.

    Once you get the script made, completely suppress any mysql errors for this script. This way your site doesn't break if you do run out of room or have some other problem. Make sure you have ample storage space on the hard drive the database is on. I would write about 1000 rows to the database with the same amount of information you are going to store. Then check the table size. Apply this to your anticipated daily page views, and you have a good idea of how fast the database is going to grow.

    Also, if it doesn't matter to your purpose, I would use a myisam table instead of innodb because it is a little quicker, and data integrity is probably not of the highest importance.
     
    jestep, May 20, 2008 IP
  3. jestep

    jestep Prominent Member

    Messages:
    3,659
    Likes Received:
    215
    Best Answers:
    19
    Trophy Points:
    330
    #3
    As to answer your index overflow question, this probably wouldn't be a problem unless you were making an index on multiple fields in the table, and again the hardware and traffic of the site. Without seeing exactly how you want to access the data, It's hard for me to suggest a good index structure, but the url/page field is the only one that seems obvious off the top of my head. Since this would be a text or varchar field, an index could get really ugly on it though, so be careful.
     
    jestep, May 20, 2008 IP
  4. 2beers

    2beers Well-Known Member

    Messages:
    195
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    110
    #4
    I've done for my site and it works fine. however you should use cookies in your script ,cause you'llbe saving a lot of google and yahoo bot's ips :) good luck
     
    2beers, May 20, 2008 IP
  5. dadougalee

    dadougalee Peon

    Messages:
    589
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #5
    @jestep - Thanks for your input. I'm going to go ahead with this then. I am not saving the actual URL to the database, but the index of the page, and then I do a mod_rewrite depending on the index.

    @2beers - Is there anyway to filter out the bots?
     
    dadougalee, May 21, 2008 IP
  6. 2beers

    2beers Well-Known Member

    Messages:
    195
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    110
    #6
    yes. you can use cookies. when an user visits your page you create a cookie with some value. and when you insert the ip in your db you check if the cookie has your value. bots don't have cookies enabled
     
    2beers, May 22, 2008 IP
  7. 2beers

    2beers Well-Known Member

    Messages:
    195
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    110
    #7
    my mistake. I'm stupid. some bots doees recognize cookies. I've also use javascript
    <script src="real_ip.php" type="text/javascript" ></script>

    and real_ip.php is the file where I insert the ips into db.
     
    2beers, May 22, 2008 IP