I'm fairly new to MySQL and SQL in general. I'm building an affiliate program that will record the number of clicks into my site from different affiliate's websites. My thinking is that I will have a relation called clicks with 3 attributes: clickID - auto incremented affiliateID date The problem is that the cardinality of the relation would be very high because there would be a tuple for every click into the site (which could be thousands a day). Is this a problem???
We run a collection of 170 "toplist" type sites from one common database which not only record every click into the site, but details of every visitor on ALL the listing sites - in other words, we receive close on a million server requests a day, all of which generate at least one database hit. It's not a problem as long as you're sensible. For example, we hold the traffic details on a rolling 24 hour window, and summarise and archive them each day as we don't need individual ip level data after 24 hours - but we do like daily summaries for the last 30 days. Again, once a month, we summarise and archive off the monthly data as we were ending up with tables with several million rows and that did impact performance. You probably could summarise up by affiliate id at the end of the day? or at some point anyway? Also, make sure you allocate enough memory to MySql so that the tables remain in memory and make sure your indexes are good. And of course, you can trial run this - it's not hard to insert a few hundred thousand rows of data for test purposes and check performance is ok. If you have console access, run "top" so you can watch the performace of the mysqld service and make sure it isn't maxing out for long periods.