I'm trying to write a script that reports the number of users in each country. I have a users table that contains an ip address field and another table that contains the ip-ranges of each country. It has 3 significant columns, (1) name of the country (2) ip range start (3) ip range end. Is it possible to write one SQL statement that will have this result: 1. First column is the name of the country 2. Second column is the number of users within that country's ip-range 3. All countries with no user records will be disregarded I'm thinking about using GROUP BY but I'm not sure if it's possible to use it with ranges instead of just one column match. Thanks!
hi, I would like to know how you are storing the IP addresses. What is the datatype of the column ? How do you plan to do the comparisons to find out if an IP address is within a range ? Those questions are important for finding the solution. cheers, jay
I store IP addresses as integers. Actually, I already come up with this query.. SELECT i.country, count(u.userID) FROM `users` u inner join ip2country i on inet_aton(u.ip) between i.from and i.to group by i.country Code (markup): The problem is it takes too long to run because there are so many records in the two tables concerned. So, any other suggestions would be appreciated.
If you store the ip-adresses in the ip2country table in numeric form, then you might consider doing the same in the user table. That way you won't have to use the inet_aton function in your query. That should speed things up a bit (don't know how much though). You might also put an index on the fields u.ip, i.from and i.to I also seem to remember (but I'm not sure and right now can't find anything to back this up, but you might try and see if it makes a difference) that 'field1 BETWEEN value1 AND value2' is slower than 'field1 >= value1 AND field1 <= value2'
I would add more indexes to your table. Also, I would suggest doing this a bit different... Create a program that updates another table with country, timestamp. You could then read through this table a lot faster. You can probably queue the job to run every 10 minutes. After the initial load you can set it to run since the last update. Base off last checked id or something. I just think for visitor tracking purposes keeping this data in another table is probably best you could store other info such as ref. or entry point etc. I know this creates work but in the long run it will allow you to pull everything you want and extremely fast.