I want to build a script that will compare all the links from last month to all the links fromt his month, i have each list set out list this in csv linkto,linkfrom,linkhost,alexarank,pagerank mysite.com/1.html,theresite.com/boo.html,theresite.com,44234,8 what I want to make is a script that will load up lalst mon ths csv, this months csv and tell me which links have been removed which links have been added which links have had a pagerank change which link have had an alexa rank change I'm not asking for someone to make this script, just give me a basic idea of the best practice to go about this as I will be working with about 22,000 rows x 2
If you load each csv into a mySql table (which will import if you do it right), you can achieve all of the above with 4 sql queries - and not particularly complex ones at that.
what you need is to convert csv into some readable format (array or import it to Mysql), then you should take 2 different arrays(from db or what you like) and compare it in double cycle foreach ($array1 as ...) { foreach ($array2 as ...) { comparing what you need} } here's function for converting melbournechapter.net/wordpress/programming-languages/php/cman/2006/05/18/php-csv-to-array-functions/
That's an awful lot of work when.... SELECT url FROM tableA WHERE url NOT IN (SELECT url FROM tableB) ....would do exactly thbe same thing
Assuming TableA is last month, and TableB is this month... which links have been removed Select * from TableA where url not in (Select url from TableB) which links have been added Select * from TableB where url not in (Select url from TableA) which links have had a pagerank change Select * from TableA,TableB where TableA.url=TableB.url and TableA.pageRank<>TableB.pageRank which link have had an alexa rank change Select * from TableA,TableB where TableA.url=TableB.url and TableA.alexaRank<>TableB.alexaRank Databases are designed to do this sort of thing. That's what they're for!