Product catalog full of duplicates - how to strip duplicates with MYSQL.

markowe Well-Known Member

Messages:: 1,136

Likes Received:: 26

Best Answers:: 0

Trophy Points:: 165

#1

Just experimenting with setting up the regular delivery of a product catalog in CSV, which I plan to dump into my database on a daily basis, say (for the techie-minded, using the LOAD DATA INFILE in MySQL, which should do it quickly) and then use it in my store.

The advertiser I am looking at has a very big catalog, (which is partly why I would use the LOAD DATA INFILE method), but I notice that lots of SKUs are identical, along with the rest of the data on those rows. Any idea what gives? Is that common, for an advertiser to have a lot of dupes in their catalog?

Also, for the tech-minded... is there a way to use LOAD DATA INFILE to skip duplicates when importing, because they are really going to increase the size of my database unnecessarily? As I understand it, the IGNORE and REPLACE options refer only to existing records in the database, not the new ones you are importing, am I right?

markowe, Oct 8, 2009 IP

boy2men Active Member

Messages:: 153

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 53

#2

Why are you asking all the techie methods when you can actually remove the dups before importing?

boy2men, Oct 8, 2009 IP

markowe Well-Known Member

Messages:: 1,136

Likes Received:: 26

Best Answers:: 0

Trophy Points:: 165

#3

boy2men said: ↑

Why are you asking all the techie methods when you can actually remove the dups before importing?
Click to expand...

Well, I can't really - there are maybe 150,000 records, so the whole reason I am importing the CSV file they provide using LOAD DATA INFILE is to put the minimum load on my (shared) hosting. If I start sorting through 150,000 records eliminating duplicates, then that defeats the point...

markowe, Oct 8, 2009 IP

melly8 Peon

Messages:: 23

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#4

is this the correct forum to ask programming stuff????

melly8, Oct 13, 2009 IP

markowe Well-Known Member

Messages:: 1,136

Likes Received:: 26

Best Answers:: 0

Trophy Points:: 165

#5

melly8 said: ↑

is this the correct forum to ask programming stuff????
Click to expand...

Well, you know, I really need to keep my post count up so I can get all that shared AdSense revenue, I'm sure you know how it is. Keeps me off the streets too.

markowe, Oct 14, 2009 IP

Log in or Sign up

Product catalog full of duplicates - how to strip duplicates with MYSQL.

markowe Well-Known Member

boy2men Active Member

markowe Well-Known Member

melly8 Peon

markowe Well-Known Member

Useful Searches