Product catalog full of duplicates - how to strip duplicates with MYSQL.

Discussion in 'Commission Junction' started by markowe, Oct 8, 2009.

  1. #1
    Just experimenting with setting up the regular delivery of a product catalog in CSV, which I plan to dump into my database on a daily basis, say (for the techie-minded, using the LOAD DATA INFILE in MySQL, which should do it quickly) and then use it in my store.

    The advertiser I am looking at has a very big catalog, (which is partly why I would use the LOAD DATA INFILE method), but I notice that lots of SKUs are identical, along with the rest of the data on those rows. Any idea what gives? Is that common, for an advertiser to have a lot of dupes in their catalog?

    Also, for the tech-minded... is there a way to use LOAD DATA INFILE to skip duplicates when importing, because they are really going to increase the size of my database unnecessarily? As I understand it, the IGNORE and REPLACE options refer only to existing records in the database, not the new ones you are importing, am I right?
     
    markowe, Oct 8, 2009 IP
  2. boy2men

    boy2men Active Member

    Messages:
    153
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    53
    #2
    Why are you asking all the techie methods when you can actually remove the dups before importing?
     
    boy2men, Oct 8, 2009 IP
  3. markowe

    markowe Well-Known Member

    Messages:
    1,136
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    165
    #3
    Well, I can't really - there are maybe 150,000 records, so the whole reason I am importing the CSV file they provide using LOAD DATA INFILE is to put the minimum load on my (shared) hosting. If I start sorting through 150,000 records eliminating duplicates, then that defeats the point...
     
    markowe, Oct 8, 2009 IP
  4. melly8

    melly8 Peon

    Messages:
    23
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    is this the correct forum to ask programming stuff????
     
    melly8, Oct 13, 2009 IP
  5. markowe

    markowe Well-Known Member

    Messages:
    1,136
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    165
    #5
    Well, you know, I really need to keep my post count up so I can get all that shared AdSense revenue, I'm sure you know how it is. Keeps me off the streets too.
     
    markowe, Oct 14, 2009 IP