Log in or Sign up

BIG Unique ARRAY make server crash

Discussion in 'PHP' started by goscript, Jul 26, 2007.

goscript Prominent Member

Messages:

2,753

Likes Received:

306

Best Answers:

0

Trophy Points:

315

#1

Here's the situation:

A text file with 14 million records, one per line.
Problem: some of the records are duplicate.

Sample records:

rmocu6zza0
gbl239serr
iq090qctnp
dfimp7oopw
y2gcsstbiw
x9owfugbn8
jg9hqx6d0c
etc..

File size is round 200 Mb

I need to remove the duplicates from the text file and import them into a db table.
I tried reading the whole file into an array and use the array_unique function, but that made the server crash on such big array.

Any way of doing this easier?

Thanks

goscript, Jul 26, 2007 IP
ecentricNick Peon

Messages:

351

Likes Received:

13

Best Answers:

0

Trophy Points:

0

#2

Split the file into sizes less than 2mb big.

Import each file into a table, with a unique index on the relavent column(s), telling the import to ignore duplicate records.

Export the results.

Voila, deduped.

ecentricNick, Jul 26, 2007 IP
goscript Prominent Member

Messages:

2,753

Likes Received:

306

Best Answers:

0

Trophy Points:

315

#3

Thanks, I'll do this way

goscript, Jul 26, 2007 IP

(You must log in or sign up to reply here.)