Credit Cards - Online Advertising - Mortgages - Articles - Credit Cards

PDA

View Full Version : Avoiding the use of DISTINCT


NoamBarz
May 7th 2007, 9:42 am
I was wondering whether there was a way to avoid using the DISTINCT keyword, since it seems to be causing really bad performance issues. Is there a general solution to problems such as these?

SoKickIt
May 7th 2007, 11:35 am
Is there a general solution to problems such as these?

No, but you can explain your situation (post queries, table structure) and I'm sure someone will help you.

Carlito
May 7th 2007, 4:12 pm
If you're using MSSQL, I haven't really seen any performance issues from using DISTINCT even in high volume environments. Maybe avoid using DISTINCT on a long text column, select the distinct IDs and join on that.

alamster
May 13th 2007, 12:20 pm
you can use group by as distinct alternative with faster execution time and more condition statement.

teachai
May 23rd 2007, 10:00 am
Look at your query. Are you missing a join.
In general with a properly designed database and a properly written query you will have very few occasions where you need to use distinct.

I suspect strongly that you have a missing join condition. If you fix the missing join condition you probably won't need a distinct.

Also a distinct means that you are doing a sort operation which if you have a lot of rows can be very memory intensive.

rennick
May 23rd 2007, 10:01 am
Depending on the database you're using, it might also indicate that you need to add an index.

bochgoch
May 23rd 2007, 10:02 am
Store unique data in the queried table. That may involve creating consolidated tables with data at the level of granularity you require.

Post more detail and I'm sure we can help further.

teachai
May 24th 2007, 5:39 am
you can use group by as distinct alternative with faster execution time and more condition statement.

:confused: :eek:

Fix the query. All the performance problems will disappear.
Why would you use a totally unnecessary group by. If group by was the same as distinct, then there wouldn't be two separate commands.

Group by is for aggregated data.

Distinct is for removing duplicates but is not a replacement for a properly joined query. Use only when absolutely necessary