HELP! Slow mysql query....

jsmith123 Guest

Messages:: 5

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#1

Ok the situation is like this:
I have a table where I need to retrieve the latest records between a certain time, and also given the name(s) that need to match another column...

Table structure for position update table:

CREATE TABLE `tblpositionupdates` (
`idPositionUpdate` int(11) NOT NULL auto_increment,
`sgTargetName` varchar(50) NOT NULL default '',
`inUTCTime` bigint(20) default NULL,
`dbHeading` double default NULL,
`dbCOG` double default NULL,
`dbSOG` double default NULL,
`dbWGS84Lat` double default NULL,
`dbWGS84Lon` double default NULL,
`dbWGS84Elh` double default NULL,
PRIMARY KEY (`idPositionUpdate`),
KEY `Index_TargetName` (`sgTargetName`),
KEY `Index_UTCTime` (`inUTCTime`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='InnoDB free: 60416 kB';
Click to expand...

The query I perform is below. Note that when I run this against a table with 1 + million records, I haven't yet seen it complete

select * from tblPositionUpdates
inner join (select max(idPositionUpdate) as id from tblPositionUpdates
where sgTargetName in (select fkidTargetName from tblProjectTargets where fkidProject = 9)
and inUTCTime <= 853229662
group by sgTargetName) as ids
on idPositionUpdate = ids.id;
Click to expand...

Strangely when I hard code the inner select statement it is quite a bit faster.
Although it can still take very large (unnatural) amount of time to complete:

select * from tblPositionUpdates
inner join (select max(idPositionUpdate) as id from tblPositionUpdates
where sgTargetName in ('Test1', 'Test2')
and inUTCTime <= 853229662
group by sgTargetName) as ids
on idPositionUpdate = ids.id;
Click to expand...

The explain statement of the slow SQL query is below (note: I ran this against a small table):
"id","select_type"       ,"table"             ,"type"          ,"possible_keys"             ,"key"             ,"key_len","ref"   ,"rows","Extra"
1,   "PRIMARY"           ,"<derived2>"        ,"system"        ,""                          ,""                ,""       ,""      ,1,""
1,   "PRIMARY"           ,"tblPositionUpdates","const"         ,"PRIMARY"                   ,"PRIMARY"         ,"4"      ,"const" ,1,""
2,   "DERIVED"           ,"tblPositionUpdates","index"         ,"Index_UTCTime"             ,"Index_TargetName","52"     ,""      ,11087,"Using where"
3,   "DEPENDENT SUBQUERY","tblProjectTargets" ,"index_subquery","fkidTargetName,fkidProject","fkidTargetName"  ,"52"     ,"func"  ,1    ,"Using index; Using where"
Code (markup):
I suspect it is the fact that I am using large indexes.
Any help/suggestions would be greatly appreciated.

jsmith123, Jan 22, 2007 IP

edr Guest

Messages:: 784

Likes Received:: 15

Best Answers:: 0

Trophy Points:: 0

#2

Have you looked at a Query Execution plan? This is often the best way to find bottlenecks - a WHERE IN will scan the left side top to bottom for each item in the right side - if both sides have numerous records this can be a nightmare.

edr, Jan 23, 2007 IP

jsmith123 Guest

Messages:: 5

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#3

Thanks for the suggestion. Do you know of any free software that will analyze my queries for performance issues such as the one I am having?

Unfortunately both sides of the WHERE IN can have large set of values.
In fact the left side is so large that it will take over 5 minutes to complete.
I always imagined that the WHERE IN would make the query much faster as it would be able to use the indexes, and not have to search the whole table.

However I don't think this is always the problem.
For example if the inner select statement :
(select fkidTargetName from tblProjectTargets where fkidProject = 9)
is empty (lets say 9 is not a valid project).
It still takes almost as long to execute as if it returned values.

jsmith123, Jan 24, 2007 IP

smallbuzz Peon

Messages:: 125

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#4

That's one hell of a sql you got there. This is probably what you want:

select * from tblPositionUpdates
where idPositionUpdate in 
   (select max(idPositionUpdate) as id from tblPositionUpdates
    where sgTargetName in (select fkidTargetName from tblProjectTargets where fkidProject = 9)
   and inUTCTime <= 853229662
   group by sgTargetName) )

PHP:

smallbuzz, Jan 24, 2007 IP

jsmith123 Guest

Messages:: 5

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#5

Yeah that is just one of the slower queries I am using, in fact there are about 10 more. But if I can solve this one I'll be able to sort out most of the others too.

I tried that query, doesn't appear to make a difference in terms of speed.
(I killed the thread after it continued running for over 8 minutes). It seems that when MySQL begins copying into the tmp table it takes the largest proportion of time.
Remember the tblPositionUpdate presently contains about 3 million records ...

Is WHERE IN generally a better option than the INNER JOIN?

jsmith123, Jan 24, 2007 IP

smallbuzz Peon

Messages:: 125

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#6

WHERE IN is more correct than using INNER JOIN, in this case. Even though the result might be the same, its confusing to figure out why you are using join.

How many rows does this query return?
(select fkidTargetName from tblProjectTargets where fkidProject = 9)

Out of the 3 million rows, if you do a select distinct sgTargetName, how many different target names are there?

smallbuzz, Jan 25, 2007 IP

jsmith123 Guest

Messages:: 5

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#7

The reason I am using a join here is that I wanted it to force using the identifier as the "connector" between the two queries.

This query (select fkidTargetName from tblProjectTargets where fkidProject = 9)
can return at most around 160 rows depending on the fkidProject.

But on average most project(s) will only return around 4 TargetNames.
As I mentioned before the query will seem just as slow even with less (say 4 fkidTargetNames), unless I "hardcode" this inner select for example:
('TargetName1', 'TargetName2' ...).

There are around 160 distinct (unique) target names out of the 3 million rows.
So nothing too large really.

I just want to return the most recent row(s) for each target name within a project that are within a certain date/time span.

There must be something i'm missing

jsmith123, Jan 25, 2007 IP

RaginBajin Peon

Messages:: 87

Likes Received:: 3

Best Answers:: 0

Trophy Points:: 0

#8

What version of Mysql are you using?

Anything below 5 is going to have a hard time with subselect statements. Usually any query is going to have a hard time, but with versions below 5.0 mysql's query engine doesn't do a good job figuring out the least cost and just makes the query go crazy..

RaginBajin, Feb 2, 2007 IP

jsmith123 Guest

Messages:: 5

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#9

I'm using version 5.0. So basically I'm thinking MySQL is just not up to the task.
Unfortunately I don't have much choice with what db I get to work with.

jsmith123, Feb 7, 2007 IP

Log in or Sign up

HELP! Slow mysql query....

jsmith123 Guest

edr Guest

jsmith123 Guest

smallbuzz Peon

jsmith123 Guest

smallbuzz Peon

jsmith123 Guest

RaginBajin Peon

jsmith123 Guest

Useful Searches