[debian-mysql] Bug#591271: Bug#591271: Explanation and fix

Nicholas Bamber nicholas at periapt.co.uk
Fri Mar 9 22:37:40 UTC 2012


There is a 5.5 in experimental that might be worth a try (in a
non-production environment).


On 09/03/12 22:19, Trey Raymond wrote:
> This info is applicable to all 5.0.x and 5.1.x versions, I'm not sure
> yet about improvements in 5.5 that might have made it irrelevant.
> 
> On Fri, Mar 9, 2012 at 5:08 PM, Nicholas Bamber <nicholas at periapt.co.uk
> <mailto:nicholas at periapt.co.uk>> wrote:
> 
>     reassign 591271 mysql-server
>     tag 591271 +moreinfo
>     thanks
> 
>     Trey,
>            Thanks for the new information. Could you confim what version
>     it was
>     obtained on?
> 
> 
>     On 09/03/12 21:43, Trey Raymond wrote:
>     > So this is what's happening here.  I'll explain what it's doing
>     for each
>     > query, then show you the optimal way.
>     >
>     > SELECT MAX(sample.id <http://sample.id> <http://sample.id>) FROM
>     sample JOIN interface ON
>     > (sample.interface=interface.id <http://interface.id>
>     <http://interface.id>) WHERE
>     > interface.name <http://interface.name>
>     <http://interface.name>='accounting-total'
>     > -- This is just a symptom of the optimizer being stupid.  It starts
>     > looking at the keys for the table, sees the first one, and notices
>     that
>     > it is a covering index - and therefore it choses to use it without
>     > considering that there might be a better one.  It is a covering index
>     > because in innodb, the primary key is appended to every secondary key,
>     > so KEY (`interface`,`timestamp`) actually stores
>     > (`interface`,`timestamp`,`id`)...also note that where you have KEY
>     > (`interface`,`id`), you really only need to say KEY (`interface`), for
>     > this same reason.  The problem is that it's caring so much about
>     using a
>     > key to filter that it doesn't notice there's one that could be used to
>     > filter AND get the max(id) easily.  That's just the mysql
>     optimizer for
>     > you.  Newer versions tend to be a bit smarter, but you'll always see
>     > this kind of thing.  I bet if you switched what order those keys
>     were in
>     > in the file, it'd find the correct one first and use it.
>     >
>     > SELECT MAX(sample.id <http://sample.id> <http://sample.id>) FROM
>     sample WHERE
>     > interface=(SELECT interface.id <http://interface.id>
>     <http://interface.id> FROM interface
>     > WHERE interface.name <http://interface.name>
>     <http://interface.name>='accounting-total')
>     > -- This one is using a one-time (independent) subquery, or "derived
>     > table."  As it only has to execute that statement once, it's very
>     fast.
>     >  This is basically what mysql would do if the optimizer handled it
>     > right, but without join syntax optimization, so it's still not
>     perfect.
>     >  It's the closest you can get without index hints or changing the
>     indexing.
>     >
>     > SELECT MAX(sample.id <http://sample.id> <http://sample.id>) FROM
>     sample WHERE interface IN
>     > (SELECT interface.id <http://interface.id> <http://interface.id>
>     FROM interface WHERE
>     > interface.name <http://interface.name>
>     <http://interface.name>='accounting-total')
>     > -- This is a BIG no-no.  It's using a dependent subquery - note how
>     > saying IN implies multiple values could be returned and a range
>     check is
>     > needed.  It also implies that the subquery cannot be solved before the
>     > main table is accessed, so no "derived table" use.  In this case,
>     it has
>     > to scan the ENTIRE key on the sample table, and for EVERY record
>     it must
>     > run that subquery - you're basically executing over 6 million separate
>     > queries in this one statement.  This is the slowest method.
>      Beware when
>     > you see "Dependent subquery" in an explain plan - there is usually a
>     > better way to go about it.
>     >
>     > So in conclusion, the best way to fix it is to just tell the optimizer
>     > to use the correct index.  Either of the following two should work:
>     > SELECT MAX(sample.id <http://sample.id> <http://sample.id>) FROM
>     sample FORCE INDEX
>     > (sample_interface_id) WHERE interface=(SELECT interface.id
>     <http://interface.id>
>     > <http://interface.id> FROM interface WHERE interface.name
>     <http://interface.name>
>     > <http://interface.name>='accounting-total')
>     > -- explicitly specifying the key
>     > SELECT MAX(sample.id <http://sample.id> <http://sample.id>) FROM
>     sample IGNORE INDEX
>     > (sample_timestamp) WHERE interface=(SELECT interface.id
>     <http://interface.id>
>     > <http://interface.id> FROM interface WHERE interface.name
>     <http://interface.name>
>     > <http://interface.name>='accounting-total')
>     > -- telling it to skip that first key and find the next usable one
>     >
>     >
>     > _______________________________________________
>     > pkg-mysql-maint mailing list
>     > pkg-mysql-maint at lists.alioth.debian.org
>     <mailto:pkg-mysql-maint at lists.alioth.debian.org>
>     >
>     http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-mysql-maint
> 
> 
>     --
>     Nicholas Bamber | http://www.periapt.co.uk/
>     PGP key 3BFFE73C from pgp.mit.edu <http://pgp.mit.edu>
> 
> 
	

-- 
Nicholas Bamber | http://www.periapt.co.uk/
PGP key 3BFFE73C from pgp.mit.edu





More information about the pkg-mysql-maint mailing list