How to speed up the Group By Clause for a large 3GB database.

Discussion in 'MySQL' started by rahulephp, Dec 30, 2010.

  1. #1
    How to speed up the Group By Clause for a large 3GB database.

    I am using Group by clause for a large database having 148 columns and 5 million rows with approx 3GB of size.
    We need to apply the Group by clause on approx 1,00,000 rows at a time without using LIMIT.
    We can’t use LIMIT as we need all of the entries from a category to be show in the filters section.

    We have a Dedicated Linux server with 4GB RAM and latest Configuration with 2 processors.

    I tried all different my.cnf configuration settings to optimize the mysql speed but nothing works.

    Here is Query that I am using to fetch the data:


    SELECT e.product_id,
     e.name,
     e.description,
     e.manufacturer,
     e.imageurl,
     e.warranty,
     e.colour,
     e.collection,
     e.saleprice,
     e.price,
     e.ages,
     e.size,
     e.size_h,
     e.size_w,
     e.size_d,
     e.size_unit,
     e.wifi_ready,			 
     e.bundled_deals_packages,
     e.service_provider,
     e.how_many_seats,
     e.characters, 
     e.publishercategory,
     e.clean_modelno
     
     MAX(price) as max_price,
     MIN(price) as min_price,
     count(distinct(advertiserid)) as total
     FROM elec_products as e
     
     WHERE status = 1
     AND (subcategory2 = 3115) 
     GROUP BY clean_modelno, publishercategory
     ORDER BY total DESC
    Code (markup):

    I have index on following columns
    • product_id PRIMARY KEY
    • Group_by(clean_modelno, publishercategory) BTREE[/*]
    • subcategory1 BTREE
    • subcategory2 BTREE
    • subcategory3 BTREE
    • subcategory4 BTREE
    • subcategory5 BTREE
    • status BTREE


    Table Type is "MyISAM".


    All major My.cnf configurations:
    • skip-locking
    • key_buffer_size = 512M
    • max_allowed_packet = 128M
    • table_open_cache = 512
    • sort_buffer_size = 128M
    • read_buffer_size = 128M
    • read_rnd_buffer_size = 128M
    • myisam_sort_buffer_size = 128M
    • thread_cache_size = 8
    • query_cache_size = 128M
    • max_heap_table_size=256M
    • tmp_table_size=256M
    • join_buffer_size = 2M

    I can see lots of other similar Price Comparison website which has excellent pageload speed.
    Please help me out from this and let me know if I am missing anything.
     
    rahulephp, Dec 30, 2010 IP
  2. mastermunj

    mastermunj Well-Known Member

    Messages:
    687
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    110
    #2
    Firstly, I still do not understand need to NOT use limit clause. Do you mean you have to display 1,00,000 on one page only?

    Practically its insane, however, if that is need, several parameters can cause this delay apart from query only.

    1. Even if mysql scans and prepares data in less than 1 second, transferring data between database server & front-end server can take few seconds.
    2. Looping through 1,00,000 data again takes some time.
    3. Network between your IDC and client system also accounts into it as data being sent would be too large.

    Above analysis is based on the statement you made that you have to display everything that matches criteria.

    Share more specific details if you feel I am not getting your point so that I can do further analysis and give you better suggestions.
     
    mastermunj, Dec 31, 2010 IP
  3. mwasif

    mwasif Active Member

    Messages:
    816
    Likes Received:
    23
    Best Answers:
    1
    Trophy Points:
    70
    #3
    3GB is the size of table or the complete database?

    Do you really need index on the following columns if not then drop them.
    subcategory1 BTREE
    subcategory2 BTREE
    subcategory3 BTREE
    subcategory4 BTREE
    subcategory5 BTREE
    status BTREE

    Use EXPLAIN to identify what is hurting the performance. Index on price may help you. But don't create many indexes.
     
    mwasif, Dec 31, 2010 IP
  4. jkl6

    jkl6 Peon

    Messages:
    70
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #4
    If you run the query without the max, min, count distinct, and order by. How long does it take?
     
    jkl6, Dec 31, 2010 IP
  5. rahulephp

    rahulephp Peon

    Messages:
    45
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    It still really takes 16seconds to Group 90000 rows if i remove MAX, MIN, COUNT and ORDER BY.


    May be i need to change something with the INDEX and Database.

    Here is Result for "SHOW CREATE TABLE elec_products;"

    
    CREATE TABLE `elec_products` (
     `product_id` int(11) NOT NULL AUTO_INCREMENT,
     `programname` varchar(100) CHARACTER SET latin1 NOT NULL,
     `advertiserid` smallint(6) NOT NULL,
     `programurl` text CHARACTER SET latin1 NOT NULL,
     `catalogname` text CHARACTER SET latin1 NOT NULL,
     `advertiserstatus` varchar(300) CHARACTER SET latin1 NOT NULL,
     `productstatus` int(2) NOT NULL,
     `lastupdated` varchar(20) CHARACTER SET latin1 NOT NULL,
     `name` varchar(300) CHARACTER SET latin1 NOT NULL,
     `productid` varchar(200) CHARACTER SET latin1 NOT NULL,
     `advertiserproductid` int(50) NOT NULL DEFAULT '0',
     `keywords` varchar(300) CHARACTER SET latin1 NOT NULL,
     `description` varchar(2000) CHARACTER SET latin1 NOT NULL,
     `specifications` varchar(5000) CHARACTER SET latin1 NOT NULL,
     `ourproductdescription` varchar(5000) CHARACTER SET latin1 NOT NULL,
     `bestproductdescription` text CHARACTER SET latin1 NOT NULL,
     `language` varchar(5) CHARACTER SET latin1 NOT NULL DEFAULT 'EN',
     `manufacturer` varchar(160) CHARACTER SET latin1 NOT NULL,
     `manufacturerid` varchar(64) CHARACTER SET latin1 NOT NULL,
     `modelno` varchar(20) CHARACTER SET latin1 NOT NULL,
     `buyurl` varchar(500) CHARACTER SET latin1 NOT NULL,
     `impressionurl` varchar(500) CHARACTER SET latin1 NOT NULL,
     `imageurl` varchar(200) NOT NULL,
     `thumburl` varchar(500) CHARACTER SET latin1 NOT NULL,
     `mpn` varchar(20) CHARACTER SET latin1 NOT NULL,
     `upc` varchar(100) CHARACTER SET latin1 NOT NULL,
     `isbn` varchar(64) CHARACTER SET latin1 NOT NULL,
     `ean` varchar(100) CHARACTER SET latin1 NOT NULL,
     `author` varchar(200) CHARACTER SET latin1 NOT NULL,
     `artist` varchar(200) CHARACTER SET latin1 NOT NULL,
     `title` varchar(200) CHARACTER SET latin1 NOT NULL,
     `publisher` varchar(200) CHARACTER SET latin1 NOT NULL,
     `label` varchar(200) CHARACTER SET latin1 NOT NULL,
     `format` varchar(200) CHARACTER SET latin1 NOT NULL,
     `warranty` varchar(200) CHARACTER SET latin1 NOT NULL,
     `moneybackexclusion` varchar(200) CHARACTER SET latin1 NOT NULL,
     `condition1` varchar(200) CHARACTER SET latin1 NOT NULL,
     `producttype` varchar(200) CHARACTER SET latin1 NOT NULL DEFAULT '1',
     `adultcontent` varchar(200) CHARACTER SET latin1 NOT NULL,
     `buyingguidename` varchar(200) CHARACTER SET latin1 NOT NULL,
     `buyingguideurl` varchar(200) CHARACTER SET latin1 NOT NULL,
     `colour` varchar(200) CHARACTER SET latin1 NOT NULL,
     `dupcolour` varchar(100) CHARACTER SET latin1 NOT NULL,
     `style` varchar(200) CHARACTER SET latin1 NOT NULL,
     `collection` varchar(200) CHARACTER SET latin1 NOT NULL,
     `parentproductid` varchar(10) CHARACTER SET latin1 NOT NULL,
     `categoryname` varchar(100) CHARACTER SET latin1 NOT NULL,
     `categoryid` varchar(11) CHARACTER SET latin1 NOT NULL,
     `advertisercategory` varchar(300) CHARACTER SET latin1 NOT NULL,
     `categorylist` varchar(200) CHARACTER SET latin1 NOT NULL,
     `thirdpartyid` varchar(64) CHARACTER SET latin1 NOT NULL,
     `thirdpartycategory` varchar(300) CHARACTER SET latin1 NOT NULL,
     `publisherid` int(11) NOT NULL,
     `publishercategory` smallint(6) NOT NULL,
     `publishercategory_name` varchar(200) CHARACTER SET latin1 NOT NULL,
     `subcategory1` smallint(6) NOT NULL,
     `subcategory1_name` varchar(200) CHARACTER SET latin1 NOT NULL,
     `subcategory2` smallint(6) NOT NULL,
     `subcategory2_name` varchar(200) CHARACTER SET latin1 NOT NULL,
     `subcategory3` smallint(6) NOT NULL,
     `subcategory3_name` varchar(200) CHARACTER SET latin1 NOT NULL,
     `subcategory4` smallint(6) NOT NULL,
     `subcategory4_name` varchar(200) CHARACTER SET latin1 NOT NULL,
     `subcategory5` smallint(6) NOT NULL,
     `subcategory5_name` varchar(200) CHARACTER SET latin1 NOT NULL,
     `commissiongroup` varchar(200) CHARACTER SET latin1 NOT NULL,
     `commission_amount` varchar(200) CHARACTER SET latin1 NOT NULL,
     `promotionaltext` varchar(300) CHARACTER SET latin1 NOT NULL,
     `preorder` varchar(200) CHARACTER SET latin1 NOT NULL,
     `hotpick` varchar(200) CHARACTER SET latin1 NOT NULL,
     `new` varchar(200) CHARACTER SET latin1 NOT NULL,
     `special` varchar(200) CHARACTER SET latin1 NOT NULL,
     `gift` varchar(200) CHARACTER SET latin1 NOT NULL,
     `startdate` varchar(30) CHARACTER SET latin1 NOT NULL,
     `enddate` varchar(30) CHARACTER SET latin1 NOT NULL,
     `offline` varchar(3) CHARACTER SET latin1 NOT NULL,
     `online` varchar(3) CHARACTER SET latin1 NOT NULL,
     `instock` varchar(3) CHARACTER SET latin1 NOT NULL,
     `stockquantity` int(11) NOT NULL,
     `deliveryavailble` varchar(10) CHARACTER SET latin1 NOT NULL,
     `deliveryDetails` varchar(500) CHARACTER SET latin1 NOT NULL,
     `deliverytime` varchar(50) CHARACTER SET latin1 NOT NULL,
     `storepickupavailable` varchar(10) CHARACTER SET latin1 NOT NULL,
     `storepickupdetails` varchar(50) CHARACTER SET latin1 NOT NULL,
     `internetonly` varchar(10) CHARACTER SET latin1 NOT NULL,
     `creditoffer` varchar(10) CHARACTER SET latin1 NOT NULL,
     `isforsale` varchar(5) CHARACTER SET latin1 NOT NULL,
     `currency` varchar(50) CHARACTER SET latin1 NOT NULL,
     `saleprice` double(10,2) NOT NULL,
     `price` float(8,2) NOT NULL,
     `retailprice` double(10,2) NOT NULL,
     `storeprice` double(10,2) NOT NULL,
     `fromprice` varchar(3) CHARACTER SET latin1 NOT NULL,
     `vat` varchar(10) CHARACTER SET latin1 NOT NULL,
     `priceexvat` varchar(10) CHARACTER SET latin1 NOT NULL,
     `priceincvat` varchar(10) CHARACTER SET latin1 NOT NULL,
     `pricevaliduntil` varchar(100) CHARACTER SET latin1 NOT NULL,
     `standardshippingcost` double(10,2) NOT NULL,
     `ship2mecost` varchar(10) CHARACTER SET latin1 NOT NULL,
     `totaldeliverycost` varchar(10) CHARACTER SET latin1 NOT NULL,
     `totalproductcost` varchar(10) CHARACTER SET latin1 NOT NULL,
     `retailerdescription` varchar(500) CHARACTER SET latin1 NOT NULL,
     `localoronline` varchar(10) CHARACTER SET latin1 NOT NULL,
     `vatinfo` int(3) NOT NULL,
     `vatrate` varchar(10) CHARACTER SET latin1 NOT NULL,
     `shippingwhere` varchar(100) CHARACTER SET latin1 NOT NULL,
     `shippinginfo` varchar(100) CHARACTER SET latin1 NOT NULL,
     `freeshipping` varchar(200) CHARACTER SET latin1 NOT NULL,
     `retailerreviews` varchar(250) CHARACTER SET latin1 NOT NULL,
     `retailerrating` int(11) NOT NULL,
     `expertreviews` varchar(200) CHARACTER SET latin1 NOT NULL,
     `expertratings` int(6) NOT NULL,
     `videoreview` varchar(200) CHARACTER SET latin1 NOT NULL,
     `videoreviewtext` varchar(500) CHARACTER SET latin1 NOT NULL,
     `productreviews` varchar(250) CHARACTER SET latin1 NOT NULL,
     `productrating` int(11) NOT NULL,
     `thirdpartyuserreviews` varchar(20) CHARACTER SET latin1 NOT NULL,
     `thirdpartyuserratings` varchar(20) CHARACTER SET latin1 NOT NULL,
     `numberofreviews` varchar(20) CHARACTER SET latin1 NOT NULL,
     `overallcustomerrating` varchar(20) CHARACTER SET latin1 NOT NULL,
     `filters` smallint(1) NOT NULL,
     `ages` varchar(20) CHARACTER SET latin1 NOT NULL,
     `size` varchar(500) CHARACTER SET latin1 NOT NULL,
     `size_h` varchar(50) CHARACTER SET latin1 NOT NULL,
     `size_w` varchar(50) CHARACTER SET latin1 NOT NULL,
     `size_d` varchar(50) CHARACTER SET latin1 NOT NULL,
     `size_unit` varchar(50) CHARACTER SET latin1 NOT NULL,
     `wifi_ready` varchar(200) CHARACTER SET latin1 NOT NULL,
     `bundled_deals_packages` varchar(10) CHARACTER SET latin1 NOT NULL,
     `service_provider` varchar(500) CHARACTER SET latin1 NOT NULL,
     `how_many_seats` varchar(100) CHARACTER SET latin1 NOT NULL,
     `characters` varchar(500) CHARACTER SET latin1 NOT NULL,
     `popular` int(2) NOT NULL DEFAULT '0',
     `match_modelno` int(11) NOT NULL,
     `aff_name` varchar(100) CHARACTER SET latin1 NOT NULL,
     `clean_modelno` varchar(200) CHARACTER SET latin1 NOT NULL,
     `pa_time_stamp` date NOT NULL,
     `status` smallint(1) NOT NULL,
     `14_july_modelno` varchar(100) CHARACTER SET latin1 NOT NULL,
     `datentry_modelno` varchar(200) CHARACTER SET latin1 NOT NULL,
     `new_old` varchar(100) CHARACTER SET latin1 NOT NULL,
     `last_cat_update` varchar(100) CHARACTER SET latin1 NOT NULL,
     `clean_modelno_dataentry` varchar(100) CHARACTER SET latin1 NOT NULL,
     `pa_categorization` smallint(1) NOT NULL,
     `seo_title` varchar(500) NOT NULL,
     `seo_desc` varchar(500) NOT NULL,
     PRIMARY KEY (`product_id`),
     KEY `subcategory1` (`subcategory1`),
     KEY `subcategory2` (`subcategory2`),
     KEY `subcategory3` (`subcategory3`),
     KEY `subcategory4` (`subcategory4`),
     KEY `subcategory5` (`subcategory5`),
     KEY `price` (`price`),
     KEY `status` (`status`),
     KEY `advertiserid` (`advertiserid`),
     KEY `pa_categorization` (`pa_categorization`),
     KEY `pa_time_stamp` (`pa_time_stamp`),
     KEY `Group_by` (`clean_modelno`,`publishercategory`),
     FULLTEXT KEY `name` (`name`,`description`,`keywords`)
    ) ENGINE=MyISAM AUTO_INCREMENT=3689200 DEFAULT CHARSET=utf8
    
    Code (markup):


    Result for EXPLAIN SELECT...
    
    id:	1
    select_type:	SIMPLE
    table:	e
    type:	ref
    possible_keys:	subcategory2,status
    key:	subcategory2
    key_len:	2
    ref:	const
    rows:	92713
    Extra:Using where; Using filesort
    
    Code (markup):


    Result for "SHOW TABLE STATUS LIKE 'elec_products'":
    
    Name: 	elec_products
    Engine: 	MyISAM
    Version: 	10
    Row_format: 	Dynamic
    Rows: 	3689199
    Avg_row_length: 	838
    Data_length: 	3126241800
    Max_data_length: 	2.81475E+14
    Index_length: 	1209323520
    Data_free: 	0
    Auto_increment: 	3689200
    Create_time :	12/30/2010 17:54
    Update_time :	12/31/2010 17:32
    Check_time: 	12/30/2010 18:01
    Collation: 	utf8_general_ci
    Checksum :	NULL
    Create_options: 	
    Comment	:
    
    Code (markup):

    Please help me to findout the bug.
     
    rahulephp, Jan 3, 2011 IP
  6. jkl6

    jkl6 Peon

    Messages:
    70
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #6
    If you run the following, how long does it take? If that comes back fast, then we might be on to something.

    
    SELECT 
     e.publishercategory,
     e.clean_modelno
     FROM elec_products as e
     
     WHERE status = 1
     AND (subcategory2 = 3115) 
     GROUP BY clean_modelno, publishercategory
     ORDER BY total DESC
    
    Code (markup):
     
    jkl6, Jan 5, 2011 IP
  7. duben

    duben Active Member

    Messages:
    50
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    61
    #7
    Do you need calculate this on live? Sometimes help pre-calculated values saved to some table. Refresh data inside pre-calculated table in some relevant period.
     
    duben, Jan 6, 2011 IP