• 欢迎访问搞代码网站,推荐使用最新版火狐浏览器和Chrome浏览器访问本网站!
  • 如果您觉得本站非常有看点,那么赶紧使用Ctrl+D 收藏搞代码吧

MySQL万万级别表数据中提高RAND随机查询的实验_mysql

mysql 搞代码 7年前 (2018-06-07) 142次浏览 已收录 0个评论

mysql千万级别表数据中提高RAND随机查询的实验

1,准备测试数据

1.1 建库建表

mysql> create database hwdb;

Query OK, 1 row affected (0.34 sec)

mysql>

mysql> show create table t_huawei;

+———-+—————————————————————————————————————————————————————————————————————————————————–+

| Table   | Create Table                                                                                                                                                                                                                                       |

+———-+—————————————————————————————————————————————————————————————————————————————————–+

| t_huawei | CREATE TABLE `t_huawei` (

 `hwid` int(11) DEFAULT NULL,

 `content` varchar(4000) DEFAULT NULL,

 `TYPE` char(2) DEFAULT NULL,

 `creator_id` varchar(30) DEFAULT NULL,

 `create_time` datetime DEFAULT NULL

) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

+———-+—————————————————————————————————————————————————————————————————————————————————–+

1 row in set (0.00 sec)

mysql>

1.2 准备存储过程

DELIMITER $$

USE `hwdb`$$

DROP PROCEDURE IF EXISTS `pro_insert_hw`$$

CREATE DEFINER=`root`@`localhost` PROCEDURE`pro_insert_hw`(num INT)

BEGIN

    DECLARE var1 INT DEFAULT 0;   

   WHILE var1<num DO   

   SET var1=var1+1;   

  INSERT INTO hwdb.t_huawei(hwid,content,TYPE,creator_id,create_time)  

   VALUES (var1,CONCAT(var1,’ – ‘,’Players supposedly said they”d avoidteam facility this summer if Thibodeau remained the ‘),

0,10013,NOW());   

   END WHILE;    

   END$$

DELIMITER ;

mysql> show create table t_huawei;

+———-+—————————————————————————————————————————————————————————————————————————————————–+

| Table   | Create Table                                                                                                                                                                                                                                       |

+———-+—————————————————————————————————————————————————————————————————————————————————–+

| t_huawei | CREATE TABLE `t_huawei` (

 `hwid` int(11) DEFAULT NULL,

 `content` varchar(4000) DEFAULT NULL,

 `TYPE` char(2) DEFAULT NULL,

  `creator_id`varchar(30) DEFAULT NULL,

 `create_time` datetime DEFAULT NULL

) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

+———-+—————————————————————————————————————————————————————————————————————————————————–+

1 row in set (0.00 sec)

mysql>

1.3 开始录入数据

因为要录入千万级别数据,所以用后台mysql进程来录入,如下所示:

nohup /usr/local/mysql/bin/mysql -uroot–password="" -S /usr/local/mysql/mysql.sock -e "select now();truncate table hwdb.t_huawei; call hwdb.pro_insert_hw(10000000); select now();" > /data/call.log &

这里因为测试环境机器性能有限,在录入到7023850条的时候,我终止了进程,开始准备测试工作:

mysql> SELECT COUNT(1) FROM `t_huawei` ;

+———-+

| COUNT(1) |

+———-+

| 7023850 |

+———-+

1 row in set (5.52 sec)

mysql>

2,开始测试

2.1 普通的ORDER BY RAND()测试很慢很慢,如下所示,需要块1分钟的时间

mysql> SELECT  SQL_NO_CACHE * FROM `t_huawei` AS t1  ORDER BY RAND() LIMIT 5;

+———+—————————————————————————————————–+——+————+———————+

| hwid   | content                                                                                            | TYPE |creator_id | create_time         |

+———+—————————————————————————————————–+——+————+———————+

| 3743323 | 3743323 – Players supposedlysaid they’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 20:43:01|

| 2418491 | 2418491 – Players supposedlysaid they’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 20:19:31|

| 1224667 | 1224667 – Players supposedlysaid they’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:58:35|

| 1639270 | 1639270 – Players supposedlysaid they’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 20:05:50|

| 2756470 | 2756470 – Players supposedlysaid they’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 20:25:28|

+———+—————————————————————————————————–+——+————+———————+

5 rows in set (54.92 sec)

mysql>

原因是:MYSQL的官方手册,里面针对RAND()的提示大概意思就是,在ORDER BY从句里面不能使用RAND()函数,因为这样会导致数据列被多次扫描。但是在MYSQL 3.23版本中,仍然可以通过ORDER BY RAND()来实现随机。

2.2 JOIN中使用RAND()* (SELECTMAX(hwid) 的方式来实现

mysql> SELECT SQL_NO_CACHE t1.* FROM`t_huawei` AS t1 LEFT JOIN(SELECT ROUND(RAND()* (SELECT MAX(hwid) FROM  `t_huawei` )) AS id ) AS t2 ONt1.`hwid`>=t2.id ORDER BY t1.`hwid` ASC LIMIT 5;

+——+———————————————————————————————–+——+————+———————+

| hwid | content                                                                                      | TYPE | creator_id | create_time        |

+——+———————————————————————————————–+——+————+———————+

|   1 | 1 – Players supposedly said they’d avoid team facility this summerif Thibodeau remained the  | 0    | 10013     | 2015-06-08 19:37:02 |

|   2 | 2 – Players supposedly said they’d avoid team facility this summerif Thibodeau remained the  | 0    | 10013     | 2015-06-08 19:37:02 |

|   3 | 3 – Players supposedly said they’d avoid team facility this summerif Thibodeau remained the  | 0    | 10013     | 2015-06-08 19:37:02 |

|   4 | 4 – Players supposedly said they’d avoid team facility this summerif Thibodeau remained the  | 0    | 10013     | 2015-06-08 19:37:02 |

|   5 | 5 – Players supposedly said they’d avoid team facility this summerif Thibodeau remained the  | 0    | 10013     | 2015-06-08 19:37:02 |

+——+———————————————————————————————–+——+————+———————+

5 rows in set (50.04 sec)

mysql>

需要50面,比起直接的order by来说少了4秒,但是还是比较耗时的。

2.3 WHERE子句的方式来优化

mysql> SELECT SQL_NO_CACHE t1.* FROMhwdb.`t_huawei` t1 WHERE t1.`hwid` >=(SELECT FLOOR(RAND() * (SELECTMAX(t2.hwid) FROM hwdb.`t_huawei` t2 ) )) ORDER BY t1.hwid LIMIT 5;

+——+————————————————————————————————–+——+————+———————+

| hwid | content                                                                                         | TYPE | creator_id | create_time        |

+——+————————————————————————————————–+——+————+———————+

| 1624 | 1624 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:05|

| 6076 | 6076 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:11|

| 7988 | 7988 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:14|

| 8016 | 8016 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:14|

| 8106 | 8106 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:14|

+——+————————————————————————————————–+——+————+———————+

5 rows in set (7.25 sec)

mysql>

花了7.25秒,比上一种要快的多很多了。

2.4 MAX(hwid)- MIN(hwid)的方式来RAND来操作

mysql> SELECT SQL_NO_CACHE * FROMhwdb.`t_huawei` WHERE hwid >= (SELECT FLOOR(RAND() * ((SELECT MAX(hwid) FROMhwdb.`t_huawei`) – (SELECT MIN(hwid) FROM hwdb.`t_huawei`)) + (SELECT MIN(hwid)FROM hwdb.`t_huawei`))) ORDER BY hwid LIMIT 5;

+——+————————————————————————————————–+——+————+———————+

| hwid | content                                                                                          |TYPE | creator_id | create_time         |

+——+————————————————————————————————–+——+————+———————+

| 5438 | 5438 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:10|

| 5883 | 5883 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:10|

| 6782 | 6782 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:12|

| 7797 | 7797 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:13|

| 8625 | 8625 – Players supposedly saidthey’d avoid team facility this summer if Thibodeau remained the  | 0   | 10013      | 2015-06-08 19:37:15|

+——+————————————————————————————————–+——+————+———————+

5 rows in set (13.50 sec)

mysql>

MAX(hwid)- MIN(hwid)的方式来RAND来操作需要13.5秒,比where子句花时间要多一些。


3,总结

WHERE子句的方式效率最高,样例sql:SELECTSQL_NO_CACHE t1.* FROM hwdb.`t_huawei` t1 WHERE t1.`hwid` >=(SELECTFLOOR(RAND() * (SELECT MAX(t2.hwid) FROM hwdb.`t_huawei` t2 ) )) ORDER BYt1.hwid LIMIT 5;

版权声明:本文为博主原创文章,未经博主允许不得转载。

欢迎大家阅读《MySQL万万级别表数据中提高RAND随机查询的实验_mysql》,跪求各位点评,by 搞代码


搞代码网(gaodaima.com)提供的所有资源部分来自互联网,如果有侵犯您的版权或其他权益,请说明详细缘由并提供版权或权益证明然后发送到邮箱[email protected],我们会在看到邮件的第一时间内为您处理,或直接联系QQ:872152909。本网站采用BY-NC-SA协议进行授权
转载请注明原文链接:MySQL万万级别表数据中提高RAND随机查询的实验_mysql

喜欢 (0)
[搞代码]
分享 (0)
发表我的评论
取消评论

表情 贴图 加粗 删除线 居中 斜体 签到

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址