博客编辑器越来越用不好了,伙伴们将就看,需要排版更好的文档请加Q群 246054962 。 625某电商网站数据库特大故障解决实录(上) 这是一次,惊心动魄的企业级电商网站数据库在线故障解决实录,故障解决的过程遇到了很多问题,思想的碰撞,解决方案的决策,及实
博客编辑器越来越用不好了,伙伴们将就看,需要排版更好的文档请加Q群246054962。
625某电商网站数据库特大故障解决实录(上)
这是一次,惊心动魄的企业级电商网站数据库在线故障解决实录,故障解决的过程遇到了很多问题,思想的碰撞,解决方案的决策,及实际操作的问题困扰,老男孩尽量原汁原味的描述恢复的全部过程及思想思维过程!老男孩教育版权所有,本内容禁止商业用途。
目录:
625某电商网站数据库特大故障解决实录… 1
1接到电商客户报警… 1
1.1与客户初步沟通… 1
1.2深入沟通确定故障恢复方案… 2
1.3开始故障恢复准备… 4
1.4开始进行故障恢复*****. 6
1.5数据库故障恢复后扫尾工作… 15
1接到电商客户报警
1.1与客户初步沟通
昨日接到某电商网站客户电话,说搞秒杀赠送活动,数据库遇到问题了,结果启动起不来了。
[root@etiantian etc]# /etc/init.d/mysqld startStarting MySQL. ERROR! The server quit without updating PID file (/var/run/mysqld/mysqld.pid).
提示:此部分客户给的是截图,是后期老男孩根据SSH日志整理而来。
由于时间紧急,本能的提示客户看看/var/run/mysqld/mysqld.pid存在否,如果存在,删除再启动,客户说没有这个PID文件,提示用户用mysqld_safe –user=mysql &启动看看,结果可以启动成功done,但是,端口服务依然起不来。让客户查下mysql启动日志,报错如下:
[root@etiantian etc]# cat /var/log/mysqld.log140624 18:51:58 mysqld_safe Starting mysqld daemon with databases from /data/mysql/140624 18:51:58 InnoDB: The InnoDB memory heap is disabled140624 18:51:58 InnoDB: Mutexes and rw_locks use GCC atomic builtins140624 18:51:58 InnoDB: Compressed tables use zlib 1.2.3140624 18:51:58 InnoDB: Initializing buffer pool, size = 768.0M140624 18:51:58 InnoDB: Completed initialization of buffer poolInnoDB: Error: auto-extending data file ./ibdata1 is of a different sizeInnoDB: 2176 pages (rounded down to MB) than specified in the .cnf file:InnoDB: initial 65536 pages, max 0 (relevant if non-zero) pages!140624 18:51:58 InnoDB: Could not open or create data files.140624 18:51:58 InnoDB: If you tried to add new data files, and it failed here,140624 18:51:58 InnoDB: you should now edit innodb_data_file_path in my.cnf back140624 18:51:58 InnoDB: to what it was, and remove the new ibdata files InnoDB created140624 18:51:58 InnoDB: in this failed attempt. InnoDB only wrote those files full of140624 18:51:58 InnoDB: zeros, but did not yet use them in any way. But be careful: do not140624 18:51:58 InnoDB: remove old data files which contain your precious data!140624 18:51:58 [ERROR] Plugin 'InnoDB' init function returned error.140624 18:51:58 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.140624 18:51:58<i style="color:transparent">本文来源gaodai$ma#com搞$代*码6网</i> [ERROR] Unknown/unsupported storage engine: InnoDB140624 18:51:58 [ERROR] Aborting 140624 18:51:58 [Note] /install/mysql/bin/mysqld: Shutdown complete 140624 18:51:58 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
提示:此部分客户给的是截图,是后期老男孩根据SSH日志整理而来。
红色部分为错误。
InnoDB: Error: auto-extending data file ./ibdata1 is of a different size140624 18:51:58 [ERROR] Plugin ‘InnoDB’ init function returned error.
140624 18:51:58 [ERROR] Plugin ‘InnoDB’ registration as a STORAGE ENGINE failed.
140624 18:51:58 [ERROR] Unknown/unsupported storage engine: InnoDB
140624 18:51:58 [ERROR] Aborting
根据客户的信息和自身的经验基本定位了客户有可能强制终止了进程或者改变了数据文件!
于是,询问客户故障前和故障后,都做了啥操作,得到的回答如下:
XXXX 18:53:41 数据库之前停止响应,killall之前已经没办法做restart重启了XXXX 18:53:32我觉得有问题,然后killall掉了,然后就起不来了,别的没做。根据日志以及客户的描述,基本上断定是强制关闭服务导致innodb表空间或文件异常。至此问题原因及故障现象已经确定。