什么是数据库同步
指在存储类型、格式和计算机系统之间的数据转换,这个名词没有严格的定义,反正就那个意思
mysql配置
看是否有my.ini文件,有就直接跳过到第4步,如果没有就跟着我的操作来
1.删除mysql服务:
终端下,进入mysql的bin目录:
<span style="font-family:">sc delete MySql</span>
www#gaodaima.com来源gaodai#ma#com搞*代#码网搞代码
2.在mysql的根目录下创建一个my.ini文件,
把下面的内容放进去保存:
<span style="font-family:"><span style="color: #008000">#</span><span style="color: #008000"> For advice on how to change settings please see</span><span style="color: #008000"> #</span><span style="color: #008000"> http://dev.mysql.com/doc/refman/5.6/en/server-configuration-defaults.html</span><span style="color: #008000"> #</span><span style="color: #008000"> *** DO NOT EDIT THIS FILE. It"s a template which will be copied to the</span><span style="color: #008000"> #</span><span style="color: #008000"> *** default location during install, and will be replaced if you</span><span style="color: #008000"> #</span><span style="color: #008000"> *** upgrade to a newer version of MySQL.</span> <span style="color: #000000">[client] default</span>-character-set =<span style="color: #000000"> utf8mb4 [mysql] default</span>-character-set =<span style="color: #000000"> utf8mb4 [mysqld] character</span>-set-client-handshake =<span style="color: #000000"> FALSE character</span>-set-server =<span style="color: #000000"> utf8mb4 collation</span>-server =<span style="color: #000000"> utf8mb4_bin init_connect</span>=<span style="color: #800000">"</span><span style="color: #800000">SET NAMES utf8mb4</span><span style="color: #800000">"</span> <span style="color: #008000">#</span><span style="color: #008000"> Remove leading # and set to the amount of RAM for the most important data</span><span style="color: #008000"> #</span><span style="color: #008000"> cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.</span> innodb_buffer_pool_size =<span style="color: #000000"> 128M </span><span style="color: #008000">#</span><span style="color: #008000"> Remove leading # to turn on a very important data integrity option: logging</span><span style="color: #008000"> #</span><span style="color: #008000"> changes to the binary log between backups.</span><span style="color: #008000"> #</span><span style="color: #008000"> log_bin</span><span style="color: #008000"> #</span><span style="color: #008000"> These are commonly set, remove the # and set as required.</span> basedir =<span style="color: #000000"> D:MySQL datadir </span>=<span style="color: #000000"> D:MySQLdata port </span>= 3306 <span style="color: #008000">#</span><span style="color: #008000"> server_id = .....</span><span style="color: #008000"> #</span><span style="color: #008000"> Remove leading # to set options mainly useful for reporting servers.</span><span style="color: #008000"> #</span><span style="color: #008000"> The server defaults are faster for transactions and fast SELECTs.</span><span style="color: #008000"> #</span><span style="color: #008000"> Adjust sizes as needed, experiment to find the optimal values.</span> join_buffer_size =<span style="color: #000000"> 128M sort_buffer_size </span>=<span style="color: #000000"> 16M read_rnd_buffer_size </span>=<span style="color: #000000"> 16M sql_mode</span>=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES</span>
3.重新生成data文件夹
此时此刻会删除之前的已有的数据,所以,这里我建议你先把之前的data文件夹备份一下,但我个人是直接把data文件夹重命名为data123
同样的,终端下,bin目录下,
<span style="font-family:">mysqld --initialize-insecure --user=mysql</span>
生成data之后,再进入data123文件夹,把里面的文件全部复制到data文件夹下即可,我试了下,数据没有丢失,但是不保证绝对没有问题,如果担心的朋友,可以提前把数据导出成sql文件,之后再把sql文件导入即可
4.重新添加服务并关联上my.ini文件
终端,bin目录下
<span style="font-family:">mysqld --install <span style="color: #800000">"</span><span style="color: #800000">MySql</span><span style="color: #800000">"</span> --defaults-file=<span style="color: #800000">"</span><span style="color: #800000">你刚才创建的my.ini目录路径</span><span style="color: #800000">"</span></span>
此时就能在服务里看到它了
启动服务,如果能正常启动,那就没有问题了。
如果已有my.ini的话,关闭mysql服务,重启mysql即可
5.设置binlog
my.ini作如下配置:
<span style="font-family:"><span style="color: #008000">#</span><span style="color: #008000"> For advice on how to change settings please see</span><span style="color: #008000"> #</span><span style="color: #008000"> http://dev.mysql.com/doc/refman/5.6/en/server-configuration-defaults.html</span><span style="color: #008000"> #</span><span style="color: #008000"> *** DO NOT EDIT THIS FILE. It"s a template which will be copied to the</span><span style="color: #008000"> #</span><span style="color: #008000"> *** default location during install, and will be replaced if you</span><span style="color: #008000"> #</span><span style="color: #008000"> *** upgrade to a newer version of MySQL.</span> <span style="color: #000000">[client] default</span>-character-set =<span style="color: #000000"> utf8mb4 [mysql] default</span>-character-set =<span style="color: #000000"> utf8mb4 [mysqld] character</span>-set-client-handshake =<span style="color: #000000"> FALSE character</span>-set-server =<span style="color: #000000"> utf8mb4 collation</span>-server =<span style="color: #000000"> utf8mb4_bin init_connect</span>=<span style="color: #800000">"</span><span style="color: #800000">SET NAMES utf8mb4</span><span style="color: #800000">"</span> <span style="color: #008000">#</span><span style="color: #008000"> Remove leading # and set to the amount of RAM for the most important data</span><span style="color: #008000"> #</span><span style="color: #008000"> cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.</span> innodb_buffer_pool_size =<span style="color: #000000"> 128M </span><span style="color: #008000">#</span><span style="color: #008000"> Remove leading # to turn on a very important data integrity option: logging</span><span style="color: #008000"> #</span><span style="color: #008000"> changes to the binary log between backups.</span><span style="color: #008000"> #</span><span style="color: #008000"> log_bin</span><span style="color: #008000"> #</span><span style="color: #008000"> These are commonly set, remove the # and set as required.</span> basedir = D:programemysql-5.7.28-<span style="color: #000000">winx64 datadir </span>= D:programemysql-5.7.28-<span style="color: #000000">winx64data port </span>= 3306 <span style="color: #008000">#</span><span style="color: #008000"> server_id = .....</span><span style="color: #008000"> #</span><span style="color: #008000"> Remove leading # to set options mainly useful for reporting servers.</span><span style="color: #008000"> #</span><span style="color: #008000"> The server defaults are faster for transactions and fast SELECTs.</span><span style="color: #008000"> #</span><span style="color: #008000"> Adjust sizes as needed, experiment to find the optimal values.</span> join_buffer_size =<span style="color: #000000"> 128M sort_buffer_size </span>=<span style="color: #000000"> 16M read_rnd_buffer_size </span>=<span style="color: #000000"> 16M sql_mode</span>=<span style="color: #000000">NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES </span><span style="color: #008000">#</span><span style="color: #008000">binlog日志名称前缀</span> log-bin=mysql-<span style="color: #000000">bin </span><span style="color: #008000">#</span><span style="color: #008000">默认值未0,如果使用默认值则不能和从节点通信,这个值的区间是:1到(2^32)-1</span> server-id=1<span style="color: #000000"> binlog</span>-format=<span style="color: #000000">Row binlog_row_image </span>=<span style="color: #000000"> full max_binlog_size </span>= 1G</span>
相关的binlog参数:
<span style="font-family:">log-bin [=file_name] 此参数表示启用binlog日志功能,并可以定制路径名称,默认为mysql-<span style="color: #000000">bin。 binlog_format 此参数配置binlog的日志格式,默认为mixed。 max_binlog_size此参数配置binlog的日志最大值,最大和默认值是1GB。 max_binlog_cache_size此参数表示binlog使用最大内存的数。 binlog</span>-do-db=<span style="color: #000000">db_name 此参数表示只记录指定数据库的二进制日志。 binlog</span>-ignore-db=<span style="color: #000000">db_name此参数表示不记录指定的数据库的二进制日志。 expire_logs_days 此参数表示binlog日志保留的时间,默认单位是天。</span></span>
6.重启mysql服务即可
咋重启就不说了
7.进入mysql终端验证是否开启
<span style="font-family:">show variables like <span style="color: #800000">"</span><span style="color: #800000">%log_bin%</span><span style="color: #800000">"</span><span style="color: #000000">; show variables like </span><span style="color: #800000">"</span><span style="color: #800000">%log_bin%</span><span style="color: #800000">"</span>;</span>
on意思就是开启binlog成功
什么时候会产生binLog文件
- 当mysql停止或重启时,mysql会在重启时生成一个新的binlog文件,文件序号依次递增。
- binlog文件超过max_binlog_size 系统变量配置的上限时
- mysql命令中执行flush logs命令
查看日志文件
<span style="font-family:">show master logs;</span>
删除日志
按名字删除,只删除给出该名字之前的,并不包括该文件
<span style="font-family:">purge {master|binary} logs to <span style="color: #800000">"</span><span style="color: #800000">log_name</span><span style="color: #800000">"</span>;</span>
按日期删除,只删除给出该日志之前的,并不包括该文件
<span style="font-family:">purge {master|binary} logs before <span style="color: #800000">"</span><span style="color: #800000">date’;</span></span>
删除全部binlog:
<span style="font-family:">reset master;</span>
使用binlog恢复数据
<span style="font-family:">mysqlbinlog xxx-bin.000001|mysql -u root –p password dataname</span>
python 相关接入
其实使用mysql自带的mysqlbinlog也可以恢复,但是诸多的使用场景下,并不太适合,所以就有了第三方的接入控制。
python的话,我找了一共四个库可以处理binlog
- mysql-replication:国外一个大佬开发的,实现数据的复制
- binlog2sql:基于mysql-replication,大众点评的dba开发的,将binlog日志的sql转出
- py-mysql-binlogserver:实现数据的半同步
- binlog2cache:将binlog日志转到redis内
本文主要介绍mysql-replication和binlog2sql
mysql-replication
安装mysql-replication
<span style="font-family:">pip install mysql-replication</span>
然后使用如下代码运行起来,实时监听mysql数据的变动:
<span style="font-family:"><span style="color: #0000ff">from</span> pymysqlreplication <span style="color: #0000ff">import</span><span style="color: #000000"> BinLogStreamReader </span><span style="color: #0000ff">from</span> pymysqlreplication.row_event <span style="color: #0000ff">import</span><span style="color: #000000"> ( DeleteRowsEvent, UpdateRowsEvent, WriteRowsEvent, ) </span><span style="color: #0000ff">import</span><span style="color: #000000"> sys </span><span style="color: #0000ff">import</span><span style="color: #000000"> json mysql_settings </span>= {<span style="color: #800000">"</span><span style="color: #800000">host</span><span style="color: #800000">"</span>: <span style="color: #800000">"</span><span style="color: #800000">127.0.0.1</span><span style="color: #800000">"</span>,<span style="color: #800000">"</span><span style="color: #800000">port</span><span style="color: #800000">"</span>: 3306<span style="color: #000000">, </span><span style="color: #800000">"</span><span style="color: #800000">user</span><span style="color: #800000">"</span>: <span style="color: #800000">"</span><span style="color: #800000">root</span><span style="color: #800000">"</span>, <span style="color: #800000">"</span><span style="color: #800000">passwd</span><span style="color: #800000">"</span>: <span style="color: #800000">""</span><span style="color: #000000">} </span><span style="color: #0000ff">def</span><span style="color: #000000"> main(): stream </span>=<span style="color: #000000"> BinLogStreamReader( connection_settings</span>=<span style="color: #000000">mysql_settings, server_id</span>=1<span style="color: #000000">, blocking</span>=<span style="color: #000000">True, only_events</span>=<span style="color: #000000">[DeleteRowsEvent, WriteRowsEvent, UpdateRowsEvent]) </span><span style="color: #0000ff">for</span> binlogevent <span style="color: #0000ff">in</span><span style="color: #000000"> stream: </span><span style="color: #0000ff">for</span> row <span style="color: #0000ff">in</span><span style="color: #000000"> binlogevent.rows: event </span>= {<span style="color: #800000">"</span><span style="color: #800000">schema</span><span style="color: #800000">"</span>: binlogevent.schema, <span style="color: #800000">"</span><span style="color: #800000">table</span><span style="color: #800000">"</span>: binlogevent.table, <span style="color: #800000">"</span><span style="color: #800000">log_pos</span><span style="color: #800000">"</span><span style="color: #000000">: binlogevent.packet.log_pos} </span><span style="color: #0000ff">if</span><span style="color: #000000"> isinstance(binlogevent, DeleteRowsEvent): event[</span><span style="color: #800000">"</span><span style="color: #800000">action</span><span style="color: #800000">"</span>] = <span style="color: #800000">"</span><span style="color: #800000">delete</span><span style="color: #800000">"</span><span style="color: #000000"> event[</span><span style="color: #800000">"</span><span style="color: #800000">values</span><span style="color: #800000">"</span>] = dict(row[<span style="color: #800000">"</span><span style="color: #800000">values</span><span style="color: #800000">"</span><span style="color: #000000">].items()) event </span>=<span style="color: #000000"> dict(event.items()) </span><span style="color: #0000ff">elif</span><span style="color: #000000"> isinstance(binlogevent, UpdateRowsEvent): event[</span><span style="color: #800000">"</span><span style="color: #800000">action</span><span style="color: #800000">"</span>] = <span style="color: #800000">"</span><span style="color: #800000">update</span><span style="color: #800000">"</span><span style="color: #000000"> event[</span><span style="color: #800000">"</span><span style="color: #800000">before_values</span><span style="color: #800000">"</span>] = dict(row[<span style="color: #800000">"</span><span style="color: #800000">before_values</span><span style="color: #800000">"</span><span style="color: #000000">].items()) event[</span><span style="color: #800000">"</span><span style="color: #800000">after_values</span><span style="color: #800000">"</span>] = dict(row[<span style="color: #800000">"</span><span style="color: #800000">after_values</span><span style="color: #800000">"</span><span style="color: #000000">].items()) event </span>=<span style="color: #000000"> dict(event.items()) </span><span style="color: #0000ff">elif</span><span style="color: #000000"> isinstance(binlogevent, WriteRowsEvent): event[</span><span style="color: #800000">"</span><span style="color: #800000">action</span><span style="color: #800000">"</span>] = <span style="color: #800000">"</span><span style="color: #800000">insert</span><span style="color: #800000">"</span><span style="color: #000000"> event[</span><span style="color: #800000">"</span><span style="color: #800000">values</span><span style="color: #800000">"</span>] = dict(row[<span style="color: #800000">"</span><span style="color: #800000">values</span><span style="color: #800000">"</span><span style="color: #000000">].items()) event </span>=<span style="color: #000000"> dict(event.items()) </span><span style="color: #0000ff">print</span>(json.dumps(event,ensure_ascii=<span style="color: #000000">False)) sys.stdout.flush() </span><span style="color: #0000ff">if</span> <span style="color: #800080">__name__</span> == <span style="color: #800000">"</span><span style="color: #800000">__main__</span><span style="color: #800000">"</span><span style="color: #000000">: main()</span></span>
我这里用nvicat增,删,改了三条数据
代码自动打印如下:
虽然能很清晰的看出具体做了那些操作,但是,此时此刻我想将这些操作同步到另一个数据库的话,我每一个命令都需要转换一下,实话说,有点不顺手,太繁琐了
binlog2sql
我现在的场景是需要对数据库进行远程同步,一个主数据库,一个从数据库,主数据库做了任何数据的更新操作,从数据库都要同步到。所以我觉得binlog2sql更实用一点
1.首先,安装binlog2sql,终端下执行:
<span style="font-family:">git clone https://github.com/danfengcao/binlog2sql.git &&<span style="color: #000000"> cd binlog2sql pip install </span>-r requirements.txt</span>
2.my.ini配置添加如下:(其实我上面给的my.ini配置已经配置了)
<span style="font-family:"><span style="color: #000000">[mysqld] server_id </span>= 1<span style="color: #000000"> log_bin </span>= /var/log/mysql/mysql-<span style="color: #000000">bin.log max_binlog_size </span>=<span style="color: #000000"> 1G binlog_format </span>=<span style="color: #000000"> row binlog_row_image </span>= full</span>
3.给定mysql指定账号的select, super/replication client, replication slave权限,
<span style="font-family:">GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 用户名@<span style="color: #800000">"</span><span style="color: #800000">主机</span><span style="color: #800000">"</span></span>
4.查看binlog日志:
5.执行binlog2sql
打开终端,进入刚才下载的binglog2sql目录的binlog2sql.py文件同文件目录下:我这里是 C:UsersAdministratorDesktopinlog2sqlinlog2sql
<span style="font-family:">python binlog2sql.py -h127.0.0.1 -P3306 -uadmin -p <span style="color: #800000">"</span><span style="color: #800000">用户名</span><span style="color: #800000">"</span> -d 数据库名 -t 表名 --start-file=<span style="color: #800000">"</span><span style="color: #800000">日志文件名</span><span style="color: #800000">"</span></span>
打印结果:
由于我中间重启过一次,所以,刚才的增删改命令分散在000002和000003文件里了:
好的,sql语句被我们拿到了,上面有些打了马赛克的,是一些无关的数据
可以通过给定的开始和结束的位置,可以得出当前操作与上一次操作的变更,这个变更,在上面就给出来了
<span style="font-family:">python binlog2sql.py --flashback -h127.0.0.1 -P3306 -u root -p <span style="color: #800000">""</span> -d 数据库名 -t 表名 --start-file=<span style="color: #800000">"</span><span style="color: #800000">mysql-bin.000002</span><span style="color: #800000">"</span> --start-position=4 --stop-position=458</span>
也可以把这些sql语句输出到一个sql语句里:
<span style="font-family:">python binlog2sql.py -h127.0.0.1 -P3306 -uroot -p -d 数据库名 --start-file=日志名 --start-position=变更开始位置 --stop-position=变更结束位置 -B > 输出的文件名.sql</span>
windows下的报错:
但是这里却报错了,找到这个源文件的该行:
修改如下:
再次运行命令,不再报错,
同时binlog2sql目录下也有了这个文件:
打开这个文件:没有问题,即可直接导入sql语句
完整实用的sql:
现在我的需求是,做数据库同步,不管主数据库做什么操作,我的从数据库都要同步。所以,以下的命令即可完成我的需求
<span style="font-family:">python3 binlog2sql.py -h%s -P%s -u %s -p "%s" -d %s -t %s --start-file="%s" --start-position=%s --stop-position=%s > %s --only-dml --sql-type INSERT DELETE UPDATE </span>
将得到的sql语句同步到从数据库
现在sql语句有这么多
我这有两个方案:
第一个:读取每一个sql文件,然后一行一行的sql语句遍历操作,使用pymysql执行
第二个:直接终端下做备份操作
我最开始使用的是第二个方案,借用Python直接调用终端命令,选了os.popen和subProcess.popen执行,mysql导入sql数据有两种,
一种是:
mysql -u root -p
mysql>use 数据库
mysql>source xxx.sql
第二种:
mysql -u 账号 -p 数据库 < xzx.sql (也有这么用的,mysqljump -u 账号 -p 数据库 < xxx.sql)
我终端操作的话,那只能用第二种,结果,发现就是不行,导入不进去,具体原因我也不清楚,使用第一种的话,需要交互,我找了个能实现终端交互的库inspect,发现还是不够实用,那么没法,我就只能用第一个方案,读取sql一句一句操作了。
操作自然是可行的,而当我把同等的配置迁移到mac上操作时,报了如下错:
mac下操作的问题
网上查说是mysql8支持的字符串长度不止255,我看了我mac上装的mysql确实是8,而我之前在windows上测试没问题,其是mysql5.7
针对这个问题,网上一片的方案说是需要升级pymysql库,我用pip升级之后,导入成功
但是这里有个大坑,他妈的,当我再次用binlog2sql导出sql时,报了一堆的错,然后我突然反应过来,马德,binlog2sql不是指定了pymysql版本7吗,而我升级了pymyql版本是10以后,肯定binlog2sql用不了了,卧槽,顿时觉得好无语啊,解决方法就是要嘛对mysql降级,要嘛等binlog2sql更新支持最新版的mysql,要嘛自己去修改pymysql和binlog2sql和pymyreplication这三个库里的配置
pymysql开发者在issue里也没给出一个完美的解决方案:
https://github.com/PyMySQL/PyMySQL/pull/591
我也就不折腾了,在docker里拉了一个5.7版本的mysql操作了,结果自然是可行的
总结
之后要想做数据库同步,就可以将所有的变更存到一个sql文件里,然后再在另一个数据库(此时这个数据库可以是另一台服务器上的)执行sql语句即可,具体怎么实现就根据场景变通即可。
binlog2sql更多具体的操作,看官方文档:https://github.com/danfengcao/binlog2sql
根据你自己的使用场景对可选参数进行选择即可