第1页
MySQL at NetEase
蒋鸿翔@网易杭研院
jhx1008@gmail.com
第2页
MySQL at NetEase
MySQL版本
MySQL 5.1.47、5.5.20、5.5.30
逐步转向MySQL 5.5.30
MySQL 5.6.19 (开发中)
数据库架构
master-slave replication
DDB(distribute database)
应用
云音乐、云阅读、微博、博客、相册、游戏…
几乎95%的应用都使用MySQL
第3页
NetEase MySQL
为什么?
更快速的bug修复
高可用性
高可靠性
创新
第4页
NetEase MySQL
NetEase MySQL
InnoSQL
Based on MySQL 5.5/MySQL 5.6
Merge MariaDB’s new feature
open source
Patch
Binary package
第5页
NetEase MySQL
开源
https://github.com/NetEase/innosql
文档
http://mysql.netease.com/doc/
生产环境应用
网易云音乐
网易云阅读
网易公开课
网易博客
几乎所有MySQL 5.5
第6页
Main Changes in InnoSQL
InnoDB L2 Cache
InnoDB buffer pool fast warm up
Virtual sync replication
Slave batch commit
Resource Governor
TopSQL
Parallel replication
Others(IO statistics, Role Table, XA enhance)
第7页
InnoDB L2 Cache
SSD flash cache
Support write back or write through
Storage engine level
Write I/O on SSD is sequential
zlib & quicklz
Workload
Read-intensive
Write-intensive
第8页
InnoDB L2 Cache
第9页
InnoDB L2 Cache
一个页在L2 cache中可能存在多个版本
回刷磁盘仅需最新版本的页
L2 cache替代doublewrite
干净的页有选择性的写入cache
cache中的页会move,确保其下次还能被命中
第10页
InnoDB L2 Cache
生产环境应用案例
网易云阅读
read-intensive work load
600G SAS => 120G SSD + 2T SATA
第11页
InnoDB L2 Cache
TPC-C benchmark
http://www.mysqlperformanceblog.com/2012/10/25/l2-cache-for-mysql/
第12页
InnoDB L2 Cache
第13页
InnoDB BP fast warm up
预热
目的
快速恢复到应用状态
数据库重启
故障转移
方法
SELECT index
加载太多无用数据
浪费BP空间
MySQL 5.6
dump & load BP LRU list
第14页
InnoDB BP fast warm up
InnoSQL
Normal:share memory
fastest
corrupt check
Abnormal:dump & load BP LRU
dump also LRU old info
better warm up than MySQL 5.6
第15页
Virtual Sync Replication
Based on
semi-replication
slave crash safe
第16页
Virtual Sync Replication
Replication problem #1
slave not crash safe
too many 1062
statement-based binary log
update xxx set k=k+1 where ooo=jQuery110208963872229214758_1429758282002?
lose data
even semi-replication
第17页
Virtual Sync Replication
relay binlog not atomic(MySQL 5.5)
relay binlog (database)
update relay-log.info (file)
write to os cache
sync_relay_log_info=1
第18页
Virtual Sync Replication
MySQL 5.6
store relay-info in table (InnoDB)
atomic
BEGIN;
APPLY binlog;
UPDATE slave_relay_log_info SET exec_master_log_pos=xxx, ...
COMMIT;
第19页
Virtual Sync Replication
semi-replication
commit transaction
transfer binlog to slave
wait for slave ACK
第20页
Virtual Sync Replication
semi-replication
data inconsistency !!!
第21页
Virtual Sync Replication
virtual sync replication
InnoDB generate prepare redo log
write binlog at master
transfer binlog to slave
wait slave ACK
InnoDB commit
master commit after slave receive binary log
change to async mode when timeout
no data lose when master crashed
using transaction-safe table
第22页
Virtual Sync Replication
commit in InnoDB when receive the ACK from slave
第23页
Virtual Sync Replication
if master alive and change as slave
data can ben inconsistent when crash at step 3
data consistent
InnoSQL should not be handled by scripts
truncate redundant binlog at master that slave does not receive
also need handle partial binlog event at slave
第24页
Virtual Sync Replication
第25页
Virtual Sync Replication
Original VSR performance decrease
40% ~ 50% for sysbench full update benchmark
15% ~ 20% for sysbench OLTP benchmark
第26页
Virtual Sync Replication
VSR with group commit
Merge MariaDB’s work
Recover need recover group binlog
new binlog event
第27页
Virtual Sync Replication
almost equal to vanilla MySQL (async) replication’s performance !!!
However, with no data lost !!!
第28页
Virtual Sync Replication
生产环境应用
网易博客
印像派
网易云数据库
默认配置
超时3s
第29页
Slave Batch Commit
Replication problem #2
single slave SQL thread
result in lag between master and slave
can be hours, even days
第30页
Slave Batch Commit
pre-fetch
convert DML to SELECT
warm up by multi thread
tools
mk-slave-prefetch
replication-booster-for-mysql
all for statement-based binlog
not workable for insert SQLs
第31页
Slave Batch Commit
parallel replication based on schema
MySQL 5.6
Tungsten Replicator
still single thread for one-schema database
第32页
Slave Batch Commit
InnoSQL’s batch commit
apply binlog within one transaction
advantage
reduce redo log fsync a lot
work for all kinds of binlog format
a more general solution
disadvantage
still single thread
第33页
Slave Batch Commit
BEGIN;
APPLY binlog;
UPDATE slave_relay_log_info
SET exec_master_log_pos=xxx, ...;
APPLY binlog;
UPDATE slave_relay_log_info
SET exec_master_log_pos=xxx, ...;
……
COMMIT;
第34页
Slave Batch Commit
第35页
Parallel Replication
MariaDB’s Replication
Based on group commit
transactions in one group can be paralleled
Mutil-SQL threads on slave
第36页
Parallel Replication
slave cpu idle
第37页
Parallel Replication
slave io util
第38页
Parallel Replication
seconds behind Master
第39页
Resource Governor
Control resource:
Row Reads
Byte Reads
QPS
Vanilla MySQL
QPH
mainly for cloud service
multi tenancy
第40页
Resource Governor
Oracle Profiler
Rollback
Microsoft SQL Server Resource Governor
InnoSQL use this design
Handler level
Support all kinds of engine
Slow down
第41页
Resource Governor
InnoSQL RG Test
Sysbench oltp
5 users & 5 databases
(admin_0, mydb_0) … (admin_4, mydb_4)
limit user’s qps one by one
release user’s qps one by one
第42页
Resource Governor
第43页
Resource Governor
第44页
TopSQL
SQL info
mysql.sql_stats (data in tables)
new sql: show sql stats( data in memory)
SQL filter
schema
SQL type(insert,select, delete,update,show,set….)
Table info
mysql.table_stats
new sql: show table stats
第45页
TopSQL
SQL info
show sql stats
+------------------------------------------------------+--------------+--------------------+------------------+-----------+------------+----------------+----------------+------------+------------+
| SQL_TEXT | INDEX | MEMORY_TEMP_TABLES | DISK_TEMP_TABLES | ROW_READS | BYTE_READS | MAX_EXEC_TIMES | MIN_EXEC_TIMES | EXEC_TIMES | EXEC_COUNT |
+------------------------------------------------------+--------------+--------------------+------------------+-----------+------------+----------------+----------------+------------+------------+
| SELECT SCHEMA ( ) | NULL | 0 | 0 | 0 | 0 | 3792 | 50 | 3988 | 4 |
| SELECT * FROM `t` WHERE `id` = ? | t.PRIMARY(1) | 0 | 0 | 0 | 0 | 257 | 257 | 257 | 1 |
| DELETE FROM `t` WHERE `id` = ? | t.PRIMARY(1) | 0 | 0 | 1 | 42 | 254 | 254 | 254 | 1 |
| SELECT * FROM `innodb_table_stats` | NULL | 0 | 0 | 2 | 852 | 217 | 217 | 217 | 1 |
| UPDATE `t` SET NAME = ? WHERE `id` = ? | t.PRIMARY(1) | 0 | 0 | 1 | 42 | 198 | 198 | 198 | 1 |
| SELECT * FROM `t` | NULL | 0 | 0 | 1 | 42 | 187 | 187 | 187 | 1 |
| SHOW TABLE STATS | NULL | 0 | 0 | 0 | 0 | 45 | 45 | 45 | 1 |
| SELECT @ @ version_comment LIMIT ? | NULL | 0 | 0 | 0 | 0 | 42 | 42 | 42 | 1 |
+------------------------------------------------------+--------------+--------------------+------------------+-----------+------------+----------------+----------------+------------+------------+
第46页
TopSQL
Table info
show table stats
+--------------------+-----------------------+-----------------------+-----------------------+------------------------+----------------------+
| DBNAME | TABLE_NAME | SELECT_COUNT | INSERT_COUNT | UPDATE_COUNT | DELETE_COUNT |
+--------------------+-----------------------+-----------------------+-----------------------+------------------------+----------------------+
| test | t | 2 | 0 | 1 | 1 |
+--------------------+------------------------+----------------------+-----------------------+------------------------+-----------------------+
| test | t1 | 50 | 5 | 10 | 0 |
+--------------------+------------------------+----------------------+-----------------------+------------------------+-----------------------+
| test | t2 | 10 | 5 | 0 | 0 |
+--------------------+------------------------+----------------------+-----------------------+------------------------+-----------------------+
第47页
Others
IO Statistics
add logical_reads, physical_reads into slow_log & profile
Role Table
users in a Role have same privileges
XA enhance
do not rollback prepared transcations when client exit
save binlog into a temp file when transaction prepared
第48页
Q & A