Redis Trouble Shooting
– Single Threaded
• Redis Trouble Shooting • Redis Security Issue
Single Threaded #1
Client #1 Client #2
…… Client #N
Packet #1 Packet #2
Redis Event Loop
I/O Multiplexing Process Command
Single Threaded #2
• Process only one command at once • If you run long processed command, all other
commands are pending.
– Keys, flushall, flushdb, lua script, MULTI/EXEC
• Redis uses some other threads, but it is only for avoiding fsync call.
Item Count 1,000,000
Time 1000ms(1 second)
Recommanded Redis Version #1
• Lastest Stable Version
– 3.0.x is also good. – Using after 2.8.13
• There are some differences depending on the version of redis
– config set client-output-buffer-limit is accepted with redis-cli in 2.6.x
– config set client-output-buffer-limit couldn’t use some expression like 1GB, 1MB in 2.8.20
Memeory Fragmentation #1
• Previous Redis Version that using Jemalloc 3.6.0
– Redis uses just 2.4G but rss is 12G in 2.8.6
Memeory Fragmentation #2
• Redis version that using after Jemalloc 3.6.0
– 2.8.20 shows less difference Mem Usages and RSS – But you should check RSS.
Recommanded Client(For Management)
• Using redis-cli
– It is best.
• You can use telent also
– Redis support inline command – Twemproxy doesn’t support inline command – You can’t use some command in old versions.
If you support Service Team to use Redis
• Check They want cache or Store
– If it is for cache, Turn off SAVE option – Even it is for store, Give proper value for SAVE
• Using Multiple Redis Instances in one physical server. • Using maxmemory option.
Using Multiple Redis Instances.
• CPU 4 core 32G Memory
– 3 redis Instances each 8G is better than one 26G Redis
• Redis is Single Thread
– Fork for Replication
• Supported Chained replication
– Not supported Multi-Master – To check RSS for Replication
• Support Chained Replication
Master 1st Slave 2nd Slave
1st slave is master of 2nd slave
When master reruns, Resync with Master
If master has no data.
Slave will has no data after resyncing
• RDB/AOF are not related. • RDB
– Snapshot of current memory status – Fork and dump its memory to disk – In Write Heavy System. It can use double memory
– Save update(create, update, delete) commands to disk after event loop as redis protocol.
– Disk Sync option affects the performance(default: everysec) – Less Disk IO compared to RDB(Except AOF rewriting)
If you turn off persistent. But You
can’t avoid fork (migration)
1. Prepare New instance(as B) for old instance(as A) 2. Send “slaveof A ip A port” to B 3. Wait to finish replication
1. Fork and using more memory
4. Turn on writable option for B
1. Config set slave-read-only no
5. Connect clients to B 6. Send “slaveof no one” to B
Redis Replication Mechanism
1. Slave sends sync command to Master 2. Master forks and create RDB 3. After RDB creation, Master sends RDB data to slave 4. While sending RDB data, Master saves new commands
into memory buffer 5. After sending RDB, slave starts loading RDB 6. After loading RDB, Master sends memory buffered
data to slave
Problem of Redis Repliaction
• When connection is broken between master and slave
– Slave try to start FULL sync. – But Full sync is very expensive.
• If there are some small difference that can be recoverd by memory buffer.
– Slave can request “PSYNC” – And master just send small memory buffer to slave. – And finish syncing.
• But if master is changed as another server.
– Only FULL sync is possible.
– Only for cache
• Redis Configuration
– stop-writes-on-bgsave-error yes
– Write is forbidden after RDB creating failure – Read are OK.
– Config set stop-writes-on-bgsave-error no
– Only for cache – Using default options
• Redis Conf
– SAVE 900 1 – SAVE 300 10 – SAVE 60 10000
– Performance degradation because of Much Disk IO in short time
– config set SAVE “” – Removing SAVE option
– Some cache and some storing data – Using one instance and using 28GB data in 32GB machine – And It has disk failure also.
– Latency is high because of Using Swap memory – It spends much time for creating RDB
• Normally, it takes 5~6 minutes for 10G memory • It took over 8 hours of dumping 28G
– Droping server – Sometime, it is better to drop data than dragging on failure.
P Service #1
– Using AOF for store – 8 instances in 256GB each instance using 26GB
– All 8 redis instances tried to start AOF Rewrite – Much Disk IO and Using much memory – They are start to service same time. So AOF rewriting timing
– Stop AOF Rewrite and manage it with batch
P Service(#2) – Not actually Failure
– All Redis Master/Slave servers connection are broken because of Network issue.
• Failure Possibility
– If network is recoverd, all redis slaves will try to sync with master – All redis masters will fork and using much memory. – It can trigger Big failure of Service
– Disconnection all M/S connection using “slaveof no one” – And recovering network. – And make replication connection one by one in one physical serve.
– 20GB data – Some write operations
– Failing of Master/Slave Replication.
– Checking client-output-buffer-limit – Default “client-output-buffer-limit slave 256mb 64mb 60”
• Hard limit 256mb • Soft limit 64mb for 60 seconds.
– Default is ok for 10G data, but If you use 20G Data
• Increment it as 512mb or 1024mb
Name CPU Usage, Load Network inbound, outbound 현재 클라이언트 개수, max client 설정 키 개수, 명령어 처리 수 메모리 사용량, RSS Disk 사용량, io Expired keys, Evicted keys
Host or Redis(info) Host Host
Redis Redis Host Redis
Redis Security Issue
Redis Security is very weak.
– Not supported – Never open redis port to public
• only use redis in private network
– Don’t run redis as root.
• Redis port is opened for public
– Config set dir “/root/.ssh” – Config set dbfilename “authorized_keys” – Save
• So user can use this server as root.