第1页
Building Resilient Services in Go
GopherCon Korea 2015 15 August 2015
Developer, Purpleworks
第2页
Who am I
purpleworks Ruby, C#, Go, etc.
第3页
Resiliency
第4页
References
sourcegraph.com/blog/live/gophercon2015/123664481750
(https://sourcegraph.com/blog/live/gophercon2015/123664481750)
第5页
1. Careful Coding
第6页
1. Careful Coding: #Error Handling & Clean-up
第7页
First Example
resourceA, err := OpenResourceA() if err != nil {
return nil, err } defer resourceA.Close()
第8页
What About nil?
第9页
Be Careful!
// can return nil, and that's not an error!
resourceA, err := OpenResourceA()
if err != nil {
return nil, err
}
defer resourceA.Close()
// panic on niljQuery110202534127888480352_1444784714737??
第10页
One Solution:
// can return nil, and that's not an error! resourceA, err := OpenResourceA() if err != nil {
return nil, err } defer func(){
if resourceA != nil { resourceA.Close()
} }()
第11页
Make deferred Methods nil-Safe
// Don't forget: resource might be nil! func (resource *Resource) Close() {
if resource != nil { // ... clean up
} }
第12页
Much Better!
// can return nil, and that's not an error!
resourceA, err := OpenResourceA()
if err != nil {
return nil, err
}
defer resourceA.Close()
// will never panic!
第13页
1. Careful Coding: #Channels
第14页
Channel Axioms
1. A send to a nil channel blocks forever 2. A receive from a nil channel blocks forever 3. A send to a closed channel panics 4. A receive from a closed channel returns the zero value immediately
http://dave.cheney.net/2014/03/19/channel-axioms
第15页
1.C areful Coding: #Panics
第16页
Panicking Code
func main() { fmt.Println(divide(1, 0))
}
func divide(a, b int) int { return a / b
}
Run
第17页
You Can Recover From Panics
func main() { fmt.Println("result: ", divide(1, 0))
}
func divide(a, b int) int { defer func() { if err := recover(); err != nil { fmt.Println(err) } }() return a / b
}
Run
第18页
Recovering Block
func protect(g func()) { defer func() { log.Println("done")
if err := recover(); err != nil { log.Printf("run time panic: %v", err)
} }() log.Println("start") g() } func main() { protect(func() {
fmt.Println(divide(4, 0)) }) }
Run
第19页
1. C areful Coding: #Avoid Race Conditions!
第20页
Go's Race Detector:
Reports when variable access is not synchronized Crashes with a full stack trace, including the read and write goroutines Should be used in unit tests, development, and testing environments
第21页
Race Condition Code
func main() { c := make(chan bool) m := make(map[string]string) go func() { m["1"] = "a" // First conflicting access. c <- true }() m["2"] = "b" // Second conflicting access. <-c for k, v := range m { fmt.Println(k, v) }
}
Run
第22页
Race Detector Output:
$ go run -race race.go
================== WARNING: DATA RACE Write by goroutine 5:
runtime.mapassign1() /usr/local/go/src/runtime/hashmap.go:383 +0x0
main.func·001() /go/src/github.com/jaehue/go-talk/gophercon-korea-2015/race.go:10 +0xa3
Previous write by main goroutine: runtime.mapassign1() /usr/local/go/src/runtime/hashmap.go:383 +0x0 main.main() /go/src/github.com/jaehue/go-talk/gophercon-korea-2015/race.go:13 +0x22e
Goroutine 5 (running) created at: main.main() /go/src/github.com/jaehue/go-talk/gophercon-korea-2015/race.go:12 +0x1ac
================== 2b 1a Found 1 data race(s) exit status 66
第23页
Enable Race Detection:
$ go test -race mypkg // to test the package $ go run -race mysrc.go // to run the source file $ go build -race mycmd // to build the command $ go install -race mypkg // to install the package
第24页
1. Careful Coding: #Implement Timeouts
第25页
Network Timeouts:
network dial timeout network connection inactivity timeout total connection timeout
第26页
TEST ALL THE THINGS!
第27页
2. Know Your Service
第28页
2. Know Your Service: #How Does It Use Memory?
第29页
Profile It! (https://github.com/wblakecaldwell/profiler)
simplechat.jang.io/profiler/info.html (http://simplechat.jang.io/profiler/info.html)
第30页
What to Watch
How much memory does the service use when idle? How much memory per connection? Does the system reclaim memory that's no longer used? What's the garbage collector doing? GODEBUG=gctrace=1 Where is memory allocated? (PPROF)
第31页
2. Know Your Service: #PPROF
第32页
"PPROF" Inspects Your Running Process
Blocking profile Goroutine count and full stacktraces Heap profile Stacktraces that lead to thread creations
第33页
Enabling PPROF:
import ( _ "net/http/pprof" "net/http"
)
func main() { http.ListenAndServe(":6060", nil) // ...
}
第34页
PPROF Main Page
simplechat.jang.io/debug/pprof(http://simplechat.jang.io/debug/pprof)
第35页
Use PPROF To Tell You:
How many goroutines when nobody is connected? How many goroutines per connection? Are all goroutines cleaned up after all connections close?
第36页
P PROF: Goroutine Page
第37页
PPROF: From the Command Line
What Are Your GoRoutines Doing?
$ go tool pprof ./simple-chat http://simplechat.jang.io/debug/pprof/goroutine
Fetching profile from http://simplechat.jang.io/debug/pprof/goroutine Saved profile in /Users/jaehuejang/pprof/pprof.simple-chat.simplechat.jang.io.goroutine.002.pb.gz Entering interactive mode (type "help" for commands)
(pprof) top 10
9 of 9 total ( 100%)
Showing top 10 nodes out of 34 (cum >= 1)
flat flat% sum%
cum cum%
7 77.78% 77.78%
7 77.78%
1 11.11% 88.89%
1 11.11%
1 11.11% 100%
1 11.11%
0 0% 100%
1 11.11%
0 0% 100%
1 11.11%
0 0% 100%
1 11.11%
0 0% 100%
1 11.11%
0 0% 100%
1 11.11%
0 0% 100%
1 11.11%
0 0% 100%
1 11.11%
(pprof) web
runtime.gopark runtime.notetsleepg runtime/pprof.writeRuntimeProfile github.com/wblakecaldwell/profiler.func·001 github.com/wblakecaldwell/profiler.func·002 github.com/wblakecaldwell/profiler.func·003 main.main net.(*TCPListener).AcceptTCP net.(*netFD).accept net.(*pollDesc).Wait
第38页
PPROF: From the Command Line - Visualizer
第39页
Who's Allocating Heap Memory?
$ go tool pprof ./simple-chat http://simplechat.jang.io/debug/pprof/heap
Fetching profile from http://simplechat.jang.io/debug/pprof/heap Saved profile in /Users/jaehuejang/pprof/pprof.simple-chat.simplechat.jang.io.inuse_objects.inuse_space.002. Entering interactive mode (type "help" for commands)
(pprof) top 5
1537.45kB of 1537.45kB total ( 100%)
Dropped 3 nodes (cum <= 7.69kB)
flat flat% sum%
cum cum%
512.75kB 33.35% 33.35% 512.75kB 33.35%
512.56kB 33.34% 66.69% 1537.45kB 100%
512.14kB 33.31% 100% 512.14kB 33.31%
0 0% 100% 1025.31kB 66.69%
0 0% 100% 512.14kB 33.31%
(pprof) web
encoding/gob.encComplex64Array [simple-chat] encoding/gob.(*Decoder).typeString encoding/gob.buildTypeInfo encoding/gob.newTypeObject
第40页
Who's Allocating Heap Memory? - Visualizer
第41页
2. Know Your Service: #Watch It Run
第42页
/info Endpoint
{ Version: "1.0.275-b244a2b9b8-20150202.163449", StartTimeEpochSecs: 1430515329, CurrentTimeEpocSecs: 143117131, Uptime: "167h10m2s"
}
simplechat.jang.io/info (http://simplechat.jang.io/info)
第43页
Managing Service Version
Version: "1.0.275-b244a2b9b8-20150202.163449"
Which is:
<major>.<minor>.<commit#>-<Git SHA>-<date>.<time>
第44页
Managing Service Version
Version is stored in a global variable, set by your build script In code:
var ServiceVersion string
Build script:
$ go build -ldflags \ > "-X main.ServiceVersion 1.0.231.fd80bea-20150814.160801"
第45页
2. Know Your Service: #Keep Good Logs!
第46页
Keep Good Logs!
Create a semi-unique string per request Use this request string as a prefix in all log entries Always log at least the start and end of a request
第47页
2. Know Your Service: #Connection Status
第48页
Who's Currently Connected?
/connections Endpoint
{ CurrentUserCount: 1, CurrentlyAuthenticatedUsers: [ { Addr: "10.157.118.9:46493", Name: " ", SessionDuration: "3m4s" } ]
}
第49页
Conclusion
第50页
Thank you
Developer, Purpleworks jaehue@jang.io (mailto:jaehue@jang.io) http://blog.remotty.com/ (http://blog.remotty.com/)