etcd啟動流程源碼分析筆記

最新 06-11

1.初始化etcdServer流程：

代碼路徑為：github.comcoreosetcdembedetcd.go

StartEtcd(inCfg *Config) (e *Etcd, err error)

流程如下：

1.1：參數校驗：inCfg.Validate()

校驗關注點1：

checkBindURLs(cfg.LPUrls）：校驗peer-urls schem相關信息,且在3.1版本之後不允許使用域名作為url來進行綁定操作。

checkBindURLs(cfg.LCUrls) : 校驗client-urls schem相關信息,且在3.1版本之後不允許使用域名作為url來進行綁定操作。

使用域名對於性能上是有一定的影響，但是在實際生產環境中，是存在使用域名的場景，需要修改如下代碼進行適配：

func checkBindURLs(urls []url.URL) error { //... if net.ParseIP(host) == nil { //取消err的return，改為列印告警信息，同3.1之前版本。 return fmt.Errorf("expected IP in URL for binding (%s)", url.String()) }

}

校驗關注點2：由於實際現網的網路延遲各不相同，選舉及心跳超時時間可作為調優適配的考慮範疇。

5*cfg.TickMs > cfg.ElectionMs ：選舉超時時間必須大於五倍於心跳超時時間。

cfg.ElectionMs > maxElectionMs：選舉超時時間必須小於5000ms

1.2：初始化PeerListeners，ClientListeners,用於監聽peers間及client端發送的http請求

PeerListeners：作為etcd member之間進行通信使用的listeners，為了性能考量，建議內部試用schema:http，由flag "listen-peer-urls"確定，

ClientListeners：作為接受外部請求的listerners，一般為了安全性考量，一般使用 schema:https，由flag "listen-client-urls"確定，

具體方法實現為：

transport.NewTimeoutListener(u.Host, u.Scheme, tlsinfo, ConnReadTimeout, ConnWriteTimeout)

默認的讀寫超時均為5s:

ConnReadTimeout = 5 * time.Second ConnWriteTimeout = 5 * time.Second

1.3: 獲取PeerURLsMap以及cluster token

1.4: 生成new etcdServer所需的的ServerConfig結構體：

// ServerConfig holds the configuration of etcd as taken from the command line or discovery.

type ServerConfig struct { Name string // etcdserver 名稱，對應flag "name「 DiscoveryURL string // etcd 用於服務發現，無需知道具體etcd節點ip即可訪問etcd 服務，對應flag "discovery" DiscoveryProxy string // 供服務發現url的代理地址，對應flag "discovery-proxy" ClientURLs types.URLs // 由ip+port組成，默認DefaultListenClientURLs = "http://localhost:2379"; 實際情況使用https schema，供etcd member 通信，對應flag "listen-client-urls" PeerURLs types.URLs // 由ip+port組成，默認DefaultListenPeerURLs = "http://localhost:2380"; 實際生產環境使用http schema, 用以外部etcd client訪問，對應flag "listen-client-urls" DataDir string // 數據目錄地址，為全路徑，對應flag "data-dir" // DedicatedWALDir config will make the etcd to write the WAL to the WALDir // rather than the dataDir/member/wal. DedicatedWALDir string SnapCount uint64 // 默認是10000次事件做一次快照:DefaultSnapCount = 100000可以作為調優參數進行參考，對應flag "snapshot-count", MaxSnapFiles uint // 默認是5，這是v2的參數，v3內只有一個db文件，DefaultMaxSnapshots = 5，對應flag "max-snapshots" MaxWALFiles uint // 默認是5，DefaultMaxWALs = 5，表示最大存儲wal文件的個數，對應flag "max-wals"，保留的文件可以作為etcd-dump-logs工具進行debug使用。 InitialPeerURLsMap types.URLsMap // peerUrl 與 etcd name對應的map,由方法cfg.PeerURLsMapAndToken("etcd")生成。 InitialClusterToken string // etcd 集群token, 對應flang "initial-cluster-token" NewCluster bool // 確定是否為新建集群，對應flag "initial-cluster-state",由方法func (cfg Config) IsNewCluster() bool { return cfg.ClusterState == ClusterStateFlagNew }確定； ForceNewCluster bool // 對應flag "force-new-cluster",默認為false,若為true，在生產環境內，一般用於含v2數據的集群恢復，效果為以現有數據或者空數據新建一個單節點的etcd集群，如果存在數據，則會清楚數據內的元數據信息，並重建只包含該etcd的元數據信息。 PeerTLSInfo transport.TLSInfo // member間通信使用的證書信息，若peerURL為https時使用，對應flag "peer-ca-file","peer-cert-file", "peer-key-file" TickMs uint // raft node 發送心跳信息的超時時間。 "heartbeat-interval" ElectionTicks int // raft node 發起選舉的超時時間，最大為5000ms maxElectionMs = 50000, 對應flag "election-timeout", 選舉時間與心跳時間在最佳實踐內建議是10倍關係。 BootstrapTimeout time.Duration // etcd server啟動的超時時間，默認為1s, 由方法func (c *ServerConfig) bootstrapTimeout() time.Duration確定； AutoCompactionRetention int // 默認為0，單位為小時，主要為了方便用戶快速查詢，定時對key進行合并處理，對應flag "auto-compaction-retention",由方法func NewPeriodic(h int, rg RevGetter, c Compactable) *Periodic確定， //具體compact的實現方法為：func (s *kvServer) Compact(ctx context.Context, r *pb.CompactionRequest) (*pb.CompactionResponse, error) QuotaBackendBytes int64 // etcd後端數據文件的大小，默認為2GB，最大為8GB, v3的參數，對應flag "quota-backend-bytes" ，具體定義：etcdetcdserverquota.go StrictReconfigCheck bool // ClientCertAuthEnabled is true when cert has been signed by the client CA. ClientCertAuthEnabled bool AuthToken string

1.5, 調用方法初始化etcdServer:

// NewServer creates a new EtcdServer from the supplied configuration. The // configuration is considered static for the lifetime of the EtcdServer. func NewServer(cfg *ServerConfig) (srv *EtcdServer, err error) {

1.5.1：分配內存空間

st := store.New(StoreClusterPrefix, StoreKeysPrefix)

1.5.2：檢測並生成數據目錄，生成向遠端raft node peer listeners發送請求的Transport

其中的超時時間計算方法為：

1.5.3:

根據日誌目錄是否存在，對應生成raft node實體。

1.5.3.1: 若日誌目錄不存在且flag "initial-cluster-state"為"existing ：

case !haveWAL && !cfg.NewCluster:

使用方法生成raft node實體

id, n, s, w = startNode(cfg, cl, nil)

1.5.3.2：若日誌目錄不存在且flag "initial-cluster-state"為"new ：

case !haveWAL && cfg.NewCluster:

id, n, s, w = startNode(cfg, cl, cl.MemberIDs())

1.5.3.3 若日誌目錄存在：

1.5.3.3.1 若flag "force-new-cluster" 為"false"：

調用方法生成raft node實體

id, cl, n, s, w = restartNode(cfg, snapshot)

1.5.3.3.2 若flag "force-new-cluster" 為"true"：

id, cl, n, s, w = restartAsStandaloneNode(cfg, snapshot)

1.5.4 初始化EtcdServer:

srv = &EtcdServer{ readych: make(chan struct{}), Cfg: cfg, snapCount: cfg.SnapCount, errorc: make(chan error, 1), store: st, snapshotter: ss, r: *newRaftNode( raftNodeConfig{ isIDRemoved: func(id uint64) bool { return cl.IsIDRemoved(types.ID(id)) }, Node: n, heartbeat: heartbeat, raftStorage: s, storage: NewStorage(w, ss), }, ), id: id, attributes: membership.Attributes, cluster: cl, stats: sstats, lstats: lstats, SyncTicker: time.NewTicker(500 * time.Millisecond), peerRt: prt, reqIDGen: idutil.NewGenerator(uint16(id), time.Now()), forceVersionC: make(chan struct{}), ....

在初始化EtcdServer過程中，會啟動用於peer間發送及接收raft 消息的rafthttp transport,具體方法如下：

func (t *Transport) Start() error { var err error t.streamRt, err = newStreamRoundTripper(t.TLSInfo, t.DialTimeout) if err != nil { return err } t.pipelineRt, err = NewRoundTripper(t.TLSInfo, t.DialTimeout) if err != nil { return err } t.remotes = make(map[types.ID]*remote) t.peers = make(map[types.ID]Peer) t.prober = probing.NewProber(t.pipelineRt) return nil

2.1. 啟動etcdServer

3.1. 為每個client url及peer url 啟動一個client server的goroutine,以提供監聽服務，這個動作在raft http transport啟動之後：

peer server goroutine:

go func(l *peerListener) { e.errHandler(l.serve()) }(pl)

client server goroutine:

go func(s *serveCtx) { e.errHandler(s.serve(e.Server, ctlscfg, v2h, e.errHandler)) }(sctx)

若啟動失敗，則停止grpcServer:

defer func() { ... if !serving { // errored before starting gRPC server for serveCtx.grpcServerC for _, sctx := range e.sctxs { close(sctx.grpcServerC) } } ... }()

暫時就啟動流程進行粗略分享，後續將進一步分析 etcdServer 啟動具體機制，及針對NewServer內針對生成raft node詳細機制進行分析及基於k8s平台部署etcd 集群備份恢復方案進行探討。

喜歡這篇文章嗎？立刻分享出去讓更多人知道吧！

本站內容充實豐富，博大精深，小編精選每日熱門資訊，隨時更新，點擊「搶先收到最新資訊」瀏覽吧！

請您繼續閱讀更多來自推酷的精彩文章:

※給go程序添加命令行參數
※JavasSript陷阱之sort
※阿里巴巴最新面試經驗
※路由器的LED燈將允許攻擊者從物理隔離計算機中竊取數據
※PHP安全之密碼哈希處理

TAG:推酷 |

您可能感興趣

※程序媛筆記分享——python模塊之subprocess模塊
※Tensorflow筆記-人工智慧起源
※handlebars筆記總結
※Photoshop詳解筆記本產品後期精修教程
※大數據測試學習筆記之Python工具集
※Spring Cloud斷路器Hystrix原理讀書筆記
※動態代理學習筆記 jdk vs cglib
※蘋果啟動MacBook Pro筆記本電池膨脹問題更換計劃
※react服務端渲染框架next.js踩坑筆記
※華為改進Linux筆記本電腦驅動程序
※302頁吳恩達Deeplearning.ai課程筆記，詳記基礎知識與作業代碼
※wide＆deep論文學習筆記
※Python操作Excel學習筆記：圖表坐標軸
※死亡筆記彌海砂cosplay
※讀書筆記《Jessica s ghost》
※《The innovators》讀書筆記
※Gradle插件學習筆記（一）
※python筆記14-讀取yaml配置文件
※python筆記12-python多線程之事件
※Surface Phone將支持「筆記本模式」