[root@localhost ~]# tiup cluster start t11 Starting component `cluster`: /root/.tiup/components/cluster/v1.0.0/cluster start t11 Starting cluster t11... + [ Serial ] - SSHKeySet: privateKey=/root/.tiup/storage/cluster/clusters/t11/ssh/id_rsa, publicKey=/root/.tiup/storage/cluster/clusters/t11/ssh/id_rsa.pub + [Parallel] - UserSSH: user=tidb, host=192.168.73.128 + [Parallel] - UserSSH: user=tidb, host=192.168.73.151 + [Parallel] - UserSSH: user=tidb, host=192.168.73.152 + [Parallel] - UserSSH: user=tidb, host=192.168.73.153 + [Parallel] - UserSSH: user=tidb, host=192.168.73.151 + [Parallel] - UserSSH: user=tidb, host=192.168.73.152 + [Parallel] - UserSSH: user=tidb, host=192.168.73.155 + [Parallel] - UserSSH: user=tidb, host=192.168.73.128 + [Parallel] - UserSSH: user=tidb, host=192.168.73.128 + [Parallel] - UserSSH: user=tidb, host=192.168.73.154 + [Parallel] - UserSSH: user=tidb, host=192.168.73.153 + [Parallel] - UserSSH: user=tidb, host=192.168.73.128 + [ Serial ] - ClusterOperate: operation=StartOperation, options={Roles:[] Nodes:[] Force:false SSHTimeout:5 OptTimeout:60 APITimeout:300} Starting component pd Starting instance pd 192.168.73.153:2379 Starting instance pd 192.168.73.151:2379 Starting instance pd 192.168.73.152:2379 Start pd 192.168.73.151:2379 success retry error: operation timed out after 1m0s pd 192.168.73.153:2379 failed to start: timed out waiting for port 2379 to be started after 1m0s, please check the log of the instance retry error: operation timed out after 1m0s pd 192.168.73.152:2379 failed to start: timed out waiting for port 2379 to be started after 1m0s, please check the log of the instance
Error: failed to start: failed to start pd: pd 192.168.73.153:2379 failed to start: timed out waiting for port 2379 to be started after 1m0s, please check the log of the instance: timed out waiting for port 2379 to be started after 1m0s
Verbose debug logs has been written to /root/logs/tiup-cluster-debug-2020-06-16-16-28-39.log. Error: run `/root/.tiup/components/cluster/v1.0.0/cluster` (wd:/root/.tiup/data/S23ruRB) failed: exit status 1
funcmain() { // Set up a connection to the server. conn, err := grpc.Dial(address, grpc.WithInsecure(), grpc.WithBlock()) if err != nil { log.Fatalf("did not connect: %v", err) } defer conn.Close() c := pb.NewGreeterClient(conn)
// Contact the server and print out its response. name := defaultName clientFunc := "Default" iflen(os.Args) == 3 { name = os.Args[1] clientFunc = os.Args[2] }else{ log.Fatal("Please input 2 arguments") } ctx, cancel := context.WithTimeout(context.Background(), time.Second) defer cancel() switch clientFunc { ...... case"ServerStream": stream,err := c.TellMeSomething(ctx,&pb.HelloRequest{Name:name}) if err != nil{ log.Fatalf("TellMeSomething error: %v ",err) } for { something,err := stream.Recv() // 服务端信息发送完成,退出 if err == io.EOF{ break } if err != nil{ log.Fatalf("TellMeSomething stream error:%v",err) } log.Printf("Recevie from server:{LineCode:%v Line:%s}\n",something.GetLineCode(),something.GetLine()) } ...... }
funcmain() { // Set up a connection to the server. conn, err := grpc.Dial(address, grpc.WithInsecure(), grpc.WithBlock()) if err != nil { log.Fatalf("did not connect: %v", err) } defer conn.Close() c := pb.NewGreeterClient(conn)
// Contact the server and print out its response. name := defaultName clientFunc := "Default" iflen(os.Args) == 3 { name = os.Args[1] clientFunc = os.Args[2] }else{ log.Fatal("Please input 2 arguments") }
funcmain() { // Set up a connection to the server. conn, err := grpc.Dial(address, grpc.WithInsecure(), grpc.WithBlock()) if err != nil { log.Fatalf("did not connect: %v", err) } defer conn.Close() c := pb.NewGreeterClient(conn)
// Contact the server and print out its response. name := defaultName clientFunc := "Default" iflen(os.Args) == 3 { name = os.Args[1] clientFunc = os.Args[2] }else{ log.Fatal("Please input 2 arguments") } ctx, cancel := context.WithTimeout(context.Background(), time.Second) defer cancel() switch clientFunc { ...... case"Stream": stream,err := c.TalkWithMe(ctx) if err != nil{ log.Fatalf("TalkWithMe err:%v",err) } waitc := make(chanstruct{}) gofunc(){ for { something,err := stream.Recv() // 服务端信息发送完成,退出 if err == io.EOF{ break } if err != nil{ log.Fatalf("TalkWithMe stream error:%v",err) } log.Printf("Got %v:%s\n",something.GetLineCode(),something.GetLine()) } }() clientStr := []string{"one","two","three"} for i,v := range(clientStr){ if err := stream.Send(&pb.Something{LineCode:int64(i),Line:v});err != nil{ log.Fatalf("TalkWithMe Send error:%v",err) } } stream.CloseSend() <- waitc default: log.Fatal("Please input second args in Default/ServerStream/ClientStream/Stream") } }
[20-02-03 19:46:34] shengang@abcs-MacBook-Pro ~/Documents/002-workspace/docker-workspace/mysql-5.7.22 $ docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 4865e2c56f67 mysql:5.7.22 "docker-entrypoint.s…" 6 seconds ago Up 5 seconds mysql-5722_mysql_1 [20-02-03 19:46:39] shengang@abcs-MacBook-Pro ~/Documents/002-workspace/docker-workspace/mysql-5.7.22 $ mysql -uroot -P3306 -h127.0.0.1 -p Enter password: ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (61)
但是进入容器内部连接 MySQL 正常
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
[20-02-03 19:46:56] shengang@abcs-MacBook-Pro ~/Documents/002-workspace/docker-workspace/mysql-5.7.22 $ docker exec -it 4865e2c56f67 bash root@linuxkit-025000000001:/# mysql -uroot -P3306 -h127.0.0.1 -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.7.22 MySQL Community Server (GPL)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
问题排查
排查过程其实比较简单,因为从容器内部连接正常,从宿主机连接失败,这个基本上就是网络或者端口或者防火墙的问题了。首先看 docker container ls 的结果输出中 PORTS 列的内容是空的,说明端口没有映射出来。但是在 docker-compose.yml 配置文件中我们明明设置了 ports 配置。仔细看了下 docker-compose.yml 配置文件,发现一个 network_mode: "host" 的配置项,这个我之前没怎么了解过,只是抄的网上的配置文件。大概率就是这个配置引起的问题。于是尝试了一下将 network_mode: "host" 这一行删除掉了。然后通过 docker-compose -f docker-compose.yml down ,docker-compose -f docker-compose.yml up -d 重新部署容器,连接就正常了。
--- global: scrape_interval: 15s # By default, scrape targets every 15 seconds. evaluation_interval: 15s # By default, scrape targets every 15 seconds. # scrape_timeout is set to the global default (10s). external_labels: cluster: 'gangshen-cluster' monitor: "prometheus"
# Load and evaluate rules in this file every 'evaluation_interval' seconds. rule_files: - 'node.rules.yml' - 'blacker.rules.yml' - 'bypass.rules.yml' - 'pd.rules.yml' - 'tidb.rules.yml' - 'tikv.rules.yml'
DEPLOY_DIR={{ deploy_dir }} cd "${DEPLOY_DIR}" || exit 1
# WARNING: This file was auto-generated. Do not edit! # All your edit might be overwritten! exec > >(tee -i -a "{{ alertmanager_log_dir }}/{{ alertmanager_log_filename }}") exec 2>&1
global: # The smarthost and SMTP sender used for mail notifications. smtp_smarthost: 'smtp.qq.com:465' smtp_from: 'xxxxx@qq.com' smtp_auth_username: 'xxxxx@qq.com' smtp_auth_password: '第三方授权码' smtp_require_tls: false
# The Slack webhook URL. # slack_api_url: ''
route: # A default receiver receiver: "db-alert-email"
# The labels by which incoming alerts are grouped together. For example, # multiple alerts coming in for cluster=A and alertname=LatencyHigh would # be batched into a single group. group_by: ['env','instance','alertname','type','group','job']
# When a new group of alerts is created by an incoming alert, wait at # least 'group_wait' to send the initial notification. # This way ensures that you get multiple alerts for the same group that start # firing shortly after another are batched together on the first # notification. group_wait: 30s
# When the first notification was sent, wait 'group_interval' to send a batch # of new alerts that started firing for that group. group_interval: 3m
# If an alert has successfully been sent, wait 'repeat_interval' to # resend them. repeat_interval: 3m
mysql> show binlog events in 'mysql-bin.000001'; +------------------+-----+-------------+-----------+-------------+----------------------------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------+-----+-------------+-----------+-------------+----------------------------------------------------------------------------+ | mysql-bin.000001 | 4 | Format_desc | 9999 | 120 | Server ver: 5.6.34-debug-log, Binlog ver: 4 | | mysql-bin.000001 | 120 | Query | 9999 | 207 | BEGIN | | mysql-bin.000001 | 207 | Query | 9999 | 347 | use `gangshen`; update test1 set name='woqutech_new' where name='woqutech' | | mysql-bin.000001 | 347 | Xid | 9999 | 378 | COMMIT /* xid=46 */ | +------------------+-----+-------------+-----------+-------------+----------------------------------------------------------------------------+ 4 rows in set (0.00 sec)
从数据库中我们可以看到update test1 set name='woqutech_new' where name='woqutech';这个insert语句转换成了3个event,分别是Query,Quert,Xid4个event。第一个Query的event表示语句开始执行,接着的Query的event就是真实执行的语句了,最后一个Xid表示语句的提交。 在数据库中查看binlog文件内容可以很直观的看到,event的类型以及主要内容,那我们接着来看看,在binlog文件中,通过mysqlbinlog工具可以看到的具体内容是什么样的。
从数据库中,我们可以看到delete from test1 where id = 1;语句,在binlog日志文件中转换成了5个event存储,分别为Query,Rows_query,Table_map,Delete_rows,Xid类型的event,这些event中,Query event表示一个更新语句的开始,Rows_query event记录了更新语句的语句内容,Table_map event中记录了insert语句操作的表的信息,Delete_rows event记录了真实更新的记录的内容,最后一个Xid event中表示COMMIT操作。
mysql> show binary logs; +------------------+-----------+ | Log_name | File_size | +------------------+-----------+ | mysql-bin.000001 | 120 | +------------------+-----------+ 1 row in set (0.00 sec)
mysql> show binlog events in 'mysql-bin.000001'; +------------------+-----+-------------+-----------+-------------+---------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------+-----+-------------+-----------+-------------+---------------------------------------------+ | mysql-bin.000001 | 4 | Format_desc | 9999 | 120 | Server ver: 5.6.34-debug-log, Binlog ver: 4 | +------------------+-----+-------------+-----------+-------------+---------------------------------------------+ 1 row in set (0.00 sec)
mysql> delete from test1 where name = 'woqutech_new'; Query OK, 1 row affected (0.01 sec)
mysql> show binlog events in 'mysql-bin.000001'; +------------------+-----+-------------+-----------+-------------+---------------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------+-----+-------------+-----------+-------------+---------------------------------------------------------------+ | mysql-bin.000001 | 4 | Format_desc | 9999 | 120 | Server ver: 5.6.34-debug-log, Binlog ver: 4 | | mysql-bin.000001 | 120 | Query | 9999 | 207 | BEGIN | | mysql-bin.000001 | 207 | Query | 9999 | 334 | use `gangshen`; delete from test1 where name = 'woqutech_new' | | mysql-bin.000001 | 334 | Xid | 9999 | 365 | COMMIT /* xid=58 */ | +------------------+-----+-------------+-----------+-------------+---------------------------------------------------------------+ 4 rows in set (0.00 sec)
从数据库中我们可以看到delete from test1 where name = 'woqutech_new';这个insert语句转换成了3个event,分别是Query,Quert,Xid4个event。第一个Query的event表示语句开始执行,接着的Query的event就是真实执行的语句了,最后一个Xid表示语句的提交。 在数据库中查看binlog文件内容可以很直观的看到,event的类型以及主要内容,那我们接着来看看,在binlog文件中,通过mysqlbinlog工具可以看到的具体内容是什么样的。