Orchestrator elegant switching source code interpretation (unfinished)

tags: Useful  mysql  orchestrator  golang  

Orchestrator Interpretation

first part

Recently developed Orchestrator twice, so I studied the code of the elegant switch section.

Entrance

First, use the Orchestrator-Client as a client request entry to describe:
The code is located:

orchestrator/resources/bin/orchestrator-client

main:

function main {
  check_requirements
  detect_leader_api

  instance_hostport=$(to_hostport $instance)
  destination_hostport=$(to_hostport $destination)

  run_command
}
 The code is mainly located in this section of Run_Command.
 Find an elegant switch module: graceful-master-takeover-auto

function graceful_master_takeover_auto {
  assert_nonempty "instance|alias" "${alias:-$instance}"
  if [ -z "$destination_hostport" ] ; then
    # No destination given.
    api "graceful-master-takeover-auto/${alias:-$instance}"
  else
    # Explicit destination (designated master) given
         # Here I request the API of the Server end.
    api "graceful-master-takeover-auto/${alias:-$instance}/${destination_hostport}"
  fi
  print_details | jq '.SuccessorKey' | print_key
}

The API code path of the Server end is:

go/http/api.go

 API entry function:
func (this *HttpAPI) registerAPIRequestInternal(m *martini.ClassicMartini, path string, handler martini.Handler, allowProxy bool) 
...
	this.registerAPIRequest(m, "graceful-master-takeover/:host/:port", this.GracefulMasterTakeover)
	this.registerAPIRequest(m, "graceful-master-takeover/:host/:port/:designatedHost/:designatedPort", this.GracefulMasterTakeover)
	this.registerAPIRequest(m, "graceful-master-takeover/:clusterHint", this.GracefulMasterTakeover)
	this.registerAPIRequest(m, "graceful-master-takeover/:clusterHint/:designatedHost/:designatedPort", this.GracefulMasterTakeover)
	this.registerAPIRequest(m, "graceful-master-takeover-auto/:host/:port", this.GracefulMasterTakeoverAuto)
	this.registerAPIRequest(m, "graceful-master-takeover-auto/:host/:port/:designatedHost/:designatedPort", this.GracefulMasterTakeoverAuto)
	this.registerAPIRequest(m, "graceful-master-takeover-auto/:clusterHint", this.GracefulMasterTakeoverAuto)
	this.registerAPIRequest(m, "graceful-master-takeover-auto/:clusterHint/:designatedHost/:designatedPort", this.GracefulMasterTakeoverAuto)

func (this *HttpAPI) GracefulMasterTakeoverAuto(params martini.Params, r render.Render, req *http.Request, user auth.User) {
	this.gracefulMasterTakeover(params, r, req, user, true)
}

Then we find the code module of the API elegant switch:

func (this *HttpAPI) gracefulMasterTakeover(params martini.Params, r render.Render, req *http.Request, user auth.User, auto bool) {
	if !isAuthorizedForAction(req, user) {
		Respond(r, &APIResponse{Code: ERROR, Message: "Unauthorized"})
		return
	}
	// Get the cluster name of the entire cluster, the default is the HostName: Port Name of the instance
	clusterName, err := figureClusterName(getClusterHint(params))
	if err != nil {
		Respond(r, &APIResponse{Code: ERROR, Message: err.Error()})
		return
	}
	 // Incoming in, is specific instance, here acquisition
	designatedKey, _ := this.getInstanceKey(params["designatedHost"], params["designatedPort"])
	// designatedKey may be empty/invalid
	 // This calls the GracefulmastertakeOver under the Logic package, the real switching logic -> Elegant switching module
	 // Files in: go / logic / topology_recovery.go
	topologyRecovery, _, err := logic.GracefulMasterTakeover(clusterName, &designatedKey, auto)
	if err != nil {
		Respond(r, &APIResponse{Code: ERROR, Message: err.Error(), Details: topologyRecovery})
		return
	}
	if topologyRecovery == nil || topologyRecovery.SuccessorKey == nil {
		Respond(r, &APIResponse{Code: ERROR, Message: "graceful-master-takeover: no successor promoted", Details: topologyRecovery})
		return
	}
	Respond(r, &APIResponse{Code: OK, Message: "graceful-master-takeover: successor promoted", Details: topologyRecovery})
}

topology_recovery.go:

/ / Here is an elegant switching backend code entry for logic switches.
func GracefulMasterTakeover(clusterName string, designatedKey *inst.InstanceKey, auto bool) (topologyRecovery *TopologyRecovery, promotedMasterCoordinates *inst.BinlogCoordinates, err error) {
// inst.readclustermaster -> According to the incoming cluster name, read the main library of the cluster. Here is the database to go to the backend, and execute a SQL statement:
###############################################################
select
			*,
			unix_timestamp() - unix_timestamp(last_checked) as seconds_since_last_checked,
			ifnull(last_checked <= last_seen, 0) as is_last_check_valid,
			unix_timestamp() - unix_timestamp(last_seen) as seconds_since_last_seen,
			candidate_database_instance.last_suggested is not null
				 and candidate_database_instance.promotion_rule in ('must', 'prefer') as is_candidate,
			ifnull(nullif(candidate_database_instance.promotion_rule, ''), 'neutral') as promotion_rule,
			ifnull(unresolved_hostname, '') as unresolved_hostname,
			(database_instance_downtime.downtime_active is not null and ifnull(database_instance_downtime.end_timestamp, now()) > now()) as is_downtimed,
    	ifnull(database_instance_downtime.reason, '') as downtime_reason,
			ifnull(database_instance_downtime.owner, '') as downtime_owner,
			ifnull(unix_timestamp() - unix_timestamp(begin_timestamp), 0) as elapsed_downtime_seconds,
    	ifnull(database_instance_downtime.end_timestamp, '') as downtime_end_timestamp
		from
			database_instance
			left join candidate_database_instance using (hostname, port)
			left join hostname_unresolve using (hostname)
			left join database_instance_downtime using (hostname, port)
		where
		cluster_name = 'xxxxx'
		and (replication_depth = 0 or is_co_master)
###############################################################
	clusterMasters, err := inst.ReadClusterMaster(clusterName)
	if err != nil {
		return nil, nil, fmt.Errorf("Cannot deduce cluster master for %+v; error: %+v", clusterName, err)
	}
	// If the cluster doesn't have a master or there are several, then directly return failed
	if len(clusterMasters) != 1 {
		return nil, nil, fmt.Errorf("Cannot deduce cluster master for %+v. Found %+v potential masters", clusterName, len(clusterMasters))
	}
	// Returns an array, so take the first value, this is the Master Instance Name
	clusterMaster := clusterMasters[0]

	clusterMasterDirectReplicas, err := inst.ReadReplicaInstances(&clusterMaster.Key)
	if err != nil {
		return nil, nil, log.Errore(err)
	}

	if len(clusterMasterDirectReplicas) == 0 {
		return nil, nil, fmt.Errorf("Master %+v doesn't seem to have replicas", clusterMaster.Key)
	}

	if designatedKey != nil && !designatedKey.IsValid() {
		// An empty or invalid key is as good as no key
		designatedKey = nil
	}
	designatedInstance, err := getGracefulMasterTakeoverDesignatedInstance(&clusterMaster.Key, designatedKey, clusterMasterDirectReplicas, auto)
	if err != nil {
		return nil, nil, log.Errore(err)
	}

	if inst.IsBannedFromBeingCandidateReplica(designatedInstance) {
		return nil, nil, fmt.Errorf("GracefulMasterTakeover: designated instance %+v cannot be promoted due to promotion rule or it is explicitly ignored in PromotionIgnoreHostnameFilters configuration", designatedInstance.Key)
	}

	masterOfDesignatedInstance, err := inst.GetInstanceMaster(designatedInstance)
	if err != nil {
		return nil, nil, err
	}
	if !masterOfDesignatedInstance.Key.Equals(&clusterMaster.Key) {
		return nil, nil, fmt.Errorf("Sanity check failure. It seems like the designated instance %+v does not replicate from the master %+v (designated instance's master key is %+v). This error is strange. Panicking", designatedInstance.Key, clusterMaster.Key, designatedInstance.MasterKey)
	}
	if !designatedInstance.HasReasonableMaintenanceReplicationLag() {
		return nil, nil, fmt.Errorf("Desginated instance %+v seems to be lagging to much for thie operation. Aborting.", designatedInstance.Key)
	}

	if len(clusterMasterDirectReplicas) > 1 {
		log.Infof("GracefulMasterTakeover: Will let %+v take over its siblings", designatedInstance.Key)
		relocatedReplicas, _, err, _ := inst.RelocateReplicas(&clusterMaster.Key, &designatedInstance.Key, "")
		if len(relocatedReplicas) != len(clusterMasterDirectReplicas)-1 {
			// We are unable to make designated instance master of all its siblings
			relocatedReplicasKeyMap := inst.NewInstanceKeyMap()
			relocatedReplicasKeyMap.AddInstances(relocatedReplicas)
			// Let's see which replicas have not been relocated
			for _, directReplica := range clusterMasterDirectReplicas {
				if relocatedReplicasKeyMap.HasKey(directReplica.Key) {
					// relocated, good
					continue
				}
				if directReplica.Key.Equals(&designatedInstance.Key) {
					// obviously we skip this one
					continue
				}
				if directReplica.IsDowntimed {
					// obviously we skip this one
					log.Warningf("GracefulMasterTakeover: unable to relocate %+v below designated %+v, but since it is downtimed (downtime reason: %s) I will proceed", directReplica.Key, designatedInstance.Key, directReplica.DowntimeReason)
					continue
				}
				return nil, nil, fmt.Errorf("Desginated instance %+v cannot take over all of its siblings. Error: %+v", designatedInstance.Key, err)
			}
		}
	}
	log.Infof("GracefulMasterTakeover: Will demote %+v and promote %+v instead", clusterMaster.Key, designatedInstance.Key)

	replicationCreds, replicationCredentialsError := inst.ReadReplicationCredentials(&designatedInstance.Key)

	analysisEntry, err := forceAnalysisEntry(clusterName, inst.DeadMaster, inst.GracefulMasterTakeoverCommandHint, &clusterMaster.Key)
	if err != nil {
		return nil, nil, err
	}
	preGracefulTakeoverTopologyRecovery := &TopologyRecovery{
		SuccessorKey:  &designatedInstance.Key,
		AnalysisEntry: analysisEntry,
	}
	if err := executeProcesses(config.Config.PreGracefulTakeoverProcesses, "PreGracefulTakeoverProcesses", preGracefulTakeoverTopologyRecovery, true); err != nil {
		return nil, nil, fmt.Errorf("Failed running PreGracefulTakeoverProcesses: %+v", err)
	}

	log.Infof("GracefulMasterTakeover: Will set %+v as read_only", clusterMaster.Key)
	if clusterMaster, err = inst.SetReadOnly(&clusterMaster.Key, true); err != nil {
		return nil, nil, err
	}
	demotedMasterSelfBinlogCoordinates := &clusterMaster.SelfBinlogCoordinates
	log.Infof("GracefulMasterTakeover: Will wait for %+v to reach master coordinates %+v", designatedInstance.Key, *demotedMasterSelfBinlogCoordinates)
	if designatedInstance, _, err = inst.WaitForExecBinlogCoordinatesToReach(&designatedInstance.Key, demotedMasterSelfBinlogCoordinates, time.Duration(config.Config.ReasonableMaintenanceReplicationLagSeconds)*time.Second); err != nil {
		return nil, nil, err
	}
	promotedMasterCoordinates = &designatedInstance.SelfBinlogCoordinates

	log.Infof("GracefulMasterTakeover: attempting recovery")
	recoveryAttempted, topologyRecovery, err := ForceExecuteRecovery(analysisEntry, &designatedInstance.Key, false)
	if err != nil {
		log.Errorf("GracefulMasterTakeover: noting an error, and for now proceeding: %+v", err)
	}
	if !recoveryAttempted {
		return nil, nil, fmt.Errorf("GracefulMasterTakeover: unexpected error: recovery not attempted. This should not happen")
	}
	if topologyRecovery == nil {
		return nil, nil, fmt.Errorf("GracefulMasterTakeover: recovery attempted but with no results. This should not happen")
	}
	if topologyRecovery.SuccessorKey == nil {
		// Promotion fails.
		// Undo setting read-only on original master.
		inst.SetReadOnly(&clusterMaster.Key, false)
		return nil, nil, fmt.Errorf("GracefulMasterTakeover: Recovery attempted yet no replica promoted; err=%+v", err)
	}
	var gtidHint inst.OperationGTIDHint = inst.GTIDHintNeutral
	if topologyRecovery.RecoveryType == MasterRecoveryGTID {
		gtidHint = inst.GTIDHintForce
	}
	clusterMaster, err = inst.ChangeMasterTo(&clusterMaster.Key, &designatedInstance.Key, promotedMasterCoordinates, false, gtidHint)
	if !clusterMaster.SelfBinlogCoordinates.Equals(demotedMasterSelfBinlogCoordinates) {
		log.Errorf("GracefulMasterTakeover: sanity problem. Demoted master's coordinates changed from %+v to %+v while supposed to have been frozen", *demotedMasterSelfBinlogCoordinates, clusterMaster.SelfBinlogCoordinates)
	}
	if !clusterMaster.HasReplicationCredentials && replicationCredentialsError == nil {
		_, credentialsErr := inst.ChangeMasterCredentials(&clusterMaster.Key, replicationCreds)
		if err == nil {
			err = credentialsErr
		}
	}
	if designatedInstance.AllowTLS {
		_, enableSSLErr := inst.EnableMasterSSL(&clusterMaster.Key)
		if err == nil {
			err = enableSSLErr
		}
	}
	if auto {
		_, startReplicationErr := inst.StartReplication(&clusterMaster.Key)
		if err == nil {
			err = startReplicationErr
		}
	}
	executeProcesses(config.Config.PostGracefulTakeoverProcesses, "PostGracefulTakeoverProcesses", topologyRecovery, false)

	return topologyRecovery, promotedMasterCoordinates, err
}

Intelligent Recommendation

Optional source code analysis (unfinished)

Optional source code analysis (unfinished) The Optional class is a new feature of Java 8. Optional is a container object that can be null. The isPresent() method returns true if the value exists, and ...

AbstractQueuedSynchronizer source code analysis (unfinished)

Node Exclusive lock ReentrantLock ReentrantLock implements fair and unfair locks, fair locks provide a way to acquire locks sequentially, and non-fair locks provide preemptive access locks. Fair lock:...

orchestrator

MYSQL5.7Build underOrchestrator Environmental description On the host1, Host2, Host3Install onMySQLServer and client.     Host1 Host2 Host3 operating system CentOS7.4 CentOS7.4 CentOS7.4 CPU...

[Source Code Interpretation] | Sparkenv Source Code Interpretation

[Source Code Interpretation] | Sparkenv Source Code Interpretation Guide Sparkenv creates an entry Sparkenv architecture components Front Create SecurityManager Security Manager Create a RPCENV enviro...

【Source Code Interpretation】 | LIVELISTENERBUS source code interpretation

​ Welcome to the public account: 857HUB Guide Why use an event monitoring mechanism? Imagine that if the Spark event notification is notified by the SCALA function call method, as the cluster size inc...

More Recommendation

【Source Code Interpretation】 | SparkContext Source Code Interpretation

【Source Code Interpretation】 | SparkContext Source Code Interpretation Guide Notice 1. CallSite creation What is callsite? What is the use of callsite? 2. ActiveContext choice Formal chapter 1. Read s...

HashMap principle and source code analysis (jdk1.8) "unfinished"

1: The principle of HashMap: HashMap data structure is composed of an array and a single chain. Stored in the key elements of the (key-value) of the, hash value of the element is calculated first key ...

Look preparation before the Java source code (unfinished)

Bug query Even the source code is also possible to have the bug, source code in some places, you will find it looks incredible, so why write? One possibility is to circumvent an official bug, give an ...

String-Source code analysis-[Unfinished:]-[Completed: hashCode]

Article Directory hashCode() hashCode() useFlash hash code(Caching the hash code) Avoid unnecessary hash calculations. It works wonderfully in the rehash of hashMap....

Redis source code 5.0.5 reading and finishing (unfinished)

  The content in the article refers to "Redis Design and Implementation (Second Edition)" and redis source code. Since the writing is relatively early, the source code (redis5.0.5)...

Copyright  DMCA © 2018-2026 - All Rights Reserved - www.programmersought.com  User Notice

Top