MapReduce剖析笔记之四:TaskTracker通过心跳机制获取任务的流程

时间:2023-03-08 22:23:38

上一节分析到了JobTracker把作业从队列里取出来并进行了初始化,所谓的初始化,主要是获取了Map、Reduce任务的数量,并统计了哪些DataNode所在的服务器可以处理哪些Split等等,将这些信息缓存起来,但还没有进行实质的分配。等待TaskTracker跟自己通信。

TaskTracker一般运行于DataNode之上,下面是它的声明,可见,是一个线程类:

/*******************************************************
* TaskTracker is a process that starts and tracks MR Tasks
* in a networked environment. It contacts the JobTracker
* for Task assignments and reporting results.
*
*******************************************************/
public class TaskTracker implements MRConstants, TaskUmbilicalProtocol,
Runnable, TaskTrackerMXBean
。。。。。

TaskTracker里面存在一个main入口,当启动TaskTracker时进入该方法:

  /**
* Start the TaskTracker, point toward the indicated JobTracker
*/
public static void main(String argv[]) throws Exception {
StringUtils.startupShutdownMessage(TaskTracker.class, argv, LOG);
if (argv.length != 0) {
System.out.println("usage: TaskTracker");
System.exit(-1);
}
try {
JobConf conf=new JobConf();
// enable the server to track time spent waiting on locks
ReflectionUtils.setContentionTracing
(conf.getBoolean("tasktracker.contention.tracking", false));
DefaultMetricsSystem.initialize("TaskTracker");
TaskTracker tt = new TaskTracker(conf);
MBeans.register("TaskTracker", "TaskTrackerInfo", tt);
tt.run();
} catch (Throwable e) {
LOG.error("Can not start task tracker because "+
StringUtils.stringifyException(e));
System.exit(-1);
}
}

可见,在里面会执行TaskTracker tt = new TaskTracker(conf);创建一个TaskTracker对象,并利用tt.run()进入其run方法:

  /**
* The server retry loop.
* This while-loop attempts to connect to the JobTracker. It only
* loops when the old TaskTracker has gone bad (its state is
* stale somehow) and we need to reinitialize everything.
*/
public void run() {
try {
getUserLogManager().start();
startCleanupThreads();
boolean denied = false;
while (running && !shuttingDown) {
boolean staleState = false;
try {
// This while-loop attempts reconnects if we get network errors
while (running && !staleState && !shuttingDown && !denied) {
try {
State osState = offerService();
............

前面是获得写日志管理器,开启清理线程等操作。之后则进入offerService这个方法,这个方法比较长,offerService基本就是一个永远的循环,这从开始的代码可以看出:

  /**
* Main service loop. Will stay in this loop forever.
*/
State offerService() throws Exception {
long lastHeartbeat = System.currentTimeMillis(); while (running && !shuttingDown) {
。。。。
// Send the heartbeat and process the jobtracker's directives
HeartbeatResponse heartbeatResponse = transmitHeartBeat(now);
。。。。

在这个方法里,只要TaskTracker还没有停止,就会一直循环,在循环里面,如果超过了心跳间隔,就会执行transmitHeartBeat这个心跳方法。我们逐步来分析。

首先TaskTracker会往JobTracker请求,查看版本信息和系统目录等。这里采用的也是RPC机制,并且与前面各节分析的客户端的机制相同,TaskTracker里面也包含一个jobClient,其类型是InterTrackerProtocol,这个协议其实和客户端那个JobSubmissionProtocol类似,都是一个继承于VersionedProtocol的接口,其定义为:

interface InterTrackerProtocol extends VersionedProtocol {
HeartbeatResponse heartbeat(TaskTrackerStatus status,
boolean restarted,
boolean initialContact,
boolean acceptNewTasks,
short responseId)
throws IOException; public String getFilesystemName() throws IOException; public void reportTaskTrackerError(String taskTracker,
String errorClass,
String errorMessage) throws IOException; TaskCompletionEvent[] getTaskCompletionEvents(JobID jobid, int fromEventId , int maxEvents) throws IOException; public String getSystemDir(); public String getVIVersion() throws IOException;
}

这里面的方法实际上都由JobTracker实现,这可以从JobTracker的定义看出来,它实现了InterTrackerProtocol、JobSubmissionProtocol等接口,同时为客户端、TaskTracker等提供RPC服务:

public class JobTracker implements MRConstants, InterTrackerProtocol,
JobSubmissionProtocol, TaskTrackerManager, RefreshUserMappingsProtocol,
RefreshAuthorizationPolicyProtocol, AdminOperationsProtocol,
JobTrackerMXBean
。。。。。

利用这些RPC服务,TaskTracker如果是第一次启动,会查看版本信息和系统目录信息(后面再执行offerService时则跳过这些步骤):

        // If the TaskTracker is just starting up:
// 1. Verify the versions matches with the JobTracker
// 2. Get the system directory & filesystem
if(justInited) {
String jtBuildVersion = jobClient.getBuildVersion();
String jtVersion = jobClient.getVIVersion();
。。。。。。。。。
String dir = jobClient.getSystemDir();
。。。。。。。。。
systemDirectory = new Path(dir);
systemFS = systemDirectory.getFileSystem(fConf);
}

之后,会进行心跳方法transmitHeartBeat,这个方法也比较长,也分为几步,首先会判断是否存在任务状态的对象,如果不存在就创建一个:

    if (status == null) {
synchronized (this) {
status = new TaskTrackerStatus(taskTrackerName, localHostname,
httpPort,
cloneAndResetRunningTaskStatuses(
sendCounters),
taskFailures,
localStorage.numFailures(),
maxMapSlots,
maxReduceSlots);
}
} else {
LOG.info("Resending 'status' to '" + jobTrackAddr.getHostName() +
"' with reponseId '" + heartbeatResponseId);
}

TaskTrackerStatus记录了TaskTracker当前的任务状态,可以看出,里面包含两个参数:maxMapSlots、maxReduceSlots,分别代表了本机可以最多执行的Map任务和Reduce任务数量,这两个数量是配置的,从代码里看出,默认都是2:

    maxMapSlots = conf.getInt( "mapred.tasktracker.map.tasks.maximum", 2);
maxReduceSlots = conf.getInt( "mapred.tasktracker.reduce.tasks.maximum", 2);

那么,在初始化TaskTrackerStatus的时候,就把这两个参数记录下来。另外,从TaskTracker与JobTracker之间的心跳RPC服务来看,TaskTrackerStatus对象会被传递至JobTracker,供其分配任务作为参考:

  HeartbeatResponse heartbeat(TaskTrackerStatus status,
boolean restarted,
boolean initialContact,
boolean acceptNewTasks,
short responseId)
throws IOException;

接下来,TaskTracker准备给TaskTrackerStatus对象赋值,然后传递至JobTracker。在赋值时,首先判断下是否能接收新任务,通过判断当前的任务数量是否超过了配置的最大数量得到,如果可以接收新任务,则把本机的内存信息等写入到TaskTrackerStatus相关的变量中:

    //
// Check if we should ask for a new Task
//
boolean askForNewTask;
long localMinSpaceStart;
synchronized (this) {
askForNewTask =
((status.countOccupiedMapSlots() < maxMapSlots ||
status.countOccupiedReduceSlots() < maxReduceSlots) &&
acceptNewTasks);
localMinSpaceStart = minSpaceStart;
}
if (askForNewTask) {
askForNewTask = enoughFreeSpace(localMinSpaceStart);
long freeDiskSpace = getFreeSpace();
long totVmem = getTotalVirtualMemoryOnTT();
long totPmem = getTotalPhysicalMemoryOnTT();
long availableVmem = getAvailableVirtualMemoryOnTT();
long availablePmem = getAvailablePhysicalMemoryOnTT();
long cumuCpuTime = getCumulativeCpuTimeOnTT();
long cpuFreq = getCpuFrequencyOnTT();
int numCpu = getNumProcessorsOnTT();
float cpuUsage = getCpuUsageOnTT(); status.getResourceStatus().setAvailableSpace(freeDiskSpace);
status.getResourceStatus().setTotalVirtualMemory(totVmem);
status.getResourceStatus().setTotalPhysicalMemory(totPmem);
status.getResourceStatus().setMapSlotMemorySizeOnTT(
mapSlotMemorySizeOnTT);
status.getResourceStatus().setReduceSlotMemorySizeOnTT(
reduceSlotSizeMemoryOnTT);
status.getResourceStatus().setAvailableVirtualMemory(availableVmem);
status.getResourceStatus().setAvailablePhysicalMemory(availablePmem);
status.getResourceStatus().setCumulativeCpuTime(cumuCpuTime);
status.getResourceStatus().setCpuFrequency(cpuFreq);
status.getResourceStatus().setNumProcessors(numCpu);
status.getResourceStatus().setCpuUsage(cpuUsage);
}

这些资源信息包括物理内存、虚拟内存、Map/Reduce任务槽的内存占用、累积的CPU时间、CPU频率、CPU数量、使用率等各种资源状态。

另外,还进行了健康检查,把信息也记录在里面,这里用一个TaskTrackerHealthStatus记录健康状态。这里进行健康检查实际上让系统运行一个脚本:

    TaskTrackerHealthStatus healthStatus = status.getHealthStatus();
synchronized (this) {
if (healthChecker != null) {
healthChecker.setHealthStatus(healthStatus);
} else {
healthStatus.setNodeHealthy(true);
healthStatus.setLastReported(0L);
healthStatus.setHealthReport("");
}
}

记录了这些信息后,就执行心跳方法:

    HeartbeatResponse heartbeatResponse = jobClient.heartbeat(status,
justStarted,
justInited,
askForNewTask,
heartbeatResponseId);

前面看到,jobClient实际上是一个接口,当调用该方法的时候,利用动态代理,会将该方法交由JobTracker端的同名方法执行。

这个方法较为复杂,我们分几个步骤分别分析。

1、检查TaskTracker的合法性

JobTracker会首先检查该TaskTracker的合法性:

    // Make sure heartbeat is from a tasktracker allowed by the jobtracker.
if (!acceptTaskTracker(status)) {
throw new DisallowedTaskTrackerException(status);
} /**
* Returns true if the tasktracker is in the hosts list and
* not in the exclude list.
*/
private boolean acceptTaskTracker(TaskTrackerStatus status) {
return (inHostsList(status) && !inExcludedHostsList(status));
}

inHostsList和inExcludedHostsList的代码为:

  /**
* Return if the specified tasktracker is in the hosts list,
* if one was configured. If none was configured, then this
* returns true.
*/
private boolean inHostsList(TaskTrackerStatus status) {
Set<String> hostsList = hostsReader.getHosts();
return (hostsList.isEmpty() || hostsList.contains(status.getHost()));
} /**
* Return if the specified tasktracker is in the exclude list.
*/
private boolean inExcludedHostsList(TaskTrackerStatus status) {
Set<String> excludeList = hostsReader.getExcludedHosts();
return excludeList.contains(status.getHost());
}

JobTracker里面存在一个HostsFileReader对象hostsReader,用于记录是否合法的TaskTracker,该对象在JobTracker创建时会被创建出来,该对象存在两个队列,其定义为:

// Keeps track of which datanodes/tasktrackers are allowed to connect to the
// namenode/jobtracker.
public class HostsFileReader {
private Set<String> includes;
private Set<String> excludes;
private String includesFile;
private String excludesFile;
。。。。。。

在这个对象里,includes表示可以访问的host列表,excludes则表示不可访问的host列表,最初,这两个列表的内容是根据两个配置文件生成的:

    // Read the hosts/exclude files to restrict access to the jobtracker.
this.hostsReader = new HostsFileReader(conf.get("mapred.hosts", ""),
conf.get("mapred.hosts.exclude", ""));

2.更新TaskTracker的健康状态

如果是一个合法的TaskTracker,接下来还会检查是否在灰名单里,对于JobTracker而言,不是每个TaskTracker都适合分配任务,比如有的TaskTracker心跳间隔超时了, 出现过故障等等,JobTracker使用一个FaultyTrackersInfo对象记录这些TaskTracker:

  // statistics about TaskTrackers with faults; may lead to graylisting
private FaultyTrackersInfo faultyTrackers = new FaultyTrackersInfo();

其检查代码为:

    // First check if the last heartbeat response got through
String trackerName = status.getTrackerName();
long now = clock.getTime();
if (restarted) {
faultyTrackers.markTrackerHealthy(status.getHost());
} else {
faultyTrackers.checkTrackerFaultTimeout(status.getHost(), now);
}

restarted是TaskTracker通过heartbeat方法传递过来的,这说明如果TaskTracker是重启了的,则认为该TaskTracker处于健康状态,进行标记:

    /**
* Removes the tracker from the blacklist, graylist, and
* potentially-faulty list, when it is restarted.
*
* Assumes JobTracker is locked on the entry.
*
* @param hostName
*/
void markTrackerHealthy(String hostName) {
synchronized (potentiallyFaultyTrackers) {
FaultInfo fi = potentiallyFaultyTrackers.remove(hostName);
if (fi != null) {
// a tracker can be both blacklisted and graylisted, so check both
if (fi.isGraylisted()) {
LOG.info("Marking " + hostName + " healthy from graylist");
decrGraylistedTrackers(getNumTaskTrackersOnHost(hostName));
}
if (fi.isBlacklisted()) {
LOG.info("Marking " + hostName + " healthy from blacklist");
addHostCapacity(hostName);
}
// no need for fi.unBlacklist() for either one: fi is already gone
}
}
}

从上面的代码看出,对于刚刚重启的主机,把它从出错的TaskTracker集合potentiallyFaultyTrackers里面删除。并更新numGraylistedTrackers、numBlacklistedTrackers等数量以及totalMapTaskCapacity、totalReduceTaskCapacity的数量。

对于不是重启的主机,检查是否应该放入黑名单:

    void checkTrackerFaultTimeout(String hostName, long now) {
synchronized (potentiallyFaultyTrackers) {
FaultInfo fi = potentiallyFaultyTrackers.get(hostName);
// getFaultCount() auto-rotates the buckets, clearing out the oldest
// as needed, before summing the faults:
if (fi != null && fi.getFaultCount(now) < TRACKER_FAULT_THRESHOLD) {
unBlacklistTracker(hostName, ReasonForBlackListing.EXCEEDING_FAILURES,
true, now);
}
}
}

所谓黑名单,其判断标准是当前TaskTracker所在的主机是否发生了超过了TRACKER_FAULT_THRESHOLD=4次失败的情况。按照Hadoop代码,应该是以滑窗方式检测一个主机是否应该放入黑名单,从下面的说明可以看出,如果在3小时范围内,出现了大于4次失败,则应该放入黑名单。

  // Fault threshold (number occurring within TRACKER_FAULT_TIMEOUT_WINDOW)
// to consider a task tracker bad enough to blacklist heuristically. This
// is functionally the same as the older "MAX_BLACKLISTS_PER_TRACKER" value.
private int TRACKER_FAULT_THRESHOLD; // = 4; // Width of overall fault-tracking sliding window (in minutes). (Default
// of 24 hours matches previous "UPDATE_FAULTY_TRACKER_INTERVAL" value that
// was used to forgive a single fault if no others occurred in the interval.)
private int TRACKER_FAULT_TIMEOUT_WINDOW; // = 180 (3 hours)

3.判断JobTracker自身的故障、心跳是否重复等

JobTracker会保存一个映射表,记录了各个TaskTracker上一次的心跳响应:

  // (trackerID --> last sent HeartBeatResponse)
Map<String, HeartbeatResponse> trackerToHeartbeatResponseMap =
new TreeMap<String, HeartbeatResponse>();

从该表中取出对应于该TaskTracker的上一次心跳响应:

    HeartbeatResponse prevHeartbeatResponse =
trackerToHeartbeatResponseMap.get(trackerName);

如果TaskTracker不是第一次联系JobTracker(这个信息TaskTracker通过心跳传递过来),且JobTracker没有保存着对应于该TaskTracker的上一次响应,一种可能是JobTracker刚刚重启了,重启与否可以判断出来;如果也不是重启,那表明出现了严重问题。出现这种情况时,JobTracker也不进行任务分配了,直接给TaskTracker返回一个响应。如下面代码所示:

      // If this isn't the 'initial contact' from the tasktracker,
// there is something seriously wrong if the JobTracker has
// no record of the 'previous heartbeat'; if so, ask the
// tasktracker to re-initialize itself.
if (prevHeartbeatResponse == null) {
// This is the first heartbeat from the old tracker to the newly
// started JobTracker
if (hasRestarted()) {
addRestartInfo = true;
// inform the recovery manager about this tracker joining back
recoveryManager.unMarkTracker(trackerName);
} else {
// Jobtracker might have restarted but no recovery is needed
// otherwise this code should not be reached
LOG.warn("Serious problem, cannot find record of 'previous' " +
"heartbeat for '" + trackerName +
"'; reinitializing the tasktracker");
return new HeartbeatResponse(responseId,
new TaskTrackerAction[] {new ReinitTrackerAction()});
}

如果JobTracker确实也保存了对应于该TaskTracker的上一次响应,那么检查一下两者是否是重复,JobTracker理论上会为每次心跳响应赋一个递增的值:

        // It is completely safe to not process a 'duplicate' heartbeat from a
// {@link TaskTracker} since it resends the heartbeat when rpcs are
// lost see {@link TaskTracker.transmitHeartbeat()};
// acknowledge it by re-sending the previous response to let the
// {@link TaskTracker} go forward.
if (prevHeartbeatResponse.getResponseId() != responseId) {
LOG.info("Ignoring 'duplicate' heartbeat from '" +
trackerName + "'; resending the previous 'lost' response");
return prevHeartbeatResponse;
}

4、处理心跳,更新TaskTracker状态等:

当合法性检查通过后,JobTracker会更新自己记录的关于TaskTracker的状态:

    // Process this heartbeat
short newResponseId = (short)(responseId + 1);
status.setLastSeen(now);
if (!processHeartbeat(status, initialContact, now)) {
if (prevHeartbeatResponse != null) {
trackerToHeartbeatResponseMap.remove(trackerName);
}
return new HeartbeatResponse(newResponseId,
new TaskTrackerAction[] {new ReinitTrackerAction()});
}

首先递增响应ID,调用processHeartbeat更新TaskTracker的状态,如果不成功,则将其从trackerToHeartbeatResponseMap中移除;但仍然返回一个响应给TaskTracker。

在processHeartbeat方法里,主要是更新JobTracker自己的数据结构:

  /**
* Process incoming heartbeat messages from the task trackers.
*/
private synchronized boolean processHeartbeat(
TaskTrackerStatus trackerStatus,
boolean initialContact,
long timeStamp) throws UnknownHostException {
。。。。。。。。。。。。
updateTaskStatuses(trackerStatus);
updateNodeHealthStatus(trackerStatus, timeStamp); return true;
}

JobTracker里存在要执行的Map和Reduce Task,每个Task的实例不止一个,一方面是考虑到容错,当一个机器挂掉的时候,让其它机器来执行这个任务,同时,在多个机器上分配,让他们执行一样的任务,这样谁先执行完,这个任务也就执行完,避免某些机器出了点问题,速度太慢,影响了整体的进度,所以,一个任务Task实际上有多个实例,称为TaskAttemptID,TaskAttemptID的格式类似这样的格式:

attempt_200707121733_0003_m_000005_0

这代表任务2007年7月12日17时33分启动的JobTracker中第0003号作业的第000005号map task的第0号task attempt。

JobTracker会去查看TaskTracker传递过来的TaskAttemptID,如果该任务对应的Job都不存在了,表名执行完了等等,就会把它放入一个清空队列里面:

  void updateTaskStatuses(TaskTrackerStatus status) {
String trackerName = status.getTrackerName();
for (TaskStatus report : status.getTaskReports()) {
report.setTaskTracker(trackerName);
TaskAttemptID taskId = report.getTaskID(); JobInProgress job = getJob(taskId.getJobID());
if (job == null) {
// if job is not there in the cleanup list ... add it
synchronized (trackerToJobsToCleanup) {
Set<JobID> jobs = trackerToJobsToCleanup.get(trackerName);
if (jobs == null) {
jobs = new HashSet<JobID>();
trackerToJobsToCleanup.put(trackerName, jobs);
}
jobs.add(taskId.getJobID());
}
continue;
}
。。。。。。

如果Job都没有完成初始化,那么也要将其清理掉:

      if (!job.inited()) {
// if job is not yet initialized ... kill the attempt
synchronized (trackerToTasksToCleanup) {
Set<TaskAttemptID> tasks = trackerToTasksToCleanup.get(trackerName);
if (tasks == null) {
tasks = new HashSet<TaskAttemptID>();
trackerToTasksToCleanup.put(trackerName, tasks);
}
tasks.add(taskId);
}
continue;
}

另外,如果JobTracker是重启过的,那么还会把TaskTracker报告的任务信息加进来:

      TaskInProgress tip = taskidToTIPMap.get(taskId);
// Check if the tip is known to the jobtracker. In case of a restarted
// jt, some tasks might join in later
if (tip != null || hasRestarted()) {
if (tip == null) {
tip = job.getTaskInProgress(taskId.getTaskID());
job.addRunningTaskToTIP(tip, taskId, status, false);
} // Update the job and inform the listeners if necessary
JobStatus prevStatus = (JobStatus)job.getStatus().clone();
// Clone TaskStatus object here, because JobInProgress
// or TaskInProgress can modify this object and
// the changes should not get reflected in TaskTrackerStatus.
// An old TaskTrackerStatus is used later in countMapTasks, etc.
job.updateTaskStatus(tip, (TaskStatus)report.clone());
JobStatus newStatus = (JobStatus)job.getStatus().clone(); // Update the listeners if an incomplete job completes
if (prevStatus.getRunState() != newStatus.getRunState()) {
JobStatusChangeEvent event =
new JobStatusChangeEvent(job, EventType.RUN_STATE_CHANGED,
prevStatus, newStatus);
updateJobInProgressListeners(event);
}
} else {
LOG.info("Serious problem. While updating status, cannot find taskid "
+ report.getTaskID());
}

5、分配任务:

在上面的操作都做完了以后,JobTracker开始分配任务,首先创建一个心跳响应,以及关于这个心跳响应的任务分配队列List<TaskTrackerAction>:

    // Initialize the response to be sent for the heartbeat
HeartbeatResponse response = new HeartbeatResponse(newResponseId, null);
List<TaskTrackerAction> actions = new ArrayList<TaskTrackerAction>();
boolean isBlacklisted = faultyTrackers.isBlacklisted(status.getHost());
// Check for new tasks to be executed on the tasktracker
if (recoveryManager.shouldSchedule() && acceptNewTasks && !isBlacklisted) {
TaskTrackerStatus taskTrackerStatus = getTaskTrackerStatus(trackerName);
if (taskTrackerStatus == null) {
LOG.warn("Unknown task tracker polling; ignoring: " + trackerName);
} else {
List<Task> tasks = getSetupAndCleanupTasks(taskTrackerStatus);
if (tasks == null ) {
tasks = taskScheduler.assignTasks(taskTrackers.get(trackerName));
}
if (tasks != null) {
for (Task task : tasks) {
expireLaunchingTasks.addNewTask(task.getTaskID());
if(LOG.isDebugEnabled()) {
LOG.debug(trackerName + " -> LaunchTask: " + task.getTaskID());
}
actions.add(new LaunchTaskAction(task));
}
}
}
}

如果该TaskTracker支持任务调度,且不在黑名单里面,则可以进行分配,核心的代码是:

          tasks = taskScheduler.assignTasks(taskTrackers.get(trackerName));

这个方法用于任务分配。taskScheduler是一个TaskScheduler对象,而TaskScheduler是一个抽象类,有几种实现,默认的是JobQueueTaskScheduler。关于任务分配过程,也比较复杂,留作后续分析,这里先假定已经分配了任务,并被加入到队列List<TaskTrackerAction>中。

6、将其他任务加入到队列中:

除了分配的任务,还需要把以下一些任务加入队列actions:

需要被kill的Task:

    // Check for tasks to be killed
List<TaskTrackerAction> killTasksList = getTasksToKill(trackerName);
if (killTasksList != null) {
actions.addAll(killTasksList);
}

需要被kill或清理的任务:

    // Check for jobs to be killed/cleanedup
List<TaskTrackerAction> killJobsList = getJobsForCleanup(trackerName);
if (killJobsList != null) {
actions.addAll(killJobsList);
}

需要保存输出的任务:

    // Check for tasks whose outputs can be saved
List<TaskTrackerAction> commitTasksList = getTasksToSave(status);
if (commitTasksList != null) {
actions.addAll(commitTasksList);
}

7、获得心跳响应,返回:

将任务队列actions加入到心跳响应:

    // calculate next heartbeat interval and put in heartbeat response
int nextInterval = getNextHeartbeatInterval();
response.setHeartbeatInterval(nextInterval);
response.setActions(actions.toArray(new TaskTrackerAction[actions.size()])); // check if the restart info is req
if (addRestartInfo) {
response.setRecoveredJobs(recoveryManager.getJobsToRecover());
} // Update the trackerToHeartbeatResponseMap
trackerToHeartbeatResponseMap.put(trackerName, response);

至此,TaskTracker这个通过RPC访问的heartbeat服务方法执行结束了,JobTracker向TaskTracker返回了心跳响应,将要执行的任务信息返回给TaskTracker。

回到TaskTracker,上一部分分析到发出心跳,接下来就对响应进行解析了,继续offerService方法的后续分析:

首先要看一下,是否有任务需要清理:

        // Check if the map-event list needs purging
Set<JobID> jobs = heartbeatResponse.getRecoveredJobs();
if (jobs.size() > 0) {
synchronized (this) {
// purge the local map events list
for (JobID job : jobs) {
RunningJob rjob;
synchronized (runningJobs) {
rjob = runningJobs.get(job);
if (rjob != null) {
synchronized (rjob) {
FetchStatus f = rjob.getFetchStatus();
if (f != null) {
f.reset();
}
}
}
}
} // Mark the reducers in shuffle for rollback
synchronized (shouldReset) {
for (Map.Entry<TaskAttemptID, TaskInProgress> entry
: runningTasks.entrySet()) {
if (entry.getValue().getStatus().getPhase() == Phase.SHUFFLE) {
this.shouldReset.add(entry.getKey());
}
}
}
}
}

接下来获得所有任务信息:

        TaskTrackerAction[] actions = heartbeatResponse.getActions();

对于每个任务,如果需要启动,就加入到队列addToTaskQueue中:

        if (actions != null){
for(TaskTrackerAction action: actions) {
if (action instanceof LaunchTaskAction) {
addToTaskQueue((LaunchTaskAction)action);
} else if (action instanceof CommitTaskAction) {
CommitTaskAction commitAction = (CommitTaskAction)action;
if (!commitResponses.contains(commitAction.getTaskID())) {
LOG.info("Received commit task action for " +
commitAction.getTaskID());
commitResponses.add(commitAction.getTaskID());
}
} else {
addActionToCleanup(action);
}
}
}

如果需要提交的,就进行要提交的队列中,否则加入到要清理的队列中。

之后,杀死那些很久没有反馈进度的任务:

        markUnresponsiveTasks();

当磁盘空间不够时,杀死某些任务以腾出空间:

        killOverflowingTasks();

此时,TaskTracker利用心跳从JobTracker获得了任务,并加入了自己的各个队列,有的是待启动的队列,有的是要提交的队列,有的是要清理的队列,这些队列里面的任务,会有其他线程来取了以后去执行。这一部分暂时告一段落。

上面我们遗留了一个问题,就是JobTracker在调用JobQueueTaskScheduler的任务分配方法时,是如何按照什么策略分配的还没有涉及。这个在下一博文里面进行分析。

另外,关于TaskTracker怎么从队列里取出任务继续启动JAVA虚拟机执行等过程,我们也留作后续博文分析研究。