-
Resource management: The node manager reports the available resources on the node to the resource manager by communicating with the resource manager. It will regularly send heartbeat information to the resource manager to update the cluster resource status. The resource manager schedules global resources according to these information, and allocates the corresponding resources to the node manager to execute jobs. -
Container startup and monitoring: The node manager is responsible for starting and monitoring the container for computing tasks. When the resource manager schedules a job to a node, The Node Manager will create a special container for the job, then start the container and monitor its running status. It checks the health of the container. If the container fails or exceeds the predetermined resource limit, The node manager terminates the container and reports the situation to the resource manager. -
Local resource management: Node Manager is responsible for local management and maintenance of local resources on the node, such as local disk, local cache, etc. It can help applications access and use these local resources, thus improving job performance and efficiency. -
Security and authentication: The Node Manager is responsible for authenticating the container in the security mode and ensuring that only authorized users or applications can perform tasks on the node. It integrates with security modules (such as Kerberos) to ensure the security and credibility of the job.