Keywords: Hadoop | start commands | stop commands | cluster management | SSH configuration
Abstract: This article explores various methods for starting and stopping the Hadoop ecosystem, detailing the differences between commands like start-all.sh, start-dfs.sh, and start-yarn.sh. Through use cases and best practices, it explains how to efficiently manage Hadoop services in different cluster configurations. The discussion includes the importance of SSH setup and provides a comprehensive guide from single-node to multi-node operations, helping readers master core skills in Hadoop cluster administration.
In the daily maintenance of the Hadoop ecosystem, selecting the appropriate commands for starting and stopping services is crucial for cluster stability. This article elaborates on three aspects: functional differences among commands, use cases, and best practices.
Command Functionality Analysis
Hadoop offers multiple levels of start and stop commands, primarily categorized into three types:
- start-all.sh & stop-all.sh: These commands are used to start or stop all Hadoop daemons across the cluster at once. Executing them on the master node controls services on all slave nodes simultaneously. It is important to note that these commands are marked as deprecated in official documentation, with finer-grained alternatives recommended.
- start-dfs.sh/stop-dfs.sh and start-yarn.sh/stop-yarn.sh: These command pairs manage daemons for HDFS (Distributed File System) and YARN (Resource Manager) separately. Compared to start-all.sh, they provide more granular control, allowing administrators to start or stop storage-layer or compute-layer services independently. This is currently the recommended standard approach.
- hadoop-daemon.sh and yarn-daemon.sh: These commands are used to manually start or stop specific daemons on individual nodes. For example,
hadoop-daemon.sh start datanodestarts the DataNode service on the current machine. This method is suitable for node-level maintenance or故障恢复.
Use Cases and Best Practices
Different commands apply to various operational scenarios:
- Cluster-wide Management: In scenarios requiring simultaneous start or stop of all services, although start-all.sh is deprecated, a similar effect can be achieved by combining start-dfs.sh and start-yarn.sh. For instance, during cluster initialization, execute
start-dfs.shfirst to start HDFS, followed bystart-yarn.shto start YARN, ensuring correct dependency handling. - Layered Service Maintenance: When only one layer (HDFS or YARN) needs updating or debugging, using the corresponding start/stop commands avoids unnecessary service interruptions. For example, after adjusting HDFS configurations, only
stop-dfs.shandstart-dfs.shneed to be executed, while YARN services remain running. - Node-level Operations: hadoop-daemon.sh and yarn-daemon.sh are particularly useful for cluster expansion or故障恢复. Suppose a new DataNode is added; an administrator can log into that node and execute
hadoop-daemon.sh start datanodewithout restarting the entire cluster. Similarly, if a ResourceManager on a node malfunctions,yarn-daemon.sh stop resourcemanagerandyarn-daemon.sh start resourcemanagercan be used for restart.
Configuration Requirements and Considerations
Several key points should be noted when using these commands:
- SSH Configuration: For commands that need to execute across multiple nodes (e.g., start-dfs.sh), SSH passwordless login from the master node to all slave nodes must be properly configured. Otherwise, commands may fail to remotely start services on slave nodes. This can be set up using tools like
ssh-keygenandssh-copy-id. - Environment Variables: Ensure the Hadoop bin directory is added to the system's PATH environment variable, or specify the full path when using commands, such as
/usr/local/hadoop/bin/start-dfs.sh. - Permission Management: Executing these commands typically requires appropriate system permissions. It is advisable to use a dedicated Hadoop user (e.g., hadoop user) for operations to avoid permission issues.
Code Examples and Practice
Below is a complete example demonstrating how to start a DataNode service on a new node:
# Log into the newly added DataNode
ssh hadoop@new-datanode
# Change to the Hadoop installation directory
cd /usr/local/hadoop
# Start the DataNode daemon
bin/hadoop-daemon.sh start datanode
# Verify if the service is started
jps | grep DataNode
If the output shows a DataNode process, it indicates successful startup. Similarly, to stop the service, use bin/hadoop-daemon.sh stop datanode.
In summary, understanding the differences and applicable scenarios of Hadoop start and stop commands enhances efficiency and reliability in cluster management. In practice, prioritize using start-dfs.sh and start-yarn.sh for layered management, and flexibly apply hadoop-daemon.sh and yarn-daemon.sh for node-level maintenance. Additionally, ensure correct SSH and permission configurations to prevent common issues.