Hadoop Installation#
Template Machine Configuration#
First, install vbox and download the CentOS image.
Install the minimal system and cancel the EFI partition, boot EFI, and root (/) partition.
Configure the network.
There is a slight difference between here and the video. It should be due to the use of different images, and the generated network names are different.
Network Configuration#
First, modify the network configuration of the CentOS virtual machine as follows:
vi /etc/sysconfig/network-scripts/ifcfg-enp0s3
Configure the resolution type as a bridged network card.
The gateway is the gateway.
The ipaddr is the IP address of the virtual machine.
dns1 is an optional gateway address.
Since the gateway of the workshop is 192.168.8.1, your physical machine IP address is also 192.168.8.x.
The virtual machine needs to write 192.168.8.x, which is the same as the physical IP address.
If you switch from this network to the dormitory network, the virtual machine may not be able to access the Internet. It is recommended to configure multiple network cards according to the dormitory network.
After the configuration is complete, restart the network service.
systemctl restart network
Use ip addr
to view the current configuration.
If it is stuck and finally there is an error, the restart service fails, most likely your network configuration is problematic.
The vbox network configuration is as above. In the device's network configuration, network card 1 is a bridged network card, and the name is the network card of your computer that is connected to the Internet.
My network card is ax201.
After the establishment is successful, the physical machine can ping the virtual machine.
Connect SSH#
Please make sure you have configured the network of the virtual machine.
After the configuration is complete, enter the IP address of your virtual machine computer and the root user and password in the SSH terminal to establish a connection directly.
At this point, the network configuration is complete, and you can follow the tutorial. The subsequent cluster network configuration should be similar to this.
Install screenfetch (optional)#
# Get the file
wget -O screenfetch https://git.io/vaHfR
# Add execute permission
chmod +x screenfetch
Used to view computer configuration.
System Update (optional)#
sudo yum check-update
sudo yum update
Install necessary software packages#
yum install -y epel-release
yum install -y psmisc nc net-tools rsync vim lrzsz ntp libzstd openssl-static tree iotop git
Disable Firewall#
systemctl stop firewalld
systemctl disable firewalld
Modify the host file#
vim /etc/hosts
192.168.8.101 neko01
192.168.8.102 neko02
Create a regular user and set privileges#
- Add a user
useradd maomao
passwd maomao
- Add root privileges
vim /etc/sudoers
Use shift + g
to move to the end
Add user information below root, then save and exit with wq!
Create directories in /opt#
cd /opt
mkdir module
mkdir software
Enter ls
to view the result
Install JDK and configure environment variables#
It is recommended to install finalshell, which comes with FTP.
1. Import jar files#
jdk-8u212-linux-x64.tar.gz
hadoop-3.1.3.tar.gz
2. Install JDK#
Extract the tar file
-zcvf
for packaging, -zxvf
for extraction
tar -zxvf jdk-8u212-linux-x64.tar.gz -C ../module
If you forget to add -C
, use mv
to move it
mv jdk1.8.0_212/ ../module/
3. Add environment variables#
It is not recommended to modify this file directly
vim /etc/profile
Principle
for i in /etc/profile.d/*.sh /etc/profile.d/sh.local ; do
if [ -r "$i" ]; then
if [ "${-#*i}" != "$-" ]; then
. "$i"
else
. "$i" >/dev/null
fi
fi
done
Create environment variables
cd /etc/profile.d
Create a file
sudo touch java.sh
sudo vim java.sh
# Configure JDK environment
# Declare JAVA_HOME variable
JAVA_HOME=/opt/module/jdk1.8.0_212
# Declare PATH variable
PATH=$PATH:$JAVA_HOME/bin
# Promote PATH and JAVA_HOME to system global variables
export JAVA_HOME PATH
Reload the profile file
source /etc/profile
Test
[maomao@nekopara profile.d]$ java -version
java version "1.8.0_212"
Java(TM) SE Runtime Environment (build 1.8.0_212-b10)
Java HotSpot(TM) 64-Bit Server VM (build 25.212-b10, mixed mode)
Configure Hadoop#
First, extract the files to module
If you are a regular user, you need to elevate privileges with sudo
Configure the bin and sbin directories of Hadoop into the environment variables#
# Configure JDK environment
# Declare JAVA_HOME variable
JAVA_HOME=/opt/module/jdk1.8.0_212
# Configure Hadoop environment
# Declare Hadoophome
HADOOP_HOME=/opt/module/hadoop-3.1.3
# Declare PATH variable
# Environment variable fusion
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
# Promote PATH, JAVA_HOME, and HADOOP_HOME to system global variables
export JAVA_HOME PATH HADOOP_HOME
Refresh the cache
source /etc/profile
Verify
[root@nekopara profile.d]# hadoop version
Hadoop 3.1.3
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r ba631c436b806728f8ec2f54ab1e289526c90579
Compiled by ztang on 2019-09-12T02:47Z
Compiled with protoc 2.5.0
From source with checksum ec785077c385118ac91aadde5ec9799
This command was run using /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-common-3.1.3.jar
If there is a similar output, it means that the environment variable is configured correctly.