1. Goal

Install HDP2.4 via ambari at CentOS6X, distribute to multi-system.

2. Reference

http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/ch_Getting_Ready.html

3. Getting Ready

3.1 Package list

-OS: CentOS-6.4-x86_64-bin-DVD1.iso, CentOS-6.4-x86_64-bin-DVD2.iso
-JDK: jdk-8u91-linux-x64.tar.gz
-cmake: cmake-3.5.2.tar.gz
-Mysql: mysql-connector-java-5.1.39-bin.jar, mysql-5.6.10.tar.gz
-httpd: pcre-8.38.tar.gz, apr-1.5.2.tar.gz, apr-util-1.5.4.tar.gz, httpd-2.4.23.tar.gz
-ambari: ambari-2.2.2.0-centos6.tar.gz
-HDP: HDP-2.4.2.0-centos6-rpm.tar.gz, HDP-UTILS-1.1.0.20-centos6.tar.gz

3.2 JDK Installation (install on each host)

3.2.1, untar

tar zxvf jdk-8u91-linux-x64.tar.gz 

3.2.2, Run the below command to add environment variable:

vi /etc/profile  

3.2.3, and add following content to this open file:

export JAVA_HOME=/usr/java/jdk1.8.0_91  
export JAVA_BIN=$JAVA_HOME/bin  
export JAVA_LIB=$JAVA_HOME/lib  
export CLASSPATH=.:$JAVA_LIB/tools.jar:$JAVA_LIB/dt.jar  
export PATH=$JAVA_BIN:$PATH     

3.2.4, source profile to take it effect:

source /etc/profile  

3.3 Mysql Installation (install on one host)

3.3.1, Install compile tools

yum install gcc gcc-c++ ncurses-devel perl autoconf automake zlib* fiex* libxml* libmcrypt* libtool-ltdl-devel*

3.3.2, cmake installation

tar zxvf cmake-3.5.2.tar.gz  
./bootstrap ; make ; make install 

3.3.3, Create mysql group and mysql user

groupadd mysql  
useradd -r -g mysql mysql

3.3.4, mysql installation

tar zxvf mysql-5.6.10.tar.gz  
cd mysql-5.6.10  
cmake -DCMAKE_INSTALL_PREFIX=/usr/local/mysql -DMYSQL_UNIX_ADDR=/usr/local/mysql/mysql.sock -DDEFAULT_CHARSET=utf8 -DDEFAULT_COLLATION=utf8_general_ci -DWITH_INNOBASE_STORAGE_ENGINE=1 -DWITH_ARCHIVE_STORAGE_ENGINE=1 -DWITH_BLACKHOLE_STORAGE_ENGINE=1 -DMYSQL_DATADIR=/data/mysqldb -DMYSQL_TCP_PORT=3306 -DENABLE_DOWNLOADS=1

Rerun configure need to run rm CMakeCache.txt delete the CMakeCache.txt first

make  
make install

3.3.5, Change owner:

cd /usr/local/mysql
chown –R mysql:mysql .

3.3.6, Init mysql database

cd /usr/local/mysql
scripts/mysql_install_db

3.3.7, Copy mysql configure file and add to PATH to /etc/profile

cp /usr/local/mysql/support-files/my-default.cnf /etc/my.cnf  
cp support-files/mysql.server /etc/init.d/mysqld  

vim /etc/profile

add below scripts:

PATH=/usr/local/mysql/bin:/usr/local/mysql/lib:$PATH  
export PATH

save and source it:

source /etc/profile  

3.3.8, start mysql service and add to auto-run while reboot

service mysqld start 

If come across error:Starting MySQL.. ERROR: The server without updating PID file (/data/mysqldb/hdp-35.pid)
Edit the /etc/my.cnf file, let it direct to the right data directory

3.3.9, Auto-run while reboot

chkconfig --level 35 mysqld on

3.3.10, try to login to make sure install successfully

service mysqld status  
mysql –u root –p

3.3.11, change root password

mysqladmin -u root password 'hadoop'

3.3.12, login to mysql server, run the below command to create mysql user for ambari, hive, oozie, ranger, and grant remote login privilege to them.

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'root'@'hdp-35.slave.com' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY 'hadoop';  


CREATE USER 'ambari'@'%' IDENTIFIED BY 'hadoop';    
GRANT ALL PRIVILEGES ON *.* TO 'ambari'@'%' IDENTIFIED BY 'hadoop';   
CREATE USER 'ambari'@'localhost' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'ambari'@'localhost' IDENTIFIED BY 'hadoop';  
CREATE USER 'ambari'@'hdp-35.slave.com' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'ambari'@'hdp-35.slave.com' IDENTIFIED BY 'hadoop';   

CREATE USER 'hive'@'%' IDENTIFIED BY 'hadoop';   
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%' IDENTIFIED BY 'hadoop';  
CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'localhost' IDENTIFIED BY 'hadoop';  
CREATE USER 'hive'@'hdp-35.slave.com' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'hdp-35.slave.com' IDENTIFIED BY 'hadoop';  

CREATE USER 'oozie'@'%' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'oozie'@'%' IDENTIFIED BY 'hadoop';  
CREATE USER 'oozie'@'localhost' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'oozie'@'localhost' IDENTIFIED BY 'hadoop';  
CREATE USER 'oozie'@'hdp-35.slave.com' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'oozie'@'hdp-35.slave.com' IDENTIFIED BY 'hadoop';  

CREATE USER 'ranger'@'%' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'ranger'@'%' IDENTIFIED BY 'hadoop';   
CREATE USER 'ranger'@'localhost' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'ranger'@'localhost' IDENTIFIED BY 'hadoop';  
CREATE USER 'ranger'@'hdp-35.slave.com' IDENTIFIED BY 'hadoop';  
GRANT ALL PRIVILEGES ON *.* TO 'ranger'@'hdp-35.slave.com' IDENTIFIED BY 'hadoop';  

Create database ambari;  
Create database hive;  
Create database oozie;  
Create database ranger;  

FLUSH PRIVILEGES;  
Commit;  

3.3.13, Run the below command to check the mysql scripts exe successfully:

Select Host, User, Password from user;  

3.4 httpd installation (install on one host)

http://blog.csdn.net/symgdwyh/article/details/8235262

3.4.1, untar

tar zxf httpd-2.4.23.tar.gz  
cd httpd-2.4.23  

3.4.2, install dependencies: apr, apr-util and pcre, I put them in httpd-2.4.23/srclib directory

Note, if no compile tool, Please install compile tools first.

yum install gcc gcc-c++ ncurses-devel perl  

Install pcre

tar -zxvf pcre-8.38.tar.gz  
cd pcre-8.38  
./configure --prefix=/usr/local/pcre   
make    
make install  

install apr

cd ../apr -1.5.2  
./configure --prefix=/usr/local/apr -with-pcre=/usr/local/pcre  
make  
make install  

Install apr-util

cd ../apr-util -1.5.4  
./configure --prefix=/usr/local/apr-util -with-apr=/usr/local/apr  
make  
make install  

Install httpd

cd ../../  
./configure --prefix=/usr/local/httpd  -with-pcre=/usr/local/pcre -with-apr-util=/usr/local/apr-util  
make  
make install  

3.4.3, configuration

run

cp /usr/local/httpd/bin/apachectl /etc/init.d/httpd  

then run

vi /etc/rc.d/init.d/httpd 

add and below script follow(#!/bin/sh)

# chkconfig: 2345 50 90  
# description: Activates/Deactivates Apache Web Server  

Note“#" is must

Run chkconfig command put Apache to the system service

chkconfig --add httpd  
chkconfig httpd on  
service httpd start  

Edit the configure file:

vi /usr/local/httpd/conf/httpd.conf  

the default port is 80, can change the default port:

Remove the # before ServerName www.example.com:80
Change the DocumentRoot to ‘/var/www/html’

3.4.4, start the service

cd /usr/local/httpd/bin  
./apachectl start  

If come across this error:

vi /usr/local/httpd/conf/httpd.conf  

Remove the # before ServerName www.example.com:80

3.4.5, visit http://hostname(or ip):port/

if come across forbidden error:
You don't have permission to access / on this server.

vi /usr/local/httpd/conf/httpd.conf  

And edit the content between Directory, add:

Allow from all  
Options Indexes  

3.5 Check the Maximum Open File Descriptors(on each host)
The recommended maximum number of open file descriptors is 10000, or more. To check the current value set for the maximum number of open file descriptors, execute the following shell commands on each host:

ulimit -Sn  
ulimit -Hn

If the output is not greater than 10000, run the following command on each host to set it to a suitable default:

ulimit -n 10000  

then

vi /etc/security/limits.conf  

and add two line as below:

* soft nofile 10000  
* hard nofile 10000   

3.5 Enable NTP on the Cluster and on the Browser Host(on each host)

chkconfig --list ntpd  

If come across error: error reading information on service ntpd: No such file or directory

Just run

yum install ntp to solve it.  

Then run:

chkconfig ntpd on  
service ntpd start  

3.6 Configuring iptables(on each host)

chkconfig iptables off  
/etc/init.d/iptables stop  

3.7 Disable SELinux and PackageKit and check the umask Value( on each host)

3.7.1, On each host in your cluster, To permanently disable SELinux set SELINUX=disabled in /etc/selinux/config This ensures that SELinux does not turn itself on after you reboot the machine.

3.7.2, On an installation host running RHEL/CentOS with PackageKit installed, open /etc/yum/pluginconf.d/refresh-packagekit.conf using a text editor. Make the following change: enabled=0

3.7.3, Permanently changing the umask on all hosts

echo umask 0022 >> /etc/profile  

3.8 Check Transparent Huge Pages (on each host)

cat /sys/kernel/mm/redhat_transparent_hugepage/enabled  

[always] stands for THP enabled.[never] stands for THP disabled. If THP is enabled, execute the following shell commands on each host:

vi /etc/rc.local

add:

“if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then 
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
fi”  

3.9 Check DNS and NSCD (on each host)

3.9.1, open the hosts file on every host in your cluster

vi /etc/hosts

3.9.2, Add a line for each host in your cluster
Example:
192.168.197.131 hdp01.local
192.168.197.132 hdp02.local
192.168.197.133 hdp03.local
192.168.197.134 hdp04.local

3.9.3, Confirm that the hostname is set by running the following command:

hostname -f  

This should return the you just set.

3.9.4, Use the "hostname" command to set the hostname on each host in your cluster. For example:
hostname

3.9.5, Using a text editor, open the network configuration file on every host and set the desired network configuration for each host. For example:

vi /etc/sysconfig/network  

3.9.6, Modify the HOSTNAME property to set the fully qualified domain name.

NETWORKING=yes  
HOSTNAME=<fully.qualified.domain.name>  

3.10 Set UP Password-less SSH (on each host)

3.10.1, Generate public and private SSH keys on each host
ssh-keygen

3.10.2, Copy the SSH Public Key(id_rsa.pub) of each host to one authorized_keys file

3.10.3, Put authorized_keys file to each host

3.10.4, Depending on your version of SSH, you may need to set permissions on the .ssh directory (to 700) and the authorized_keys file in that directory (to 600) on the target hosts.

 chmod 700 ~/.ssh  
 chmod 600 ~/.ssh/authorized_keys  

3.10.5, use ssh root@ to make sure you can connect to each host

ERROR: -bash: ssh: command not found

Run:

yum -y install openssh-clients  

3.11 Configuring a Local Repository

3.11.1, check the httpd service is start

3.11.2, run command as below:

 mkdir -p /var/www/html/  

3.11.3,

tar –zxvf ambari-2.2.2.0-centos6.tar.gz to /var/www/html/  

3.11.4,

tar –zxvf HDP-2.4.2.0-centos6-rpm.tar.gz 

and

tar –zxvf HDP-UTILS-1.1.0.20-centos6.tar.gz to /var/www/html/hdp  

if come across: File size limit exceeded Run the below command:

ulimit -f 6553500  

3.11.5, make sure can access from the http url
Example, installing httpd at 192.168.197.133, and can visit the ambari and HDP via these urls.
http://192.168.197.133/hdp/HDP-UTILS-1.1.0.20/repos/centos6/
http://192.168.197.133/AMBARI-2.2.2.0/centos6/2.2.2.0-460/
http://192.168.197.133/hdp/HDP/centos6/2.x/updates/2.4.2.0/

3.11.6, configure 3 repos:
ambary.repo file content as below:

[Updates-ambari-2.2.2.0]  
name=ambari-2.2.2.0 - Updates  
baseurl=http://192.168.197.133/AMBARI-2.2.2.0/centos6/2.2.2.0-460/  
gpgcheck=0  
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos6/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins  
enabled=1  
priority=1  

hdp.repo file content as below:

[HDP-2.4.2.0]  
name=HDP Version - HDP-2.4.2.0  
baseurl=http://192.168.197.133/hdp/HDP/centos6/2.x/updates/2.4.2.0/  
gpgcheck=0  
gpgkey=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.2.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins  
enabled=1  
priority=1  

hdp-util.repo file content as below:

[HDP-UTILS-1.1.0.20]  
name=HDP Utils Version - HDP-UTILS-1.1.0.20  
baseurl=http://192.168.197.133/hdp/HDP-UTILS-1.1.0.20/repos/centos6/  
gpgcheck=0  
gpgkey=http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.2.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins  
enabled=1  
priority=1  

3.11.7, put three repo files to the hosts(/etc/yum.repos.d/) which want use these repos

3.11.8, Optional. If you have multiple repositories configured in your environment, deploy the following plug-in on all the nodes in your cluster.
Install the plug-in.

yum install yum-plugin-priorities  

Edit the /etc/yum/pluginconf.d/priorities.conf file to add the following:

[main]  
enabled=1  
gpgcheck=0  

3.11.9, needn’t but if update yum source. run command below on each host

yum clean all  
yum mackecache  
yum upgrade  

3.11.10, If come across the below error:

just run:

dhclient  

then run:

yum upgrade will be success.  

3.11.11, Another app is currently holding the yum lock; waiting for it to exit...

cd /var/run  
rm -f yum.pid  

4. Install Ambari server(on one host)
4.1, Install the Ambari bits. This also installs the default PostgreSQL Ambari database.

yum install ambari-server  

4.2, Set up the Ambari server(mysql database have jdbc connect error, so I recommend to use PostgreSQL)

ambari-server setup   

4.3, Run the following command on the Ambari Server host:

ambari-server start

4.4, To check the Ambari Server processes:

ambari-server status    

4.5, To stop the Ambari Server:
ambari-server stop

4.6, Be sure you have run:
ambari-server setup --jdbc-db=mysql --jdbc-driver=/path/to/mysql/mysql-connector-java.jar on the Ambari Server host to make the JDBC driver available and to enable testing the database connection. For example, my JDBC driver is /usr/share/java/mysql-connector-java-5.1.39-bin.jar
So I need to run:

ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java-5.1.39-bin.jar  

5. Installing, Configuring, and Deploying a HDP Cluster

5.1, Login to ambari UI

using http://:8080/
User: admin
Password: amin

5.2, Launching the Ambari Install Wizard

5.3, Get Started

Just Name Your Cluster

5.4, Select Stack

Select the correct OS, and add the local repo address as below:

5.5, Install Options

5.5.1, All the hosts name, one per line.
5.5.2, The SSH Private Key of the Ambari server

5.6, Confirm Hosts( come across some error and warning )
5.6.1, THP issues

Check Transparent Huge Pages (THP).

cat /sys/kernel/mm/redhat_transparent_hugepage/enabled

[always] stands for THP enabled.[never] stands for THP disabled.
If THP is enabled, execute the following shell commands on each host:

vi /etc/rc.local  

add:

“if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then 
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
fi”  

5.6.2, ambary-agent error

Or can find ‘/var/lib/ambari-agent/data’ error
Install the ambari-agent at each host in the cluster excluding the ambary-server host using below command:

yum install ambari-agent –y  

Start it and check it is started using below command:

ambari-agent start  
ambari-agent status  

Then continue, until all are successful.

5.7, Choose Services
5.8, Assign Masters
5.9, Assign Slaves and Clients
5.10, Customize Services (some need to be careful)
Before configuring DB connection, Be sure that mysql can be login from remote host, create the corresponding database and Be sure that you have run:ambari-server setup --jdbc-db=mysql --jdbc-driver=/path/to/mysql/mysql-connector-java.jar on the Ambari Server host to make the JDBC driver available and to enable testing the database connection.

5.11, Review
5.12, Install, Start and Test
Just waiting………………………..

While Ambari Freezes on Install complete (Waiting to start), Restart ambari server and login to the ambary UI and retry three times.

5.13, Summary