CDH安装前置基础准备条件
CDH安装前置基础准备条件
1.基础环境
1.1.节点规模
测试环境,最小规模,最少4台服务器。一台做管理节点Cloudera Manager和NameNode等,另外三台用作worker,DATANODE节点,这种最小规模一般仅用于开发和测试。
如果是生产环境,最少6台,3台管理节点包括1个Cloudera Manager,2个NameNode做高可用,3个工作节点。
常见的较小规模的生产系统一般为10-20台。
###1.2.操作系统
CDH支持大部分主流的64位操作系统,我这里会以centos 6.9部署CDH 5.14版本为例子。其他CDH版本及其对应的操作系统版本可参考:CDH版本及其支持的操作系统版本
1.3.安装用户
可以用root,或具有免密sudo权限的用户
###1.4.硬件要求
要评估群集的硬件和资源分配,其实需要分析要在群集上运行业务的负载情况,以及将要部署的CDH组件。
还应该考虑存储和处理的数据大小,工作负载的频率,需要运行的作业并发数量以及应用程序所需的资源。
所以硬件配置需要视具体情况而定。
测试集群最低要求:
CPU:最少4 cores内存:最少16GB网络:千兆及以上磁盘:视情况而定这里我使用的服务器配置是:
CPU:56 cores内存:14*16GB网络:双万兆网卡绑定磁盘:24*1.2T SAS(2.5 10K)1.4.1.磁盘要求
所有节点服务器系统盘可以使用raid1或raid10,数据盘不要使用raid,应该用JBOD模式。hdfs存储系统本身就是分布式高可用的,使用raid就失去使用hdfs的初衷,且会有性能损失。
如果集群的规模不大,有多个应用服务复用的话,NN,ZK,JN等管理服务存放的数据目录也可以放在使用raid的磁盘上。
DataNode数据盘建议选择ext4或xfs,并配置noatime:
1234567
UUID=4df04bc1-c94b-45d6-a80c-4b2269211fa0 /data1 ext4 defaults,noatime 1 2UUID=0ec154be-9923-4f05-ae0f-72fa98067d23 /data2 ext4 defaults,noatime 1 2UUID=a87a9192-3e75-40c6-a58a-f851e5f888e3 /data3 ext4 defaults,noatime 1 2UUID=283926d8-dc64-4a99-aa17-23e4f325897c /data4 ext4 defaults,noatime 1 2UUID=b547c6d3-5898-4053-8a15-e38c7be3f9ba /data5 ext4 defaults,noatime 1 2UUID=8a332303-6bcb-47cb-9def-546b70b75bcf /data6 ext4 defaults,noatime 1 2UUID=2574f003-b84a-458b-8063-f503066b1101 /data7 ext4 defaults,noatime 1 2
目前常见的SATA读写速度大概在150MB/S-200MB/S,SAS或者SSD会更快,如果磁盘读写速度小于80MB/S,最好检查下磁盘,或者更换更好的磁盘,不然后期IO隐患很大。
1.4.2.网络要求
由于大数据应用,集群内部网络吞吐一般较大,稳健的高性能网络支撑十分重要。前期最好规划好,等到后期业务吞吐上去,网络撑不住再去升级底层网络设施是非常痛苦的。
最起码千兆网卡,根据实际情况,必要时需要考虑万兆网卡,以及配套的光纤交换机,并留有网卡绑定,交换机堆叠的扩展余地。
如果是使用云上的虚拟机,最好确认下网卡的多队列支持,笔者就被XX云网卡多队列支持数过少而坑过,导致集群性能利用率上不去,且CPU负载偏移,网络丢包等现象。
123456789101112
[root@prd-bigdata06 ~]# ethtool -l eth0Channel parameters for eth0:Pre-set maximums:RX: 0TX: 0Other: 0Combined: 14Current hardware settings:RX: 0TX: 0Other: 0Combined: 14
123456789101112
[root@bigdata17 ~]# ethtool -l p6p1Channel parameters for p6p1:Pre-set maximums:RX: 0TX: 0Other: 1Combined: 63Current hardware settings:RX: 0TX: 0Other: 1Combined: 56
2.系统及应用环境
2.1.JDK
CDH发行版中自带JDK为1.7.0_67的版本,CDH5.3以后开始支持JDK1.8。可以实现自己安装好,或者后续安装CDH时,勾选CDH自带的JDK安装。
2.2.外部数据库
CM自动部署安装时会自带数据库进行系统配置、schema等并进行相应管理。
也可是自行部署,不用自带的,具体支持的数据库包括:
MySQL:5.1、5.5、5.6、5.7
PostgreSQL:8.1、8.3、8.4、9.1、9.2、9.3、9.4
Oracle:11gR2、12c
这里我是自己部署的mysql,方便管理。 确保以下配置:
增加数据库的最大连接数确保数据库支持UTF-8编码配置为主备模式,参考如何实现CDH元数据库MySQL的主备自己部署的话,就需要自己预先创建好CDH各项服务对应的元数据库。
1234567891011121314151617181920212223242526272829303132333435
create database metastore default character set utf8;CREATE USER 'hive'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'%';FLUSH PRIVILEGES;create database cm default character set utf8;CREATE USER 'cm'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON cm. * TO 'cm'@'%';FLUSH PRIVILEGES;create database am default character set utf8;CREATE USER 'am'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON am. * TO 'am'@'%';FLUSH PRIVILEGES;create database rm default character set utf8; CREATE USER 'rm'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON rm. * TO 'rm'@'%';FLUSH PRIVILEGES;create database hue default character set utf8;CREATE USER 'hue'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON hue. * TO 'hue'@'%';FLUSH PRIVILEGES;create database oozie default character set utf8;CREATE USER 'oozie'@'%' IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON oozie. * TO 'oozie'@'%';FLUSH PRIVILEGES;create database sentry default character set utf8;CREATE USER 'sentry'@'%' IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON sentry.* TO 'sentry'@'%';FLUSH PRIVILEGES;
2.3.开放端口
以下常用服务端口,根据实际情况,需要在防火墙上放行。
ServicePortHostsCloudera Manager7180CM所在主机Cloudera Navigator Metadata7187Navigator所在主机HDFS50070,8020Namenode所在主机ResourceManager8088,19888RM, JobHistory所在主机HBase60010, 60030HMaster, RegionServer所在主机Hive10002HiveServer2所在主机Hue8888Hue所在主机Impala25010, 25020, 25000spark18088Spark HistoryServer所在机器ssh22http80httpd服务所在机器,一般是CM那台主机2.4.http服务
安装httpd服务主要是提供CDH和cm的本地源,进行离线安装。由于总所周知的原因,在线安装一般不会太顺利,最好是离线安装。
123456
[root@bigdata02~]# yum -y install httpd[root@bigdata02~]# chkconfig --add httpd [root@bigdata02~]# chkconfig httpd on[root@bigdata02~]# service httpd startStarting httpd: [OK][root@bigdata02~]#
2.4.1.配置本地yum源
在/var/www/html下新增cm5.14目录1
[root@bigdata02~]# mkdir -p /var/www/html/cm5.14
下载CM5.14版本的RPM安装包放在cm5.14目录下,并执行createrepo
:12345678910
[root@bigdata02]# createrepo .Spawning worker 0 with 7pkgsWorkers FinishedGathering worker resultsSaving Primary metadataSaving file lists metadataSaving other metadataGenerating sqlite DBsSqlite DBs complete[root@bigdata02]# ll在Cloudera Manager所在服务器的/etc/yum.repo.d目录下创建cm.repo文件,内容如下:
1234567
[root@bigdata04 yum.repos.d]# vim cm.repo[cmrepo]name=Cloudera Manager 5.14baseurl=http://10.50.10.12/cm5.14gpgcheck=falseenable=trueCDH Parcels部署同上。
2.5.hosts配置
将集群所有服务器的IP和HOSTNAME配置到hosts文件,并同步至集群的所有服务器。
2.6.系统相关设置
禁用selinux关闭iptables防火墙swap相关设置
swappiness表示如何使用swap分区。
swappiness=0的时候表示最大限度使用物理内存,然后才是 swap空间,swappiness=100的时候表示积极的使用swap分区,并且把内存上的数据及时的搬运到swap空间里面。linux的基本默认设置为60,这里我是设为1:
12
/etc/sysctl.confvm.swappiness=1
关闭透明大页面
12
[root@bigdata02~]# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag[root@bigdata02~]# echo never >/sys/kernel/mm/redhat_transparent_hugepage/enabled
2.7.NTP时钟同步
如果公司有自己的NTP Server则可以集群中所有节点可配置企业NTP Server,如果没有则在集群中选用一台服务器作为NTP Server,其它服务器与其保持同步,配置如下:
所有节点安装NTP
1
[root@bigdata02~]# yum -y install ntp
选一台做ntp server
1234567
[root@bigdata02~]# vim /etc/ntp.conf#server 0.centos.pool.ntp.org iburst#server 1.centos.pool.ntp.org iburst#server 2.centos.pool.ntp.org iburst#server 3.centos.pool.ntp.org iburstserver 127.127.1.0 #local clockfudge 127.127.1.0 stratum 10
集群其它节点与其同步,配置如下:
1234567
[root@bigdata04~]# vim /etc/ntp.conf# Use public servers from thepool.ntp.org project.#server 0.centos.pool.ntp.org iburst#server 1.centos.pool.ntp.org iburst#server 2.centos.pool.ntp.org iburst#server 3.centos.pool.ntp.org iburstserver 172.16.1.22
所有节点启动ntp
:
123456
[root@bigdata04~]# chkconfig --add ntpd[root@bigdata04~]# chkconfig ntpd on[root@bigdata04~]# service ntpd restartShutting down ntpd: [ OK ]Starting ntpd: [ OK ][root@bigdata04~]#
3.外部数据库
集群中CM节点安装MySQL服务
12345
[root@bigdata02~]# yum -y install mysql mysql-server [root@bigdata02~]# chkconfig --add mysqld[root@bigdata02~]# chkconfig mysqld on[root@bigdata02~]# service mysqld startStarting mysqld: [ OK ]
初始化脚本
123456789101112131415161718192021222324252627282930313233343536373839404142434445
[root@bigdata02~]# mysql_secure_installation NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FORALL MySQL SERVERS IN PRODUCTION USE! PLEASE READEACH STEP CAREFULLY!In order to log into MySQL to secure it, we'll needthe currentpassword for the root user. If you'vejust installed MySQL, andyou haven't set the root password yet, the passwordwill be blank,so you should just press enter here.Enter current password for root (enter for none): OK, successfully used password, moving on...Setting the root password ensures that nobody can log into the MySQLroot user without the proper authorisation.Set root password? [Y/n] yNew password: Re-enter new password: Password updated successfully!Reloading privilege tables.. ... Success!By default, a MySQL installation has an anonymous user, allowing anyoneto log into MySQL without having to have a user account created forthem. This is intended only for testing,and to make the installationgo a bit smoother. You should removethem before moving into aproduction environment.Remove anonymous users? [Y/n] y ... Success!Normally, root should only be allowed to connect from 'localhost'. Thisensures that someone cannot guess at the root password from the network.Disallow root login remotely? [Y/n] n ... skipping.By default, MySQL comes with a database named 'test' that anyone canaccess. This is also intended only fortesting, and should be removedbefore moving into a production environment.Remove test database and access to it? [Y/n] y - Dropping test database... ... Success! - Removing privileges on testdatabase... ... Success!Reloading the privilege tables will ensure that all changes made so farwill take effect immediately.Reload privilege tables now? [Y/n] y ... Success!Cleaning up...All done! If you've completed all of the above steps, your MySQLinstallationshould now be secure.Thanks for usingMySQL![root@bigdata02~]#
创建CDH所需要的库
1234567891011121314151617181920212223242526272829303132333435
create database metastore default character set utf8;CREATE USER 'hive'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'%';FLUSH PRIVILEGES;create database cm default character set utf8;CREATE USER 'cm'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON cm. * TO 'cm'@'%';FLUSH PRIVILEGES;create database am default character set utf8;CREATE USER 'am'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON am. * TO 'am'@'%';FLUSH PRIVILEGES;create database rm default character set utf8; CREATE USER 'rm'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON rm. * TO 'rm'@'%';FLUSH PRIVILEGES;create database hue default character set utf8;CREATE USER 'hue'@'%'IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON hue. * TO 'hue'@'%';FLUSH PRIVILEGES;create database oozie default character set utf8;CREATE USER 'oozie'@'%' IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON oozie. * TO 'oozie'@'%';FLUSH PRIVILEGES;create database sentry default character set utf8;CREATE USER 'sentry'@'%' IDENTIFIED BY 'password';GRANT ALL PRIVILEGES ON sentry.* TO 'sentry'@'%';FLUSH PRIVILEGES;
4.安装MySQL驱动,将mysql-connector-java-5.1.34.jar拷贝至/usr/share/java目录,并创建软链接。
CDH的安装前置要求大致就这些,后续会讲下CDH部署的具体步骤。以及由浅入深的讲下CDH的一些基本组件。
文章来源:
Author:hyperxu
link:http://www.hyperxu.com/2018/07/21/prepare-cdh/