一、hive与hadoop的下载地址
hive官网下载地址:https://archive.apache.org/dist/hive/
hadoop官网下载地址:https://archive.apache.org/dist/hadoop/common/
二、安装hadoop
根据hive版本来下载对应的hadoop版本
例如:
下载apache-hive-3.1.2-src.tar.gz
打开压缩包中的pom文件来确定对应的hadoop版本
确定版本并下载对应的hadoop
- 解压hadoop-3.2.2.tar.gz
- 配置环境变量(与Java相同),变量名为HADOOP_HOME
- 在cmd中输入==hadoop version==来确定配置是否正常
如出现Java环境变量错误或未设置请检查Java环境变量或将所用Java路径修改到hadoop目录下==etchadoophadoop-env.cmd==文件中的==JAVA_HOME==并将该文件中的==%USERNAME%==修改为==% "USERNAME" %==
- 在hadoop目录下创建==data/dfs/namenode==与==data/dfs/datanode==两个文件夹
修改==etchadoop==下的==core-site.xml==文件,添加以下内容
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
修改==hdfs-site.xml==中的datanode和namenode改为自己的目录
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/D:/Softwares/hadoop/data/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/D:/Softwares/hadoop/data/dfs/datanode</value>
</property>
</configuration>
修改mapred-site.xml.template文件,将文件重命名为:mapred-site.xml,并添加:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
修改yarn-site.xml文件,并添加:
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
- 下载对应版本的winutils
由于hadoop不能直接直接在windows环境下启动,需要依赖hadoop的winutils
将对应的版本覆盖hadoop的bin目录
格式化HDFS,打开cmd,输入命令
hadoop namenode -format
出现INFO common.Storage: Storage directory D:\Softwares\hadoop\data\dfs\namenode has been successfully formatted.
代表成功
cmd下切换到hadoop目录下的sbin目录,输入:
start-all.cmd
会依次弹出四个命令框,分别为namenode、datanode、resourcemanager、nodemanager,访问http://localhost:9870
三、安装hive
解压下载好的hive文件
- 配置好环境变量(HIVE_HOME)
- 在hive目录里创建5个文件夹(包括data_hive)
data_hiveoperation_logs
data_hivequerylog
data_hiveresources
data_hivescratch
- 将mysql-connector-java-5.1.47-bin.jar复制到hive目录下的lib目录中
- 进入hive目录中的conf目录
将 hive-log4j2.properties.template 重命名为 hive-log4j2.properties
将 hive-exec-log4j2.properties.template 重命名为 hive-exec-log4j2.properties
将hive-env.sh.template文件重命名为hive-env.sh
将hive-default.xml.template文件重名为hive-site.xml
修改hive-env.sh文件,对应值修改为自己的
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=D:\Softwares\hadoop
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=D:\Softwares\hive\conf
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=D:\Softwares\hive\lib
修改hive-site.xml文件,对应值修改为自己的
<property>
<name>hive.exec.local.scratchdir</name>
<value>D:/Softwares/hive/data_hive/scratch</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>D:/Softwares/hive/data_hive/resources/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>Username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&useSSL=false&useUnicode=true&characterEncoding=UTF-8</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
- 打开hadoop的sbin目录,点击start-dfs.cmd启动两个窗口服务即hadoop启动
初始化hive数据库,在cmd中输入:
hive --service schematool -dbType mysql -initSchema
- 在cmd中输入hive即可启动