自己动手搭建Spark on Yarn集群

整体方法参考:https://blog.csdn.net/wy250229163/article/details/52729608 CONFIGURATION 安装虚拟机,安装ssh,配置静态网络 sudo apt-get install ssh;配置静态网络: 参考 https://blog.csdn.net/JingLisen/article/details/82430767https://blog.csdn.net/bzlj2912009596/article/details/55187564目前设置ip成为:master: 192.168.86.129slave1: 192.168.86.130slave2: 192.168.86.131远程登录ssh zhentao@192.168.86.129ssh zhentao@192.168.86.130ssh zhentao@192.168.86.131 下载spark cd ~/Desktop;mkdir softwares;wget http://www.trieuvan.com/apache/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgztar -xf spark-2.4.3-bin-hadoop2.7.tgz 下载JAVA  配置JAVA环境变量 wget https://download.oracle.com/otn/java/jdk/8u202-b08/1961070e4c9b4e26a04e7f5a083f551e/jdk-8u202-linux-x64.tar.gz?AuthParam=1563033313_467425c8aa77a5007fd6f772d1aed13bmv ‘jdk-8u202-linux-x64.tar.gz?AuthParam=1563033313_467425c8aa77a5007fd6f772d1aed13b’ jdk-8u202-linux-x64.tar.gztar -zxvf jdk-8u202-linux-x64.tar.gzsudo vi /etc/profileAdd:export JAVA_HOME=/home/zhentao/Desktop/softwares/jdk1.8.0_202export JRE_HOME=/home/zhentao/Desktop/softwares/jdk1.8.0_202/jreexport PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATHexport CLASSPATH=$CLASSPATH:.:$JAVA_HOME/lib:$JAVA_HOME/jre/libchecksource /etc/profilejava…

Scala 学习笔记 (Coursera课程:Functional Programming Principal in Scala)

==Week1== 配置环境(MacOS) 检查Java 1.8, Check Java environment with “java -version” 安装sbt (版本>0.13.x): “brew install sbt@0.13″ 安装IntelliJ: https://www.jetbrains.com/idea/download/#section=mac 配置Intellij:  安装Scala插件 使用Scala, sbt 初始化project,使用sbt == 0.13.8, scala==2.11.8 在src>main>scala 里创建example.Example.scala这个object,如果是main object,需要在object定义的时候 ”extend APP“ 添加unittest模块: 在build.sbt里面添加 libraryDependencies += “org.scalatest” %% “scalatest”…

学习Elastic Search

  ElasticSearch介绍和基本概念 Elasticsearch(ES)是一个基于Lucene构建的开源、分布式、RESTful接口的全文搜索引擎。   Part I: Elastic Search 的基本配置 Download & Install     https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.6.1.tar.gz run  Users/frank.xu/Documents/[Softwares]/elasticsearch-6.6.1/bin/elasticsearch 打开网站: localhost:9200 或者 localhost:9201可以看到结果。 配置kibana 安装 brew install kibana ./kibana Part II: Elastic Search 基本操作 创建新的Index 在Sense Dev Tool Console…

Learning Redis

  Just feel interested in this key-value data management engine, and want to try some hands-on examples and share some notes of using Redis.   I took a 3-hour Redis course here, and below is…

Learning Scala

Scala is the source language of Spark, which is a unified analytics engine for big data processing. Programming Spark using Scala will archive optimal computation performance.   I will update the learning notes below after…

Learn Spark (Python API)

Start learning Apache Spark these days. Will update the cool techniques here.   RDD: Resilient Distributed Dataset 本质上是一个dataset,分布式的 创建的方式: rdd = sc.textFile(“file:///…*.txt”); 这种方法很好! rdd = sc.parallelize([a list of value]); 这种方法不是很有意思 常见的函数: ACTION 类别: collect: 运行rdd输出结果flat…

Start learning new cool tools (SQL)

Recently I have started my Udemy course on MYSQL and HADOOP. Although I have self-learned some MYSQL and PHP two years ago, some systematic course on that will definitely be much helpful. I was surprised…

Blog’s first day.

Hi, this is Zhentao,  a master student who just graduated from UMich and currently doing an internship at Isuzu Technical Center of America (ITCA). Today I have published the WordPress module on my website, I’d…