spark项目实践

回眸只為那壹抹淺笑 提交于 2021-02-04 08:29:34

实践目的

通过操作一个开源例子,学习大数据的架构 及基本的使用,各种概念。不涉及自编码与创新。

环境搭建

需要建立 hadoop,hbase ,spark 等大数据环境

在10.30.2.5上建立六个docker , 分别对应 s141~s146 分别用于装大数据环境,具体操作步骤 参考本人

hadoop-spark

https://blog.csdn.net/dualvencsdn/article/details/112007643?spm=1001.2014.3001.5501

habase

https://blog.csdn.net/dualvencsdn/article/details/112905925?spm=1001.2014.3001.5501

学会操作hbase

https://blog.csdn.net/dualvencsdn/article/details/113309385?spm=1001.2014.3001.5501
 

flume初步学习与使用

https://blog.csdn.net/qq_1018944104/article/details/85462011

/usr/local/flume/do.sh

kafka与zookeeper的使用与编程 

https://blog.csdn.net/dualvencsdn/article/details/105557575?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522161234227816780269887393%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fblog.%2522%257D&request_id=161234227816780269887393&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~blog~first_rank_v1~rank_blog_v1-1-105557575.pc_v1_rank_blog_v1&utm_term=kafka&spm=1018.2226.3001.4450

成果展现

 

 

操作记录

/home/dualven/docker/*.jar  

start.sh -->start dockers

appendHost.sh-> add host ip for six hosts

 seeMessage.sh  ->see the message consumed by kafka

docker exec -it centos1122 bash

cd /usr/local/

see readme.txt

 代码

https://codechina.csdn.net/dualvenorg/sparkstreaming.git

参考资料

https://blog.csdn.net/qq_41955099/article/details/88959996?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-2.control&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-2.control

https://github.com/ljcan/SparkStreaming

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!