linux常用命令总结

一：常用sql

1）查看分区

hadoop fs -ls /hive/warehouse/managed/dwd_data.db/dwd_gen_track_oneapp_log_df

2）添加分区

alter table dwd_gen_track_oneapp_log_df add partition(partition_date = '2019-10-24'); 添加分区

3）删除分区

alter table dwd_gen_track_oneapp_log_df drop partition(pt='2019-10-22');删除分区

4）杀死yarn进程

yarn application -kill application_1571219160975_3186      杀死yarn进程

5）查看分区

show partitions 表名   查看分区

6）删除表数据

truncate table 表名		删除表数据

7）外部表转内部表

alter table tableA set TBLPROPERTIES('EXTERNAL'='false') 外部表转内部表

8）查看表结构的详细信息

desc formatted table;

9）删除表

DROP TABLE IF EXISTS table;	删除表

10）本地导入到hive

load data local inpath '/222.csv'  into table dw_test1;  本地导入到hive

11）添加语句

INSERT OVERWRITE table 新表名  partition(partition_date = '2019-10-24') select * from default.旧表名 where w_upd_dt < '2019-10-25 00:00:00.0';

12）检查是否有串行

select * from default.表名 where  length(nvl(cast(last_mdf_dt as string),''))=0  limit 1;

13）当前数据库

select current_database()

二：spark命令

1）spark-shell

cat so_log1.scala | spark-shell \
--name empty_test \
--conf spark.driver.memory=2g \
--conf spark.executer.memory=2g \
--conf spark.executer.cores=4 \
--conf spark.executer.instances=2 \
--conf spark.kryoserializer.buffer.max=512m \
--conf spark.dynamicAllocation.enabled=false \
--conf spark.sql.shuffle.partitions=100 \
--conf spark.driver.maxResultSize=4g \
--conf spark.broadcast.blockSize=100m \
>> so_log1.txt

2）spark-submit

bin/spark2-submit --class WoobooOrderStreamingKafka  --master yarn --executor-memory 2G --total-executor-cores 2 /taskJar/spark-wooboo-1.0.jar 2>&1 | tee /submit.log

三：tumx的使用：

安装tmux：yum install tmux

新建会话：tmux new -s aaa

查看会话：tmux ls

进入会话：tmux a -t aaa

断开会话：tmux detach

关闭会话：tmux kill-session -t aaa

来源：CSDN

作者：Small_temper

链接：https://blog.csdn.net/Small_temper/article/details/103925983

标签

linux分区

linux系统

yarn