发表新帖

发表新帖

Does Apache Spark load entire data from target database?

前端未结

关注

 2  608

逝去的感伤 2021-01-16 08:13

I want to use Apache Spark and connect to Vertica by JDBC.

In Vertica database, I have 100 million records and spark code runs on another server.

When I run

2条回答

轮回少年 (楼主)

2021-01-16 08:54
After your spark jobs finishes logon to the Vertica database using the same credentials that the spark job used and run:
```
SELECT * FROM v_monitor.query_requests ORDER BY start_timetamp DESC LIMIT 10000;
```
This will show you the queries sent to the database by the spark job, allowing you to see if it pushed down the count(*) to the database or if it indeed tried to retrieve the entire table across the network.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题