Does Hive preserve file order when selecting data

不打扰是莪最后的温柔 提交于 2021-02-19 04:05:44

问题


If I do select * from table1; in which order data will retrieve

File order Or random order


回答1:


Without ORDER BY the order is not guaranteed.

Data is being read in parallel by many processes (mappers), after splits were calculated, each process starts reading some piece of file or few files, depending on splits calculated.

All parallel processes can process different volume of data and running on different nodes, the load is not the same each time, so they start returning rows and finishing at different times, depending on too many factors, such as node load, network load, volume of data per process, etc, etc.

Removing all this factors you can increase the order prediction accuracy. Say, single thread sequential file read may return rows in the same order as they are in the file. But this is not how the database works.

Also according to Codd's relational theory, the order of columns and rows is immaterial.



来源:https://stackoverflow.com/questions/56678834/does-hive-preserve-file-order-when-selecting-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!