Join Query in Apache Ignite

萝らか妹 提交于 2019-12-11 01:08:28

问题


How is spatial join query executed among the nodes in PARTITIONED mode? As Ignite partition the data (default 1024) among the nodes using Rendezvous Affinity hashing, how is join operation executed among the partitions? Suppose I have two spatial datasets in the cache (pCache and qCache), each contains 10 partitions(1, .., 10). How is ignite perform the join operation on this two dataset? Is it partition1 of pCache with partition1 of qCache?

My second question: How is ignite perform the same operation in case of distributed join?


回答1:


There is no correspondence between partitions of different caches. If you run a join operation, then by default only local lookup will be performed.If data is not collocated, then this approach may give you a partial result.

When all-to-all mapping is performed, then every node has to communicate with every other node, so

messages are totally sent in the cluster, where

is the number of nodes. This is called distributed joins, and it affects performance significantly. It may be enabled either in the connection string in case of JDBC driver, or by using SqlFieldsQuery#setDistributedJoins(...) method in case of cache query API.

The recommended way to do joins is to collocate the data in a way, that no distributed joins are needed. Ignite has a feature called affinity collocation, designed specially for this purpose. You can specify a field of an object, that will be used to calculate the affinity function. Value of this field doesn't have to be unique, but it should be a part of a key. So, if you want to perform joins on two tables, you may collocate them by affinity, so no distributed joins will be needed.



来源:https://stackoverflow.com/questions/52602842/join-query-in-apache-ignite

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!