Impala, Hive, Presto, Spark のJOINに関するメモ

SQL on Hadoop でのJOIN時のテーブルの指定順序と挙動

結合の種類にもよりますが、複数のテーブルを結合する際、一つのテーブルを全てのワーカーノードに転送して行う場合があります。(Broadcast JOIN)
もちろん、統計情報を取得しているかにより挙動は異なります

いくつかのクエリエンジンの挙動を調査しました。

Impala

Impalaでは、最初に最も大きなテーブル、次に最も小さなテーブル、の順序で記述する必要があります
https://docs.cloudera.com/runtime/7.2.0/impala-reference/topics/impala-perf-joins.html

Specify the largest table first. This table is read from disk by each Impala node and so its size is not significant in terms of memory usage during the query.

2つ目以降に記述したテーブルはネットワークを介して各Impalaノードに転送される

Hive

LanguageManual Joins - Apache Hive - Apache Software Foundation

In every map/reduce stage of the join, the last table in the sequence is streamed through the reducers where as the others are buffered. Therefore, it helps to reduce the memory needed in the reducer for buffering the rows for a particular value of the join key by organizing the tables such that the largest tables appear last in the sequence. e.g. in

一番最後のテーブルがストリームでreducerに転送される

Presto

4.3. Properties Reference — Presto 0.244 Documentation

The type of distributed join to use. When set to PARTITIONED, presto will use hash distributed joins. When set to BROADCAST, it will broadcast the right table to all nodes in the cluster that have data from the left table. Partitioned joins require redistributing both tables using a hash of the join key. This can be slower (sometimes substantially) than broadcast joins, but allows much larger joins.

BROADCASTの場合、左側のテーブルをクラスターの全てのノードに転送する

SparkSQL

Performance Tuning - Spark 3.0.1 Documentation

The join strategy hints, namely BROADCAST, MERGE, SHUFFLE_HASH and SHUFFLE_REPLICATE_NL, instruct Spark to use the hinted strategy on each specified relation when joining them with another relation. For example, when the BROADCAST hint is used on table ‘t1’, broadcast join (either broadcast hash join or broadcast nested loop join depending on whether there is any equi-join key) with ‘t1’ as the build side will be prioritized by Spark even if the size of table ‘t1’ suggested by the statistics is above the configuration spark.sql.autoBroadcastJoinThreshold.

結合時にヒントを渡せることは書かれているが、指定の順序に関する明確な記述は発見できず。

見つけたら追記しますが、ご存知の方がいらっしゃれば教えてください。

コメント