:: @DeveloperApi ::
:: @DeveloperApi ::
Stores partition index and references to left and right partitions to be joined
The partition index
The left Dependency to be used in the join
The right Dependency to be used in the join
:: @DeveloperApi ::
:: @DeveloperApi ::
RDD implementation for merge-join that uses a shuffle to partition and sort by keys using an implicit Ordering for K
,
and then delegates to an instance of MergeJoin to perform the actual merge logic.
There is an optimization in place to avoid a shuffle in some cases where left
or right
are guaranteed to be partition-sorted already (ie: via repartitionAndSortWithinPartitions
)
Merge-join operators that provide scalable equivalents to the existing Spark RDD join
, leftOuterJoin
, rightOuterJoin
, fullOuterJoin
operators.
Merge-join operators that provide scalable equivalents to the existing Spark RDD join
, leftOuterJoin
, rightOuterJoin
, fullOuterJoin
operators.
Refer to the documentation for MergeJoin for implementation details.
Spark merge-join capability for RDDs. See com.hindog.spark.rdd.PairRDDFunctions for further documentation