ap操作先将Tuple从(index, key, value)转成(key, value)形态,然后把这个rdd集合new成CoGroupRDD,包含一次(Seq) JavaConversions.asScalaBuffer(rddPairs)转化。最后调用CoGroupRDD的map方法,把Tuple2
Package
Package需要把global rearrange处理后的key, Seq
进行group。具体的待处理Tuple结构是这样的:(key, Seq
:{(index,key, value without key)})
tuple.get(0)是keyTuple,tuple.get(1)是Iterator
,最后返回(key, {values}),即Tuple
全文完 :)