将 PIG 中每个块的嵌套翻译成 Spark

问题描述 投票:0回答:0

我有以下 Pig 脚本,想将其翻译成 Spark Scala:

FOREACH (GROUP callMetrics BY (datacenter, instance, tag, host_name, db_name, cluster_name, method)) {
                 groupedCounts = FOREACH callMetrics GENERATE
                                    timestamp AS timestamp,
                                    sensor_value AS sensor_value,
                                    last_reset_time AS last_reset_time;

                 GENERATE
                    group.datacenter AS datacenter,
                    group.instance AS instance,
                    group.tag AS tag,
                    group.host_name AS host_name,
                    group.db_name AS db_name,
                    group.cluster_name AS cluster_name,
                    group.method AS method,
                    FLATTEN(udf.compute_qps(groupedCounts)) AS (timestamp, qps);
              };

我尝试在 Spark 中使用

groupBy
,但如果没有某种聚合,我似乎无法使用它。

scala apache-spark apache-pig
© www.soinside.com 2019 - 2024. All rights reserved.