我有一个航班列表,具有以下属性:
我想找到每个航空公司运营的航班数量。这样做,每天,我需要找到不同的flight_number,origin_airport,dest_airport和carrier_code,因为其中一个条件是“飞机可能被安排从A飞到B然后从B飞到C,飞行号相同但是,我们认为这两次旅行是两次单独的航班。“
这就是我没有运行的东西:
desiredattributes = FOREACH jnd GENERATE day, flight_number, origin_airport_id, dest_airport_id, carrier_code;
distinctflights = FOREACH (GROUP desiredattributes BY day)
{
a = carriers.(carrier_code, flight_number, origin_airport_id, dest_airport_id);
b = DISTINCT a;
};
DUMP distinctflights;
任何帮助或指导表示赞赏!我是猪的新手
您应该列出desiredattributes中的字段。
desiredattributes = FOREACH jnd GENERATE day, flight_number, origin_airport_id, dest_airport_id, carrier_code;
dayflights = GROUP desiredattributes BY day;
alldayflights = FOREACH dayflights GENERATE FLATTEN(group) as day,desiredattributes.flight_number,desiredattributes.origin_airport_id,desiredattributes.dest_airport_id, desiredattributes.carrier_code;
distinctflights = DISTINCT alldayflights;
DUMP distinctflights;