我有一些数据: -
ID PRICE
1 100
2 200
3 120
4 130
5 320
6 300
7 200
8 100
9 120
10 250
我需要找到前20%的价格。
预期产量: -
ID PRICE
5 320
6 300
你可以不加入连接。使用分析函数计算max(price)
,取80%,然后使用过滤价格> 80%:
with your_data as ( --this is your data
select stack(10,
1 , 100,
2 , 200,
3 , 120,
4 , 130,
5 , 320,
6 , 300,
7 , 200,
8 , 100,
9 , 120,
10, 250) as (ID, PRICE)
)
select id, price
from
(
select d.*, max(price) over()*0.8 as pct_80 from your_data d
)s where price>pct_80
结果:
OK
id price
6 300
5 320
使用您的表而不是WITH
子查询,必要时按ID添加订单。
以下是查询 -
with top_20 as (
select
max(price)*0.8 as price1
from
<tableName>
)
select * from <tableName> t1 , top_20 t2 where t1.price > t2.price1;
select
name,
price
from
(select
name,
price,
max(price)*0.8 over (order by price) as top_20
from <tableName>
) t1
where
t1.price > t1.top_20;
以下查询将不适用于配置单元 -
select * from <tableName> where price > (select max(salary)*0.8 from <tableName>)
select * from <tableName> t1 where exists (select salary from <tablename> t2 where t1.salary > t2.salary*0.8)
原因 - Hive不支持具有相同条件的where子句中的子查询,它仅支持IN,NOT IN,EXISTS和NOT EXISTS。
即使存在Exists和NOT Exists,它也仅支持Equijoin,请参阅https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries#LanguageManualSubQueries-SubqueriesintheWHEREClause以获取更多详细信息
希望这可以帮助。
这是一种你无需使用join
就可以做到这一点的方法。
Select id,price from (select id,price, row_number() over(order by price desc) r1,count(*) over()*(20/100) ct from table_name)final where r1<=ct ;