使用 HiveQL 删除前导零

问题描述 投票:0回答:3

我有一个字符串值,其中可能有前导零,所以我想删除所有前导零。

例如:

accNumber = "000340" ---> "340"

Hive 中有可用的 UDF 吗?我们可以使用

regexp_extract
来实现这个吗?

hive hiveql
3个回答
7
投票

是的,只需使用

REGEXP_REPLACE()

SELECT some_string,
   REGEXP_REPLACE(some_string, "^0+", '') stripped_string
FROM db.tbl

(修正了带有逗号的简单拼写错误)


1
投票

您也可以使用,

SELECT cast(cast("000340" as INT) as STRING) col_without_leading_zeroes 
  FROM db.table;

输出:340(数据类型将为字符串)

希望这对您有帮助。


0
投票

如果需要保留,可以使用更好的正则表达式:

  • 单个零
  • 号码无效

^0+(?!$)
=> 使用负向前瞻 (?!$)

with
cte_data_test as (
    select '0123'               as txt
    union all
    select '00123'              as txt
    union all
    select '0'                  as txt
    union all
    select '0000'               as txt
    union all
    select cast(null as string) as txt
    union all
    select 'bad_number'         as txt
)
select
    txt,
    regexp_replace(txt,'^0+(?!$)','')  as using_regexp_replace_a,
    regexp_replace(txt, "^0+", '')     as using_regexp_replace_b,
    cast(cast(txt as INT) as STRING)   as using_cast
from 
    cte_data_test
;

将产生:

+-------------+-------------------------+-------------------------+-------------+
|     txt     | using_regexp_replace_a  | using_regexp_replace_b  | using_cast  |
+-------------+-------------------------+-------------------------+-------------+
| 0123        | 123                     | 123                     | 123         |
| 00123       | 123                     | 123                     | 123         |
| 0           | 0                       |                         | 0           |
| 0000        | 0                       |                         | 0           |
| NULL        | NULL                    | NULL                    | NULL        |
| bad_number  | bad_number              | bad_number              | NULL        |
+-------------+-------------------------+-------------------------+-------------+
© www.soinside.com 2019 - 2024. All rights reserved.