BigQuery 使用 COALESCE 更新表

问题描述 投票:0回答:1

我尝试遵循 youtube 上的数据清理教程。导师使用 SQL,我使用 BigQuery(在 Mac 上安装 SQL 太麻烦,我想学习如何使 SQL 语法适应 Bigquery)。

当在 SQL 中指示如何填充属性地址数据时,SQL 语法是

Select *
From PortfolioProject.dbo.NashvilleHousing
--Where PropertyAddress is null
order by ParcelID



Select a.ParcelID, a.PropertyAddress, b.ParcelID, b.PropertyAddress, ISNULL(a.PropertyAddress,b.PropertyAddress)
From PortfolioProject.dbo.NashvilleHousing a
JOIN PortfolioProject.dbo.NashvilleHousing b
    on a.ParcelID = b.ParcelID
    AND a.[UniqueID ] <> b.[UniqueID ]
Where a.PropertyAddress is null


Update a
SET PropertyAddress = ISNULL(a.PropertyAddress,b.PropertyAddress)
From PortfolioProject.dbo.NashvilleHousing a
JOIN PortfolioProject.dbo.NashvilleHousing b
    on a.ParcelID = b.ParcelID
    AND a.[UniqueID ] <> b.[UniqueID ]
Where a.PropertyAddress is null

我已经设法使一些内容适应bigquery语言

Select *
From `sturdy-filament-415311.NashvilleHousing.NHData`
order by ParcelID
;

Select a.ParcelID, a.PropertyAddress, b.ParcelID, b.PropertyAddress, COALESCE(a.PropertyAddress,b.PropertyAddress)
From `sturdy-filament-415311.NashvilleHousing.NHData` a
JOIN `sturdy-filament-415311.NashvilleHousing.NHData` b
  on a.ParcelID = b.ParcelID
  AND a.UniqueID_ <> b.UniqueID_
Where a.PropertyAddress is null
;

但在执行以下查询时不知何故出现错误

Update a
SET PropertyAddress = COALESCE(a.PropertyAddress,b.PropertyAddress)
From sturdy-filament-415311.NashvilleHousing.NHData a
JOIN sturdy-filament-415311.NashvilleHousing.NHData b
    on a.ParcelID = b.ParcelID
    AND a.UniqueID_ <> b.UniqueID_ 
Where a.PropertyAddress is null

我已经知道表“a”必须用数据集限定(例如 dataset.table)。

然后我用这个来解决它

Update sturdy-filament-415311.NashvilleHousing.NHData
SET PropertyAddress = COALESCE(a.PropertyAddress,b.PropertyAddress)
From sturdy-filament-415311.NashvilleHousing.NHData a
JOIN sturdy-filament-415311.NashvilleHousing.NHData b
    on a.ParcelID = b.ParcelID
    AND a.UniqueID_ <> b.UniqueID_ 
Where a.PropertyAddress is null

并且更新/合并必须为每个目标行最多匹配一个源行

我该如何解决这个问题?

sql google-bigquery merge syntax sql-update
1个回答
0
投票

为了避免重复的表,请考虑在没有

JOIN
的情况下运行别名,并一致使用反引号,因为表名称中的连字符
-
可能会引发语法错误。另外,让我们注意,要改掉的坏习惯:表别名,如 (a, b, c) 或 (t1, t2, t3),使用信息更丰富的别名:

UPDATE `sturdy-filament-415311.NashvilleHousing.NHData` nh1
SET nh1.PropertyAddress = COALESCE(nh1.PropertyAddress, nh2.PropertyAddress)
FROM `sturdy-filament-415311.NashvilleHousing.NHData` nh2
WHERE nh1.ParcelID = nh2.ParcelID
  AND nh1.UniqueID_ <> nh2.UniqueID_ 
  AND nh1.PropertyAddress IS NULL
© www.soinside.com 2019 - 2024. All rights reserved.