如何计算表中的数据在每列中的金额

问题描述 投票:2回答:2

我有在具有490列,今天我需要补充一些SQL Server中的表。我有一个从外部系统填充此表中的API,目前,它正在约16小时,因为有〜55万行的表说同步。我需要计算在每列使用的行数,看看是否有任何地方我可以删除。

现在我已经研究过这个问题了一点时间和采取张贴在这里作为最后的努力。我已经尝试了几种不同的方式,但没有什么是相当击中我需要什么。我知道我可以办理,并做COUNT(列),但也有490列,这是不是真的可行。

所以,我目前使用的SYS.COLUMNS表来获得上述表中的行的列表,然后使用按照表COUNT(*)外适用。这是有点儿工作,但显然只是回到我行的表中的总金额再次每一行。

我想我需要更换COUNT(*)用COUNT(sys.columns.name),但也不管用,它返回一个“上的右边聚集APPLY不能从左侧引用列。”错误。

我觉得该代码是目前最接近的是如下但我可能是一百万英里远。

 SELECT

  name as 'Column',
  Counter.total   

 FROM sys.columns WITH (NOLOCK)

 OUTER APPLY
 (
    SELECT TOP 1
        COUNT(*) as total
    FROM lead WITH (nolock) 
 ) as Counter

 WHERE sys.columns.object_id = 544720993

这回抛出如下 -

Column    |     total
______________________

Column1   |       512345

Column2   |       512345

Column3   |       512345

Column4   |       512345

Column5   |       512345

然而,在一个理想的世界,我想以下

Column    |     total
______________________
Column1   |      512345 --(meaning no nulls in this column)

Column2   |      435765 --(mean some nulls in this column)

Column3   |      123423

Column4   |      76 --(meaning only 73 non nulls on this column)

Column5   |      0 --(meaning every row is null in this column)

感谢您的时间!

sql sql-server
2个回答
1
投票

您可以使用与插入一个临时表中的每个COUNT检查动态SQL游标。

您可以控制架构,表和列检查与光标的SELECT

IF OBJECT_ID('tempdb..#ColumnResults') IS NOT NULL
    DROP TABLE #ColumnResults

CREATE TABLE #ColumnResults (
    SchemaName VARCHAR(100),
    TableName VARCHAR(100),
    ColumnName VARCHAR(100),
    TotalRows INT,
    NotNullAmount INT)


DECLARE @SchemaName VARCHAR(100)
DECLARE @TableName VARCHAR(100)
DECLARE @ColumnName VARCHAR(100)

DECLARE ColumnCursor CURSOR FOR
    SELECT
        QUOTENAME(T.TABLE_SCHEMA),
        QUOTENAME(T.TABLE_NAME),
        QUOTENAME(T.COLUMN_NAME)
    FROM
        INFORMATION_SCHEMA.COLUMNS AS T
    WHERE
        T.TABLE_NAME = 'YourTableName' AND      -- Filter here the table you want to check
        T.TABLE_SCHEMA = 'YourTableSchema'      -- Filter here the schema you want to check
    ORDER BY
        T.TABLE_SCHEMA,
        T.TABLE_NAME,
        T.COLUMN_NAME

OPEN ColumnCursor
FETCH NEXT FROM ColumnCursor INTO 
    @SchemaName, 
    @TableName,
    @ColumnName

WHILE @@FETCH_STATUS = 0
BEGIN

    DECLARE @DynamicSQL VARCHAR(MAX) = '
        INSERT INTO #ColumnResults (
            SchemaName,
            TableName,
            ColumnName,
            TotalRows,
            NotNullAmount)
        SELECT
            SchemaName = ''' + @SchemaName + ''',
            TableName = ''' + @TableName + ''',
            ColumnName = ''' + @ColumnName + ''',
            TotalRows = COUNT(1),
            NotNullAmount = COUNT(' + @ColumnName + ')
        FROM
            ' + @SchemaName + '.' + @TableName + ' AS T'

    -- PRINT (@DynamicSQL)
    EXEC (@DynamicSQL)

    FETCH NEXT FROM ColumnCursor INTO 
        @SchemaName, 
        @TableName,
        @ColumnName

END

CLOSE ColumnCursor
DEALLOCATE ColumnCursor


SELECT
    C.*
FROM
    #ColumnResults AS C
ORDER BY
    C.SchemaName,
    C.TableName,
    C.ColumnName

您可以发表意见EXEC并取消PRINT检查执行之前创建的动态SQL。

请注意,这实际上将执行一个SELECT为每列而不是为表中的所有列的SELECT。你可以篡改的动态SQL一点,所以它同时检查所有列的作品每一次表,但我觉得这种做法更为简洁并且能够跨模式和表工作相同的方式的。


3
投票

样本数据

 CREATE TABLE [dbo].[Tp](
    [a] [char](2) NULL,
    [b] [char](2) NULL,
    [c] [char](2) NULL
    ) ON [PRIMARY]

GO    
INSERT INTO [Tp] ([a],[b],[c])VALUES('a','a','a')
INSERT INTO [Tp] ([a],[b],[c])VALUES('1','1','1')
INSERT INTO [Tp] ([a],[b],[c])VALUES('2','2','2')
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,'9',NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES('3','3','3')
INSERT INTO [Tp] ([a],[b],[c])VALUES('4','4','4')
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,NULL,NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,'7',NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,NULL,NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES('8','8','8')
INSERT INTO [Tp] ([a],[b],[c])VALUES('9','9','9')
INSERT INTO [Tp] ([a],[b],[c])VALUES(NULL,NULL,NULL)
INSERT INTO [Tp] ([a],[b],[c])VALUES('','','')
INSERT INTO [Tp] ([a],[b],[c])VALUES('','','')
INSERT INTO [Tp] ([a],[b],[c])VALUES('','5','')
INSERT INTO [Tp] ([a],[b],[c])VALUES('2','','')
SELECT * FROM [Tp]

动态SQL脚本,以获得预期的结果

 DECLARE @ColumnCount nvarchar(max),
         @Sql nvarchar(max)

SELECT @Sql = STUFF((SELECT ' UNION ALL '+ ' '+'SELECT '''+TABLE_NAME+''' AS TABLE_NAME,'+''''+COLUMN_NAME+''''+' AS ColumName'+',SUM(CASE WHEN '+COLUMN_NAME+' IS NULL THEN 1 ELSE 0 END) As Countof_nulls
      ,SUM(CASE WHEN ISNULL(NULLIF('+COLUMN_NAME+',''''),''1'')=''1'' THEN 1 ELSE 0 END) As CountOf_EmptySpace
      ,COUNT('+COLUMN_NAME+') As Count_not_nulls 
     FROM '+TABLE_NAME  
FROM 
INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME ='Tp' --Enter your table in the query
FOR XML PATH (''), TYPE).value('.', 'VARCHAR(MAX)'),1,10,'')

EXEC (@Sql)

结果

TABLE_NAME  ColumName   Countof_nulls   CountOf_EmptySpace  Count_not_nulls
***************************************************************************
    Tp          a           5                   9               11
    Tp          b           3                   7               13
    Tp          c           5                   10              11
© www.soinside.com 2019 - 2024. All rights reserved.