Azure Sql 服务器指标 - cpu_percentage 与 sql_instance_cpu_percent

问题描述 投票:0回答:1

我正在致力于为 Azure 超大规模 Gen5 sql 数据库开发自动缩放功能。

为此,我需要监控 cpu 使用情况,了解何时向上或向下扩展其 vcores。 有 2 个 sql cpu 指标 - "cpu_percentage" 和 "sql_instance_cpu_percent" 。 根据我的阅读,sql_instance_cpu_percent 显示了整个服务器的 cpu 使用情况,包括系统服务等,而 cpu_percentage 是每个特定进程的。 这意味着,sql_instance_cpu_percent 始终高于 cpu_percentage。

但是,当在数据库上运行负载测试(大量读取、写入、删除)时,我看到 cpu_percentage 远远高于 sql_cpu_instance_percent。 为什么会这样呢? cpu_percentage 是否是规划扩展操作时需要考虑的更好指标?

azure-sql-database devops
1个回答
0
投票

我所有的向上/向下缩放都基于此查询:

DECLARE @StartDate date = DATEADD(day, -30, GETDATE()) -- 14 Days

SELECT
    @@SERVERNAME AS ServerName
    ,database_name AS DatabaseName
    ,sysso.edition
    ,sysso.service_objective
    ,(SELECT TOP 1 dtu_limit FROM sys.resource_stats AS rs3 WHERE rs3.database_name = rs1.database_name ORDER BY rs3.start_time DESC)  AS DTU
    /*,(SELECT TOP 1 storage_in_megabytes FROM sys.resource_stats AS rs2 WHERE rs2.database_name = rs1.database_name ORDER BY rs2.start_time DESC)  AS StorageMB */
    /*,(SELECT TOP 1 allocated_storage_in_megabytes FROM sys.resource_stats AS rs4 WHERE rs4.database_name = rs1.database_name ORDER BY rs4.start_time DESC)  AS Allocated_StorageMB*/ 
    ,avcon.AVG_Connections_per_Hour
    ,CAST(MAX(storage_in_megabytes) / 1024 AS DECIMAL(10, 2)) StorageGB
    ,CAST(MAX(allocated_storage_in_megabytes) / 1024 AS DECIMAL(10, 2)) Allocated_StorageGB
    ,MIN(end_time) AS StartTime
    ,MAX(end_time) AS EndTime
    ,CAST(AVG(avg_cpu_percent) AS decimal(4,2)) AS Avg_CPU
    ,MAX(avg_cpu_percent) AS Max_CPU
    ,(COUNT(database_name) - SUM(CASE WHEN avg_cpu_percent >= 40 THEN 1 ELSE 0 END) * 1.0) / COUNT(database_name) * 100 AS [CPU Fit %]
    ,CAST(AVG(avg_data_io_percent) AS decimal(4,2)) AS Avg_IO
    ,MAX(avg_data_io_percent) AS Max_IO
    ,(COUNT(database_name) - SUM(CASE WHEN avg_data_io_percent >= 40 THEN 1 ELSE 0 END) * 1.0) / COUNT(database_name) * 100 AS [Data IO Fit %]
    ,CAST(AVG(avg_log_write_percent) AS decimal(4,2)) AS Avg_LogWrite
    ,MAX(avg_log_write_percent) AS Max_LogWrite
    ,(COUNT(database_name) - SUM(CASE WHEN avg_log_write_percent >= 40 THEN 1 ELSE 0 END) * 1.0) / COUNT(database_name) * 100 AS [Log Write Fit %]
    ,CAST(AVG(max_session_percent) AS decimal(4,2)) AS 'Average % of sessions'
    ,MAX(max_session_percent) AS 'Maximum % of sessions'
    ,CAST(AVG(max_worker_percent) AS decimal(4,2)) AS 'Average % of workers'
    ,MAX(max_worker_percent) AS 'Maximum % of workers'
  
  
FROM sys.resource_stats AS rs1
inner join sys.databases dbs on rs1.database_name = dbs.name
INNER JOIN sys.database_service_objectives sysso on sysso.database_id = dbs.database_id
inner join 

(SELECT t.name
    ,round(avg(CAST(t.Count_Connections AS FLOAT)), 2) AS AVG_Connections_per_Hour
FROM (
    SELECT name
        --,database_name
        --,success_count
        --,start_time
        ,CONVERT(DATE, start_time) AS Dating
        ,DATEPART(HOUR, start_time) AS Houring
        ,sum(CASE 
                WHEN name = database_name
                    THEN success_count
                ELSE 0
                END) AS Count_Connections
    FROM sys.database_connection_stats
    CROSS JOIN sys.databases
    WHERE start_time > @StartDate
        AND database_id != 1
    GROUP BY name
        ,CONVERT(DATE, start_time)
        ,DATEPART(HOUR, start_time)
    ) AS t
GROUP BY t.name) avcon on avcon.name = rs1.database_name


WHERE start_time > @StartDate
AND rs1.start_time > @StartDate 
GROUP BY database_name, sysso.edition, sysso.service_objective,avcon.AVG_Connections_per_Hour
ORDER BY database_name , sysso.edition, sysso.service_objective

现在:为什么要尝试扩展超大规模?

我不称其为超大规模……因为它可以扩展。那么你为什么要缩放它呢?我主要使用此查询来查询 DTU vCore。

© www.soinside.com 2019 - 2024. All rights reserved.