使用 APIM 的 Azure OpenAI 和负载均衡器配置

问题描述 投票:0回答:1

我正在尝试使用 API 管理的 Azure OpenAI 和负载均衡器配置。我有一个基本的网络应用程序,其前端可以接受输入并总结文本。

我已经使用 openai 端点配置了负载均衡器策略,并相应地配置了 APIM。

我遇到困难的地方是带有 APIM 和负载均衡器端点的 python 配置以及如何使用它们。早些时候我直接使用 OpenAI 端点和密钥,但现在我觉得事情会改变,只是不知道如何改变它们。

如果有人能在这里帮助我,我将非常感激。谢谢

政策如下:

'''

<!--
    IMPORTANT:
    - Policy elements can appear only within the <inbound>, <outbound>, <backend> section elements.
    - To apply a policy to the incoming request (before it is forwarded to the backend service), place a corresponding policy element within the <inbound> section element.
    - To apply a policy to the outgoing response (before it is sent back to the caller), place a corresponding policy element within the <outbound> section element.
    - To add a policy, place the cursor at the desired insertion point and select a policy from the sidebar.
    - To remove a policy, delete the corresponding policy statement from the policy document.
    - Position the <base> element within a section element to inherit all policies from the corresponding section element in the enclosing scope.
    - Remove the <base> element to prevent inheriting policies from the corresponding section element in the enclosing scope.
    - Policies are applied in the order of their appearance, from the top down.
    - Comments within policy elements are not supported and may disappear. Place your comments between policy elements or at a higher level scope.
-->
<policies>
    <inbound>
        <base />
        <check-header name="X-Azure-FDID" failed-check-httpcode="401" failed-check-error-message="Not authorized" ignore-case="false">
            <value>aaabbbbccc</value>
        </check-header>
        <ip-filter action="allow">
            <address-range from="add1" to="add2" />
        </ip-filter>
        <set-variable name="backendUrlA" value="url1" />
        <!-- [A] Subscription 1: location East -->
        <set-variable name="backendUrlB" value="url2" />
        <!-- [B] Subscription 1: location East -->
        <set-variable name="backendA-apiKey" value="{{backendA-apiKey}}" />
        <set-variable name="backendB-apiKey" value="{{backendB-apiKey}}" />
        <!-- Load balancing logic -->
        <choose>
            <when condition="@((int)context.Request.Url.Path.IndexOf("/gpt-35-turbo") != -1)">
                <!-- Pool 1: GPT3.5 Turbo and GPT3.5 Turbo 16K -->
                <!-- Load balancing logic for gpt-35-turbo models -->
                <cache-lookup-value key="pool1Index" default-value="@((int)0)" variable-name="pool1Index" />
                <choose>
                    <when condition="@( (int)context.Variables["pool1Index"] == 0 )">
                        <set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlA"])" />
                        <set-variable name="selectedBackendKey" value="@((string)context.Variables["backendA-apiKey"])" />
                    </when>
                    <otherwise>
                        <set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlB"])" />
                        <set-variable name="selectedBackendKey" value="@((string)context.Variables["backendB-apiKey"])" />
                    </otherwise>
                </choose>
                <!-- Increment the pool1Index and reset to 0 when it reaches the end -->
                <set-variable name="pool1Index" value="@((int)context.Variables["pool1Index"] == 1 ? 0 : (int)context.Variables["pool1Index"] + 1)" />
                <cache-store-value key="pool1Index" value="@((int)context.Variables["pool1Index"])" duration="1440" />
            </when>
            <when condition="@((int)context.Request.Url.Path.IndexOf("/text-embedding-ada-002") != -1)">
                <!-- Pool 2: Embedding ADA-002 -->
                <!-- Load balancing logic for text-embedding-ada-002 model -->
                <cache-lookup-value key="pool2Index" default-value="@((int)0)" variable-name="pool2Index" />
                <choose>
                    <when condition="@( (int)context.Variables["pool2Index"] == 0 )">
                        <set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlA"])" />
                        <set-variable name="selectedBackendKey" value="@((string)context.Variables["backendA-apiKey"])" />
                    </when>
                    <otherwise>
                        <set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlB"])" />
                        <set-variable name="selectedBackendKey" value="@((string)context.Variables["backendB-apiKey"])" />
                    </otherwise>
                </choose>
                <!-- Increment the pool2Index and reset to 0 when it reaches the end -->
                <set-variable name="pool2Index" value="@((int)context.Variables["pool2Index"] == 1 ? 0 : (int)context.Variables["pool2Index"] + 1)" />
                <cache-store-value key="pool2Index" value="@((int)context.Variables["pool2Index"])" duration="1440" />
            </when>
            <otherwise>
                <!-- Direct to Pool 3: General -->
                <set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlA"])" />
                <set-variable name="selectedBackendKey" value="@((string)context.Variables["backendA-apiKey"])" />
            </otherwise>
        </choose>
        <set-backend-service base-url="@((string)context.Variables["selectedBackend"])" />
        <set-header name="api-key" exists-action="override">
            <value>@((string)context.Variables["selectedBackendKey"])</value>
        </set-header>
    </inbound>
    <backend>
        <base />
    </backend>
    <outbound>
        <base />
        <!-- Add the selected backend URL to the response headers -->
        <set-header name="X-Selected-Backend" exists-action="override">
            <value>@((string)context.Variables["selectedBackend"])</value>
        </set-header>
    </outbound>
    <on-error>
        <base />
    </on-error>
</policies>

'''

python azure-api-management azure-load-balancer azure-openai
1个回答
0
投票

我假设您在 Python Web 应用程序中使用 Azure OpenAI SDK。如果是这样,则需要在 APIM 中配置一些内容,因为目前 SDK 不能很好地适应 APIM。

  1. 更改您的 API 配置以接受 api_key 标头中的订阅密钥。您可以在“订阅”部分的 API 设置中执行此操作
  2. 在同一页面上,确保“API URL 后缀”字段值等于或以“/openai”结尾

完成这些更改后,您应该能够在 SDK 配置中指定 APIM 订阅密钥而不是 OpenAI 订阅密钥,并使用 APIM 的 API 基本 URL(末尾不带“openai”)作为 OpenAI 端点。

© www.soinside.com 2019 - 2024. All rights reserved.