我正在尝试使用 API 管理的 Azure OpenAI 和负载均衡器配置。我有一个基本的网络应用程序,其前端可以接受输入并总结文本。
我已经使用 openai 端点配置了负载均衡器策略,并相应地配置了 APIM。
我遇到困难的地方是带有 APIM 和负载均衡器端点的 python 配置以及如何使用它们。早些时候我直接使用 OpenAI 端点和密钥,但现在我觉得事情会改变,只是不知道如何改变它们。
如果有人能在这里帮助我,我将非常感激。谢谢
政策如下:
'''
<!--
IMPORTANT:
- Policy elements can appear only within the <inbound>, <outbound>, <backend> section elements.
- To apply a policy to the incoming request (before it is forwarded to the backend service), place a corresponding policy element within the <inbound> section element.
- To apply a policy to the outgoing response (before it is sent back to the caller), place a corresponding policy element within the <outbound> section element.
- To add a policy, place the cursor at the desired insertion point and select a policy from the sidebar.
- To remove a policy, delete the corresponding policy statement from the policy document.
- Position the <base> element within a section element to inherit all policies from the corresponding section element in the enclosing scope.
- Remove the <base> element to prevent inheriting policies from the corresponding section element in the enclosing scope.
- Policies are applied in the order of their appearance, from the top down.
- Comments within policy elements are not supported and may disappear. Place your comments between policy elements or at a higher level scope.
-->
<policies>
<inbound>
<base />
<check-header name="X-Azure-FDID" failed-check-httpcode="401" failed-check-error-message="Not authorized" ignore-case="false">
<value>aaabbbbccc</value>
</check-header>
<ip-filter action="allow">
<address-range from="add1" to="add2" />
</ip-filter>
<set-variable name="backendUrlA" value="url1" />
<!-- [A] Subscription 1: location East -->
<set-variable name="backendUrlB" value="url2" />
<!-- [B] Subscription 1: location East -->
<set-variable name="backendA-apiKey" value="{{backendA-apiKey}}" />
<set-variable name="backendB-apiKey" value="{{backendB-apiKey}}" />
<!-- Load balancing logic -->
<choose>
<when condition="@((int)context.Request.Url.Path.IndexOf("/gpt-35-turbo") != -1)">
<!-- Pool 1: GPT3.5 Turbo and GPT3.5 Turbo 16K -->
<!-- Load balancing logic for gpt-35-turbo models -->
<cache-lookup-value key="pool1Index" default-value="@((int)0)" variable-name="pool1Index" />
<choose>
<when condition="@( (int)context.Variables["pool1Index"] == 0 )">
<set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlA"])" />
<set-variable name="selectedBackendKey" value="@((string)context.Variables["backendA-apiKey"])" />
</when>
<otherwise>
<set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlB"])" />
<set-variable name="selectedBackendKey" value="@((string)context.Variables["backendB-apiKey"])" />
</otherwise>
</choose>
<!-- Increment the pool1Index and reset to 0 when it reaches the end -->
<set-variable name="pool1Index" value="@((int)context.Variables["pool1Index"] == 1 ? 0 : (int)context.Variables["pool1Index"] + 1)" />
<cache-store-value key="pool1Index" value="@((int)context.Variables["pool1Index"])" duration="1440" />
</when>
<when condition="@((int)context.Request.Url.Path.IndexOf("/text-embedding-ada-002") != -1)">
<!-- Pool 2: Embedding ADA-002 -->
<!-- Load balancing logic for text-embedding-ada-002 model -->
<cache-lookup-value key="pool2Index" default-value="@((int)0)" variable-name="pool2Index" />
<choose>
<when condition="@( (int)context.Variables["pool2Index"] == 0 )">
<set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlA"])" />
<set-variable name="selectedBackendKey" value="@((string)context.Variables["backendA-apiKey"])" />
</when>
<otherwise>
<set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlB"])" />
<set-variable name="selectedBackendKey" value="@((string)context.Variables["backendB-apiKey"])" />
</otherwise>
</choose>
<!-- Increment the pool2Index and reset to 0 when it reaches the end -->
<set-variable name="pool2Index" value="@((int)context.Variables["pool2Index"] == 1 ? 0 : (int)context.Variables["pool2Index"] + 1)" />
<cache-store-value key="pool2Index" value="@((int)context.Variables["pool2Index"])" duration="1440" />
</when>
<otherwise>
<!-- Direct to Pool 3: General -->
<set-variable name="selectedBackend" value="@((string)context.Variables["backendUrlA"])" />
<set-variable name="selectedBackendKey" value="@((string)context.Variables["backendA-apiKey"])" />
</otherwise>
</choose>
<set-backend-service base-url="@((string)context.Variables["selectedBackend"])" />
<set-header name="api-key" exists-action="override">
<value>@((string)context.Variables["selectedBackendKey"])</value>
</set-header>
</inbound>
<backend>
<base />
</backend>
<outbound>
<base />
<!-- Add the selected backend URL to the response headers -->
<set-header name="X-Selected-Backend" exists-action="override">
<value>@((string)context.Variables["selectedBackend"])</value>
</set-header>
</outbound>
<on-error>
<base />
</on-error>
</policies>
'''
我假设您在 Python Web 应用程序中使用 Azure OpenAI SDK。如果是这样,则需要在 APIM 中配置一些内容,因为目前 SDK 不能很好地适应 APIM。
完成这些更改后,您应该能够在 SDK 配置中指定 APIM 订阅密钥而不是 OpenAI 订阅密钥,并使用 APIM 的 API 基本 URL(末尾不带“openai”)作为 OpenAI 端点。