如何在kubernetes集群中正确部署chromadb?

问题描述 投票:0回答:1

我不想对矢量数据库进行身份验证,我希望将其保存在磁盘上(在 k8s 中),以便我可以加载它并对其进行查询。我正在按照以下步骤执行此操作:

  1. 首先我创建这个
    persistent-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: chromadb-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /mnt/data/chromadb

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: chromadb-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

并部署此服务:

kubectl apply -f persistent-volume.yaml
  1. 然后我创建这个
    values.yaml
auth:
  enabled: false

persistence:
  enabled: true
  existingClaim: chromadb-pvc
  mountPath: /data

initContainers:
  - name: init-tenant
    image: curlimages/curl:7.73.0
    command: ["sh", "-c"]
    args:
      - >
        until curl -X POST http://localhost:8000/api/v1/tenants/default_tenant; do
          echo "Waiting for ChromaDB to be ready...";
          sleep 5;
        done
    env:
      - name: CHROMA_DEFAULT_TENANT
        value: "default_tenant"
    volumeMounts:
      - name: chromadb-storage
        mountPath: /data

volumeMounts:
  - name: chromadb-storage
    mountPath: /data

volumes:
  - name: chromadb-storage
    persistentVolumeClaim:
      claimName: chromadb-pvc
  1. 然后我使用
    values.yaml
    中的值安装 chromadb 的 helm 图表:
helm repo add chroma https://amikos-tech.github.io/chromadb-chart/
helm repo update
helm install chromadb-release amikos-tech/chromadb -f values.yaml

显示安装成功:

NAME: chromadb-release
LAST DEPLOYED: Tue May 21 11:16:05 2024
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
1. Get the application URL by running these commands:
  export NODE_PORT=$(kubectl get --namespace default -o jsonpath="{.spec.ports[0].nodePort}" services chromadb-release)
  export NODE_IP=$(kubectl get nodes --namespace default -o jsonpath="{.items[0].status.addresses[0].address}")
  echo http://$NODE_IP:$NODE_PORT
2. To get auth credentials run:
kubectl --namespace default get secret chromadb-auth -o jsonpath="{.data.token}" | base64 --decode
  1. 然后我进行端口转发以在本地访问 chromadb:
kubectl port-forward svc/chromadb-release 7000:8000
Forwarding from 127.0.0.1:7000 -> 8000
Forwarding from [::1]:7000 -> 8000
Handling connection for 7000
Handling connection for 7000
Handling connection for 7000
Handling connection for 7000
  1. 然后我尝试用Python代码测试连接:
import chromadb
client = chromadb.HttpClient("http://localhost:7000")

我收到以下错误:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
File ~/miniforge3/lib/python3.10/site-packages/chromadb/api/fastapi.py:588, in raise_chroma_error(resp)
    587 try:
--> 588     resp.raise_for_status()
    589 except requests.HTTPError:

File ~/miniforge3/lib/python3.10/site-packages/requests/models.py:1021, in Response.raise_for_status(self)
   1020 if http_error_msg:
-> 1021     raise HTTPError(http_error_msg, response=self)

HTTPError: 401 Client Error: Unauthorized for url: http://localhost:7000/api/v1/tenants/default_tenant

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
File ~/miniforge3/lib/python3.10/site-packages/chromadb/api/client.py:402, in Client._validate_tenant_database(self, tenant, database)
    401 try:
--> 402     self._admin_client.get_tenant(name=tenant)
    403 except Exception:

File ~/miniforge3/lib/python3.10/site-packages/chromadb/api/client.py:439, in AdminClient.get_tenant(self, name)
    437 @override
    438 def get_tenant(self, name: str) -> Tenant:
--> 439     return self._server.get_tenant(name=name)

File ~/miniforge3/lib/python3.10/site-packages/chromadb/telemetry/opentelemetry/__init__.py:127, in trace_method.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    126 if trace_granularity < granularity:
--> 127     return f(*args, **kwargs)
    128 if not tracer:

File ~/miniforge3/lib/python3.10/site-packages/chromadb/api/fastapi.py:194, in FastAPI.get_tenant(self, name)
    191 resp = self._session.get(
    192     self._api_url + "/tenants/" + name,
    193 )
--> 194 raise_chroma_error(resp)
    195 resp_json = resp.json()

File ~/miniforge3/lib/python3.10/site-packages/chromadb/api/fastapi.py:590, in raise_chroma_error(resp)
    589 except requests.HTTPError:
--> 590     raise (Exception(resp.text))

Exception: {"error":"Unauthorized"}

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Input In [3], in <cell line: 2>()
      1 import chromadb
----> 2 client = chromadb.HttpClient("http://localhost:7000")

File ~/miniforge3/lib/python3.10/site-packages/chromadb/__init__.py:174, in HttpClient(host, port, ssl, headers, settings, tenant, database)
    171 settings.chroma_server_ssl_enabled = ssl
    172 settings.chroma_server_headers = headers
--> 174 return ClientCreator(tenant=tenant, database=database, settings=settings)

File ~/miniforge3/lib/python3.10/site-packages/chromadb/api/client.py:138, in Client.__init__(self, tenant, database, settings)
    136 # Create an admin client for verifying that databases and tenants exist
    137 self._admin_client = AdminClient.from_system(self._system)
--> 138 self._validate_tenant_database(tenant=tenant, database=database)
    140 # Get the root system component we want to interact with
    141 self._server = self._system.instance(ServerAPI)

File ~/miniforge3/lib/python3.10/site-packages/chromadb/api/client.py:404, in Client._validate_tenant_database(self, tenant, database)
    402     self._admin_client.get_tenant(name=tenant)
    403 except Exception:
--> 404     raise ValueError(
    405         f"Could not connect to tenant {tenant}. Are you sure it exists?"
    406     )
    408 try:
    409     self._admin_client.get_database(name=database, tenant=tenant)

ValueError: Could not connect to tenant default_tenant. Are you sure it exists?

出了什么问题?尽管我已经在

HTTPError: 401 Client Error: Unauthorized for url
中声明了以下内容,但为什么它说
values.yaml

auth:
  enabled: false

如何让它发挥作用?

python kubernetes langchain chromadb vector-database
1个回答
0
投票

如果遇到显示身份验证错误的错误,则可能是即使您在

auth.enabled: false
中设置了
values.yaml
,身份验证仍然处于启用状态。检查文档,除了您使用的方法之外,没有其他关于禁用身份验证的参考。

另一种可能性是未在服务器端设置中将 ChromaDB 的身份验证设置为不需要身份验证。在此附上文档作为 ChromaDB 身份验证的参考。您可以考虑尝试以下步骤:

  1. CHROMA\_SERVER\_AUTH\_PROVIDER
    环境变量设置为
    none

    CHROMA_SERVER_AUTH_PROVIDER="none" \
    IS_PERSISTENT=1 \
    uvicorn chromadb.app:app --workers 1 --host 127.0.0.1 --port 8000 --proxy-headers --log-config chromadb/log_config.yml
    
  2. 在Python客户端代码中,实例化

    HttpClient
    ,无需任何身份验证设置。

    import chromadb
    from chromadb.config import Settings
    
    client = chromadb.HttpClient()
    
© www.soinside.com 2019 - 2024. All rights reserved.