在 AWS EKS 中部署 RayCluster

问题描述 投票:0回答:1

想要在k8s环境中部署ray-cluster。我想使用我的自定义 docker 映像和 docker compose 文件来进行此部署。早些时候我使用 kuberay Operator 和 helm 进行部署。

helm install kuberay-operator
helm install raycluster -f values.yaml

但现在我想添加额外的库来支持依赖。 我想要安装在 rayCluster 中的库如下

awscli
boltons
pandas
protobuf==3.20.*

Dockerfile

FROM python:3.9.7-slim
ARG ACCESS_KEY_ID
ARG SECRET_ACCESS_KEY
RUN apt update -y && apt install -y git
COPY . /app
WORKDIR /app
RUN pip install --no-cache-dir -U 'ray[default]'
COPY requirement.txt ./
RUN pip3 install -r requirement.txt
RUN aws configure set aws_access_key_id $ACCESS_KEY_ID \
    && aws configure set aws_secret_access_key $SECRET_ACCESS_KEY \
    && aws configure set default.region ap-south-1 \
    && aws configure set default.output json

RUN pip install --no-cache-dir git+https://user:{token}@github.com/1234/abcd.git
RUN abcd configure --path 'default.json'
CMD ["bash", "-c", "ray start --head --num-cpus 1 --dashboard-host 0.0.0.0 --include-dashboard true --block"]

docker-compose 文件

version: "3"

services:
  ray-head:
    image: ${RAY_IMAGE}
    ports:
      - "${REDISPORT}:${REDISPORT}"
      - "${DASHBOARDPORT}:${DASHBOARDPORT}"
      - "${HEADNODEPORT}:${HEADNODEPORT}"
    env_file:
      - .env
    command: bash -c "ray start --head --dashboard-port=${DASHBOARDPORT} --port=${REDISPORT} --dashboard-host=0.0.0.0 --redis-password=${REDISPASSWORD} --block"
    shm_size: 2g
    deploy:
      resources:
        limits:
          cpus: '1'
          memory: '2g'
    networks:
      - airflow_default

  ray-worker:
    image: ${RAY_IMAGE}
    depends_on: 
      - ray-head
    env_file:
      - .env
    command: bash -c "ray start --address=ray-head:${REDISPORT} --redis-password=${REDISPASSWORD} --num-cpus=${NUM_CPU_WORKER} --block" 
    shm_size: 2g
    deploy:
      mode: replicated
      replicas: ${NUM_WORKERS} 
      resources:
        limits:
          cpus: ${NUM_CPU_WORKER}
          memory: '2g'
    networks:
      - airflow_default

networks:
  airflow_default:

我该如何继续。无法找出直接在 ECS 或 EC2 实例中使用 docker compose 跟随其他人的方向。

amazon-web-services kubernetes docker-compose dockerfile ray
1个回答
0
投票

要使用其他库在 AWS EKS 中部署自定义 Ray 集群:

1.  Update your Dockerfile to include the required libraries (awscli, boltons, pandas, protobuf) and any necessary configuration for AWS CLI.
2.  Convert your Docker Compose file to Kubernetes manifests using a tool like Kompose, or manually adapt them for Kubernetes, paying attention to your specific needs for AWS EKS.
3.  Build and push your Docker image to a container registry such as Amazon ECR.
4.  Deploy to EKS by applying the Kubernetes manifests with kubectl apply -f <manifest.yaml>, ensuring your EKS cluster is correctly set up and accessible.
5.  Adjust for Kubernetes by securely managing AWS credentials, configuring networking appropriately, and considering autoscaling options for your Ray cluster.

此方法使您的部署保持在 Kubernetes 生态系统内,利用其编排功能来管理 AWS EKS 中的 Ray 集群。

© www.soinside.com 2019 - 2024. All rights reserved.