Kubernetes Docker 容器 Pod 中无法访问 ScrapyRT 端口

问题描述 投票:0回答:1

我在访问 Kubernetes pod 内特定端口上运行的 ScrapyRT 服务时遇到困难。我的设置包括一个 Kubernetes 集群,其中有一个运行 Scrapy 应用程序的 pod,该应用程序使用 ScrapyRT 侦听指定端口上的传入请求。这些请求旨在触发相应端口上的蜘蛛。

尽管正确设置了 Kubernetes 服务并引用其中的 Scrapy Pod,我仍无法接收到该 Pod 的任何传入请求。我的理解是,在Kubernetes组网中,应该先创建一个service,然后再创建pod,通过service实现pod间的通信和外部访问。这是正确的吗?

相关配置如下:

scrapy-pod Dockerfile:

# Use Ubuntu as the base image
FROM ubuntu:latest

# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive

# # Update package repository and install Python, pip, and other utilities
RUN apt-get update && \
    apt-get install -y curl software-properties-common iputils-ping net-tools dnsutils vim build-essential python3 python3-pip && \
    rm -rf /var/lib/apt/lists/*


# Install nvm (Node Version Manager) - EXPRESS
ENV NVM_DIR /usr/local/nvm
ENV NODE_VERSION 16.20.1

RUN mkdir -p $NVM_DIR
RUN curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.1/install.sh | bash

# Install Node.js and npm - EXPRESS
RUN . "$NVM_DIR/nvm.sh" && nvm install $NODE_VERSION && nvm alias default $NODE_VERSION && nvm use default

# Add Node and npm to path so the commands are available - EXPRESS
ENV NODE_PATH $NVM_DIR/versions/node/v$NODE_VERSION/lib/node_modules
ENV PATH $NVM_DIR/versions/node/v$NODE_VERSION/bin:$PATH

# Install Yarn - EXPRESS
RUN npm install --global yarn

# Set the working directory in the container to /usr/src/app
WORKDIR /usr/src/app

# Copy the current directory contents into the container at /usr/src/app
COPY . .

# Install any needed packages specified in requirements.txt
RUN pip3 install --no-cache-dir -r requirements.txt

# Copy the start_services.sh script into the container
COPY start_services.sh /start_services.sh

# Make the script executable
RUN chmod +x /start_services.sh


# Install any needed packages specified in package.json using Yarn - EXPRESS
RUN yarn install


# Expose all the necessary ports
EXPOSE 14805 14807 12085 14806 13905 12080 14808 8000


# Define environment variable - EXPRESS
ENV NODE_ENV production

# Run the script when the container starts
CMD ["/start_services.sh"]

start_services.sh:

#!/bin/bash

# Start ScrapyRT instances on different ports
scrapyrt -p 14805 &
scrapyrt -p 14807 &
scrapyrt -p 12085 &
scrapyrt -p 14806 &
scrapyrt -p 13905 &
scrapyrt -p 12080 &
scrapyrt -p 14808 &

# Keep the container running since the ScrapyRT processes are in the background
tail -f /dev/null

服务yaml文件:

apiVersion: v1
kind: Service
metadata:
  name: scrapy-service
spec:
  selector:
    app: scrapy-pod
  ports:
    - name: port-14805
      protocol: TCP
      port: 14805
      targetPort: 14805
    - name: port-14807
      protocol: TCP
      port: 14807
      targetPort: 14807
    - name: port-12085
      protocol: TCP
      port: 12085
      targetPort: 12085
    - name: port-14806
      protocol: TCP
      port: 14806
      targetPort: 14806
    - name: port-13905
      protocol: TCP
      port: 13905
      targetPort: 13905
    - name: port-12080
      protocol: TCP
      port: 12080
      targetPort: 12080
    - name: port-14808
      protocol: TCP
      port: 14808
      targetPort: 14808
    - name: port-8000
      protocol: TCP
      port: 8000
      targetPort: 8000
  type: ClusterIP

部署yaml文件:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: scrapy-deployment
  labels:
    app: scrapy-pod
spec:
  replicas: 1
  selector:
    matchLabels:
      app: scrapy-pod
  template:
    metadata:
      labels:
        app: scrapy-pod
    spec:
      containers:
      - name: scrapy-pod
        image: mydockerhub/privaterepository-scrapy:latest
        imagePullPolicy: Always  
        ports:
        - containerPort: 14805
        - containerPort: 14806
        - containerPort: 14807
        - containerPort: 12085
        - containerPort: 13905
        - containerPort: 12080
        - containerPort: 8000
        envFrom:
        - secretRef:
            name: scrapy-env-secret
        - secretRef:
            name: express-env-secret
      imagePullSecrets:
      - name: my-docker-credentials 

scrapy-pod 在 Powershell 终端中的日志:

> k logs scrapy-deployment-56b9d66858-p59gs -f
2024-01-09 21:53:27+0000 [-] Log opened.
2024-01-09 21:53:27+0000 [-] Log opened.
2024-01-09 21:53:27+0000 [-] Log opened.
2024-01-09 21:53:27+0000 [-] Log opened.
2024-01-09 21:53:27+0000 [-] Log opened.
2024-01-09 21:53:27+0000 [-] Log opened.
2024-01-09 21:53:27+0000 [-] Log opened.
2024-01-09 21:53:27+0000 [-] Site starting on 12080
2024-01-09 21:53:27+0000 [-] Site starting on 14808
2024-01-09 21:53:27+0000 [-] Site starting on 14805
2024-01-09 21:53:27+0000 [-] Starting factory <twisted.web.server.Site object at 0x7f4cbdf44d60>
2024-01-09 21:53:27+0000 [-] Starting factory <twisted.web.server.Site object at 0x7fef9b620a00>
2024-01-09 21:53:27+0000 [-] Site starting on 13905
2024-01-09 21:53:27+0000 [-] Running with reactor: AsyncioSelectorReactor.
2024-01-09 21:53:27+0000 [-] Site starting on 14807
2024-01-09 21:53:27+0000 [-] Starting factory <twisted.web.server.Site object at 0x7f0892ff4df0>
2024-01-09 21:53:27+0000 [-] Site starting on 14806
2024-01-09 21:53:27+0000 [-] Starting factory <twisted.web.server.Site object at 0x7f00d3b99000>
2024-01-09 21:53:27+0000 [-] Starting factory <twisted.web.server.Site object at 0x7fba9e321180>
2024-01-09 21:53:27+0000 [-] Running with reactor: AsyncioSelectorReactor.
2024-01-09 21:53:27+0000 [-] Starting factory <twisted.web.server.Site object at 0x7f1782514f10>
2024-01-09 21:53:27+0000 [-] Running with reactor: AsyncioSelectorReactor.
2024-01-09 21:53:27+0000 [-] Running with reactor: AsyncioSelectorReactor.
2024-01-09 21:53:27+0000 [-] Site starting on 12085
2024-01-09 21:53:27+0000 [-] Starting factory <twisted.web.server.Site object at 0x7fb2054cd060>
2024-01-09 21:53:27+0000 [-] Running with reactor: AsyncioSelectorReactor.
2024-01-09 21:53:27+0000 [-] Running with reactor: AsyncioSelectorReactor.
2024-01-09 21:53:27+0000 [-] Running with reactor: AsyncioSelectorReactor.

问题: 尽管有这些配置,但似乎没有请求到达 Scrapy pod。 kubectl日志中的日志显示ScrapyRT实例在指定端口上成功启动。但是,当我从运行 Python Jupyter Notebook 的单独调试 pod 发送请求时,它们对其他 pod 成功,但对 Scrapy pod 失败。

问题: 如何才能成功连接到Scrapy pod?什么可能会阻止请求到达它?

任何见解或建议将不胜感激。

python node.js docker kubernetes scrapy
1个回答
0
投票

很少有东西可以尝试 -

  • 验证服务 YAML (scrapy-service) 中的
    selector
    字段是否与部署 YAML (scrapy-deployment) 中的标签匹配。标签应该相同才能正确选择 Pod。
  • 您是否检查过日志以查看是否有任何错误消息或指示正在接收请求??????
  • 您是否验证了集群内的 DNS 解析正常并且可以解析名称(scrapy-service)??????
  • 您是否验证过是否有任何防火墙规则可能会阻止集群内 Pod 之间的流量??????
  • 您是否尝试过 ping 或 telnet 来检查 pod 和集群之间的连接??????
  • 我可以看到 scrapy-service 的类型为
    ClusterIP
    ,这意味着它是一个内部服务。如果您需要外部访问,这将不起作用。请仔细检查它。尝试将其更改为
    NodePort
    LoadBalancer
    以获得外部访问。
  • 最后,验证 pod 是否正在运行。

请告诉我上述故障排除是否有帮助。

© www.soinside.com 2019 - 2024. All rights reserved.