我有三个安装了docker的节点,一个master和两个slave,想在docker上运行Mesos,marathon和hadoop。我有这些docker compose文件:这个docker compose是主节点运行Mesos和Marathon:
version: '3.7'
services:
zookeeper:
image: hadoop_marathon_mesos_flink_2
command: >
sh -c "
echo zookeeper && /home/zookeeper-3.4.14/bin/zkServer.sh
restart &&
sleep 30 && /home/mesos-1.7.2/build/bin/mesos-master.sh
--ip=10.32.0.1 --hostname=10.32.0.1 --roles=marathon,flink |
/home/marathon-1.7.189-48bfd6000/bin/marathon --master
10.32.0.1:5050 --zk zk://10.32.0.1:2181/marathon
--hostname 10.32.0.1 --webui_url 10.32.0.1:8080
--logging_level debug"
privileged: true
network_mode: "bridge"
environment:
WEAVE_CIDR: 10.32.0.1/12
ZOOKEEPER_SERVER_ID: 1
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ZOOKEEPER_INIT_LIMIT: 10
ZOOKEEPER_SYNC_LIMIT: 5
ZOOKEEPER_SERVERS: 10.32.0.1:2888:3888
MESOS_CLUSTER: Yekta
LIBPROCESS_IP: 10.32.0.1
MESOS_QUORUM: 1
MESOS_LOG_DIR: /var/log/mesos
MESOS_WORK_DIR: /var/run/mesos
MESOS_EXECUTOR_REGISTRATION_TIMEOUT: 5mins
HOSTNAME: 10.32.0.1
MESOS_NATIVE_JAVA_LIBRARY: /usr/local/lib/libmesos.so
MESOS_DOCKER_SOCKET: /var/run/weave/weave.sock
volumes:
- /home/cfms11/.ssh:/root/.ssh
expose:
- 2181
- 2888
- 3888
- 5050
- 4040
- 7077
- 8080
- 9000
- 50070
- 50090
ports:
- 2181:2181
- 2888:2888
- 3888:3888
- 5050:5050
- 4040:4040
- 7077:7077
- 8080:8080
- 9000:9000
- 50070:50070
- 50090:50090
networks:
default:
external:
name: weave
Docker在slave节点中组成:
version: '3.7'
services:
slave:
image: hadoop_marathon_mesos_flink_2
command: sh -c "/home/mesos-1.7.2/build/bin/mesos-slave.sh
--master=10.32.0.1:5050 --work_dir=/var/run/mesos
--systemd_enable_support=false"
privileged: true
network_mode: "weave"
environment:
WEAVE_CIDR: 10.32.0.1/12
MESOS_RESOURCES: ports(*):[11000-11999]
LIBPROCESS_IP: 10.32.0.2
MESOS_HOSTNAME: 10.32.0.2
MESOS_EXECUTOR_REGISTRATION_TIMEOUT: 5mins #also in Dockerfile
MESOS_LOG_DIR: /var/log/mesos
MESOS_WORK_DIR: /var/run/mesos
MESOS_LOGGING_LEVEL: INFO
volumes:
- /home/spark/.ssh:/root/.ssh
expose:
- 5051
ports:
- 5051:5051
networks:
default:
external:
name: weave
运行docker compose文件后,运行marathon和mesos没有任何问题。然后,我必须进入用docker compose制作的容器并运行hadoop。因此,我做这些阶段:在每个节点:
sudo docker-compose ps
我复制上面命令的输出名称。
sudo docker exec -it "the_name" /bin/bash
在主节点中输入该容器后,我运行以下命令:
cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
chmod -R 750 /root/.ssh/authorized_keys
chmod 700 ~/.ssh/
chmod 600 ~/.ssh/*
chown -R root ~/.ssh/
chgrp -R root ~/.ssh/
service ssh restart
另外,我在slave容器中运行这些命令:
chmod 700 ~/.ssh/
chmod 600 ~/.ssh/*
chown -R root ~/.ssh/
chgrp -R root ~/.ssh/
service ssh restart
完成这些工作后,我可以使用此命令运行hadoop:
/opt/hadoop/sbin/start-dfs.sh
但是,它还没有开始。我收到此错误:
在[compose-weave-ok-for-master-node_zookeeper_1.weave.local]上启动名称节点
compose-weave-ok-for-master-node_zookeeper_1.weave.local:错误:无法设置namenode进程的优先级1985
启动datanodes
启动辅助名称节点[compose-weave-ok-for-master-node_zookeeper_1.weave.local]
我认为这是因为ID Container不在/ etc / hosts中。事实上,/ etc / hosts如下:
# created by Weave - BEGIN
# container hostname
10.32.0.1 compose-weave-ok-for-master-node_zookeeper_1.weave.local
compose-weave-ok-for-master-node_zookeeper_1
# static names added with --add-host
# default localhost entries
127.0.0.1 localhost
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
# created by Weave - EN
有人请告诉我,我怎么能在马拉松和梅索斯旁边跑步?
先感谢您。
请注意每个节点上都有相同的hadoop路径。此外,将Container ID放在Hosts文件上并再次运行Hadoop。我认为它解决了这个问题。