验证 MPIJob
创建一个 MPIJob 来运行 Pi 计算。
kubectl apply -f - << EOF
apiVersion: kubeflow.org/v2beta1
kind: MPIJob
metadata:
name: pi
spec:
slotsPerWorker: 1
runPolicy:
cleanPodPolicy: Running
ttlSecondsAfterFinished: 3600
sshAuthMountPath: /home/mpiuser/.ssh
mpiReplicaSpecs:
Launcher:
replicas: 1
template:
spec:
containers:
- image: mpioperator/mpi-pi:openmpi
name: mpi-launcher
securityContext:
runAsUser: 1000
command:
- mpirun
args:
- -n
- "2"
- /home/mpiuser/pi
resources:
limits:
cpu: 1
memory: 1Gi
Worker:
replicas: 2
template:
spec:
containers:
- image: mpioperator/mpi-pi:openmpi
name: mpi-worker
securityContext:
runAsUser: 1000
command:
- /usr/sbin/sshd
args:
- -De
- -f
- /home/mpiuser/.sshd_config
resources:
limits:
cpu: 1
memory: 1Gi
EOF
查看 MPIJob 状态:
kubectl get mpijob
kubectl describe mpijob pi
查看 Pod 状态:
kubectl get pods
kubectl describe pod pi-launcher-xxxxx
kubectl logs pi-launcher-xxxxx
删除 MPIJob:
kubectl delete mpijob pi
更多使用指南: