Conversation
allocating (or sharing) resources
flink-kubernetes/README.md
Outdated
|
|
||
| ## Task Manager | ||
| Task manager is a temporary essence and is created (and deleted) by a job manager for a particular slot. | ||
| No deployments/jobs/services are created for a task manager only pods. |
There was a problem hiding this comment.
"for a task manager, only pods" comma missing?
| Example: | ||
| ``` | ||
| kubectl create -f jobmanager-deployment.yaml | ||
| kubectl create -f jobmanager-service.yaml |
There was a problem hiding this comment.
jobmanager-exposer-deployment.yaml ?
Also, a question сomes up instantly how exactly it exposes?
There was a problem hiding this comment.
That creates the deployment with one job manager and service around it that exposes
(ClusterIP/NodePort/LoadBalancer/ExternalName) the job manager
https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
flink-kubernetes/README.md
Outdated
| TBD | ||
|
|
||
| ## Kubernetes Resource Management | ||
| Resource management uses a default service account every pod contains. It should has admin privileges to be able |
| package org.apache.flink.kubernetes.client; | ||
|
|
||
| /** | ||
| * represent a endpoint. |
| void terminateClusterPod(ResourceID resourceID) throws KubernetesClientException; | ||
|
|
||
| /** | ||
| * stop cluster and clean up all resources, include services, auxiliary services and all running pods. |
There was a problem hiding this comment.
Some comments begin with a capital letter and some don't
| public Collection<ResourceProfile> startNewWorker(ResourceProfile resourceProfile) { | ||
| LOG.info("Starting a new worker."); | ||
| try { | ||
| nodeManagerClient.createClusterPod(resourceProfile); |
There was a problem hiding this comment.
So at higher level we provide a worker with one slot only, does that strategy have a downside?
There was a problem hiding this comment.
For now, it is our basis, we consciously do the same on samza.
It's a reasonable solution because in this case, different slot threads will not compete for the CPU and memory (since task manager doesn't isolate these resources). Also recovering is easier. However, we will use a slot sharing feature and share slots between different Flink operations according to pipeline logic to get rid of high network usage between task managers.
There was a problem hiding this comment.
As for a downside, you asked, I may mention the absence of resource sharing. In the case of low job utilization, a task manager will simply stand idle without much load.
Also, in this case, there will be no slot grouping. This feature tends to reduce network traffic by allocating slots on a single task manager. However, we will use slot sharing instead.
The current implementation of Kubernetes support is made for a session cluster only.
For additional information please see README file