1 of 92

Fawry cloud devops internship

Application production deployment architecture

Recommended Application HW Sizing

Role

VM specs

Software

No. of VM

Application Server Production

vCPU : 16 core RAM : 64 GB Volume: 200 GB SSD OS: Ubuntu 22.04 LTS

Nginx Kubernetes 1.27 Containerd

3 VM

Database Server

vCPU : 16 core RAM : 64 GB Volume: 400 GB SSD OS: Ubuntu 22.04 LTS

MySql 8.0

1 VM

Staging Server

vCPU : 4 core RAM : 16 GB Volume: 200 GB OS: Ubuntu 20.04 LTS

Nginx Kubernetes 1.27 Containerd mysql 8.0

1 VM

Note : Take daily snapshots of Media store server and Database server

Application Deployment Process

Application Deployment Process: Start Date : 15 Sep 2023

End Date : 20Sep 2023

Day One :

install Proxmox on an OVH server with a public IP and then install and configure Ubuntu as a virtual machine, you can follow these steps:

Provision OVH Server: Access your OVH account and provision a server with the desired specifications, including a public IP address.
Connect to the Server: Once the server is provisioned, you can connect to it via SSH using a terminal application like Terminal (Mac/Linux). Use the server's public IP address and the SSH credentials provided by OVH.
Install Proxmox VE: Proxmox VE is a server virtualization platform that will allow you to create and manage virtual machines. Follow the official Proxmox VE installation guide specific to your server's operating system. Typically, it involves running a script provided by Proxmox to install the necessary packages.
Access Proxmox Web Interface: Once Proxmox VE is installed, access its web interface by opening a web browser and entering the server's IP address followed by the default port 8006. For example, https://<server_ip>:8006. Accept any security warnings and log in using the default username root and the password set during the Proxmox installation.
Configure Network: In the Proxmox web interface, navigate to "Datacenter" -> "Network" -> "VM Network" and configure the network settings for your virtual machines. Ensure the public IP address assigned to your OVH server is properly configured.
Create Virtual Machine (VM): In the Proxmox web interface, click on "Create VM" to create a new virtual machine. Follow the wizard to specify the VM settings, such as the amount of RAM, CPU cores, disk space, and networking. Choose Ubuntu as the guest OS template.
Install Ubuntu on the VM: After creating the VM, select it in the Proxmox interface and click on "Console" to access the virtual machine's console. Start the VM and mount the Ubuntu ISO image to perform the installation. Follow the Ubuntu installation process, configuring options like language, partitioning, and user credentials.
Configure Ubuntu: Once Ubuntu is installed, log in to the virtual machine and perform any necessary configurations. Update the system packages using sudo apt update and sudo apt upgrade. Install additional software, set up firewall rules, configure network interfaces, etc., as per your requirements.
Access Ubuntu VM: To access the Ubuntu VM from your local machine, you can use SSH or remote desktop tools like VNC. Ensure you have network connectivity between your local machine and the OVH server.

Day TWO :

Containerize application: Package your application and its dependencies into a container image. Kubernetes uses containers to run and manage applications. You can use tools like Docker to create container images.
Create Kubernetes manifests: Kubernetes uses YAML or JSON files called manifests to define the desired state of your application. Manifests describe the resources that Kubernetes should create and manage, such as pods, services, deployments, and ingress rules. You need to create these manifests to define your application's structure, dependencies, and configurations.
Define a Deployment: A Deployment is a Kubernetes resource that manages the lifecycle of your application. It ensures that the desired number of instances (replicas) of your application are running and handles updates and rollbacks. You define a Deployment in your manifest file, specifying details like the container image, replicas, labels, and environment variables.
Apply the manifests: Use the kubectl apply command to apply your Kubernetes manifests to the cluster. This command creates or updates the specified resources based on the desired state defined in the manifests.
Verify the deployment: After applying the manifests, you can use kubectl commands to verify the status of your deployment. For example, kubectl get pods shows the running pods, kubectl get deployments shows the status of the deployments, and kubectl logs <pod-name> retrieves the logs of a specific pod.
Expose the application: To make your application accessible from outside the cluster, you'll need to expose it using a Service. A Service is a Kubernetes resource that provides a stable network endpoint to access your application. There are different types of Services, such as ClusterIP, NodePort, and LoadBalancer. You define and create a Service in your manifest file.
Scale and update the deployment: Kubernetes allows you to scale your application horizontally by adjusting the number of replicas in your Deployment. You can use the kubectl scale command to scale your application. To update your application, you modify the Deployment's manifest file with the new version or configuration and apply the changes using kubectl apply again.
Monitor and manage: Kubernetes provides various tools and approaches for monitoring and managing your application. You can use kubectl commands, Kubernetes Dashboard, or third-party monitoring tools to monitor the health, resource usage, and logs of your application.

Day Three :

build a setup that includes Amazon S3 for Laravel media storage, Amazon CloudFront for content delivery, and Amazon SES for email integration in a Laravel application, you can follow these steps:

Set up an AWS Account: If you don't already have one, create an AWS account at https://aws.amazon.com and ensure you have the necessary permissions to create and manage services like S3, CloudFront, and SES.
Configure Laravel: Install and configure Laravel for your application. This involves setting up the database connection, configuring the mail driver (to use SES later), and any other necessary Laravel configurations.
Set up Amazon S3: Create an S3 bucket in the AWS Management Console. This bucket will store your media files. Ensure you have the appropriate permissions to access the bucket. You may need to create an IAM user and attach the necessary policies.
Install and Configure Laravel S3 Driver: Install the league/flysystem-aws-s3-v3 package through Composer to enable Laravel's S3 driver. Configure the S3 driver in Laravel's filesystems.php configuration file, specifying the S3 bucket and credentials.
Upload and Retrieve Media: In your Laravel application, use the storage facade to upload and retrieve media files. For example, you can use Storage::put() to upload files to S3 and Storage::url() to generate URLs for accessing the files.
Set up Amazon CloudFront: Create a CloudFront distribution in the AWS Management Console. Configure CloudFront to use your S3 bucket as the origin and specify the desired settings, such as caching, SSL, and custom domain names. CloudFront acts as a content delivery network (CDN) to cache and deliver your media files globally.
Integrate CloudFront URLs in Laravel: Replace the direct S3 URLs in your Laravel application with the CloudFront URLs generated for your media files. This ensures that files are served through CloudFront for faster and more efficient delivery.
Set up Amazon SES: In the AWS Management Console, create an SES (Simple Email Service) configuration, verify your domain or email addresses, and configure the necessary email settings, such as DKIM and SPF records.
Configure Laravel Mail: Update Laravel's mail driver configuration to use SES. Modify the mail.php configuration file and specify the SES SMTP credentials and region.
Send Emails: Use Laravel's built-in mail functionality (Mail facade) to send emails from your application. Laravel will use SES as the underlying mail transport.
Test and Verify: Test the media upload/download functionality and email sending from your Laravel application to ensure everything is working as expected. Monitor logs and error messages to identify and resolve any issues.

Day Four :

Connect a GitHub Repository:GitHub reposito y application's source code. Initialize the repository with your project files and push them to the remote repository.
Choose a CI/CD Platform: Select a CI/CD platform that integrates well with GitHub. Some popular options include devtron .
Configure CI/CD with GitHub Actions: Create a Dockerfile , nginx conf and supervisor.sh directory in your GitHub repository. Inside this directory, create a YAML file (e.g., ci-cd.yaml) to define your CI/CD workflow. Configure the workflow to trigger on specific events, such as a push to the master branch or a pull request. Define the necessary steps, such as building the application, running tests, and creating a build artifact.
Define Environment-Specific Configurations: Determine the necessary configurations for your staging and production environments. This may include environment variables, database connections, and other settings specifi
Set up Staging Environment: Create a staging environment using Devtron or any other Kubernetes deployment tool of your choice. Devtron simplifies deployment on Kubernetes by providing a user-friendly interface and automation. Configure the necessary Kubernetes resources, such as namespaces, deployments, services, and ingress rules, to create your staging environment.
Configure Deployment Workflow: Modify your CI/CD workflow to include the deployment of your application to the staging environment. This involves building a Docker image, pushing it to a container registry like Docker Hub or Amazon ECR, and deploying the image to your staging environment using Devtron or Kubernetes manifest files.
Test and Verify Staging Environment: Run tests and perform manual verification to ensure that the staging environment is functioning correctly. This includes testing application functionality, performance, and integration with any dependent services.
Set up Production Environment: Create a production environment using Devtron or Kubernetes in a similar manner to the staging environment. Configure the necessary Kubernetes resources according to your production requirements, such as scaling, high availability, and security.
Configure Production Deployment Workflow: Modify your CI/CD workflow to include the deployment of your application to the production environment. This may involve additional steps for promoting the application from the staging environment to production, such as manual approval gates or specific branching strategies.
Test and Verify Production Environment: Run tests and perform thorough verification to ensure that the production environment is functioning correctly. This includes testing application functionality, security, performance, and any other critical aspects.
Monitor and Manage: Implement monitoring and logging solutions to gain insights into your application's performance and health in both staging and production environments. Utilize Devtron's monitoring capabilities or integrate with third-party tools like Prometheus or Grafana.

Day Five :

Build a data protection process for a MySQL database and Laravel media files stored in an S3 bucket with a lifecycle policy without encryption, you can follow these steps:

Backup MySQL Database: Implement a regular backup strategy for your MySQL database to protect against data loss. Use tools like mysqldump or database backup services to create automated backups at scheduled intervals. Store the backups in a secure location, such as a separate server or cloud storage.
Store Laravel Media Files in S3: Configure your Laravel application to store media files in an S3 bucket. Set up the necessary credentials and access policies to ensure secure access to the bucket. This allows you to offload the storage of media files to a scalable and durable object storage service.
Enable Server-Side Logging: Enable server-side logging for the S3 bucket to capture access logs. This will help you monitor and track any unauthorized access attempts or suspicious activities related to your media files.
Configure Lifecycle Policy: Define a lifecycle policy for your S3 bucket to manage the lifecycle of the media files. The lifecycle policy can specify rules to transition or expire objects based on criteria such as age, object size, or specific prefixes. For example, you can set rules to automatically move files to Glacier storage after a certain period or delete files that have reached their retention period.
Enable Versioning: Enable versioning for the S3 bucket to preserve multiple versions of the media files. This provides an additional layer of protection by allowing you to restore previous versions of files in case of accidental deletion or corruption.
Implement Access Controls: Apply appropriate access controls to your MySQL database and the S3 bucket to restrict unauthorized access..
Monitor and Test: Regularly monitor the backup process, database integrity, and S3 storage to ensure everything is functioning correctly. Perform periodic tests to restore data from backups and validate the integrity of the restored data.
Disaster Recovery Plan: Develop a comprehensive disaster recovery plan that includes steps to restore the MySQL database and media files from backups in case of data loss or system failure. Document the procedures and regularly review and test the plan to ensure its effectiveness.
Regularly Update and Patch: Keep your MySQL database, Laravel application, and server infrastructure up to date with the latest security patches and updates. Regularly review and apply Laravel framework updates and security best practices to protect against vulnerabilities.

Application Deployment pricing

Monthly cost : pay for OVH fixed and pay as you go for AWS

Cloud Provider

Service

Cost

One time payment : Implementaion DevOps full CI/CD pipeline

Provider

Kubernetes ConfigMap and Secret

Kubernetes Configmaps

If we have many object files, it won’t be easy to manage the environment data stored within the query files. We can take this information out of the pod definition file and execute it centrally using Configuration Maps. ConfigMaps are used to provide configuration data in key-value pairs in K8s. There are two phases involved in configuring ConfigMaps. The first is to create the ConfigMaps, and the second to inject them into the pod.

ConfigMaps

There are two ways of creating a config map:

i) The imperative way – without using a ConfigMap definition file ii) Declarative way by using a Configmap definition file.

Create a ConfigMap

ConfigMap can be written manually as a YAML representation and load it into Kubernetes, or you can also apply the kubectl create configmap command to create it from the CLI.

Create a ConfigMap Using kubectl

Run the kubectl create configmap cmd to create ConfigMaps from directories, files, or literal values:

kubectl create configmap <map-name> <data-source>

Where <map-name> is the name you want for ConfigMap and <data-source> is the directory, file, or literal value to draw the data from.

Check Out: How to launch Kubernetes dashboard. Click here

ConfigMaps and Pods

You can write a Pod spec that attributes to a ConfigMap and configures the container based on the data in the ConfigMap. Note that the Pod and the ConfigMap must exist in the same namespace.

Let’s look at an instance now, ConfigMap, with some keys with single values and other keys where the value looks like a fragment of a configuration format.

apiVersion: v1
kind: ConfigMap
metadata:
  name: game-demo
data:
  # property-like keys; each key maps to a simple value
  player_initial_lives: "3"
  ui_properties_file_name: "user-interface.properties"

  # file-like keys
  game.properties: |
    enemy.types=aliens,monsters
    player.maximum-lives=5
  user-interface.properties: |
    color.good=purple
    color.bad=yellow
    allow.textmode=true

There are four various ways you can use a ConfigMap to configure a container in a Pod:

Entrypoint of the containers via command line.
Environment variables for a container.
Also, we can add a file in read-only volume for the application to read.
Manually written Code to run inside the Pod that uses the K8s API to read a ConfigMap.

Also Read: Our previous blog post on Kubernetes deployment. Click here

Using ConfigMaps

ConfigMaps can be mounted as data volumes. Other parts of the system can also use ConfigMaps without being directly exposed to the Pod. For example, ConfigMaps can hold data that different system parts should use for configuration. The usual way of using ConfigMaps is to configure environments for containers running in a Pod in the same namespace. You can also use ConfigMap separately.

Check Out: Kubernetes Operator Example. Click here

Immutable ConfigMaps

The K8s beta feature Immutable Secrets and ConfigMaps gives an option to set specific Secrets and ConfigMaps as immutable. And for clusters that extensively use ConfigMaps, restricting modifications to their data has the following advantages:

Protects you from accidental updates that could cause applications outages
Improves the performance of your cluster by significantly reducing the load on the Kube-API server by closing watches for ConfigMaps marked as immutable.

The Immutable Ephemeral Volumes control this feature gate. You can build an immutable ConfigMap by fixing the immutable field to true. For example:

apiVersion: v1
kind: ConfigMap
metadata:
  ...
data:
  ...
immutable: true

Also Read: Our blog post on Docker and Kubernetes. Click here

Kubernetes Secrets

Secrets 🔐 are used to hold sensitive data like passwords or keys. They’re similar to configMap except that they’re stored in an encoded or hashed format with config maps.

There are two levels involved in working with secrets. First, to create the secret, and second, to introduce it into the pod. Putting confidential data in secret is safer and adaptable rather than putting it in a container image or a Pod definition.

To use a secret, a pod has to reference the secret. There are three ways to use a secret with a Pod:

As a file in a volume mounted on one or more of its containers.
As a container environment variable.
kubelet uses secrets by pulling images for the Pod.

Built-in Secrets

Built-in Secrets is where Service accounts automatically create and Attach Secrets with API credentials. K8s automatically creates secrets that include credentials for reaching the API and automatically alters your pods to use this type of secret. The automatic and also the use of API credentials can be overridden or disabled if wanted.

Creating a Secret

There are several options to create a Secret:

Create Secret using kubectl command
We can create a Secret from the config file
Create Secret using customise

Editing a Secret

Run the following command to edit a Secret:

 kubectl edit secrets mysecret

Using Secrets

Secrets can be attached as data volumes or exhibited as environment variables used by a container in a Pod. Other parts of the system can also use secrets without being directly exposed to the Pod.

Also Read: Our previous blog post on Kubernetes labels. Click here

Updating Secrets and ConfigMaps

Kubernetes manages our environment variables, which means we don’t have to change code or rebuild containers when changing the variable’s value. The updation is a two-step process because pods cache the value of environment variables during startup.

First, update the values:

kubectl create configmap language --from-literal=LANGUAGE=Spanish \
        -o yaml --dry-run | kubectl replace -f -
kubectl create secret generic apikey --from-literal=API_KEY=098765 \
        -o yaml --dry-run | kubectl replace -f -

Then, restart the pods. This can be done in various ways, such as forcing a new deployment. The quick and easy way is to delete the pods manually and have the deployment automatically spin up new ones.

kubectl delete pod -l name=envtest

It is most suitable to create your Secrets and ConfigMaps using the above method so kubectl can record its annotation for tracking changes to the resource in the spec. Another way of doing this is using the –save-config command-line option when running kubectl create secret|configmap.

Kubernetes Network

Moving from physical networks using switches, routers, and ethernet cables to virtual networks using software-defined networks (SDN) and virtual interfaces involves a slight learning curve. Of course, the principles remain the same, but there are different specifications and best practices. Kubernetes has its own set of rules, and if you're dealing with containers and the cloud, it helps to understand how Kubernetes networking works.

The Kubernetes Network Model has a few general rules to keep in mind:

Every Pod gets its own IP address: There should be no need to create links between Pods and no need to map container ports to host ports.
NAT is not required: Pods on a node should be able to communicate with all Pods on all nodes without NAT.
Agents get all-access passes: Agents on a node (system daemons, Kubelet) can communicate with all the Pods in that node.
Shared namespaces: Containers within a Pod share a network namespace (IP and MAC address), so they can communicate with each other using the loopback address.

What Kubernetes networking solves

Kubernetes networking is designed to ensure that the different entity types within Kubernetes can communicate. The layout of a Kubernetes infrastructure has, by design, a lot of separation. Namespaces, containers, and Pods are meant to keep components distinct from one another, so a highly structured plan for communication is important.

(Nived Velayudhan, CC BY-SA 4.0)

Container-to-container networking

Container-to-container networking happens through the Pod network namespace. Network namespaces allow you to have separate network interfaces and routing tables that are isolated from the rest of the system and operate independently. Every Pod has its own network namespace, and containers inside that Pod share the same IP address and ports. All communication between these containers happens through localhost, as they are all part of the same namespace. (Represented by the green line in the diagram.)

Pod-to-Pod networking

With Kubernetes, every node has a designated CIDR range of IPs for Pods. This ensures that every Pod receives a unique IP address that other Pods in the cluster can see. When a new Pod is created, the IP addresses never overlap. Unlike container-to-container networking, Pod-to-Pod communication happens using real IPs, whether you deploy the Pod on the same node or a different node in the cluster.

The diagram shows that for Pods to communicate with each other, the traffic must flow between the Pod network namespace and the Root network namespace. This is achieved by connecting both the Pod namespace and the Root namespace by a virtual ethernet device or a veth pair (veth0 to Pod namespace 1 and veth1 to Pod namespace 2 in the diagram). A virtual network bridge connects these virtual interfaces, allowing traffic to flow between them using the Address Resolution Protocol (ARP).

When data is sent from Pod 1 to Pod 2, the flow of events is:

Pod 1 traffic flows through eth0 to the Root network namespace's virtual interface veth0.
Traffic then goes through veth0 to the virtual bridge, which is connected to veth1.
Traffic goes through the virtual bridge to veth1.
Finally, traffic reaches the eth0 interface of Pod 2 through veth1.

Pod-to-Service networking

Pods are very dynamic. They may need to scale up or down based on demand. They may be created again in case of an application crash or a node failure. These events cause a Pod's IP address to change, which would make networking a challenge.

Credit to : (Nived Velayudhan, CC BY-SA 4.0)

Kubernetes solves this problem by using the Service function, which does the following:

Assigns a static virtual IP address in the frontend to connect any backend Pods associated with the Service.
Load-balances any traffic addressed to this virtual IP to the set of backend Pods.
Keeps track of the IP address of a Pod, such that even if the Pod IP address changes, the clients don't have any trouble connecting to the Pod because they only directly connect with the static virtual IP address of the Service itself.

The in-cluster load balancing occurs in two ways:

IPTABLES: In this mode, kube-proxy watches for changes in the API Server. For each new Service, it installs iptables rules, which capture traffic to the Service's clusterIP and port, then redirects traffic to the backend Pod for the Service. The Pod is selected randomly. This mode is reliable and has a lower system overhead because Linux Netfilter handles traffic without the need to switch between userspace and kernel space.
IPVS: IPVS is built on top of Netfilter and implements transport-layer load balancing. IPVS uses the Netfilter hook function, using the hash table as the underlying data structure, and works in the kernel space. This means that kube-proxy in IPVS mode redirects traffic with lower latency, higher throughput, and better performance than kube-proxy in iptables mode.

The diagram above shows the package flow from Pod 1 to Pod 3 through a Service to a different node (marked in red). The package traveling to the virtual bridge would have to use the default route (eth0) as ARP running on the bridge wouldn't understand the Service. Later, the packages have to be filtered by iptables, which uses the rules defined in the node by kube-proxy. Therefore the diagram shows the path as it is.

Internet-to-Service networking

So far, I have discussed how traffic is routed within a cluster. There's another side to Kubernetes networking, though, and that's exposing an application to the external network.

(Nived Velayudhan, CC BY-SA 4.0)

You can expose an application to an external network in two different ways.

Egress: Use this when you want to route traffic from your Kubernetes Service out to the Internet. In this case, iptables performs the source NAT, so the traffic appears to be coming from the node and not the Pod.
Ingress: This is the incoming traffic from the external world to Services. Ingress also allows and blocks particular communications with Services using rules for connections. Typically, there are two ingress solutions that function on different network stack regions: the service load balancer and the ingress controller.

Discovering Services

There are two ways Kubernetes discovers a Service:

Environment Variables: The kubelet service running on the node where your Pod runs is responsible for setting up environment variables for each active service in the format {SVCNAME}_SERVICE_HOST and {SVCNAME}_SERVICE_PORT. You must create the Service before the client Pods come into existence. Otherwise, those client Pods won't have their environment variables populated.
DNS: The DNS service is implemented as a Kubernetes service that maps to one or more DNS server Pods, which are scheduled just like any other Pod. Pods in the cluster are configured to use the DNS service, with a DNS search list that includes the Pod's own namespace and the cluster's default domain. A cluster-aware DNS server, such as CoreDNS, watches the Kubernetes API for new Services and creates a set of DNS records for each one. If DNS is enabled throughout your cluster, all Pods can automatically resolve Services by their DNS name. The Kubernetes DNS server is the only way to access ExternalName Services.

ServiceTypes for publishing Services:

Kubernetes Services provide you with a way of accessing a group of Pods, usually defined by using a label selector. This could be applications trying to access other applications within the cluster, or it could allow you to expose an application running in the cluster to the external world. Kubernetes ServiceTypes enable you to specify what kind of Service you want.

(Ahmet Alp Balkan, CC BY-SA 4.0)

The different ServiceTypes are:

ClusterIP: This is the default ServiceType. It makes the Service only reachable from within the cluster and allows applications within the cluster to communicate with each other. There is no external access.
LoadBalancer: This ServiceType exposes the Services externally using the cloud provider's load balancer. Traffic from the external load balancer is directed to the backend Pods. The cloud provider decides how it is load-balanced.
NodePort: This allows the external traffic to access the Service by opening a specific port on all the nodes. Any traffic sent to this Port is then forwarded to the Service.
ExternalName: This type of Service maps a Service to a DNS name by using the contents of the externalName field by returning a CNAME record with its value. No proxying of any kind is set up.

Cluster Networking ^

In a standard Kubernetes deployment, there are several networking variations you should be aware of. Below are the most common networking situations to know.

Also read: Docker vs Virtual Machine to understand what is their difference.

Container-to-container Networking

The smallest object we can deploy in Kubernetes is the pod, however, within each pod, you may want to run multiple containers. A common use-case for this is a helper where a secondary container helps a primary container with tasks such as pushing and pulling data. Container to container communication within a K8s pod uses either the shared file system or the localhost network interface.

Pod-to-Pod Networking

Pod-to-pod networking can occur for pods within the same node or across nodes. Each of your nodes has a classless inter-domain routing (CIDR) block. This block is a defined set of unique IP addresses that are assigned to pods within that node. This ensures that each pod is provided with a unique IP regardless of which node it is in.

There are 2 types of communication.

Inter-node communication
Intra-node communication

Also Check: our previous blog on helm Kubernetes

Pod-to-Service Networking

Kubernetes is designed to allow pods to be replaced dynamically, as needed. This means that pod IP addresses are not durable unless special precautions are taken, such as for stateful applications. To address this issue and ensure that communication with and between pods is maintained, Kubernetes uses services.

Kubernetes services manage pod states and enable you to track pod IP addresses over time. These services abstract pod addresses by assigning a single virtual IP (a cluster IP) to a group of pod IPs. Then, any traffic sent to the virtual IP is distributed to the associated pods.

This service IP enables pods to be created and destroyed as needed without affecting overall communications. It also enables Kubernetes services to act as in-cluster load balancers, distributing traffic as needed among associated pods.

Internet-to-Service Networking

The final networking situation that is needed for most deployments is between the Internet and services. Whether you are using Kubernetes for internal or external applications, you generally need Internet connectivity. This connectivity enables users to access your services and distributed teams to collaborate.

When setting up external access, there are two techniques you need to use — egress and ingress. These are policies that you can set up with either whitelisting or blacklisting to control traffic into and out of your network.

Also Read: Our blog post on Kubernetes delete deployment. Click here

What are Services in Kubernetes? ^

Kubernetes Service provides the IP Address, a single DNS name, and a Load Balancer to a set of Pods. A Service identifies its member Pods with a selector. For a Pod to be a member of the Service, the Pod must have all of the labels specified in the selector. A label is an arbitrary key/value pair that is attached to an object. K8s Services are also a REST object and also an abstraction that defines a logical set of pods and a policy for accessing the pod set.

Services select Pods based on their labels. When a network request is made to the service, it selects all Pods in the cluster matching the service’s selector, chooses one of them, and forwards the network request to it. Let us look at the core attributes of any kind of service in Kubernetes:

Label selector that locates pods
ClusterIP IP address & assigned port number
Port definitions
Optional mapping of incoming ports to a targetPort

Check Out: Kubernetes Monitoring Tools. Click here

ClusterIP

ClusterIP is the default Service type in Kubernetes. In this Service, Kubernetes creates a stable IP Address that is accessible from all the nodes in the cluster. The scope of this service is confined within the cluster only. The main use case of this service is to connect our Frontend Pods to our Backend Pods as we don’t expose backend Pods to the outside world because of security reasons.

NodePort

NodePort exposes the Service on each Node’s IP at a static port (the NodePort). NodePort builds on top of ClusterIP to create a mapping from each Worker Node’s static IP on a specified (or Kubernetes has chosen) Port. We can contact the NodePort service, from outside the cluster, by requesting <Node IP>:<Nodeport>. The only main use case of NodePort Service is to expose our Pods to the outside world. Note: We can only expose the ports 30000-32767.

LoadBalancer

The LoadBalancer Service is a standard way for exposing our Nodes to the outside world or the internet. We have multiple Pods deployed on multiple Nodes, to access our application we can use any of the Public IP of any node and node port. But there are some problems in this scenario, like which Nodes IP we will provide to the clients and how will the traffic balance between the multiple nodes into the cluster. A simple solution for this will be LoadBalancer.

Also Read: Kubernetes Labels and Kubernetes Annotations are one of the main components which provide a way for adding additional metadata to our Kubernetes Objects.

Ingress Controller

Ingress Controller is an intelligent Load Balancer. Ingress is a high-level abstraction responsible for allowing simple host or URL based HTTP routing. It is always implemented using a third-party proxy. These implementations are nothing but Ingress Controller. It is a Layer-7 load balancer.

Also Read: Know everything about Ingress Controller.

DNS in Kubernetes ^

The Domain Name System (DNS) is the networking system in place that allows us to resolve human-friendly names to unique IP addresses. By default, most Kubernetes clusters automatically configure an internal DNS service to provide a lightweight mechanism for service discovery. Kube-DNS and CoreDNS are two established DNS solutions for defining DNS naming rules and resolving pod and service DNS to their corresponding cluster IPs. With DNS, Kubernetes services can be referenced by name that will correspond to any number of backend pods managed by the service.

A DNS Pod consists of three separate containers:

Kubedns: watches the Kubernetes master for changes in Services and Endpoints, and maintains in-memory lookup structures to serve DNS requests.
Dnsmasq: adds DNS caching to improve performance.
Sidecar: provides a single health check endpoint to perform health checks for dnsmasq and kubedns.

Lab 1 : Create ingress route to Nignx deployment

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress
spec:
  rules:
  - host: "nginx-ingress.kubernetes.saasx.io"
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: my-nginx-svc
            port:
              number: 80

Lab 2 : Create ingress route to Wordpress deployment

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress
spec:
  rules:
  - host: "wp-ingress.kubernetes.saasx.io"
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: wordpress
            port:
              number: 80

Kubernetes PV & PVC

Previously in this series, we looked at volumes and volumeMounts in Kubernetes. Now we’ll take a step further and introduce two other Kubernetes objects related to data persistence and preservation, namely PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs).

We’ll cover the following topics, including hands-on practice on the functionalities of these concepts:

What are PersistentVolumes and PersistentVolumeClaims?
How and by whom are PersistentVolumes defined?
How do PersistentVolumes differ from Volumes?
How are PersistentVolumes and PersistentVolumeClaims created?
How and by whom is persistent storage requested using PersistentVolumeClaims?

Relationship between PV , PVC and Storage Volume

Static provisioning

Dynamic Provisioning

Kubernetes PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs)
Storage management is essential in Kubernetes, especially in large environments where many users deploy multiple Pods. The users in this environment often need to configure storage for each Pod, and when making a change to existing applications, it must be made on all Pods, one after the other. To mitigate this time-consuming scenario and separate the details of how storage is provisioned from how it is consumed, we use PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs).
A PersistentVolume (PV) in Kubernetes is a pool of pre-provisioned storage resources in a Kubernetes cluster, that can be used across different user environments. Its lifecycle is separate from a Pod that uses the PersistentVolume.
A PersistentVolumeClaim (PVC), is a process of storage requests from PVs by the users in Kubernetes. Kubernetes binds PVs with the PVCs based on the request and property set on those PVs. Kubernetes searches for PVs that correspond to the PVCs’ requested capacity and specified properties, so that each PVC can bind to a single PV.
When there are multiple matches, you can use labels and selectors to bind a PVC to the right or a particular PV. This helps guard against a situation where a small PVC binds to a larger PV, since PVs and PVCs have a one-to-one relationship. When this happens, the remaining storage in the bound PVs are inaccessible to other users.
NOTE: Both the PVCs and the Pod using them must be in the same namespace.
The Difference Between PVs and PVCs in Kubernetes
PVs and PVCs differ in provisioning, functionalities, and the person responsible for creating them, specifically :
- PVs are created by the cluster administrator or dynamically by Kubernetes, whereas users/developers create PVCs.
- PVs are cluster resources provisioned by an administrator, whereas PVCs are a user’s request for storage and resources.
- PVCs consume PVs resources, but not vice versa.
- A PV is similar to a node in terms of cluster resources, while a PVC is like a Pod in the context of cluster resource consumption.
PVs and PVCs architecture
Difference between Volumes and PersistentVolumes
Volumes and PersistentVolumes differ in the following ways:
- A Volume separates storage from a container but binds it to a Pod, while PVs separate storage from a Pod.
- The lifecycle of a Volume is dependent on the Pod using it, while the lifecycle of a PV is not.
- A Volume enables safe container restart and allows sharing of data across containers, whereas a PV enables safe Pod termination or restart.
- A separate manifest YAML file is needed to create a PV, but not required for a volume.
PersistentVolumes and PersistentVolumeClaims Lifecycle
The communication between PVs and PVCs consists of the following stages:
- Provisioning: This is the process of creating a storage volume, which can be done statically or dynamically.
  - Static: The static way of provisioning a storage volume is when PVs are created before PVCs by an Administrator and exist in the Kubernetes API, waiting to be claimed by a user’s storage request, using PVC.
  - Dynamic: The dynamic way of storage provisioning involves creating PVs automatically, using StorageClass instead of manual provisioning of PersistentVolumes. Kubernetes will dynamically provision a volume specifically for this storage request, with the help of a StorageClass. (More about StorageClass and how it is used to provision PVs dynamically in the next part of this series).

Binding: When an Administrator has provisioned PVs and users make a specific amount of storage requests (PVCs), a control loop in the master finds a matching PV and binds them together. If a matching volume does not exist, the claim will remain unbound until a volume match is created.
Using: Pods use claims as volumes. The Kubernetes API checks the claim to find a bound PV and mounts it in the Pod for the users. When a claim is already bound to a PV, the bind remains unchanged as long as the user wants it. So user A’s bound PV can not be taken over by user B, if it is still in use by user A.
Reclaiming: When a user is done with the volume usage, the resources must be freed up by deleting the PVC objects from Kubernetes. This practice will free the PV from the PVC, allowing reclamation of the resources, making it available for future use. What happens to the volumes afterwards is determined by the PersistentVolumeReclaimPolicy (PVRP) value specified in the configuration file. The PVRP provides three options of what you can do to a PersistentVolume after the claim has been deleted: retain, delete, or recycle.
- Retain: This is the default reclaim policy. It is a process that allows manual deletion of the PV by the Administrator. When a PVC is deleted, the PV remains intact, including its data, thus making it unavailable for use by another claim. However, the Administrator can still manually reclaim the PersistentVolume.
- Delete: When the reclaim policy is set to delete, the PV deletes automatically when the PVC is deleted and makes the space available. It is important to note that the dynamically provisioned PVs inherit the reclaim policy of their StorageClass, which defaults to Delete.
- Recycle: This type of reclaim policy will scrub the data in the PV and make it available for another claim.

How to Create a PersistentVolume

The following steps will guide you through how to create a PersistentVolume and how to use it in a Pod. Before continuing, it is imperative to have a basic knowledge of volume, volumeMounts, and volume types such as hostPath, emptyDir, among others, in order to effectively follow this hands-on practice. A running Kubernetes cluster and a kubectl command-line tool must be configured to talk to the cluster. If you do not have this, you can achieve this by simply creating a Kubernetes cluster on any environment with KubeOne. Refer to Getting Started for instructions. Alternatively, you can go to the Kubernetes playground to practice. In that case, you might also need a cloud provider (GKE, AWS, etc.) access or credentials to provision storage.

Follow the below steps to create a PersistentVolume:

Step 1: Create the YAML file.

$ vim pv-config.yaml

Step 2: Copy and paste the below configuration file into the YAML manifest file created above.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-volume
spec:
  capacity:
    storage: 3Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/app/data"

The configuration above shows properties with different functionalities. In addition to the properties from the previous Kubernetes objects exercises, let’s look at accessModes, capacity, and storage properties:

accessModes: This property defines how a PV should be mounted on the host. The value can be ReadWriteOnce, ReadOnlyMany, or ReadWriteMany. Below is a basic description of these values:
- ReadWriteOnce: You can mount the PV as read-write by a single node.
- ReadOnlyMany: You can mount the PV as read-only by multiple nodes.
- ReadWriteMany: You can mount the PV as read-write by multiple nodes.
capacity.storage: This property is used to set the amount of storage needed for the volume.

Step 3: Create the Persistent Volume using kubectl create command.

$ kubectl create -f pv-config.yaml
persistentvolume/my-volume created

Step 4: Check the created PV to see if it is available.

$ kubectl get pv

NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY  STATUS     CLAIM    STORAGECLASS      REASON      AGE 
my-volume    3Gi       RWO              Retain       Available                         ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 60s

Step 5: Check the description of the PersistentVolume by running kubectl describe command.

$ kubectl describe pv my-volume

Labels:          <none>
Annotations:     <none>
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:
Status:          Available
Claim:
Reclaim Policy:  Retain
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        3Gi
Node Affinity:   <none>
Message:
Source:
    Type:          HostPath (bare host directory volume)
    Path:          /app/data
    HostPathType:
Events:            <none>

As seen above, the PersistentVolume is available and ready to serve a PVC storage request. The status will change from available to bound when it has been claimed.

NOTE: It is not advisable to use the hostpath volume type in a production environment.

How to Create a PersistentVolumeClaim in Kubernetes

Now that the PersistentVolume has been successfully created, the next step is to create a PVC that will claim the volume. Creating a PersistentVolumeClaim is similar to the method used to create the PersistentVolume above, with a few differences in terms of its properties and values. The value of the kind property will be PersistentVolumeClaim. The resources.request.storage field will also be added, and the value will be the provisioned PV capacity value (so 3Gi in our case).

The configuration will look like the manifest file below:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-claim
spec:
  accessModes:
    - ReadWriteOnce
   resources:
      requests:
        storage: 3Gi

Step 1: Create a YAML file.

$ vim pvc-config.yaml

Step 2: Copy and paste the above configuration file into the YAML file created above.

Step 3: Create the PVC by running kubectl create command.

$ kubectl create -f pvc-config.yaml
persistentvolumeclaim/my-claim created

Step 4: Check the status of the PVC by running kubectl get command.

$ kubectl get pvc my-claim

NAME       STATUS     VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
my-claim   Bound    my-volume       3Gi           RWO                      7m

Step 5: Check the description of the PVC.

$ kubectl describe pvc my-claim

Name:          my-claim
Namespace:     default
StorageClass: 
Status:        Bound
Volume:        my-volume
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      3Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Mounted By:    <none>
Events:        <none>

As shown above, the status indicates a “Bound” state, which means the PVC and PV are now bound together. Now, check the status of the PV once again with kubectl get command.

$ kubectl get pv my-volume

NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY  STATUS     CLAIM        ⠀⠀⠀STORAGECLASS    REASON     AGE 
my-volume    3Gi       RWO              Retain       Bound   default/my-claim                          ⠀⠀⠀ 11m

The output shows that the status has changed from “available” when it is not yet bound to “Bound” because the PV has been claimed by the created PVC.

Step 6: Delete the PVC using kubectl delete command.

$ kubectl delete pvc my-claim

persistentvolumeclaim "my-claim" deleted

Step 7: Check both the PVC and PV status with kubectl get command.

$ kubectl get pvc my-claim

No resources found in default namespace.

$ kubectl get pv my-volume

NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY  STATUS       CLAIM        ⠀⠀⠀STORAGECLASS    REASON     AGE 
my-volume    3Gi       RWO              Retain       Released   default/my-claim                          ⠀⠀ 21m

The output shows that the PV status has now changed from a “Bound” to a “Released” state, after the PVC is deleted.

PersistentVolumes States

A PV has different states, can be in any of these, and each has its own meaning, described as follows:

Available: The PV is free and not yet claimed or bounded.
Bound: The PV has been claimed or bounded.
Released: The bounded PVC has been deleted, and the PV is free from its previous bounded state.
Failed: The PV has failed its automatic reclamation.

PersistentVolumeClaims States

Each PVC, like the PV, has its own states that represent its current status.

Bound: The PVC is bound to a PV.
Pending: The PVC can not bind to a PV. This could be due to a higher storage request than the PV capacity, or the PV accessMode value is different from that of the PVC, etc.

How to Use PersistentVolumeClaim in a Pod

A Pod can access storage with the help of a PVC, which will be used as a volume. PVC can be used in a Pod by first declaring a “volumes” property in the Pod manifest file and specifying the claim name under the declared volume type “persistentVolumeClaim” property. It is essential that both the PVC and the Pod using it exist in the same namespace. This will allow the cluster to find the claim in the Pod’s namespace and use it to access the PersistentVolume that is bound to the PVC. Once created, the applications in the containers can read and write into the storage.The complete configuration file will look like this:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: stclass-test
    image: nginx
    volumeMounts:
    - mountPath: "/app/data"
      name: my-volume
  volumes:
  - name: my-volume
    persistentVolumeClaim:
      claimName: my-claim

The below steps will guide you through how to use a claim in a Pod. The PV and PVC have to be provisioned before creating the Pod. The claim name inside the Pod must also match the claim name in the running PVC.

Step 1: Create a YAML file.

$ vim pvc-pod.yaml

Step 2: Copy and paste the above Pod manifest file into the YAML file created above and create the Pod with kubectl create command.

$ kubectl create -f pvc-pod.yaml

pod/my-pod created

Step 3: Check the status and the description of the Pod.

$ kubectl get pod my-pod

NAME      READY   STATUS    RESTARTS    AGE
my-pod     1/1    Running      0       6m50s

$ kubectl describe pod my-pod

Name:         my-pod
Namespace:    default
Priority:     0
Node:         node01/172.17.0.57
Start Time:   Tue, 01 Dec 2020 11:44:08 +0000
Labels:       <none>
Annotations:  <none>
Status:       Running
IP:           10.244.1.2
IPs:
  IP:  10.244.1.2
Volumes:
  my-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  my-claim
    ReadOnly:   false
  default-token-nlmxj:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-nlmxj
    Optional:   false
QoS Class:      BestEffort

Step 4: Exec into the Pod to test the PVC use case.

$ kubectl exec -it my-pod -- bin/bash

root@my-pod:/#

Step 5: Use df -h together with the path to confirm the mount point.

root@my-pod:/# df -h /app/data

Filesystem                   		Size  	 Used 	    Avail       Use% Mountedon 
/dev/mapper/host01--vg-root  	        191G      22G        159G        13% /app/data

Step 6: Change into the mount directory and create a file using echo command.

root@my-pod:/# cd /app/data
root@my-pod:/app/data# echo "I love Kubermatic" > file.txt

Step 7: Check the created file and data using ls and cat commands, then exit the Pod once this has been confirmed.

root@my-pod:/app/data# ls
file.txt
root@my-pod:/app/data# cat file.txt
"I love Kubermatic"
root@my-pod:/app/data# exit

Step 8: Delete and recreate the Pod.

$ kubectl delete pod my-pod
pod "my-pod" deleted
$ kubectl get pods

No resources found in default namespace.

Recreate the Pod and check the status:

$ kubectl create -f pvc-pod.yaml
pod/my-pod created

$ kubectl get pod my-pod

NAME     READY     STATUS    RESTARTS   AGE
my-pod    1/1      Running     0         5s

Step 9: Exec into the Pod and check the previous file and data created if they exist in the new Pod.

$ kubectl exec -it my-pod -- bin/bash
root@my-pod:/# df -h /app/data

Filesystem                     Size    Used    Avail    Use%   Mountedon 
/dev/mapper/host01--vg-root    191G     22G    159G     13%    /app/data

root@my-pod:/# cd /app/data
root@my-pod:/app/data# ls
file.txt
root@my-pod:/app/data# cat file.txt
"I love Kubermatic"

You can see that the file and data created in the deleted Pod above are still there and were taken up by the new Pod.

Summary: PersistentVolumes and PersistentVolumeClaims are Kubernetes objects that work in tandem to give your Kubernetes applications a higher level of persistence, either statically or dynamically, with the help of StorageClass.

kubernetea Labs

Lab 1 : Replicaset lab

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  namespace: yourname
  name: frontend
  labels:
    app: guestbook
    tier: frontend
spec:
  # modify replicas according to your case
  replicas: 3
  selector:
    matchLabels:
      tier: frontend
  template:
    metadata:
      labels:
        tier: frontend
    spec:
      containers:
      - name: php-redis
        image: gcr.io/google_samples/gb-frontend:v3

Lab 2 : Create Statfulset for mySQl DB with PVC and Secret

Step 1: Create secret with password

Step 1 : Encrypte password with base64

echo -n 'devops' | base64

Step 2: Create secret with password

apiVersion: v1
kind: Secret
metadata:
  namespace: yourname
  name: mysql-pass
type: Opaque
data:
  password: ZGV2b3Bz

Step 3: Create PVC to store mysql db

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  namespace: yourname
  name: mysql-pv-claim
  labels:
    app: wordpress
spec:
  accessModes:
   - ReadWriteOnce
  resources:
   requests:
    storage: 40Gi

Step 4: Create Statefulset for MySql DB

apiVersion: apps/v1
kind: StatefulSet
metadata:
  namespace: yourname
  name: wordpress-mysql
  labels:
    app: wordpress
spec:
  selector:
    matchLabels:
      app: wordpress
      tier: mysql
  template:
    metadata:
      labels:
        app: wordpress
        tier: mysql
    spec:
      containers:
        - image: mysql:5.6
          name: mysql
          env:
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: mysql-pass
                  key: password
          ports:
            - containerPort: 3306
              name: mysql
          volumeMounts:
            - name: mysql-persistent-storage
              mountPath: /var/lib/mysql
      volumes:
        - name: mysql-persistent-storage
          persistentVolumeClaim:
            claimName: mysql-pv-claim

Step 5: Expose DB service port 3306

apiVersion: v1
kind: Service
metadata:
  namespace: yourname
  name: wordpress-mysql
  labels:
    app: wordpress
spec:
  ports:
    - port: 3306
  selector:
    app: wordpress
    tier: mysql

Kubernetes Session 2

What is a Kubernetes Namespace?

A namespace helps separate a cluster into logical units. It helps granularly organize, allocate, manage, and secure cluster resources. Here are two notable use cases for Kubernetes namespaces:

Apply policies to cluster segments—Kubernetes namespaces let you apply policies to different parts of a cluster. For example, you can define resource policies to limit resource consumption. You can also use container network interfaces (CNIs) to apply network policies that define how communication is achieved between pods in each namespace. Learn more about .
Apply access controls—namespaces let you define role-based access control (RBAC). You can define a role object type and assign it using role binding. The role you define is applied to a namespace, and RoleBinding is applied to specific objects within this namespace. Using this technique can help you improve the security of your cluster.

In a new cluster, Kubernetes automatically creates the following namespaces: default (for user workloads) and three namespaces for the Kubernetes control plane: kube-node-lease, kube-public, and kube-system. Kubernetes also allows admins to manually create custom namespaces.

Related content: Read our guide to

In this article:

Kubernetes Namespaces Concepts

There are two types of Kubernetes namespaces: Kubernetes system namespaces and custom namespaces.

Default Kubernetes namespaces

Here are the four default namespaces Kubernetes creates automatically:

default—a default space for objects that do not have a specified namespace.
kube-system—a default space for Kubernetes system objects, such as kube-dns and kube-proxy, and add-ons providing cluster-level features, such as web UI dashboards, ingresses, and cluster-level logging.
kube-public—a default space for resources available to all users without authentication.
kube-node-lease—a default space for objects related to cluster scaling.

Custom Kubernetes namespaces

Admins can create as many Kubernetes namespaces as necessary to isolate workloads or resources and limit access to specific users. Here is how to create a namespace using kubectl:

kubectl create ns mynamespace

Related content: Read our guide to

The Hierarchical Namespace Controller (HNC)

Hierarchical namespaces are an extension to the Kubernetes namespaces mechanism, which allows you to organize groups of namespaces that have a common owner. For example, when a cluster is shared by multiple teams, each team can have a group of namespaces that belong to them.

With hierarchical namespaces, you can create a team namespace, and under it namespaces for specific workloads. You don’t need cluster-level permission to create a namespace within your team namespace, and you also have the flexibility to apply different RBAC rules and network security groups at each level of the namespace hierarchy.

When Should You Use Multiple Kubernetes Namespaces?

In small projects or teams, where there is no need to isolate workloads or users from each other, it can be reasonable to use the default Kubernetes namespace. Consider using multiple namespaces for the following reasons:

Isolation—if you have a large team, or several teams working on the same cluster, you can use namespaces to create separation between projects and microservices. Activity in one namespace never affects the other namespaces.
Development stages—if you use the same Kubernetes cluster for multiple stages of the development lifecycle, it is a good idea to separate development, testing, and production environments. You do not want errors or instability in testing environments to affect production users. Ideally, you should use a separate cluster for each environment, but if this is not possible, namespaces can create this separation.
Permissions—it might be necessary to define separate permissions for different resources in your cluster. You can define separate RBAC rules for each namespace, ensuring that only authorized roles can access the resources in the namespace. This is especially important for mission critical applications, and to protect sensitive data in production deployments.
Resource control—you can define resource limits at the namespace level, ensuring each namespace has access to a certain amount of CPU and memory resources. This enables separating cluster resources among several projects and ensuring each project has the resources it needs, leaving sufficient resources for other projects.
Performance—Kubernetes API provides better performance when you define namespaces in the cluster. This is because the API has less items to search through when you perform specific operations.

Quick Tutorial: Working with Kubernetes Namespaces

Let’s see how to perform basic namespace operations—creating a namespace, viewing existing namespaces, and creating a pod in a namespace.

Creating Namespaces

You can create a namespace with a simple kubectl command, like this:

kubectl create namespace mynamespace

This will create a namespace called mynamespace, with default configuration. If you want more control over the namespace, you can create a YAML file and apply it. Here is an example of a namespace YAML file:

Viewing Namespaces

To list all the namespaces currently active in your cluster, run this command:

kubectl get namespace

The output will look something like this:

Creating Resources in the Namespace

When you create a resource in Kubernetes without specifying a namespace, it is automatically created in the current namespace.

For example, the following pod specification does not specify a namespace:

When you apply this pod specification, the following will happen:

If you did not create any namespaces in your cluster, the pod will be created in the default namespace
If you created a namespace and are currently running in it, the pod will be created in that namespace.

How can you explicitly create a resource in a specific namespace?

There are two ways to do this:

Use the –namespace flag when creating the resource, like this:

Specify a namespace in the YAML specification of the resource. Here is what it looks like in a pod specification:

Important notes:

Note that if your YAML specification specifies one namespace, but the apply command specifies another namespace, the command will fail.
If you try to work with a Kubernetes resource in a different namespace, kubectl will not find the resource. Us the –namespace flag to work with resources in other namespaces, like this:

What Are Kubernetes Nodes?

A node is a worker machine that runs . It can be a physical (bare metal) machine or a virtual machine (VM). Each node can host one or more pods. Kubernetes nodes are managed by a control plane, which automatically handles the deployment and scheduling of pods across nodes in a Kubernetes cluster. When scheduling pods, the control plane assesses the resources available on each node.

Each node runs two main components—a kubelet and a container runtime. The kubelet is in charge of facilitating communication between the control plane and the node. The container runtime is in charge of pulling the relevant container image from a registry, unpacking containers, running them on the node, and communicating with the operating system kernel.

In this article:

Kubernetes Node Components

Here are three main Kubernetes node components:

kubelet

The Kubelet is responsible for managing the deployment of pods to Kubernetes nodes. It receives commands from the API server and instructs the container runtime to start or stop containers as needed.

kube-proxy

A network proxy running on each Kubernetes node. It is responsible for maintaining network rules on each node. Network rules enable network communication between nodes and pods. Kube-proxy can directly forward traffic or use the operating system packet filter layer.

Container runtime

The software layer responsible for running containers. There are several container runtimes supported by Kubernetes, including Containerd, CRI-O, Docker, and other Kubernetes Container Runtime Interface (CRI) implementations.

Working with Kubernetes Nodes: 4 Basic Operations

Here is how to perform common operations on a Kubernetes node.

1. Adding Node to a Cluster

You can manually add nodes to a Kubernetes cluster, or let the kubelet on that node self-register to the control plane. Once a node object is created manually or by the kubelet, the control plane validates the new node object.

Adding nodes automatically

The example below is a JSON manifest that creates a node object. Kubernetes then checks that the kubelet registered to the API server matches the node’s metadata.name field. Only healthy nodes running all necessary services are eligible to run the pod. If the check fails, the node is ignored until it becomes healthy.

Defining node capacity

Nodes that self-register with the API Server report their CPU and memory volume capacity after the node object is created. However, When manually creating the node, administrators need to set up capacity demands. Once this information is defined, the Kubernetes scheduler assigns resources to all pods running on a node. The scheduler is responsible for ensuring that requests do not exceed node capacity.

2. Modifying Node Objects

You can use kubectl to manually create or modify node objects, overriding the settings defined in --register-node. You can, for example:

Use labels on nodes and node selectors to control scheduling. You can, for example, limit a pod to be eligible only for running on a subset of available nodes.
Mark a node as unschedulable to prevent the scheduler from adding new pods to the node. This action does not affect pods running on the node. You can use this option in preparation for maintenance tasks like node reboot. To mark the node as unschedulable, you can run: kubectl cordon $NODENAME.

3. Checking Node Status

There are three primary commands you can use to determine the status of a node and the resources running on it.

kubectl describe nodes

Run the command kubectl describe nodes my-node to get node information including:

HostName—reported by the node operating system kernel. You can report a different value for HostName using the kubelet flag --hostname-override.
InternalIP—enables traffic to be routed to the node within the cluster.
ExternalID—an IP that can be used to access the node from outside the cluster.
Conditions—system resource issues including CPU and memory utilization. This section shows error conditions like OutOfDisk, MemoryPressure, and DiskPressure.
Events—this section shows issues occurring in the environment, such as eviction of pods.

kubectl describe pods

You can use this command to get information about pods running on a node:

Pod information—labels, resource requirements, and containers running in the pod
Pod ready state—if a pod appears as READY, it means it passed the last readiness check.
Container state—can be Waiting, Running, or Terminated.
Restart count—how often a container has been restarted.
Log events—showing activity on the pod, indicating which component logged the event, for which object, a Reason and a Message explaining what happened.

4. Understanding the Node Controller

The node controller is the control plane component responsible for managing several aspects of the node’s lifecycle. Here are the three main roles of the node controller:

Assigning CIDR addresses

When the node is registered, the node controller assigns a Cross Inter-Domain Routing (CIDR) block (if CIDR assignment is enabled).

Updating internal node lists

The node controller maintains an internal list of nodes. It needs to be updated constantly with the list of machines available by the cloud provider. This list enables the node controller to ensure capacity is met.

When a node is unhealthy, the node controller checks if the host machine for that node is available. If the VM is not available, the node controller deletes the node from the internal list. If Kubernetes is running on a public or private cloud, the node controller can send a request to create a new node, to maintain cluster capacity.

Monitoring the health of nodes

Here are several tasks the node controller is responsible for:

Checking the state of all nodes periodically, with the period determined by the --node-monitor-period flag.
Updating the NodeReady condition to ConditionUnknown if the node becomes unreachable and the node controller no longer receives heartbeats.
Evicting all pods from the node. If the node remains unreachable, the node controller uses graceful termination to evict the pods. Timeouts are set by default to 40 seconds, before reporting ConditionUnknown. Five minutes later, the node controller starts evicting pods.

Kubernetes Node Security

kubelet

This component is the main node agent for managing individual containers that run in a pod. Vulnerabilities associated with the kubelet are constantly discovered, meaning that you need to regularly upgrade the kubelet versions and apply the latest patches. Access to the kubelet is not authenticated by default, so you should implement strong authentication measures to restrict access.

kube-proxy

This component handles request forwarding according to network rules. It is a network proxy that supports various protocols (i.e. TCP, UDP) and allows Kubernetes services to be exposed. There are two ways to secure kube-proxy:

If proxy configuration is maintained via the kubeconfig file, restrict file permissions to ensure unauthorized parties cannot tamper with proxy settings.
Ensure that communication with the API server is only done over a secured port, and always require authentication and authorization.

Hardened Node Security

You can harden your noder security by following these steps:

Ensure the host is properly configured and secure—check your configuration to ensure it meets the CIS Benchmarks standards.
Control access to sensitive ports—ensure the network blocks access to ports that kubelet uses. Limit Kubernetes API server access to trusted networks.
Limit administrative access to nodes—ensure your Kubernetes nodes have restricted access. You can handle tasks like debugging and without having direct access to a node.

Isolation of Sensitive Workloads

You should run any sensitive workload on dedicated machines to minimize the impact of a breach. Isolating workloads prevents an attacker from accessing sensitive applications through lower-priority applications on the same host or with the same container runtime. Attackers can only exploit the kubelet credentials of compromised nodes to access secrets that are mounted on those nodes. You can use controls such as node pools, namespaces, tolerations and taints to isolate your workloads.

Taints and Tolerations

is a property of that attracts them to a set of (either as a preference or a hard requirement). Taints are the opposite -- they allow a node to repel a set of pods.

Tolerations are applied to pods. Tolerations allow the scheduler to schedule pods with matching taints. Tolerations allow scheduling but don't guarantee scheduling: the scheduler also as part of its function.

Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints.

Concepts

You add a taint to a node using . For example,

places a taint on node node1. The taint has key key1, value value1, and taint effect NoSchedule. This means that no pod will be able to schedule onto node1 unless it has a matching toleration.

To remove the taint added by the command above, you can run:

You specify a toleration for a pod in the PodSpec. Both of the following tolerations "match" the taint created by the kubectl taint line above, and thus a pod with either toleration would be able to schedule onto node1:

Here's an example of a pod that uses tolerations:

The default value for operator is Equal.

A toleration "matches" a taint if the keys are the same and the effects are the same, and:

the operator is Exists (in which case no value should be specified), or
the operator is Equal and the values are equal.

Note:

There are two special cases:

An empty key with operator Exists matches all keys, values and effects which means this will tolerate everything.

An empty effect matches all effects with key key1.

The above example used effect of NoSchedule. Alternatively, you can use effect of PreferNoSchedule. This is a "preference" or "soft" version of NoSchedule -- the system will try to avoid placing a pod that does not tolerate the taint on the node, but it is not required. The third kind of effect is NoExecute, described later.

You can put multiple taints on the same node and multiple tolerations on the same pod. The way Kubernetes processes multiple taints and tolerations is like a filter: start with all of a node's taints, then ignore the ones for which the pod has a matching toleration; the remaining un-ignored taints have the indicated effects on the pod. In particular,

if there is at least one un-ignored taint with effect NoSchedule then Kubernetes will not schedule the pod onto that node
if there is no un-ignored taint with effect NoSchedule but there is at least one un-ignored taint with effect PreferNoSchedule then Kubernetes will try to not schedule the pod onto the node
if there is at least one un-ignored taint with effect NoExecute then the pod will be evicted from the node (if it is already running on the node), and will not be scheduled onto the node (if it is not yet running on the node).

For example, imagine you taint a node like this

And a pod has two tolerations:

In this case, the pod will not be able to schedule onto the node, because there is no toleration matching the third taint. But it will be able to continue running if it is already running on the node when the taint is added, because the third taint is the only one of the three that is not tolerated by the pod.

Normally, if a taint with effect NoExecute is added to a node, then any pods that do not tolerate the taint will be evicted immediately, and pods that do tolerate the taint will never be evicted. However, a toleration with NoExecute effect can specify an optional tolerationSeconds field that dictates how long the pod will stay bound to the node after the taint is added. For example,

means that if this pod is running and a matching taint is added to the node, then the pod will stay bound to the node for 3600 seconds, and then be evicted. If the taint is removed before that time, the pod will not be evicted.

Example Use Cases

Taints and tolerations are a flexible way to steer pods away from nodes or evict pods that shouldn't be running. A few of the use cases are

Dedicated Nodes: If you want to dedicate a set of nodes for exclusive use by a particular set of users, you can add a taint to those nodes (say, kubectl taint nodes nodename dedicated=groupName:NoSchedule) and then add a corresponding toleration to their pods (this would be done most easily by writing a custom ). The pods with the tolerations will then be allowed to use the tainted (dedicated) nodes as well as any other nodes in the cluster. If you want to dedicate the nodes to them and ensure they only use the dedicated nodes, then you should additionally add a label similar to the taint to the same set of nodes (e.g. dedicated=groupName), and the admission controller should additionally add a node affinity to require that the pods can only schedule onto nodes labeled with dedicated=groupName.
Nodes with Special Hardware: In a cluster where a small subset of nodes have specialized hardware (for example GPUs), it is desirable to keep pods that don't need the specialized hardware off of those nodes, thus leaving room for later-arriving pods that do need the specialized hardware. This can be done by tainting the nodes that have the specialized hardware (e.g. kubectl taint nodes nodename special=true:NoSchedule or kubectl taint nodes nodename special=true:PreferNoSchedule) and adding a corresponding toleration to pods that use the special hardware. As in the dedicated nodes use case, it is probably easiest to apply the tolerations using a custom . For example, it is recommended to use to represent the special hardware, taint your special hardware nodes with the extended resource name and run the admission controller. Now, because the nodes are tainted, no pods without the toleration will schedule on them. But when you submit a pod that requests the extended resource, the ExtendedResourceToleration admission controller will automatically add the correct toleration to the pod and that pod will schedule on the special hardware nodes. This will make sure that these special hardware nodes are dedicated for pods requesting such hardware and you don't have to manually add tolerations to your pods.
Taint based Evictions: A per-pod-configurable eviction behavior when there are node problems, which is described in the next section.

Taint based Evictions

FEATURE STATE: Kubernetes v1.18 [stable]

The NoExecute taint effect, mentioned above, affects pods that are already running on the node as follows

pods that do not tolerate the taint are evicted immediately
pods that tolerate the taint without specifying tolerationSeconds in their toleration specification remain bound forever
pods that tolerate the taint with a specified tolerationSeconds remain bound for the specified amount of time

The node controller automatically taints a Node when certain conditions are true. The following taints are built in:

node.kubernetes.io/not-ready: Node is not ready. This corresponds to the NodeCondition Ready being "False".
node.kubernetes.io/unreachable: Node is unreachable from the node controller. This corresponds to the NodeCondition Ready being "Unknown".
node.kubernetes.io/memory-pressure: Node has memory pressure.
node.kubernetes.io/disk-pressure: Node has disk pressure.
node.kubernetes.io/pid-pressure: Node has PID pressure.
node.kubernetes.io/network-unavailable: Node's network is unavailable.
node.kubernetes.io/unschedulable: Node is unschedulable.
node.cloudprovider.kubernetes.io/uninitialized: When the kubelet is started with "external" cloud provider, this taint is set on a node to mark it as unusable. After a controller from the cloud-controller-manager initializes this node, the kubelet removes this taint.

In case a node is to be evicted, the node controller or the kubelet adds relevant taints with NoExecute effect. If the fault condition returns to normal the kubelet or node controller can remove the relevant taint(s).

In some cases when the node is unreachable, the API server is unable to communicate with the kubelet on the node. The decision to delete the pods cannot be communicated to the kubelet until communication with the API server is re-established. In the meantime, the pods that are scheduled for deletion may continue to run on the partitioned node.

Note: The control plane limits the rate of adding node new taints to nodes. This rate limiting manages the number of evictions that are triggered when many nodes become unreachable at once (for example: if there is a network disruption).

You can specify tolerationSeconds for a Pod to define how long that Pod stays bound to a failing or unresponsive Node.

For example, you might want to keep an application with a lot of local state bound to node for a long time in the event of network partition, hoping that the partition will recover and thus the pod eviction can be avoided. The toleration you set for that Pod might look like:

Note:

Kubernetes automatically adds a toleration for node.kubernetes.io/not-ready and node.kubernetes.io/unreachable with tolerationSeconds=300, unless you, or a controller, set those tolerations explicitly.

These automatically-added tolerations mean that Pods remain bound to Nodes for 5 minutes after one of these problems is detected.

pods are created with NoExecute tolerations for the following taints with no tolerationSeconds:

node.kubernetes.io/unreachable
node.kubernetes.io/not-ready

This ensures that DaemonSet pods are never evicted due to these problems.

Taint Nodes by Condition

The control plane, using the node , automatically creates taints with a NoSchedule effect for .

The scheduler checks taints, not node conditions, when it makes scheduling decisions. This ensures that node conditions don't directly affect scheduling. For example, if the DiskPressure node condition is active, the control plane adds the node.kubernetes.io/disk-pressure taint and does not schedule new pods onto the affected node. If the MemoryPressure node condition is active, the control plane adds the node.kubernetes.io/memory-pressure taint.

You can ignore node conditions for newly created pods by adding the corresponding Pod tolerations. The control plane also adds the node.kubernetes.io/memory-pressure toleration on pods that have a other than BestEffort. This is because Kubernetes treats pods in the Guaranteed or Burstable QoS classes (even pods with no memory request set) as if they are able to cope with memory pressure, while new BestEffort pods are not scheduled onto the affected node.

The DaemonSet controller automatically adds the following NoSchedule tolerations to all daemons, to prevent DaemonSets from breaking.

node.kubernetes.io/memory-pressure
node.kubernetes.io/disk-pressure
node.kubernetes.io/pid-pressure (1.14 or later)
node.kubernetes.io/unschedulable (1.10 or later)
node.kubernetes.io/network-unavailable (host network only)

Adding these tolerations ensures backward compatibility. You can also add arbitrary tolerations to DaemonSets.

Kubernetes Nodes need occasional maintenance. You could be updating the Node’s kernel, resizing its compute resource in your cloud account, or replacing physical hardware components in a self-hosted installation.

Kubernetes cordons and drains are two mechanisms you can use to safely prepare for Node downtime. They allow workloads running on a target Node to be rescheduled onto other ones. You can then shutdown the Node or remove it from your cluster without impacting service availability.

Applying a Node Cordon

Cordoning a Node marks it as unavailable to the Kubernetes scheduler. The Node will be ineligible to host any new Pods subsequently added to your cluster.

Use the kubectl cordon command to place a cordon around a named Node:

Existing Pods already running on the Node won’t be affected by the cordon. They’ll remain accessible and will still be hosted by the cordoned Node.

You can check which of your Nodes are currently cordoned with the get nodes command:

Cordoned nodes appear with the SchedulingDisabled status.

Draining a Node

The next step is to drain remaining Pods out of the Node. This procedure will evict the Pods so they’re rescheduled onto other Nodes in your cluster. Pods are allowed to before they’re forcefully removed from the target Node.

Run kubectl drain to initiate a drain procedure. Specify the name of the Node you’re taking out for maintenance:

The drain procedure first cordons the Node if you’ve not already placed one manually. It will then evict running Kubernetes workloads by safely rescheduling them to other Nodes in your cluster.

You can shutdown or destroy the Node once the drain’s completed. You’ve freed the Node from its responsibilities to your cluster. The cordon provides an assurance that no new workloads have been scheduled since the drain completed.

Ignoring Pod Grace Periods

Drains can sometimes take a while to complete if your Pods have long grace periods. This might not be ideal when you need to urgently take a Node offline. Use the --grace-period flag to override Pod termination grace periods and force an immediate eviction:

This should be used with care – some workloads might not respond well if they’re stopped without being offered a chance to clean up.

Understanding the pod phase

pod’s conditions during its lifecycle

Understanding the container state

Configuring the pod’s restart policy

What are the Three Types of Kubernetes Probes?

Kubernetes provides the following types of probes. For all these types, if the container does not implement the probe handler, their result is always Success.

Liveness Probe—indicates if the container is operating. If so, no action is taken. If not, the kubelet kills and restarts the container. Learn more in our .
Readiness Probe—indicates whether the application running in the container is ready to accept requests. If so, Services matching the pod are allowed to send traffic to it. If not, the endpoints controller removes the pod from all matching Kubernetes Services.
Startup Probe—indicates whether the application running in the container has started. If so, other probes start functioning. If not, the kubelet kills and restarts the container.

When to Use Readiness Probes

Readiness probes are most useful when an application is temporarily malfunctioning and unable to serve traffic. If the application is running but not fully available, Kubernetes may not be able to scale it up and new deployments could fail. A readiness probe allows Kubernetes to wait until the service is active before sending it traffic.

When you use a readiness probe, keep in mind that Kubernetes will only send traffic to the pod if the probe succeeds.

There is no need to use a readiness probe on deletion of a pod. When a pod is deleted, it automatically puts itself into an unready state, regardless of whether readiness probes are used. It remains in this status until all containers in the pod have stopped.

How Readiness Probes Work in Kubernetes

A readiness probe can be deployed as part of several Kubernetes objects. For example, here is how to define a readiness probe in a Deployment:

Once the above Deployment object is applied to the cluster, the readiness probe runs continuously throughout the lifecycle of the application.

A readiness probe has the following configuration options:

PARAMETER

DESCRIPTION

DEFAULT VALUE

Why Do Readiness Probes Fail? Common Error Scenarios

Readiness probes are used to verify tasks during a container lifecycle. This means that if the probe’s response is interrupted or delayed, service may be interrupted. Keep in mind that if a readiness probe returns Failure status, Kubernetes will remove the pod from all matching service endpoints. Here are two examples of conditions that can cause an application to incorrectly fail the readiness probe.

Delayed Response

In some circumstances, readiness probes may be late to respond—for example, if the application needs to read large amounts of data with low latency or perform heavy computations. Consider this behavior when configuring readiness probes, and always test your application thoroughly before running it in production with a readiness probe.

Cascading Failures

A readiness probe response can be conditional on components that are outside the direct control of the application. For example, you could configure a readiness probe using HTTPGet, in such a way that the application first checks the availability of a cache service or database before responding to the probe. This means that if the database is down or late to respond, the entire application will become unavailable.

This may or may not make sense, depending on your application setup. If the application cannot function at all without the third-party component, maybe this behavior is warranted. If it can continue functioning, for example, by falling back to a local cache, the database or external cache should not be connected to probe responses.

In general, if the pod is technically ready, even if it cannot function perfectly, it should not fail the readiness probe. A good compromise is to implement a “degraded mode,” for example, if there is no access to the database, answer read requests that can be addressed by local cache and return 503 (service unavailable) on write requests. Ensure that downstream services are resilient to a failure in the upstream service.

Methods of Checking Application Health

Startup, readiness, and liveness probes can check the health of applications in three ways: HTTP checks, container execution checks, and TCP socket checks.

HTTP Checks

An HTTP check is ideal for applications that return HTTP status codes, such as REST APIs.

HTTP probe uses GET requests to check the health of an application. The check is successful if the HTTP response code is in the range 200-399.

The following example demonstrates how to implement a readiness probe with the HTTP check method:

The readiness probe endpoint.
How long to wait after the container starts before checking its health.
How long to wait for the probe to finish.

Container Execution Checks

Container execution checks are ideal in scenarios where you must determine the status of the container based on the exit code of a process or shell script running in the container.

When using container execution checks Kubernetes executes a command inside the container. Exiting the check with a status of 0 is considered a success. All other status codes are considered a failure.

The following example demonstrates how to implement a container execution check:

The command to run and its arguments, as a YAML array.

TCP Socket Checks

A TCP socket check is ideal for applications that run as daemons, and open TCP ports, such as database servers, file servers, web servers, and application servers.

When using TCP socket checks Kubernetes attempts to open a socket to the container. The container is considered healthy if the check can establish a successful connection.

The following example demonstrates how to implement a liveness probe by using the TCP socket check method:

The TCP port to check.

Creating Probes

To configure probes on a deployment, edit the deployment’s resource definition. To do this, you can use the kubectl edit or kubectl patch commands. Alternatively, if you already have a deployment YAML definition, you can modify it to include the probes and then apply it with kubectl apply.

The following example demonstrates using the kubectl edit command to add a readiness probe to a deployment:

The following examples demonstrate using the kubectl set probe command with a variety of options:

Assign Pods to Nodes using Node Affinity

This page shows how to assign a Kubernetes Pod to a particular node using Node Affinity in a Kubernetes cluster.

Before you begin

You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using or you can use one of these Kubernetes playgrounds:

Your Kubernetes server must be at or later than version v1.10. To check the version, enter kubectl version.

Add a label to a node

List the nodes in your cluster, along with their labels:
The output is similar to this:
Choose one of your nodes, and add a label to it:
where <your-node-name> is the name of your chosen node.
Verify that your chosen node has a disktype=ssd label:
The output is similar to this:
In the preceding output, you can see that the worker0 node has a disktype=ssd label.

Schedule a Pod using required node affinity

This manifest describes a Pod that has a requiredDuringSchedulingIgnoredDuringExecution node affinity,disktype: ssd. This means that the pod will get scheduled only on a node that has a disktype=ssd label.

Apply the manifest to create a Pod that is scheduled onto your chosen node:
Verify that the pod is running on your chosen node:
The output is similar to this:

Schedule a Pod using preferred node affinity

This manifest describes a Pod that has a preferredDuringSchedulingIgnoredDuringExecution node affinity,disktype: ssd. This means that the pod will prefer a node that has a disktype=ssd label.

Apply the manifest to create a Pod that is scheduled onto your chosen node:
Verify that the pod is running on your chosen node:
The output is similar to this:

Kubernetes Architecture

This comprehensive guide on Kubernetes architecture aims to explain each kubernetes component in detail with illustrations.

So, If you want to,

Understand kubernetes architecture
Understand the workflow of Kubernetes core components

Then you’ll love this guide.

TABLE OF CONTENTS

Note: To understand Kubernetes architecture better, there are some prerequisites Please check the prerequisites in the kubernetes learning guide to know more.

Kubernetes Architecture

The following Kubernetes architecture diagram shows all the components of the Kubernetes cluster and how external systems connect to the Kubernetes cluster.

The first and foremost thing you should understand about Kubernetes is, it is a distributed system. Meaning, it has multiple components spread across different servers over a network. These servers could be Virtual machines or bare metal servers. We call it a Kubernetes cluster.

A Kubernetes cluster consists of control plane nodes and worker nodes.

Control Plane

The control plane is responsible for container orchestration and maintaining the desired state of the cluster. It has the following components.

kube-apiserver
etcd
kube-scheduler
kube-controller-manager
cloud-controller-manager

Worker Node

The Worker nodes are responsible for running containerized applications. The worker Node has the following components.

kubelet
kube-proxy
Container runtime

Kubernetes Control Plane Components

First, let’s take a look at each control plane component and the important concepts behind each component.

1. kube-apiserver

The kube-api server is the central hub of the Kubernetes cluster that exposes the Kubernetes API.

End users, and other cluster components, talk to the cluster via the API server. Very rarely monitoring systems and third-party services may talk to API servers to interact with the cluster.

So when you use kubectl to manage the cluster, at the backend you are actually communicating with the API server through HTTP REST APIs. However, the internal cluster components like the scheduler, controller, etc talk to the API server using gRPC.

The communication between the API server and other components in the cluster happens over TLS to prevent unauthorized access to the cluster.

Kubernetes api-server is responsible for the following

API management: Exposes the cluster API endpoint and handles all API requests.
Authentication (Using client certificates, bearer tokens, and HTTP Basic Authentication) and Authorization (ABAC and RBAC evaluation)
Processing API requests and validating data for the API objects like pods, services, etc. (Validation and Mutation Admission controllers)
It is the only component that communicates with etcd.
api-server coordinates all the processes between the control plane and worker node components.
api-server has a built-in bastion apiserver proxy. It is part of the API server process. It is primarily used to enable access to ClusterIP services from outside the cluster, even though these services are typically only reachable within the cluster itself.

Note: To reduce the cluster attack surface, it is crucial to secure the API server. The Shadowserver Foundation has conducted an experiment that discovered 380 000 publicly accessible Kubernetes API servers.

2. etcd

Kubernetes is a distributed system and it needs an efficient distributed database like etcd that supports its distributed nature. It acts as both a backend service discovery and a database. You can call it the brain of the Kubernetes cluster.

etcd is an open-source strongly consistent, distributed key-value store. So what does it mean?

Strongly consistent: If an update is made to a node, strong consistency will ensure it gets updated to all the other nodes in the cluster immediately. Also if you look at CAP theorem, achieving 100% availability with strong consistency and & Partition Tolerance is impossible.
Distributed: etcd is designed to run on multiple nodes as a cluster without sacrificing consistency.
Key Value Store: A nonrelational database that stores data as keys and values. It also exposes a key-value API. The datastore is built on top of BboltDB which is a fork of BoltDB.

etcd uses raft consensus algorithm for strong consistency and availability. It works in a leader-member fashion for high availability and to withstand node failures.

So how etcd works with Kubernetes?

To put it simply, when you use kubectl to get kubernetes object details, you are getting it from etcd. Also, when you deploy an object like a pod, an entry gets created in etcd.

In a nutshell, here is what you need to know about etcd.

etcd stores all configurations, states, and metadata of Kubernetes objects (pods, secrets, daemonsets, deployments, configmaps, statefulsets, etc).
etcd allows a client to subscribe to events using Watch() API . Kubernetes api-server uses the etcd’s watch functionality to track the change in the state of an object.
etcd exposes key-value API using gRPC. Also, the gRPC gateway is a RESTful proxy that translates all the HTTP API calls into gRPC messages. It makes it an ideal database for Kubernetes.
etcd stores all objects under the /registry directory key in key-value format. For example, information on a pod named Nginx in the default namespace can be found under /registry/pods/default/nginx

Also, etcd it is the only Statefulset component in the control plane.

3. kube-scheduler

The kube-scheduler is responsible for scheduling pods on worker nodes.

When you deploy a pod, you specify the pod requirements such as CPU, memory, affinity, taints or tolerations, priority, persistent volumes (PV), etc. The scheduler’s primary task is to identify the create request and choose the best node for a pod that satisfies the requirements.

The following image shows a high-level overview of how the scheduler works.

In a Kubernetes cluster, there will be more than one worker node. So how does the scheduler select the node out of all worker nodes?

Here is how the scheduler works.

To choose the best node, the Kube-scheduler uses filtering and scoring operations.
In filtering, the scheduler finds the best-suited nodes where the pod can be scheduled. For example, if there are five worker nodes with resource availability to run the pod, it selects all five nodes. If there are no nodes, then the pod is unschedulable and moved to the scheduling queue. If It is a large cluster, let’s say 100 worker nodes, and the scheduler doesn’t iterate over all the nodes. There is a scheduler configuration parameter called percentageOfNodesToScore. The default value is typically 50%. So it tries to iterate over 50% of nodes in a round-robin fashion. If the worker nodes are spread across multiple zones, then the scheduler iterates over nodes in different zones. For very large clusters the default percentageOfNodesToScore is 5%.
In the scoring phase, the scheduler ranks the nodes by assigning a score to the filtered worker nodes. The scheduler makes the scoring by calling multiple scheduling plugins. Finally, the worker node with the highest rank will be selected for scheduling the pod. If all the nodes have the same rank, a node will be selected at random.
Once the node is selected, the scheduler creates a binding event in the API server. Meaning an event to bind a pod and node.

Things should know about a scheduler.

It is a controller that listens to pod creation events in the API server.
The scheduler has two phases. Scheduling cycle and the Binding cycle. Together it is called the scheduling context. The scheduling cycle selects a worker node and the binding cycle applies that change to the cluster.
The scheduler always places the high-priority pods ahead of the low-priority pods for scheduling. Also, in some cases, after the pod started running in the selected node, the pod might get evicted or moved to other nodes. If you want to understand more, read the Kubernetes pod priority guide
You can create custom schedulers and run multiple schedulers in a cluster along with the native scheduler. When you deploy a pod you can specify the custom scheduler in the pod manifest. So the scheduling decisions will be taken based on the custom scheduler logic.
The scheduler has a pluggable scheduling framework. Meaning, you can add your custom plugin to the scheduling workflow.

4. Kube Controller Manager

What is a controller? Controllers are programs that run infinite control loops. Meaning it runs continuously and watches the actual and desired state of objects. If there is a difference in the actual and desired state, it ensures that the kubernetes resource/object is in the desired state.

As per the official documentation,

In Kubernetes, controllers are control loops that watch the state of your cluster, then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state.

Let’s say you want to create a deployment, you specify the desired state in the manifest YAML file (declarative approach). For example, 2 replicas, one volume mount, configmap, etc. The in-built deployment controller ensures that the deployment is in the desired state all the time. If a user updates the deployment with 5 replicas, the deployment controller recognizes it and ensures the desired state is 5 replicas.

Kube controller manager is a component that manages all the Kubernetes controllers. Kubernetes resources/objects like pods, namespaces, jobs, replicaset are managed by respective controllers. Also, the kube scheduler is also a controller managed by Kube controller manager.

Following is the list of important built-in Kubernetes controllers.

Deployment controller
Replicaset controller
DaemonSet controller
Job Controller (Kubernetes Jobs)
CronJob Controller
endpoints controller
namespace controller
service accounts controller.
Node controller

Here is what you should know about the Kube controller manager.

It manages all the controllers and the controllers try to keep the cluster in the desired state.
You can extend kubernetes with custom controllers associated with a custom resource definition.

5. Cloud Controller Manager (CCM)

When kubernetes is deployed in cloud environments, the cloud controller manager acts as a bridge between Cloud Platform APIs and the Kubernetes cluster.

This way the core kubernetes core components can work independently and allow the cloud providers to integrate with kubernetes using plugins. (For example, an interface between kubernetes cluster and AWS cloud API)

Cloud controller integration allows Kubernetes cluster to provision cloud resources like instances (for nodes), Load Balancers (for services), and Storage Volumes (for persistent volumes).

Cloud Controller Manager contains a set of cloud platform-specific controllers that ensure the desired state of cloud-specific components (nodes, Loadbalancers, storage, etc). Following are the three main controllers that are part of the cloud controller manager.

Node controller: This controller updates node-related information by talking to the cloud provider API. For example, node labeling & annotation, getting hostname, CPU & memory availability, nodes health, etc.
Route controller: It is responsible for configuring networking routes on a cloud platform. So that pods in different nodes can talk to each other.
Service controller: It takes care of deploying load balancers for kubernetes services, assigning IP addresses, etc.

Following are some of the classic examples of cloud controller manager.

Deploying Kubernetes Service of type Load balancer. Here Kubernetes provisions a Cloud-specific Loadbalancer and integrates with Kubernetes Service.
Provisioning storage volumes (PV) for pods backed by cloud storage solutions.

Overall Cloud Controller Manager manages the lifecycle of cloud-specific resources used by kubernetes.

Kubernetes Worker Node Components

Now let’s look at each of the worker node components.

1. Kubelet

Kubelet is an agent component that runs on every node in the cluster. t does not run as a container instead runs as a daemon, managed by systemd.

It is responsible for registering worker nodes with the API server and working with the podSpec (Pod specification – YAML or JSON) primarily from the API server. podSpec defines the containers that should run inside the pod, their resources (e.g. CPU and memory limits), and other settings such as environment variables, volumes, and labels.

It then brings the podSpec to the desired state by creating containers.

To put it simply, kubelet is responsible for the following.

Creating, modifying, and deleting containers for the pod.
Responsible for handling liveliness, readiness, and startup probes.
Responsible for Mounting volumes by reading pod configuration and creating respective directories on the host for the volume mount.
Collecting and reporting Node and pod status via calls to the API server.

Kubelet is also a controller where it watches for pod changes and utilizes the node’s container runtime to pull images, run containers, etc.

Other than PodSpecs from the API server, kubelet can accept podSpec from a file, HTTP endpoint, and HTTP server. A good example of “podSpec from a file” is Kubernetes static pods.

Static pods are controlled by kubelet, not the API servers.

This means you can create pods by providing a pod YAML location to the Kubelet component. However, static pods created by Kubelet are not managed by the API server.

Here is a real-world example use case of the static pod.

While bootstrapping the control plane, kubelet starts the api-server, scheduler, and controller manager as static pods from podSpecs located at /etc/kubernetes/manifests

Following are some of the key things about kubelet.

Kubelet uses the CRI (container runtime interface) gRPC interface to talk to the container runtime.
It also exposes an HTTP endpoint to stream logs and provides exec sessions for clients.
Uses the CSI (container storage interface) gRPC to configure block volumes.
It uses the CNI plugin configured in the cluster to allocate the pod IP address and set up any necessary network routes and firewall rules for the pod.

2. Kube proxy

To understand kube proxy, you need to have a basic knowledge of Kubernetes Service & endpoint objects.

Service in Kubernetes is a way to expose a set of pods internally or to external traffic. When you create the service object, it gets a virtual IP assigned to it. It is called clusterIP. It is only accessible within the Kubernetes cluster.

The Endpoint object contains all the IP addresses and ports of pod groups under a Service object. The endpoints controller is responsible for maintaining a list of pod IP addresses (endpoints). The service controller is responsible for configuring endpoints to a service.

You cannot ping the ClusterIP because it is only used for service discovery, unlike pod IPs which are pingable.

Now let’s understand Kube Proxy.

Kube-proxy is a daemon that runs on every node as a daemonset. It is a proxy component that implements the Kubernetes Services concept for pods. (single DNS for a set of pods with load balancing). It primarily proxies UDP, TCP, and SCTP and does not understand HTTP.

When you expose pods using a Service (ClusterIP), Kube-proxy creates network rules to send traffic to the backend pods (endpoints) grouped under the Service object. Meaning, all the load balancing, and service discovery are handled by the Kube proxy.

So how does Kube-proxy work?

Kube proxy talks to the API server to get the details about the Service (ClusterIP) and respective pod IPs & ports (endpoints). It also monitors for changes in service and endpoints.

Kube-proxy then uses any one of the following modes to create/update rules for routing traffic to pods behind a Service

IPTables: It is the default mode. In IPTables mode, the traffic is handled by IPtable rules. In this mode, kube-proxy chooses the backend pod random for load balancing. Once the connection is established, the requests go to the same pod until the connection is terminated.
IPVS: For clusters with services exceeding 1000, IPVS offers performance improvement. It supports the following load-balancing algorithms for the backend.
1. rr: round-robin : It is the default mode.
2. lc: least connection (smallest number of open connections)
3. dh: destination hashing
4. sh: source hashing
5. sed: shortest expected delay
6. nq: never queue
Userspace (legacy & not recommended)
Kernelspace: This mode is only for windows systems.

If you would like to understand the performance difference between kube-proxy IPtables and IPVS mode, read this article.

Also, you can run a Kubernetes cluster without kube-proxy by replacing it with Cilium.

3. Container Runtime

You probably know about Java Runtime (JRE). It is the software required to run Java programs on a host. In the same way, container runtime is a software component that is required to run containers.

Container runtime runs on all the nodes in the Kubernetes cluster. It is responsible for pulling images from container registries, running containers, allocating and isolating resources for containers, and managing the entire lifecycle of a container on a host.

To understand this better, let’s take a look at two key concepts:

Container Runtime Interface (CRI): It is a set of APIs that allows Kubernetes to interact with different container runtimes. It allows different container runtimes to be used interchangeably with Kubernetes. The CRI defines the API for creating, starting, stopping, and deleting containers, as well as for managing images and container networks.
Open Container Initiative (OCI): It is a set of standards for container formats and runtimes

Kubernetes supports multiple container runtimes (CRI-O, Docker Engine, containerd, etc) that are compliant with Container Runtime Interface (CRI). This means, all these container runtimes implement the CRI interface and expose gRPC CRI APIs (runtime and image service endpoints).

So how does Kubernetes make use of the container runtime?

As we learned in the Kubelet section, the kubelet agent is responsible for interacting with the container runtime using CRI APIs to manage the lifecycle of a container. It also gets all the container information from the container runtime and provides it to the control plane.

Let’s take an example of CRI-O container runtime interface. Here is a high-level overview of how container runtime works with kubernetes.

When there is a new request for a pod from the API server, the kubelet talks to CRI-O daemon to launch the required containers via Kubernetes Container Runtime Interface.
CRI-O checks and pulls the required container image from the configured container registry using containers/image library.
CRI-O then generates OCI runtime specification (JSON) for a container.
CRI-O then launches an OCI-compatible runtime (runc) to start the container process as per the runtime specification.

Kubernetes Cluster Addon Components

Apart from the core components, the kubernetes cluster needs addon components to be fully operational. Choosing an addon depends on the project requirements and use cases.

Following are some of the popular addon components that you might need on a cluster.

CNI Plugin (Container Network Interface)
CoreDNS (For DNS server): CoreDNS acts as a DNS server within the Kubernetes cluster. By enabling this addon, you can enable DNS-based service discovery.
Metrics Server (For Resource Metrics): This addon helps you collect performance data and resource usage of Nodes and pods in the cluster.
Web UI (Kubernetes Dashboard): This addon enables the Kubernetes dashboard for managing the object via web UI.

1. CNI Plugin

First, you need to understand Container Networking Interface (CNI)

It is a plugin-based architecture with vendor-neutral specifications and libraries for creating network interfaces for Containers.

It is not specific to Kubernetes. With CNI container networking can be standardized across container orchestration tools like Kubernetes, Mesos, CloudFoundry, Podman, Docker, etc.

When it comes to container networking, companies might have different requirements such as network isolation, security, encryption, etc. As container technology advanced, many network providers created CNI-based solutions for containers with a wide range of networking capabilities. You can call it as CNI-Plugins

This allows users to choose a networking solution that best fits their needs from different providers.

How does CNI Plugin work with Kubernetes?

The Kube-controller-manager is responsible for assigning pod CIDR to each node. Each pod gets a unique IP address from the pod CIDR.
Kubelet interacts with container runtime to launch the scheduled pod. The CRI plugin which is part of the Container runtime interacts with the CNI plugin to configure the pod network.
CNI Plugin enables networking between pods spread across the same or different nodes using an overlay network.

Following are high-level functionalities provided by CNI plugins.

Pod Networking
Pod network security & isolation using Network Policies to control the traffic flow between pods and between namespaces.

Some popular CNI plugins include:

Calico
Flannel
Weave Net
Cilium (Uses eBPF)
Amazon VPC CNI (For AWS VPC)
Azure CNI (For Azure Virtual network)Kubernetes networking is a big topic and it differs based on the hosting platforms.

Amazon SQS

Distributed queues

There are three main parts in a distributed messaging system: the components of your distributed system, your queue (distributed on Amazon SQS servers), and the messages in the queue.

In the following scenario, your system has several producers (components that send messages to the queue) and consumers (components that receive messages from the queue). The queue (which holds messages A through E) redundantly stores the messages across multiple Amazon SQS servers.

Message lifecycle

The following scenario describes the lifecycle of an Amazon SQS message in a queue, from creation to deletion.

A producer (component 1) sends message A to a queue, and the message is distributed across the Amazon SQS servers redundantly.

When a consumer (component 2) is ready to process messages, it consumes messages from the queue, and message A is returned. While message A is being processed, it remains in the queue and isn't returned to subsequent receive requests for the duration of the visibility timeout.

The consumer (component 2) deletes message A from the queue to prevent the message from being received and processed again when the visibility timeout expires.

Amazon SQS supports two types of queues – standard queues and FIFO queues. Use the information from the following table to chose the right queue for your situation. To learn more about Amazon SQS queues, see Amazon SQS Standard queues and Amazon SQS FIFO (First-In-First-Out) queues.

Standard queues

FIFO queues

Unlimited Throughput – Standard queues support a nearly unlimited number of API calls per second, per API action (SendMessage, ReceiveMessage, or DeleteMessage).

At-Least-Once Delivery – A message is delivered at least once, but occasionally more than one copy of a message is delivered.

Best-Effort Ordering – Occasionally, messages are delivered in an order different from which they were sent.

High Throughput – If you use , FIFO queues support up to 3,000 calls per second, per API method (SendMessageBatch, ReceiveMessage, or DeleteMessageBatch). The 3,000 calls per second represent 300 API calls, each with a batch of 10 messages. To request a quota increase, . Without batching, FIFO queues support up to 300 API calls per second, per API method (SendMessage, ReceiveMessage, or DeleteMessage).

Exactly-Once Processing – A message is delivered once and remains available until a consumer processes and deletes it. Duplicates aren't introduced into the queue.

First-In-First-Out Delivery – The order in which messages are sent and received is strictly preserved.

Send data between applications when the throughput is important, for example:

Decouple live user requests from intensive background work: let users upload media while resizing or encoding it.
Allocate tasks to multiple worker nodes: process a high number of credit card validation requests.
Batch messages for future processing: schedule multiple entries to be added to a database.

Send data between applications when the order of events is important, for example:

Make sure that user-entered commands are run in the right order.
Display the correct product price by sending price modifications in the right order.
Prevent a student from enrolling in a course before registering for an account.

AWS SNS

What is SNS?

SNS stands for Simple Notification Service.
It is a web service which makes it easy to set up, operate, and send a notification from the cloud.
It provides developers with the highly scalable, cost-effective, and flexible capability to publish messages from an application and sends them to other applications.
It is a way of sending messages. When you are using AutoScaling, it triggers an SNS service which will email you that "your EC2 instance is growing".
SNS can also send the messages to devices by sending push notifications to Apple, Google, Fire OS, and Windows devices, as well as Android devices in China with Baidu Cloud Push.
Besides sending the push notifications to the mobile devices, Amazon SNS sends the notifications through SMS or email to an Amazon Simple Queue Service (SQS), or to an HTTP endpoint.
SNS notifications can also trigger the Lambda function. When a message is published to an SNS topic that has a Lambda function associated with it, Lambda function is invoked with the payload of the message. Therefore, we can say that the Lambda function is invoked with a message payload as an input parameter and manipulate the information in the message and then sends the message to other SNS topics or other AWS services.
Amazon SNS allows you to group multiple recipients using topics where the topic is a logical access point that sends the identical copies of the same message to the subscribe recipients.
Amazon SNS supports multiple endpoint types. For example, you can group together IOS, Android and SMS recipients. Once you publish the message to the topic, SNS delivers the formatted copies of your message to the subscribers.
To prevent the loss of data, all messages published to SNS are stored redundantly across multiple availability zones.

Amazon SNS is a web service that manages sending messages to the subscribing endpoint. There are two clients of SNS:

Subscribers
Publishers

Publishers

Publishers are also known as producers that produce and send the message to the SNS which is a logical access point.

Subscribers

Subscribers such as web servers, email addresses, Amazon SQS queues, AWS Lambda functions receive the message or notification from the SNS over one of the supported protocols (Amazon SQS, email, Lambda, HTTP, SMS).

Note: A publisher sends the message to the SNS topic that they have created. There is no need to specify the destination address while publishing the message as the topic itself matches the subscribers associated with the topic that the publisher has created and delivers the message to the subscribers.

How to use SNS

Move to the SNS service available under the application services.

Click on the Topics appearing on the left side of the Console.

Click on the Create Topic to create a new topic.

Enter the topic name in a text box.

The below screen shows that the topic has been created successfully.

To create a subscription, click on the Create subscription.

Now, choose the endpoint type and enter the Endpoint address, i.e., where you want to send your notification.

The below screen shows that the status of subscription is pending.

The below screen shows that mail has been sent to the subscriber. A Subscriber has to click on the Confirm Subscription.

Click on the topic name, i.e., hello and then click on the Publish message.

Enter the subject, Time to Live and Message body to send to the endpoint.

The message has been sent to all the subscribers that have been mentioned in the ID.

Benefits of SNS

Instantaneous delivery SNS is based on push-based delivery. This is the key difference between SNS and SQS. SNS is pushed once you publish the message in a topic and the message is delivered to multiple subscribers.
Flexible SNS supports multiple endpoint types. Multiple endpoint types can receive the message over multiple transport protocols such as email, SMS, Lambda, Amazon SQS, HTTP, etc.
Inexpensive SNS service is quite inexpensive as it is based on pay-as-you-go model, i.e., you need to pay only when you are using the resources with no up-front costs.
Ease of use SNS service is very simple to use as Web-based AWS Management Console offers the simplicity of the point-and-click interface.
Simple Architecture SNS is used to simplify the messaging architecture by offloading the message filtering logic from the subscribers and message routing logic from the publishers. Instead of receiving all the messages from the topic, SNS sends the message to subscriber-only of their interest.

SNS stands for Simple Notification Service while SQS stands for Simple Queue Service.
SQS is a pull-based delivery, i.e., messages are not pushed to the receivers. Users have to pull the messages from the Queue. SNS is a push-based delivery, i.e., messages are pushed to multiple subscribers.
In SNS service, messages are pushed to the multiple receivers at the same time while in SQS service, messages are not received by the multiple receivers at the same time.
SQS polling introduces some latency in message delivery while SQS pushing pushed the messages to the subscribers immediately.

AWS Elastic Transcode

What is Elastic Transcoder?

Elastic Transcoder is an aws service used to convert the media files stored in an S3 bucket into the media files in different formats supported by different devices.
Elastic Transcoder is a media transcoder in the cloud.
It is used to convert media files from their original source format into different formats that will play on smartphones, tablets, PC's, etc.
It provides transcoding presets for popular output formats means that you don't need to guess about which settings work best on particular devices.
If you use Elastic Transcoder, then you need to pay based on the minutes that you transcode and the resolution at which you transcode.

Components of Elastic Transcoder

Elastic Transcoder consists of four components:

Jobs
Pipelines
Presets
Notifications

Jobs The main task of the job is to complete the work of transcoding. Each job can convert a file up to 30 formats. For example, if you want to convert a media file into eight different formats, then a single job creates files in eight formats. When you create a job, you need to specify the name of the file that you want to transcode.
Pipelines Pipelines are the queues that consist of your transcoding jobs. When you create a job, then you need to specify which pipeline you want to add your job. If you want a job to create more than one format, Elastic Transcoder creates the files for each format in the order you specify the formats in a job. You can create either of the two pipelines, i.e., standard-priority jobs and high-priority jobs. Mainly jobs go into the standard-priority jobs. Sometimes you want to transcode the file immediately; the high-priority pipeline is used.
Presets Presets are the templates that contain the settings for transcoding the media file from one format to another format. Elastic transcoder consists of some default presets for common formats. You can also create your own presets that are not included in the default presets. You need to specify a preset that you want to use when you create a job.
Notifications Notification is an optional field which you can configure with the Elastic Transcoder. Notification Service is a service that keeps you updated with the status of your job: when Elastic Transcoder starts processing your job, when Elastic Transcoder finishes its job, whether the Elastic Transcoder encounters an error condition or not.You can configure Notifications when you create a pipeline.

How A Cloud Uses Elastic Transcoder

Suppose I uploaded the mp4 file in S3 bucket. As soon as uploading is completed, it triggers a Lambda function. Lambda function will then invoke Elastic Transcoder. Elastic Transcoder converts the mp4 file into different formats so that the file can be opened in iphone, Laptop, etc. Once it has completed the transcoding, it stores the transcoded files in S3 bucket.

AWS RDS

Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate and scale relational databases in the cloud. It provides cost-efficient and resizable capabilities while automating time-consuming administration tasks such as hardware provisioning, database setup, patching, and backup. It frees you up to focus on your applications so that you can provide them with the fast performance, high availability, security and compatibility they need.

Amazon RDS is available on multiple database instance types - optimized for memory, performance or I/O - and gives you six familiar database engines to choose from, including Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database and SQL Server Huh. You can use the AWS Database Migration Service to migrate easily or replicate your existing databases to Amazon RDS.

How it works

Amazon Relational Database Service (Amazon RDS) is a collection of managed services that makes it simple to set up, operate, and scale databases in the cloud. Choose from seven popular engines — , , , , , , and — and deploy on-premises with .

Databases store large amounts of data that applications can draw upon to help them perform various tasks. A relational database uses tables to store data and is called relational because it organizes data points with defined relationships.

Administrators control Amazon RDS with the AWS Management Console, Amazon RDS API calls, or the AWS command-line interface. They use these interfaces to deploy database instances to which users can apply specific settings.

Amazon provides several instance types with different resources, such as CPU, memory, storage options, and networking capability. Each type comes in a variety of sizes to suit the needs of different workloads.

RDS users can use AWS Identity and Access Management to define and set permissions to access RDS databases.

Amazon RDS Features

Amazon RDS features include the following:

Replication. RDS uses the replication feature to create read replicas, and these are read-only copies of the database instances that the application uses without changing the original production database. Administrators can also enable automatic failover across multiple availability zones through RDS multi-edge deployment and synchronous data replication.

Amazon RDS - DB Instances

A DB instance is an isolated database environment running in the cloud which can contain multiple user-created databases. It can be accessed using the same client tools and applications used to access a standalone database instance. But there is restriction on how many DB instances of what type you can have for a single customer account. The below diagram illustrates the different combinations based on the type of license you opt for.

Each DB instance is identified by a customer supplied name called DB instance identifier. It is unique for the customer for a given AWS region.

DB Instance Classes

Depending on the need of the processing power and memory requirement, there is a variety of instance classes offered by AWS for the RDS service.

Instance Class

Number of Vcpu

Memory Range in GB

Bandwidth Range in Mbps

When there is a need of more processing power than memory requirement you can choose the standard instance class with a higher number of virtual CPUs. But in the case of very high memory requirement you can choose Memory optimized class with appropriate number of VCPUs. Choosing a correct class not only impacts the speed of the processing but also the cost of using service. The burstable performance class is needed when you have a minimal processing requirement and the data size in not in peta bytes.

DB Instance Status

The DB Instance status indicates the health of the DB. It’s value can be seen from the AWS console or using AWS CLI command describe-db-instances. The important status values of DB instances and their meaning is described below.

DB Instance Status

Meaning

Is the Instance Billed?

Amazon RDS - DB Storages

The RDS instances use Amazon Block Storage (EBS) volumes for storing data and log. These storage types can dynamically increase their size as and when needed. But based on the database workloads and price associated with these storage types we can customize the storage need. Following are the factors to be analysed while deciding on the storage types.

IOPS – It represents the number of Input Output operations performed per second. Both read and write operations are summed up for finding the IOPS value. AWS creates a report of IOPS value for every 1 minute. It can have value from 0 to tens of thousands per second.
Latency – It is the number of milliseconds elapsed between the initiation of an I/O request and the completion of the I/O request. A bigger latency indicates a slower performance.
Throughput – The number of bytes transferred to and from the disk every second. AWS reports the read and write throughput separately for every 1-minute interval.
Queue Depth – It is the number of I/O requests waiting in the queue before they can reach the disk. AWS reports the queue depth for every 1-minute interval. Also a higher queue-depth indicates a slower storage performance.

Based on the above considerations, the aws storage types are as below.

General Purpose SSD

This is a cost-effective storage that is useful in most of the common database tasks. It can provide 3000 IOPS for a 1- TiB volume. In a 3.34 TiB size the performance can go up to 10000 IOPS.

I/O Credits

Each GB of storage allows 3 IOPs as a base line performance. Which mean a 100 GB volume can provide 300 IOPs. But there may be scenario when you need more IOPS. In such scenario you need to use some IO credit balance which is offered when the storage is initialized. It is 5.4 million IO credits which can be used when a burstable performance need arises. On the other hand when you use less IOPS than the baseline performance, you accumulate the credits which can be used in future requirement of burstable performances.

Below is a equation which shows the relation between burst duration and Credit balance.

If your DB needs frequent and long duration burstable performance, then the next storage type will be a better choice.

Provisioned IOPS Storage

This is a type of storage system that gives sustained higher performance and consistently low latency which is most suitable for OLTP workloads.

When creating the DB instance, you specify the required IOPS rate and volume size for such storage. Below is a chart which is used for reference for deciding about the IOPS and storage needed under provisioned storage.

DB Engine

Provisioned IOPS Range

Storage Range

This is a very old storage technology which is maintained by aws, only for backward compatibility. Its features are very limited which are the following.

Does not support Elastic Volumes
Limited to maximum size of 4 TB
Limited to maximum of 1000 IOPS

Amazon Aurora RDS

MySQL and PostgreSQL-compatible relational database built for the cloud.

Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud, that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases.

Amazon Aurora is up to five times faster than standard MySQL databases and three times faster than standard PostgreSQL databases. Amazon Aurora is fully managed by Amazon Relational Database Service (RDS), which automates time-consuming administration tasks like hardware provisioning, database setup, patching, and backups.

Amazon Aurora features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 128TB per database instance. It delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across three Availability Zones (AZs).

Visit the to create your first Aurora database instance and start migrating your MySQL and PostgreSQL databases.

Benefits

High Performance and Scalability

Get 5X the throughput of standard MySQL and 3X the throughput of standard PostgreSQL. You can easily scale your database deployment up and down from smaller to larger instance types as your needs change, or let Aurora Serverless handle scaling automatically for you. To scale read capacity and performance, you can add up to 15 low latency read replicas across three Availability Zones. Amazon Aurora automatically grows storage as needed, up to 128TB per database instance.

High Availability and Durability

Amazon Aurora is designed to offer greater than 99.99% availability, replicating 6 copies of your data across 3 Availability Zones and backing up your data continuously to Amazon S3. It transparently recovers from physical storage failures; instance failover typically takes less than 30 seconds.

Highly Secure

Amazon Aurora provides multiple levels of security for your database. These include network isolation using ,encryption at rest using keys you create and control through and encryption of data in transit using SSL. On an encrypted Amazon Aurora instance, data in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster.

MySQL and PostgreSQL Compatible

The Amazon Aurora database engine is fully compatible with existing MySQL and PostgreSQL open source databases, and adds compatibility for new releases regularly. This means you can easily migrate MySQL or PostgreSQL databases to Aurora using standard MySQL or PostgreSQL import/export tools or snapshots. It also means the code, applications, drivers, and tools you already use with your existing databases can be used with Amazon Aurora with little or no change.

Fully Managed

Amazon Aurora is fully managed by Amazon Relational Database Service (RDS). You no longer need to worry about database management tasks such as hardware provisioning, software patching, setup, configuration, or backups. Aurora automatically and continuously monitors and backs up your database to Amazon S3, enabling granular point-in-time recovery. You can monitor database performance using Amazon CloudWatch, Enhanced Monitoring, or Performance Insights, an easy-to-use tool that helps you quickly detect performance problems.

Migration Support

MySQL and PostgreSQL compatibility make Amazon Aurora a compelling target for database migrations to the cloud. If you're migrating from MySQL or PostgreSQL, see our for a list of tools and options. To migrate from commercial database engines, you can use the for a secure migration with minimal downtime.

Amazon RDS for PostgreSQL

PostgreSQL has become the preferred open source relational database for many enterprise developers and start-ups, powering leading business and mobile applications. Amazon RDS makes it easy to set up, operate, and scale PostgreSQL deployments in the cloud. With Amazon RDS, you can deploy scalable PostgreSQL deployments in minutes with cost-efficient and resizable hardware capacity. Amazon RDS manages complex and time-consuming administrative tasks such as PostgreSQL software installation and upgrades; storage management; replication for high availability and read throughput; and backups for disaster recovery.

Amazon RDS for PostgreSQL gives you access to the capabilities of the familiar PostgreSQL database engine. This means that the code, applications, and tools you already use today with your existing databases can be used with Amazon RDS. Amazon RDS for PostgreSQL currently supports PostgreSQL 9.6, 10, 11, 12, 13, 14 and 15.

Benefits

Easy, managed deployments

It takes only a few clicks in the Amazon Web Services Management Console to launch and connect to a production-ready PostgreSQL database in minutes. Amazon RDS for PostgreSQL database instances are pre-configured with parameters and settings for the server type you have selected. Database parameter groups provide granular control and fine-tuning of your PostgreSQL database.

Fast, predictable storage

Amazon RDS provides two SSD-backed storage options for your PostgreSQL database. General Purpose storage provides cost-effective storage for small or medium-sized workloads. For high-performance OLTP applications, Provisioned IOPS delivers consistent performance of up to 80,000 IOs per second. As your storage requirements grow you can provision additional storage on-the-fly with zero downtime.

Backup and recovery

The automated backup feature of Amazon RDS enables recovery of your PostgreSQL database instance to any point in time within your specified retention period of up to thirty five days. In addition, you can perform user-initiated backups of your DB Instance. These full database backups will be stored by Amazon RDS until you explicitly delete them.

High availability and read replicas

Amazon RDS Multi-AZ deployments provide enhanced availability and durability for your PostgreSQL databases, making them a natural fit for production database workloads. Amazon RDS Read Replicas make it easy to elastically scale out beyond the capacity constraints of a single database instance for read-heavy database workloads.

Monitoring and metrics

Amazon RDS for PostgreSQL provides metrics for your database instances at no additional charge and Amazon RDS Enhanced Monitoring provides access to over 50 CPU, memory, file system, and disk I/O metrics. View key operational metrics in Amazon Web Services Management Console, including compute/memory/storage capacity utilization, I/O activity, and instance connections.

Isolation and security

As a managed service, Amazon RDS provides a high level of security for your PostgreSQL databases. These include network isolation using , encryption at rest using keys you create and control through and encryption of data in transit using SSL.

Amazon RDS for SQL Server

SQL Server is a relational database management system developed by Microsoft. Amazon RDS for SQL Server makes it easy to set up, operate, and scale SQL Server deployments in the cloud. With Amazon RDS, you can deploy multiple editions of SQL Server (2012, 2014, 2016, 2017 and 2019) including Express, Web, Standard and Enterprise, in minutes with cost-efficient and re-sizable compute capacity. Amazon RDS frees you up to focus on application development by managing time-consuming database administration tasks including provisioning, backups, software patching, monitoring, and hardware scaling.

Amazon RDS for SQL Server supports the “License Included” licensing model. You do not need separately purchased Microsoft SQL Server licenses. "License Included" pricing is inclusive of software, underlying hardware resources, and Amazon RDS management capabilities.

You can take advantage of hourly pricing with no upfront fees or long-term commitments. In addition, you also have the option to purchase Reserved DB Instances under one or three year reservation terms. With Reserved DB Instances, you can make low, one-time, upfront payment for each DB Instance and then pay a significantly discounted hourly usage rate, achieving up to 65% net cost savings.

Amazon RDS for SQL Server DB Instances can be provisioned with either standard storage or Provisioned IOPS storage. Amazon RDS Provisioned IOPS is a storage option designed to deliver fast, predictable, and consistent I/O performance, and is optimized for I/O-intensive, transactional (OLTP) database workloads.

Benefits

Fully managed

Amazon RDS for SQL Server is fully managed by Amazon Relational Database Service (RDS). You no longer need to worry about database management tasks such as hardware provisioning, software patching, setup, configuration, or backups.

Single click high availability

With a single click, you can enable the Multi-AZ option, replicating data synchronously across different availability zones. In case the primary node crashes, your database will automatically fail over to the secondary and we will automatically re-build the secondary.

Auto-scaled storage

By opting into Auto-Scale storage, instances will automatically increase the storage size with zero downtime. With RDS Storage Auto Scaling, you simply set your desired maximum storage limit, and Auto Scaling takes care of the rest.

Automated backups

Amazon RDS creates and saves automated backups of your SQL Server instance. Amazon RDS creates a storage volume snapshot of your instance, backing up the entire instance and not just individual databases. Amazon RDS for SQL Server creates automated backups of your DB instance during the backup window of your DB instance.

Upgrade and modernize

Amazon RDS SQL Server offers Enterprise, Standard Edition, Web, and Express Editions for SQL Server Versions 2012, 2014, 2016, 2017, and 2019. For SQL Server 2008 R2 (now deprecated) customers can still keep their database compatibility in 2008 and migrate to a new versions with minimal changes.

Ease of migration

We support a number of ways to migrate to Amazon RDS SQL Server, including Single and Multi-file native restores, Microsoft SQL Server Database Publishing Wizard, Import/Export, Amazon Database Migration Service, and SQL Server Replication.

Amazon RDS Multi-AZ with one standby

Automatic fail over

Protect database performance

Enhance durability

Increase availability

How it works

In an Amazon RDS Multi-AZ deployment, Amazon RDS automatically creates a primary database (DB) instance and synchronously replicates the data to an instance in a different AZ. When it detects a failure, Amazon RDS automatically fails over to a standby instance without manual intervention.

Amazon RDS Multi-AZ with two readable standbys

How it works

Deploy highly available, durable MySQL or PostgreSQL databases in three AZs using Amazon RDS Multi-AZ with two readable standbys. Gain automatic failovers in typically under 35 seconds, up to 2x faster transaction commit latency compared to Amazon RDS Multi-AZ with one standby, additional read capacity, and a choice of AWS Graviton2– or Intel–based instances for compute.

Introduction to Amazon RDS Multi-AZ

Amazon RDS Multi-AZ deployments provide enhanced availability and durability for your Amazon RDS database (DB) instances, making them a natural fit for production database workloads. With two different deployment options, you can customize your workloads for the availability they need.

Comparison Table

Amazon RDS Single-AZ or Amazon RDS Multi-AZ with one standby or Amazon RDS Multi-AZ with two readable standbys

AWS Route53

What is DNS?

DNS stands for Domain Name System.
DNS is used when you use an internet. DNS is used to convert human-friendly domain names into an Internet Protocol (IP) address.
IP addresses are used by computers to identify each other on the network.
IP addresses are of two types, i.e., Ipv4 and Ipv6.

Top Level Domains

Domains are seperated by a string of characters seperated by dots. For example, google.com, gmail.com, etc.
The last word in a domain name is known as a Top Level Domain.
The second word in a domain name is known as a second level domain name.

For example:

.com: .com is a top-level domain.

.edu: .edu is a top-level domain.

.gov: .gov is a top-level domain.

.co.uk: .uk is a top-level domain name while .co is a second level domain name.

.gov.uk: .uk is a top-level domain name while .gov is a second level domain name.

The Top level domain names are controlled by IANA (Internet Assigned Numbers Authority).
IANA is a root zone database of all available top-level domains.
You can view the database by visiting the site: http://www.iana.org/domains/root/db

Domain Registrars

Domain Registrar is an authority that assigns the domain names directly under one or more top-level domains.
Domain Registrar is used because all the names in a domain name must be unique there needs to be a way to organize these domain names so that they do not get duplicated.
Domain names are registered with interNIC, a service of ICANN, which enforces uniqueness of domain name across the internet.
Each domain name is registered in a central database known as the WhoIS database.
The popular domain registrars include GoDaddy.com, 123-reg.co.uk, etc.

State Of Authority Record (SOA)

SOA stores the information in Domain Name System (zone) about the zone and other DNS records. Where DNS zone is a space allocated for a particular type of server.
Each DNS zone consists of a single SOA record.

The State of Authority Record stores the information about:

The name of the server that supplies the data for the zone.
The administrator of the zone, i.e., who is administering the zone.
The current version of the data file that contains the zone.
The default number of records for the time-to-live file on resource records. For example, when you are dealing with a DNS, then it always has a time-to-live. Time-to-live must be lower as possible because when you make changes, it then propagates quicker. Suppose the name of the website is Hindi100.com and its time-to-live is 60 seconds. By the end, you want to change its IP address then the time taken to achieve this is equal to the time-to-live.
The number of seconds a secondary name server has to wait before checking for the updates.
The maximum number of seconds that a secondary name server can use the data before it is either be refreshed or expire.

NS Records

NS stands for Name Server records.
NS Records are used by Top Level Domain Servers to direct traffic to the Content DNS server which contains the authoritative DNS records.

Let's understand through a simple example.

Suppose the user wants an IP address of hindi100.com. If ISP does not know the IP address of hindi100.com, ISP goes to the .com and asks for the NS Record. It finds that time-to-live is 172800 and its ns record is ns.awsdns.com. ISP moves to this ns record and asks that "do you know hindi100.com". Yes, it knows, so it points to Route53. In SOA, we have all the DNS types and 'A' records.

A Records

An 'A' record is a fundamental type of DNS record.
'A' stands for Address.
An 'A' record is used by the computer to convert the domain name into an IP address. For example, https://www.javatpoint.com might point to http://123.10.10.80.

TTL

The length that a DNS record is cached on either the Resolving power or the users owns local PC is equal to the value of the TTL in seconds.
The lower the time-to-live, the faster changes to DNS records take to propagate throughout the internet.

CNAMES

A CNAME can be used to resolve one domain name to another. For example, you may have a mobile website with a domain name http://m.devices.com which is used when users browse to your domain name on their mobile devices. You may also want the name http://mobile.devices.com to resolve the same address.

Alias Records

Alias Records are used to map resource record sets in your hosted zone to Elastic load balancers, CloudFront distributions, or S3 buckets that are configured as websites.
Alias records work like a CNAME record in that you can map one DNS name (http://www.example.com) to another target DNS name (elb1234.elb.amazonaws.com).
The key difference between a CNAME and Alias Record is that a CNAME cannot be used for naked domain names (zone apex) record, i.e., it cannot be used when something is written infront of the domain name. For example, http://www.example.com contains a www infront of the domain name, therefore, it cannot be used for CNAME.

What Is Amazon Route 53?

Route 53 is a web service that is a highly available and scalable Domain Name System (DNS.)

Let’s understand what is Amazon Route 53 in technical terms. AWS Route 53 lets developers and organizations route end users to their web applications in a very reliable and cost-effective manner. It is a Domain Name System (DNS) that translates domain names into IP addresses to direct traffic to your website. In simple terms, it converts World Wide Web addresses like www.example.com to IP addresses like 10.20.30.40.

Basically, domain queries are automatically routed to the nearest DNS server to provide the quickest response possible. If you use a web hosting company like GoDaddy, it takes 30 minutes to 24 hours to remap a domain to a different IP, but by using Route 53 in AWS it takes only a few minutes.

Interested in learning AWS from experts? Check out AWS Training Course now!

How Amazon Route 53 works?

AWS Route 53 connects requests to the infrastructure running in AWS. These requests include AWS ELB, Amazon EC2 instances, or Amazon S3 buckets. In addition to this, AWS Route 53 is also used to route users to infrastructure outside of AWS.

AWS Route 53 can be easily used to configure DNS health checks, continuously monitor your applications’ ability to recover from failures, and control application recovery with Route 53 Application Recovery Controller. Further, AWS Route 53 traffic flow helps to manage traffic globally via a wide variety of routing types including latency-based routing, geo DNS, weighted round-robin, and geo proximity. All these routing types can be easily combined with DNS Failover in order to enable a variety of low-latency, fault-tolerant architectures.

Let us understand, step by step, how does AWS Route 53 work:

A user accesses www.example.com, an address managed by Route 53, which leads to a machine on AWS.
The request for www.example.com is routed to the user’s DNS resolver, typically managed by the ISP or local network, and is forwarded to a DNS root server.
The DNS resolver forwards the request to the TLD name servers for “.com” domains.
The resolver obtains the authoritative name server for the domain—these will be four Amazon Route 53 name servers that host the domain’s DNS zone.
The DNS resolver chooses one of the four Route 53 servers and requests details for the hostname www.example.com.
The Route 53 name server looks in the DNS zone for www.example.com, gets the IP address and other relevant information, and returns it to the DNS resolver.
The DNS resolver returns the IP address to the user’s web browser. The DNS resolver also caches the IP address locally as specified by the Time to Live (TTL) parameter.
The browser contacts the webserver or other Amazon-hosted services by using the IP address provided by the resolver.
The website is displayed on the user’s web browser.

Now, take a look at the benefits provided by Route 53.

Amazon Route 53 Benefits

Route 53 provides the user with several benefits.

They are:

Highly Available and Reliable
Flexible
Simple
Fast
Cost-effective
Designed to Integrate with Other AWS Services
Secure
Scalable

Highly Available and Reliable

AWS Route 53 is built using AWS’s highly available and reliable infrastructure. DNS servers are distributed across many availability zones, which helps in routing end users to your website consistently.
Amazon Route 53 Traffic Flow service helps improve reliability with easy re-route configuration when the system fails.

Flexible

Route 53 Traffic Flow provides users flexibility in choosing traffic policies based on multiple criteria, such as endpoint health, geographic location, and latency.

Simple

Your DNS queries are answered by Route 53 in AWS within minutes of your setup, and it is a self-service sign-up.
Also, you can use the simple AWS Route 53 API and embed it in your web application too.

Distributed Route 53 DNS servers around the world make a low-latency service. Because they route users to the nearest DNS server available.

Cost-effective

You only pay for what you use, for example, the hosted zones managing your domains, the number of queries that are answered per domain, etc.
Also, optional features like traffic policies and health checks are available at a very low cost.

Designed to Integrate with Other AWS Services

Route 53 works very well with other services like Amazon EC2 and Amazon S3.
For example, you can use Route 53 to map your domain names or IP addresses to your EC2 instances and Amazon S3 buckets.

Secure

You can create and grant unique credentials and permissions to each and every user with your AWS account, while you have to mention who has access to which parts of the service.

Scalable

Amazon Route 53 is designed to automatically scale up or down when the query volume size varies.

These are the benefits that Amazon Route 53 provides, moving on with this what is Amazon Route 53 tutorial, let’s discuss the AWS routing policies.

Amazon Route 53 Limitations

Amazon Route 53 is a robust DNS service with advanced features, but it has several limitations as well. Some of them are discussed below:

No DNSSEC support: DNSSEC stands for Domain Name System Security Extensions. It is a suite of extensions specifications by the Internet Engineering Task Force. It is used to secure the data exchanged in DNS in Internet Protocol networks. It is not supported by AWS Route 53.
Forwarding options: Route 53 does not provide forwarding or conditional forwarding options for domains used on an on-premise network.
Single point of failure: Used in conjunction with other AWS services, Route 53 may become a single point of failure. This becomes a major problem for AWS route 53 disaster recovery and other relevant issues.
Limited Route 53 DNS load balancing: The features of AWS Route 53 load balancer lack advanced policy support and enterprise-class features and provide only basic load balancing capabilities.
Route 53 Cost: For businesses using Route 53 with non-AWS endpoints or services, the service is expensive. In particular, the visual editor is costly including the cost of each query.
No support for private zone transfers: AWS Route 53 DNS cannot be appointed as the authoritative source for cloud websites.com, even after having the root-level domain registered.
Latency: All AWS Route 53 queries must be forwarded to external servers after contacting Amazon infrastructure.

AWS Route 53 Alternatives

When buying a solution, buyers often compare and evaluate similar products by different market players based on certain parameters such as specific product capabilities, integration, contracting, ease of deployment, and offered support and services. Based on the mentioned parameters and a few more, we have listed some potential AWS Route 53 alternatives below:

Azure DNS: It allows you to host your DNS domain in Azure. This helps to manage DNS records by using the same credentials, billing, and support contract just as other Azure services.
Cloudflare DNS: As a potential alternative to AWS Route 53, Cloudflare DNS is described as the fastest, privacy-first consumer DNS service. It is a free-of-charge service for ordinary people; however, professionals and enterprises have to take up a monthly subscription.
Google Cloud DNS: Google Cloud DNS is a scalable, reliable, and managed authoritative DNS service that runs on the same infrastructure as Google.
DNSMadeEasy: It offers affordable DNS management services that are easy to manage. It also has the highest uptime and amazing ROI.
DNSimple: With DNSimple, you can register a domain quickly with no upselling and hassles.

Does Avi Offer Route 53 Monitoring Capabilities?

Avi Vantage is a next-generation, full-featured elastic application of azure services fabric that offers a range of application services such as security, monitoring and analytics, load balancing, and multi-cloud traffic management for workloads. All workloads are deployed in bare metal, virtualized, or container environments in a data center of a public cloud such as AWS. Avi Vantage delivers full-featured load balancing capabilities in an as-a-service experience and easily integrated Web Application Firewall (WAF) capabilities.

Enterprises often leverage the power of AWS in order to maximize and modernize infrastructure utilization. The next phase of this modernization is represented by extending app-centricity to the networking stack.

Avi Networks integrates with AWS Route 53 and delivers elastic application services that extend beyond load balancing to deliver real-time app and security insights, simplify troubleshooting, enable developer self-service, and automation.

Amazon Route 53 Resolver for Hybrid Cloud

The user merges a private center with one of their Amazon VPCs using a managed VPN or AWS Direct Connect in a typical hybrid cloud environment. As the private cloud and the user’s VPC is a pre-established connection to AWS, whenever a lookup is performed across this connection, it often fails. As a result, some users reroute requests using on-premises DNS servers to another Amazon VPC server. It can perform outbound communication from VPC to the data center and inbound communication from an on-premises source to VPC.

Some of the advantages of AWS Route 53 resolver are as follows:

Security: AWS benefits from the added security of Identity Access Management (IAM). AWS IAM allows secure user control access to all web resources and services. It can also assign specific permissions to allow or deny access to AWS resources and the creation and management of AWS users or groups.

Cost: AWS Route 53 proves to be really cost-effective as it redirects website requests without extra hardware and does not charge for queries to CloudFront distributions, ELBs, S3 buckets, VPC endpoints, and other AWS resources.

Reliability: All features of Route 53, such as geographically-based and latency-based policies, are designed to be highly reliable and cost-effective. In addition to this, Amazon Route 53 is designed to help the system stay running in a coordinated way with all the other AWS services.

AWS Routing Policies

There are several types of routing policies. The below list provides the routing policies which are used by AWS Route 53.

Simple Routing
Latency-based Routing
Geolocation Routing

Simple Routing

Simple routing responds to DNS queries based only on the values in AWS route table. Use the simple routing policy when you have a single resource that performs a given function for your domain.

Latency-based Routing

If an application is hosted on EC2 instances in multiple regions, user latency can be reduced by serving requests from the region where network latency is the lowest. Create a latency resource record set for the Amazon EC2 resource in each region that hosts the application. Latency will sometimes change when there are changes in the routes.

Interested in learning AWS? Go through this AWS Tutorial!

Geolocation Routing

Geolocation routing can be used to send traffic to resources based on the geographical location of users, e.g., all queries from Europe can be routed to the IP address 10.20.30.40. Geolocation works by mapping IP addresses, irrespective of regions, to locations.

Now, you understood that Route 53 in AWS maps the end user to an IP address or a domain name. But, where are the routes stored?

AWS Route Tables

An AWS route table contains a set of rules or routes, which is used to determine where the network traffic is directed to.

All subnets in your VPC have to be attached to an AWS route table, and the table will take control of routing for those particular subnets. A subnet cannot be associated with multiple route tables at the same time, but multiple subnets can be connected with a single AWS route table. An AWS route table consists of the destination IP address and the target.

These are the benefits provided by Route 53. What key features make Route 53 special?

AWS Route 53 Key Features

Traffic Flow

You can route end users to the best endpoint possible according to your application’s geo proximity, latency, health, and other considerations.

Latency-based Routing

You can route end users to the AWS region with the lowest possible latency.

Geo DNS

You can route your end users to the endpoint which is present in their specific region or the nearest geographic location.

DNS Failover

You can route your end users to an alternate location to avoid website crashes or outages.

Health Checks and Monitoring

The Health and performance of your website or application is monitored by Amazon Route 53. Your servers can be monitored as well.

Domain Registration

You can search for and register available domain names using Amazon Route 53. A full list of currently available Top-level Domains (TLDs) are provided with the current pricing.

Hands-on: Creating a Hosted Zone

Step 1: Log in to the AWS Management Console Step 2: Click on Route 53 in the Services drop-down

Now, go to www.freenom.com or any website for which you want to get a domain name. Freenom is completely free; for a demo, just use a domain from freenom.

Step 3: Go to Route 53 dashboard and click on Create Hosted Zone

Step 4: Provide the domain you have created in the domain field and keep the website as a public hosted site

Step 5: Now, you will have a nameserver (NS) and Start of Authority (SOA) type recordsets. Copy the content of the nameserver value textbox and paste it in the Custom nameservers of your domain name

After pasting nameservers, click on Change Nameservers.Remove the dots at the end of your nameserver values in both places

Step 6: Create two recordsets with the type ‘A’ and leave one as the same. For the other, add ‘www’ so that both domain names redirect to the EC2 instance IP address you have provided. If you want to know how to create an EC2 instance, check out the AWS EC2 blog and do as per the hands-on steps mentioned there.

Step 7: After completing all these steps perfectly, type the domain name in your browser’s URL tab. As you can see, the website is now online and available publicly on the Internet

You have successfully hosted your first website!

In this what is amazon route 53 in AWS, we have discussed the concepts of Route 53, how it works, what are AWS route tables and the key features provided by Amazon Route 53. Keep visiting for more tutorials on Services offered by AWS.

Application Architecture

ZiSoft Deployment

Architecture of Linux system

The Linux operating system's architecture mainly contains some of the components: the Kernel, System Library, Hardware layer, System, and Shell utility.

1. Kernel:- The kernel is one of the core section of an operating system. It is responsible for each of the major actions of the Linux OS. This operating system contains distinct types of modules and cooperates with underlying hardware directly. The kernel facilitates required abstraction for hiding details of low-level hardware or application programs to the system. There are some of the important kernel types which are mentioned below:

Monolithic Kernel
Micro kernels
Exo kernels
Hybrid kernels

2. System Libraries:- These libraries can be specified as some special functions. These are applied for implementing the operating system's functionality and don't need code access rights of the modules of kernel.

3. System Utility Programs:- It is responsible for doing specialized level and individual activities.

4. Hardware layer:- Linux operating system contains a hardware layer that consists of several peripheral devices like CPU, HDD, and RAM.

5. Shell:- It is an interface among the kernel and user. It can afford the services of kernel. It can take commands through the user and runs the functions of the kernel. The shell is available in distinct types of OSes. These operating systems are categorized into two different types, which are the graphical shells and command-line shells.

The graphical line shells facilitate the graphical user interface, while the command line shells facilitate the command line interface. Thus, both of these shells implement operations. However, the graphical user interface shells work slower as compared to the command-line interface shells.

There are a few types of these shells which are categorized as follows:

Korn shell
Bourne shell
C shell
POSIX shell

Docs

Git vs SVN

It's a distributed version control system.

It's a Centralized version control system

Git is an SCM (source code management).

Git has a cloned repository.

SVN does not have a cloned repository.

The Git branches are familiar to work. The Git system helps in merging the files quickly and also assist in finding the unmerged ones.

The SVN branches are a folder which exists in the repository. Some special commands are required For merging the branches.

Git does not have a Global revision number.

SVN has a Global revision number.

Git has cryptographically hashed contents that protect the contents from repository corruption taking place due to network issues or disk failures.

SVN does not have any cryptographically hashed contents.

Git stored content as metadata.

SVN stores content as files.

Git has more content protection than SVN.

SVN's content is less secure than Git.

Linus Torvalds developed git for Linux kernel.

CollabNet, Inc developed SVN.

Git is distributed under GNU (General public license).

SVN is distributed under the open-source license.

Kubernetes Session 2

What is a Kubernetes Namespace?

A namespace helps separate a cluster into logical units. It helps granularly organize, allocate, manage, and secure cluster resources. Here are two notable use cases for Kubernetes namespaces:

Apply policies to cluster segments—Kubernetes namespaces let you apply policies to different parts of a cluster. For example, you can define resource policies to limit resource consumption. You can also use container network interfaces (CNIs) to apply network policies that define how communication is achieved between pods in each namespace. Learn more about .
Apply access controls—namespaces let you define role-based access control (RBAC). You can define a role object type and assign it using role binding. The role you define is applied to a namespace, and RoleBinding is applied to specific objects within this namespace. Using this technique can help you improve the security of your cluster.

Related content: Read our guide to

In this article:

Kubernetes Namespaces Concepts

There are two types of Kubernetes namespaces: Kubernetes system namespaces and custom namespaces.

Default Kubernetes namespaces

Here are the four default namespaces Kubernetes creates automatically:

default—a default space for objects that do not have a specified namespace.
kube-system—a default space for Kubernetes system objects, such as kube-dns and kube-proxy, and add-ons providing cluster-level features, such as web UI dashboards, ingresses, and cluster-level logging.
kube-public—a default space for resources available to all users without authentication.
kube-node-lease—a default space for objects related to cluster scaling.

Custom Kubernetes namespaces

Admins can create as many Kubernetes namespaces as necessary to isolate workloads or resources and limit access to specific users. Here is how to create a namespace using kubectl:

kubectl create ns mynamespace

Related content: Read our guide to

The Hierarchical Namespace Controller (HNC)

When Should You Use Multiple Kubernetes Namespaces?

Isolation—if you have a large team, or several teams working on the same cluster, you can use namespaces to create separation between projects and microservices. Activity in one namespace never affects the other namespaces.
Development stages—if you use the same Kubernetes cluster for multiple stages of the development lifecycle, it is a good idea to separate development, testing, and production environments. You do not want errors or instability in testing environments to affect production users. Ideally, you should use a separate cluster for each environment, but if this is not possible, namespaces can create this separation.
Permissions—it might be necessary to define separate permissions for different resources in your cluster. You can define separate RBAC rules for each namespace, ensuring that only authorized roles can access the resources in the namespace. This is especially important for mission critical applications, and to protect sensitive data in production deployments.
Resource control—you can define resource limits at the namespace level, ensuring each namespace has access to a certain amount of CPU and memory resources. This enables separating cluster resources among several projects and ensuring each project has the resources it needs, leaving sufficient resources for other projects.
Performance—Kubernetes API provides better performance when you define namespaces in the cluster. This is because the API has less items to search through when you perform specific operations.

Quick Tutorial: Working with Kubernetes Namespaces

Let’s see how to perform basic namespace operations—creating a namespace, viewing existing namespaces, and creating a pod in a namespace.

Creating Namespaces

You can create a namespace with a simple kubectl command, like this:

kubectl create namespace mynamespace

Viewing Namespaces

To list all the namespaces currently active in your cluster, run this command:

kubectl get namespace

The output will look something like this:

Creating Resources in the Namespace

When you create a resource in Kubernetes without specifying a namespace, it is automatically created in the current namespace.

For example, the following pod specification does not specify a namespace:

When you apply this pod specification, the following will happen:

If you did not create any namespaces in your cluster, the pod will be created in the default namespace
If you created a namespace and are currently running in it, the pod will be created in that namespace.

How can you explicitly create a resource in a specific namespace?

There are two ways to do this:

Use the –namespace flag when creating the resource, like this:

Specify a namespace in the YAML specification of the resource. Here is what it looks like in a pod specification:

Important notes:

Note that if your YAML specification specifies one namespace, but the apply command specifies another namespace, the command will fail.
If you try to work with a Kubernetes resource in a different namespace, kubectl will not find the resource. Us the –namespace flag to work with resources in other namespaces, like this:

What Are Kubernetes Nodes?

In this article:

Kubernetes Node Components

Here are three main Kubernetes node components:

kubelet

The Kubelet is responsible for managing the deployment of pods to Kubernetes nodes. It receives commands from the API server and instructs the container runtime to start or stop containers as needed.

kube-proxy

Container runtime

Working with Kubernetes Nodes: 4 Basic Operations

Here is how to perform common operations on a Kubernetes node.

1. Adding Node to a Cluster

Adding nodes automatically

Defining node capacity

2. Modifying Node Objects

You can use kubectl to manually create or modify node objects, overriding the settings defined in --register-node. You can, for example:

Use labels on nodes and node selectors to control scheduling. You can, for example, limit a pod to be eligible only for running on a subset of available nodes.
Mark a node as unschedulable to prevent the scheduler from adding new pods to the node. This action does not affect pods running on the node. You can use this option in preparation for maintenance tasks like node reboot. To mark the node as unschedulable, you can run: kubectl cordon $NODENAME.

3. Checking Node Status

There are three primary commands you can use to determine the status of a node and the resources running on it.

kubectl describe nodes

Run the command kubectl describe nodes my-node to get node information including:

HostName—reported by the node operating system kernel. You can report a different value for HostName using the kubelet flag --hostname-override.
InternalIP—enables traffic to be routed to the node within the cluster.
ExternalID—an IP that can be used to access the node from outside the cluster.
Conditions—system resource issues including CPU and memory utilization. This section shows error conditions like OutOfDisk, MemoryPressure, and DiskPressure.
Events—this section shows issues occurring in the environment, such as eviction of pods.

kubectl describe pods

You can use this command to get information about pods running on a node:

Pod information—labels, resource requirements, and containers running in the pod
Pod ready state—if a pod appears as READY, it means it passed the last readiness check.
Container state—can be Waiting, Running, or Terminated.
Restart count—how often a container has been restarted.
Log events—showing activity on the pod, indicating which component logged the event, for which object, a Reason and a Message explaining what happened.

4. Understanding the Node Controller

The node controller is the control plane component responsible for managing several aspects of the node’s lifecycle. Here are the three main roles of the node controller:

Assigning CIDR addresses

When the node is registered, the node controller assigns a Cross Inter-Domain Routing (CIDR) block (if CIDR assignment is enabled).

Updating internal node lists

Monitoring the health of nodes

Here are several tasks the node controller is responsible for:

Checking the state of all nodes periodically, with the period determined by the --node-monitor-period flag.
Updating the NodeReady condition to ConditionUnknown if the node becomes unreachable and the node controller no longer receives heartbeats.
Evicting all pods from the node. If the node remains unreachable, the node controller uses graceful termination to evict the pods. Timeouts are set by default to 40 seconds, before reporting ConditionUnknown. Five minutes later, the node controller starts evicting pods.

Kubernetes Node Security

kubelet

kube-proxy

If proxy configuration is maintained via the kubeconfig file, restrict file permissions to ensure unauthorized parties cannot tamper with proxy settings.
Ensure that communication with the API server is only done over a secured port, and always require authentication and authorization.

Hardened Node Security

You can harden your noder security by following these steps:

Ensure the host is properly configured and secure—check your configuration to ensure it meets the CIS Benchmarks standards.
Control access to sensitive ports—ensure the network blocks access to ports that kubelet uses. Limit Kubernetes API server access to trusted networks.
Limit administrative access to nodes—ensure your Kubernetes nodes have restricted access. You can handle tasks like debugging and without having direct access to a node.

Isolation of Sensitive Workloads

Taints and Tolerations

is a property of that attracts them to a set of (either as a preference or a hard requirement). Taints are the opposite -- they allow a node to repel a set of pods.

Concepts

You add a taint to a node using . For example,

To remove the taint added by the command above, you can run:

Here's an example of a pod that uses tolerations:

The default value for operator is Equal.

A toleration "matches" a taint if the keys are the same and the effects are the same, and:

the operator is Exists (in which case no value should be specified), or
the operator is Equal and the values are equal.

Note:

There are two special cases:

An empty key with operator Exists matches all keys, values and effects which means this will tolerate everything.

An empty effect matches all effects with key key1.

if there is at least one un-ignored taint with effect NoSchedule then Kubernetes will not schedule the pod onto that node
if there is no un-ignored taint with effect NoSchedule but there is at least one un-ignored taint with effect PreferNoSchedule then Kubernetes will try to not schedule the pod onto the node
if there is at least one un-ignored taint with effect NoExecute then the pod will be evicted from the node (if it is already running on the node), and will not be scheduled onto the node (if it is not yet running on the node).

For example, imagine you taint a node like this

And a pod has two tolerations:

Example Use Cases

Taints and tolerations are a flexible way to steer pods away from nodes or evict pods that shouldn't be running. A few of the use cases are

Dedicated Nodes: If you want to dedicate a set of nodes for exclusive use by a particular set of users, you can add a taint to those nodes (say, kubectl taint nodes nodename dedicated=groupName:NoSchedule) and then add a corresponding toleration to their pods (this would be done most easily by writing a custom ). The pods with the tolerations will then be allowed to use the tainted (dedicated) nodes as well as any other nodes in the cluster. If you want to dedicate the nodes to them and ensure they only use the dedicated nodes, then you should additionally add a label similar to the taint to the same set of nodes (e.g. dedicated=groupName), and the admission controller should additionally add a node affinity to require that the pods can only schedule onto nodes labeled with dedicated=groupName.
Nodes with Special Hardware: In a cluster where a small subset of nodes have specialized hardware (for example GPUs), it is desirable to keep pods that don't need the specialized hardware off of those nodes, thus leaving room for later-arriving pods that do need the specialized hardware. This can be done by tainting the nodes that have the specialized hardware (e.g. kubectl taint nodes nodename special=true:NoSchedule or kubectl taint nodes nodename special=true:PreferNoSchedule) and adding a corresponding toleration to pods that use the special hardware. As in the dedicated nodes use case, it is probably easiest to apply the tolerations using a custom . For example, it is recommended to use to represent the special hardware, taint your special hardware nodes with the extended resource name and run the admission controller. Now, because the nodes are tainted, no pods without the toleration will schedule on them. But when you submit a pod that requests the extended resource, the ExtendedResourceToleration admission controller will automatically add the correct toleration to the pod and that pod will schedule on the special hardware nodes. This will make sure that these special hardware nodes are dedicated for pods requesting such hardware and you don't have to manually add tolerations to your pods.
Taint based Evictions: A per-pod-configurable eviction behavior when there are node problems, which is described in the next section.

Taint based Evictions

FEATURE STATE: Kubernetes v1.18 [stable]

The NoExecute taint effect, mentioned above, affects pods that are already running on the node as follows

pods that do not tolerate the taint are evicted immediately
pods that tolerate the taint without specifying tolerationSeconds in their toleration specification remain bound forever
pods that tolerate the taint with a specified tolerationSeconds remain bound for the specified amount of time

The node controller automatically taints a Node when certain conditions are true. The following taints are built in:

node.kubernetes.io/not-ready: Node is not ready. This corresponds to the NodeCondition Ready being "False".
node.kubernetes.io/unreachable: Node is unreachable from the node controller. This corresponds to the NodeCondition Ready being "Unknown".
node.kubernetes.io/memory-pressure: Node has memory pressure.
node.kubernetes.io/disk-pressure: Node has disk pressure.
node.kubernetes.io/pid-pressure: Node has PID pressure.
node.kubernetes.io/network-unavailable: Node's network is unavailable.
node.kubernetes.io/unschedulable: Node is unschedulable.
node.cloudprovider.kubernetes.io/uninitialized: When the kubelet is started with "external" cloud provider, this taint is set on a node to mark it as unusable. After a controller from the cloud-controller-manager initializes this node, the kubelet removes this taint.

You can specify tolerationSeconds for a Pod to define how long that Pod stays bound to a failing or unresponsive Node.

Note:

These automatically-added tolerations mean that Pods remain bound to Nodes for 5 minutes after one of these problems is detected.

pods are created with NoExecute tolerations for the following taints with no tolerationSeconds:

node.kubernetes.io/unreachable
node.kubernetes.io/not-ready

This ensures that DaemonSet pods are never evicted due to these problems.

Taint Nodes by Condition

The control plane, using the node , automatically creates taints with a NoSchedule effect for .

The DaemonSet controller automatically adds the following NoSchedule tolerations to all daemons, to prevent DaemonSets from breaking.

node.kubernetes.io/memory-pressure
node.kubernetes.io/disk-pressure
node.kubernetes.io/pid-pressure (1.14 or later)
node.kubernetes.io/unschedulable (1.10 or later)
node.kubernetes.io/network-unavailable (host network only)

Adding these tolerations ensures backward compatibility. You can also add arbitrary tolerations to DaemonSets.

Applying a Node Cordon

Cordoning a Node marks it as unavailable to the Kubernetes scheduler. The Node will be ineligible to host any new Pods subsequently added to your cluster.

Use the kubectl cordon command to place a cordon around a named Node:

Existing Pods already running on the Node won’t be affected by the cordon. They’ll remain accessible and will still be hosted by the cordoned Node.

You can check which of your Nodes are currently cordoned with the get nodes command:

Cordoned nodes appear with the SchedulingDisabled status.

Draining a Node

Run kubectl drain to initiate a drain procedure. Specify the name of the Node you’re taking out for maintenance:

The drain procedure first cordons the Node if you’ve not already placed one manually. It will then evict running Kubernetes workloads by safely rescheduling them to other Nodes in your cluster.

Ignoring Pod Grace Periods

This should be used with care – some workloads might not respond well if they’re stopped without being offered a chance to clean up.

Understanding the pod phase

pod’s conditions during its lifecycle

Understanding the container state

Configuring the pod’s restart policy

What are the Three Types of Kubernetes Probes?

Kubernetes provides the following types of probes. For all these types, if the container does not implement the probe handler, their result is always Success.

Liveness Probe—indicates if the container is operating. If so, no action is taken. If not, the kubelet kills and restarts the container. Learn more in our .
Readiness Probe—indicates whether the application running in the container is ready to accept requests. If so, Services matching the pod are allowed to send traffic to it. If not, the endpoints controller removes the pod from all matching Kubernetes Services.
Startup Probe—indicates whether the application running in the container has started. If so, other probes start functioning. If not, the kubelet kills and restarts the container.

When to Use Readiness Probes

When you use a readiness probe, keep in mind that Kubernetes will only send traffic to the pod if the probe succeeds.

How Readiness Probes Work in Kubernetes

A readiness probe can be deployed as part of several Kubernetes objects. For example, here is how to define a readiness probe in a Deployment:

Once the above Deployment object is applied to the cluster, the readiness probe runs continuously throughout the lifecycle of the application.

A readiness probe has the following configuration options:

PARAMETER

DESCRIPTION

DEFAULT VALUE

Why Do Readiness Probes Fail? Common Error Scenarios

Delayed Response

Cascading Failures

Methods of Checking Application Health

Startup, readiness, and liveness probes can check the health of applications in three ways: HTTP checks, container execution checks, and TCP socket checks.

HTTP Checks

An HTTP check is ideal for applications that return HTTP status codes, such as REST APIs.

HTTP probe uses GET requests to check the health of an application. The check is successful if the HTTP response code is in the range 200-399.

The following example demonstrates how to implement a readiness probe with the HTTP check method:

The readiness probe endpoint.
How long to wait after the container starts before checking its health.
How long to wait for the probe to finish.

Container Execution Checks

Container execution checks are ideal in scenarios where you must determine the status of the container based on the exit code of a process or shell script running in the container.

The following example demonstrates how to implement a container execution check:

The command to run and its arguments, as a YAML array.

TCP Socket Checks

A TCP socket check is ideal for applications that run as daemons, and open TCP ports, such as database servers, file servers, web servers, and application servers.

When using TCP socket checks Kubernetes attempts to open a socket to the container. The container is considered healthy if the check can establish a successful connection.

The following example demonstrates how to implement a liveness probe by using the TCP socket check method:

The TCP port to check.

Creating Probes

The following example demonstrates using the kubectl edit command to add a readiness probe to a deployment:

The following examples demonstrate using the kubectl set probe command with a variety of options:

Assign Pods to Nodes using Node Affinity

This page shows how to assign a Kubernetes Pod to a particular node using Node Affinity in a Kubernetes cluster.

Before you begin

Your Kubernetes server must be at or later than version v1.10. To check the version, enter kubectl version.

Add a label to a node

List the nodes in your cluster, along with their labels:
The output is similar to this:
Choose one of your nodes, and add a label to it:
where <your-node-name> is the name of your chosen node.
Verify that your chosen node has a disktype=ssd label:
The output is similar to this:
In the preceding output, you can see that the worker0 node has a disktype=ssd label.

Schedule a Pod using required node affinity

Apply the manifest to create a Pod that is scheduled onto your chosen node:
Verify that the pod is running on your chosen node:
The output is similar to this:

Schedule a Pod using preferred node affinity

This manifest describes a Pod that has a preferredDuringSchedulingIgnoredDuringExecution node affinity,disktype: ssd. This means that the pod will prefer a node that has a disktype=ssd label.

Apply the manifest to create a Pod that is scheduled onto your chosen node:
Verify that the pod is running on your chosen node:
The output is similar to this:

Kubernetes Session 3

A time-to-live mechanism to clean up old Jobs that have finished execution.

ReplicaSet

A ReplicaSet's purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.

How a ReplicaSet works

A ReplicaSet is defined with fields, including a selector that specifies how to identify Pods it can acquire, a number of replicas indicating how many Pods it should be maintaining, and a pod template specifying the data of new Pods it should create to meet the number of replicas criteria. A ReplicaSet then fulfills its purpose by creating and deleting Pods as needed to reach the desired number. When a ReplicaSet needs to create new Pods, it uses its Pod template.

A ReplicaSet is linked to its Pods via the Pods' field, which specifies what resource the current object is owned by. All Pods acquired by a ReplicaSet have their owning ReplicaSet's identifying information within their ownerReferences field. It's through this link that the ReplicaSet knows of the state of the Pods it is maintaining and plans accordingly.

A ReplicaSet identifies new Pods to acquire by using its selector. If there is a Pod that has no OwnerReference or the OwnerReference is not a and it matches a ReplicaSet's selector, it will be immediately acquired by said ReplicaSet.

When to use a ReplicaSet

A ReplicaSet ensures that a specified number of pod replicas are running at any given time. However, a Deployment is a higher-level concept that manages ReplicaSets and provides declarative updates to Pods along with a lot of other useful features. Therefore, we recommend using Deployments instead of directly using ReplicaSets, unless you require custom update orchestration or don't require updates at all.

This actually means that you may never need to manipulate ReplicaSet objects: use a Deployment instead, and define your application in the spec section.

Example

Saving this manifest into frontend.yaml and submitting it to a Kubernetes cluster will create the defined ReplicaSet and the Pods that it manages.

You can then get the current ReplicaSets deployed:

And see the frontend one you created:

You can also check on the state of the ReplicaSet:

And you will see output similar to:

And lastly you can check for the Pods brought up:

You should see Pod information similar to:

You can also verify that the owner reference of these pods is set to the frontend ReplicaSet. To do this, get the yaml of one of the Pods running:

The output will look similar to this, with the frontend ReplicaSet's info set in the metadata's ownerReferences field:

Non-Template Pod acquisitions

While you can create bare Pods with no problems, it is strongly recommended to make sure that the bare Pods do not have labels which match the selector of one of your ReplicaSets. The reason for this is because a ReplicaSet is not limited to owning Pods specified by its template-- it can acquire other Pods in the manner specified in the previous sections.

Take the previous frontend ReplicaSet example, and the Pods specified in the following manifest:

As those Pods do not have a Controller (or any object) as their owner reference and match the selector of the frontend ReplicaSet, they will immediately be acquired by it.

Suppose you create the Pods after the frontend ReplicaSet has been deployed and has set up its initial Pod replicas to fulfill its replica count requirement:

The new Pods will be acquired by the ReplicaSet, and then immediately terminated as the ReplicaSet would be over its desired count.

Fetching the Pods:

The output shows that the new Pods are either already terminated, or in the process of being terminated:

If you create the Pods first:

And then create the ReplicaSet however:

You shall see that the ReplicaSet has acquired the Pods and has only created new ones according to its spec until the number of its new Pods and the original matches its desired count. As fetching the Pods:

Will reveal in its output:

In this manner, a ReplicaSet can own a non-homogenous set of Pods

Writing a ReplicaSet manifest

As with all other Kubernetes API objects, a ReplicaSet needs the apiVersion, kind, and metadata fields. For ReplicaSets, the kind is always a ReplicaSet.

When the control plane creates new Pods for a ReplicaSet, the .metadata.name of the ReplicaSet is part of the basis for naming those Pods. The name of a ReplicaSet must be a valid value, but this can produce unexpected results for the Pod hostnames. For best compatibility, the name should follow the more restrictive rules for a .

A ReplicaSet also needs a .

Pod Template

The .spec.template is a which is also required to have labels in place. In our frontend.yaml example we had one label: tier: frontend. Be careful not to overlap with the selectors of other controllers, lest they try to adopt this Pod.

For the template's field, .spec.template.spec.restartPolicy, the only allowed value is Always, which is the default.

Pod Selector

The .spec.selector field is a . As discussed these are the labels used to identify potential Pods to acquire. In our frontend.yaml example, the selector was:

In the ReplicaSet, .spec.template.metadata.labels must match spec.selector, or it will be rejected by the API.

Note: For 2 ReplicaSets specifying the same .spec.selector but different .spec.template.metadata.labels and .spec.template.spec fields, each ReplicaSet ignores the Pods created by the other ReplicaSet.

Replicas

You can specify how many Pods should run concurrently by setting .spec.replicas. The ReplicaSet will create/delete its Pods to match this number.

If you do not specify .spec.replicas, then it defaults to 1.

Working with ReplicaSets

Deleting a ReplicaSet and its Pods

To delete a ReplicaSet and all of its Pods, use . The automatically deletes all of the dependent Pods by default.

When using the REST API or the client-go library, you must set propagationPolicy to Background or Foreground in the -d option. For example:

Deleting just a ReplicaSet

You can delete a ReplicaSet without affecting any of its Pods using with the --cascade=orphan option. When using the REST API or the client-go library, you must set propagationPolicy to Orphan. For example:

Once the original is deleted, you can create a new ReplicaSet to replace it. As long as the old and new .spec.selector are the same, then the new one will adopt the old Pods. However, it will not make any effort to make existing Pods match a new, different pod template. To update Pods to a new spec in a controlled way, use a , as ReplicaSets do not support a rolling update directly.

Isolating Pods from a ReplicaSet

You can remove Pods from a ReplicaSet by changing their labels. This technique may be used to remove Pods from service for debugging, data recovery, etc. Pods that are removed in this way will be replaced automatically ( assuming that the number of replicas is not also changed).

Scaling a ReplicaSet

A ReplicaSet can be easily scaled up or down by simply updating the .spec.replicas field. The ReplicaSet controller ensures that a desired number of Pods with a matching label selector are available and operational.

When scaling down, the ReplicaSet controller chooses which pods to delete by sorting the available pods to prioritize scaling down pods based on the following general algorithm:

Pending (and unschedulable) pods are scaled down first
If controller.kubernetes.io/pod-deletion-cost annotation is set, then the pod with the lower value will come first.
Pods on nodes with more replicas come before pods on nodes with fewer replicas.
If the pods' creation times differ, the pod that was created more recently comes before the older pod (the creation times are bucketed on an integer log scale when the LogarithmicScaleDown is enabled)

If all of the above match, then selection is random.

Pod deletion cost

FEATURE STATE: Kubernetes v1.22 [beta]

Using the annotation, users can set a preference regarding which pods to remove first when downscaling a ReplicaSet.

The annotation should be set on the pod, the range is [-2147483647, 2147483647]. It represents the cost of deleting a pod compared to other pods belonging to the same ReplicaSet. Pods with lower deletion cost are preferred to be deleted before pods with higher deletion cost.

The implicit value for this annotation for pods that don't set it is 0; negative values are permitted. Invalid values will be rejected by the API server.

This feature is beta and enabled by default. You can disable it using the PodDeletionCost in both kube-apiserver and kube-controller-manager.

Note:

This is honored on a best-effort basis, so it does not offer any guarantees on pod deletion order.
Users should avoid updating the annotation frequently, such as updating it based on a metric value, because doing so will generate a significant number of pod updates on the apiserver.

Example Use Case

The different pods of an application could have different utilization levels. On scale down, the application may prefer to remove the pods with lower utilization. To avoid frequently updating the pods, the application should update controller.kubernetes.io/pod-deletion-cost once before issuing a scale down (setting the annotation to a value proportional to pod utilization level). This works if the application itself controls the down scaling; for example, the driver pod of a Spark deployment.

ReplicaSet as a Horizontal Pod Autoscaler Target

A ReplicaSet can also be a target for . That is, a ReplicaSet can be auto-scaled by an HPA. Here is an example HPA targeting the ReplicaSet we created in the previous example.

Saving this manifest into hpa-rs.yaml and submitting it to a Kubernetes cluster should create the defined HPA that autoscales the target ReplicaSet depending on the CPU usage of the replicated Pods.

Alternatively, you can use the kubectl autoscale command to accomplish the same (and it's easier!)

ReplicationController

Note: A that configures a is now the recommended way to set up replication.

A ReplicationController ensures that a specified number of pod replicas are running at any one time. In other words, a ReplicationController makes sure that a pod or a homogeneous set of pods is always up and available.

How a ReplicationController Works

If there are too many pods, the ReplicationController terminates the extra pods. If there are too few, the ReplicationController starts more pods. Unlike manually created pods, the pods maintained by a ReplicationController are automatically replaced if they fail, are deleted, or are terminated. For example, your pods are re-created on a node after disruptive maintenance such as a kernel upgrade. For this reason, you should use a ReplicationController even if your application requires only a single pod. A ReplicationController is similar to a process supervisor, but instead of supervising individual processes on a single node, the ReplicationController supervises multiple pods across multiple nodes.

ReplicationController is often abbreviated to "rc" in discussion, and as a shortcut in kubectl commands.

A simple case is to create one ReplicationController object to reliably run one instance of a Pod indefinitely. A more complex use case is to run several identical replicas of a replicated service, such as web servers.

Running an example ReplicationController

This example ReplicationController config runs three copies of the nginx web server.

Run the example job by downloading the example file and then running this command:

The output is similar to this:

Check on the status of the ReplicationController using this command:

The output is similar to this:

Here, three pods are created, but none is running yet, perhaps because the image is being pulled. A little later, the same command may show:

To list all the pods that belong to the ReplicationController in a machine readable form, you can use a command like this:

The output is similar to this:

Here, the selector is the same as the selector for the ReplicationController (seen in the kubectl describe output), and in a different form in replication.yaml. The --output=jsonpath option specifies an expression with the name from each pod in the returned list.

Writing a ReplicationController Manifest

As with all other Kubernetes config, a ReplicationController needs apiVersion, kind, and metadata fields.

When the control plane creates new Pods for a ReplicationController, the .metadata.name of the ReplicationController is part of the basis for naming those Pods. The name of a ReplicationController must be a valid value, but this can produce unexpected results for the Pod hostnames. For best compatibility, the name should follow the more restrictive rules for a .

For general information about working with configuration files, see .

A ReplicationController also needs a .

Pod Template

The .spec.template is the only required field of the .spec.

The .spec.template is a . It has exactly the same schema as a , except it is nested and does not have an apiVersion or kind.

In addition to required fields for a Pod, a pod template in a ReplicationController must specify appropriate labels and an appropriate restart policy. For labels, make sure not to overlap with other controllers. See .

Only a equal to Always is allowed, which is the default if not specified.

For local container restarts, ReplicationControllers delegate to an agent on the node, for example the .

Labels on the ReplicationController

The ReplicationController can itself have labels (.metadata.labels). Typically, you would set these the same as the .spec.template.metadata.labels; if .metadata.labels is not specified then it defaults to .spec.template.metadata.labels. However, they are allowed to be different, and the .metadata.labels do not affect the behavior of the ReplicationController.

Pod Selector

The .spec.selector field is a . A ReplicationController manages all the pods with labels that match the selector. It does not distinguish between pods that it created or deleted and pods that another person or process created or deleted. This allows the ReplicationController to be replaced without affecting the running pods.

If specified, the .spec.template.metadata.labels must be equal to the .spec.selector, or it will be rejected by the API. If .spec.selector is unspecified, it will be defaulted to .spec.template.metadata.labels.

Also you should not normally create any pods whose labels match this selector, either directly, with another ReplicationController, or with another controller such as Job. If you do so, the ReplicationController thinks that it created the other pods. Kubernetes does not stop you from doing this.

If you do end up with multiple controllers that have overlapping selectors, you will have to manage the deletion yourself (see ).

Multiple Replicas

You can specify how many pods should run concurrently by setting .spec.replicas to the number of pods you would like to have running concurrently. The number running at any time may be higher or lower, such as if the replicas were just increased or decreased, or if a pod is gracefully shutdown, and a replacement starts early.

If you do not specify .spec.replicas, then it defaults to 1.

Working with ReplicationControllers

Deleting a ReplicationController and its Pods

To delete a ReplicationController and all its pods, use . Kubectl will scale the ReplicationController to zero and wait for it to delete each pod before deleting the ReplicationController itself. If this kubectl command is interrupted, it can be restarted.

When using the REST API or , you need to do the steps explicitly (scale replicas to 0, wait for pod deletions, then delete the ReplicationController).

Deleting only a ReplicationController

You can delete a ReplicationController without affecting any of its pods.

Using kubectl, specify the --cascade=orphan option to .

When using the REST API or , you can delete the ReplicationController object.

Once the original is deleted, you can create a new ReplicationController to replace it. As long as the old and new .spec.selector are the same, then the new one will adopt the old pods. However, it will not make any effort to make existing pods match a new, different pod template. To update pods to a new spec in a controlled way, use a .

Isolating pods from a ReplicationController

Pods may be removed from a ReplicationController's target set by changing their labels. This technique may be used to remove pods from service for debugging and data recovery. Pods that are removed in this way will be replaced automatically (assuming that the number of replicas is not also changed).

Common usage patterns

Rescheduling

As mentioned above, whether you have 1 pod you want to keep running, or 1000, a ReplicationController will ensure that the specified number of pods exists, even in the event of node failure or pod termination (for example, due to an action by another control agent).

Scaling

The ReplicationController enables scaling the number of replicas up or down, either manually or by an auto-scaling control agent, by updating the replicas field.

Rolling updates

The ReplicationController is designed to facilitate rolling updates to a service by replacing pods one-by-one.

As explained in , the recommended approach is to create a new ReplicationController with 1 replica, scale the new (+1) and old (-1) controllers one by one, and then delete the old controller after it reaches 0 replicas. This predictably updates the set of pods regardless of unexpected failures.

Ideally, the rolling update controller would take application readiness into account, and would ensure that a sufficient number of pods were productively serving at any given time.

The two ReplicationControllers would need to create pods with at least one differentiating label, such as the image tag of the primary container of the pod, since it is typically image updates that motivate rolling updates.

Multiple release tracks

In addition to running multiple releases of an application while a rolling update is in progress, it's common to run multiple releases for an extended period of time, or even continuously, using multiple release tracks. The tracks would be differentiated by labels.

For instance, a service might target all pods with tier in (frontend), environment in (prod). Now say you have 10 replicated pods that make up this tier. But you want to be able to 'canary' a new version of this component. You could set up a ReplicationController with replicas set to 9 for the bulk of the replicas, with labels tier=frontend, environment=prod, track=stable, and another ReplicationController with replicas set to 1 for the canary, with labels tier=frontend, environment=prod, track=canary. Now the service is covering both the canary and non-canary pods. But you can mess with the ReplicationControllers separately to test things out, monitor the results, etc.

Using ReplicationControllers with Services

Multiple ReplicationControllers can sit behind a single service, so that, for example, some traffic goes to the old version, and some goes to the new version.

A ReplicationController will never terminate on its own, but it isn't expected to be as long-lived as services. Services may be composed of pods controlled by multiple ReplicationControllers, and it is expected that many ReplicationControllers may be created and destroyed over the lifetime of a service (for instance, to perform an update of pods that run the service). Both services themselves and their clients should remain oblivious to the ReplicationControllers that maintain the pods of the services.

Writing programs for Replication

Pods created by a ReplicationController are intended to be fungible and semantically identical, though their configurations may become heterogeneous over time. This is an obvious fit for replicated stateless servers, but ReplicationControllers can also be used to maintain availability of master-elected, sharded, and worker-pool applications. Such applications should use dynamic work assignment mechanisms, such as the , as opposed to static/one-time customization of the configuration of each pod, which is considered an anti-pattern. Any pod customization performed, such as vertical auto-sizing of resources (for example, cpu or memory), should be performed by another online controller process, not unlike the ReplicationController itself.

Responsibilities of the ReplicationController

The ReplicationController ensures that the desired number of pods matches its label selector and are operational. Currently, only terminated pods are excluded from its count. In the future, and other information available from the system may be taken into account, we may add more controls over the replacement policy, and we plan to emit events that could be used by external clients to implement arbitrarily sophisticated replacement and/or scale-down policies.

The ReplicationController is forever constrained to this narrow responsibility. It itself will not perform readiness nor liveness probes. Rather than performing auto-scaling, it is intended to be controlled by an external auto-scaler (as discussed in ), which would change its replicas field. We will not add scheduling policies (for example, ) to the ReplicationController. Nor should it verify that the pods controlled match the currently specified template, as that would obstruct auto-sizing and other automated processes. Similarly, completion deadlines, ordering dependencies, configuration expansion, and other features belong elsewhere. We even plan to factor out the mechanism for bulk pod creation ().

The ReplicationController is intended to be a composable building-block primitive. We expect higher-level APIs and/or tools to be built on top of it and other complementary primitives for user convenience in the future. The "macro" operations currently supported by kubectl (run, scale) are proof-of-concept examples of this. For instance, we could imagine something like managing ReplicationControllers, auto-scalers, services, scheduling policies, canaries, etc.

Deployments

A Deployment provides declarative updates for and .

You describe a desired state in a Deployment, and the Deployment changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.

Note: Do not manage ReplicaSets owned by a Deployment. Consider opening an issue in the main Kubernetes repository if your use case is not covered below.

Use Case

The following are typical use cases for Deployments:

. The ReplicaSet creates Pods in the background. Check the status of the rollout to see if it succeeds or not.
by updating the PodTemplateSpec of the Deployment. A new ReplicaSet is created and the Deployment manages moving the Pods from the old ReplicaSet to the new one at a controlled rate. Each new ReplicaSet updates the revision of the Deployment.
if the current state of the Deployment is not stable. Each rollback updates the revision of the Deployment.
.
to apply multiple fixes to its PodTemplateSpec and then resume it to start a new rollout.
as an indicator that a rollout has stuck.
that you don't need anymore.

Creating a Deployment

Before creating a Deployment define an for a container.

The following is an example of a Deployment. It creates a ReplicaSet to bring up three nginx Pods:

In this example:

A Deployment named nginx-deployment is created, indicated by the .metadata.name field. This name will become the basis for the ReplicaSets and Pods which are created later. See for more details.
The Deployment creates a ReplicaSet that creates three replicated Pods, indicated by the .spec.replicas field.
The .spec.selector field defines how the created ReplicaSet finds which Pods to manage. In this case, you select a label that is defined in the Pod template (app: nginx). However, more sophisticated selection rules are possible, as long as the Pod template itself satisfies the rule.
Note: The .spec.selector.matchLabels field is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". All of the requirements, from both matchLabels and matchExpressions, must be satisfied in order to match.
The template field contains the following sub-fields:
- The Pods are labeled app: nginxusing the .metadata.labels field.
- The Pod template's specification, or .template.spec field, indicates that the Pods run one container, nginx, which runs the nginx image at version 1.14.2.
- Create one container and name it nginx using the .spec.template.spec.containers[0].name field.

Before you begin, make sure your Kubernetes cluster is up and running. Follow the steps given below to create the above Deployment:

Create the Deployment by running the following command:
Run kubectl get deployments to check if the Deployment was created.
If the Deployment is still being created, the output is similar to the following:
When you inspect the Deployments in your cluster, the following fields are displayed:
- NAME lists the names of the Deployments in the namespace.
- READY displays how many replicas of the application are available to your users. It follows the pattern ready/desired.
- UP-TO-DATE displays the number of replicas that have been updated to achieve the desired state.
- AVAILABLE displays how many replicas of the application are available to your users.
- AGE displays the amount of time that the application has been running.
Notice how the number of desired replicas is 3 according to .spec.replicas field.
To see the Deployment rollout status, run kubectl rollout status deployment/nginx-deployment.
The output is similar to:
Run the kubectl get deployments again a few seconds later. The output is similar to this:
Notice that the Deployment has created all three replicas, and all replicas are up-to-date (they contain the latest Pod template) and available.
To see the ReplicaSet (rs) created by the Deployment, run kubectl get rs. The output is similar to this:
ReplicaSet output shows the following fields:
- NAME lists the names of the ReplicaSets in the namespace.
- DESIRED displays the desired number of replicas of the application, which you define when you create the Deployment. This is the desired state.
- CURRENT displays how many replicas are currently running.
- READY displays how many replicas of the application are available to your users.
- AGE displays the amount of time that the application has been running.
Notice that the name of the ReplicaSet is always formatted as [DEPLOYMENT-NAME]-[HASH]. This name will become the basis for the Pods which are created.
The HASH string is the same as the pod-template-hash label on the ReplicaSet.
To see the labels automatically generated for each Pod, run kubectl get pods --show-labels. The output is similar to:
The created ReplicaSet ensures that there are three nginx Pods.

Note:

You must specify an appropriate selector and Pod template labels in a Deployment (in this case, app: nginx).

Do not overlap labels or selectors with other controllers (including other Deployments and StatefulSets). Kubernetes doesn't stop you from overlapping, and if multiple controllers have overlapping selectors those controllers might conflict and behave unexpectedly.

Pod-template-hash label

Caution: Do not change this label.

The pod-template-hash label is added by the Deployment controller to every ReplicaSet that a Deployment creates or adopts.

This label ensures that child ReplicaSets of a Deployment do not overlap. It is generated by hashing the PodTemplate of the ReplicaSet and using the resulting hash as the label value that is added to the ReplicaSet selector, Pod template labels, and in any existing Pods that the ReplicaSet might have.

Updating a Deployment

Note: A Deployment's rollout is triggered if and only if the Deployment's Pod template (that is, .spec.template) is changed, for example if the labels or container images of the template are updated. Other updates, such as scaling the Deployment, do not trigger a rollout.

Follow the steps given below to update your Deployment:

Let's update the nginx Pods to use the nginx:1.16.1 image instead of the nginx:1.14.2 image.
or use the following command:
where deployment/nginx-deployment indicates the Deployment, nginx indicates the Container the update will take place and nginx:1.16.1 indicates the new image and its tag.
The output is similar to:
Alternatively, you can edit the Deployment and change .spec.template.spec.containers[0].image from nginx:1.14.2 to nginx:1.16.1:
The output is similar to:
To see the rollout status, run:
The output is similar to this:
or

Get more details on your updated Deployment:

After the rollout succeeds, you can view the Deployment by running kubectl get deployments. The output is similar to this:
Run kubectl get rs to see that the Deployment updated the Pods by creating a new ReplicaSet and scaling it up to 3 replicas, as well as scaling down the old ReplicaSet to 0 replicas.
The output is similar to this:
Running get pods should now show only the new Pods:
The output is similar to this:
Next time you want to update these Pods, you only need to update the Deployment's Pod template again.
Deployment ensures that only a certain number of Pods are down while they are being updated. By default, it ensures that at least 75% of the desired number of Pods are up (25% max unavailable).
Deployment also ensures that only a certain number of Pods are created above the desired number of Pods. By default, it ensures that at most 125% of the desired number of Pods are up (25% max surge).
For example, if you look at the above Deployment closely, you will see that it first creates a new Pod, then deletes an old Pod, and creates another new one. It does not kill old Pods until a sufficient number of new Pods have come up, and does not create new Pods until a sufficient number of old Pods have been killed. It makes sure that at least 3 Pods are available and that at max 4 Pods in total are available. In case of a Deployment with 4 replicas, the number of Pods would be between 3 and 5.
Get details of your Deployment:
The output is similar to this:
Here you see that when you first created the Deployment, it created a ReplicaSet (nginx-deployment-2035384211) and scaled it up to 3 replicas directly. When you updated the Deployment, it created a new ReplicaSet (nginx-deployment-1564180365) and scaled it up to 1 and waited for it to come up. Then it scaled down the old ReplicaSet to 2 and scaled up the new ReplicaSet to 2 so that at least 3 Pods were available and at most 4 Pods were created at all times. It then continued scaling up and down the new and the old ReplicaSet, with the same rolling update strategy. Finally, you'll have 3 available replicas in the new ReplicaSet, and the old ReplicaSet is scaled down to 0.

Note: Kubernetes doesn't count terminating Pods when calculating the number of availableReplicas, which must be between replicas - maxUnavailable and replicas + maxSurge. As a result, you might notice that there are more Pods than expected during a rollout, and that the total resources consumed by the Deployment is more than replicas + maxSurge until the terminationGracePeriodSeconds of the terminating Pods expires.

Rollover (aka multiple updates in-flight)

Each time a new Deployment is observed by the Deployment controller, a ReplicaSet is created to bring up the desired Pods. If the Deployment is updated, the existing ReplicaSet that controls Pods whose labels match .spec.selector but whose template does not match .spec.template are scaled down. Eventually, the new ReplicaSet is scaled to .spec.replicas and all old ReplicaSets is scaled to 0.

If you update a Deployment while an existing rollout is in progress, the Deployment creates a new ReplicaSet as per the update and start scaling that up, and rolls over the ReplicaSet that it was scaling up previously -- it will add it to its list of old ReplicaSets and start scaling it down.

For example, suppose you create a Deployment to create 5 replicas of nginx:1.14.2, but then update the Deployment to create 5 replicas of nginx:1.16.1, when only 3 replicas of nginx:1.14.2 had been created. In that case, the Deployment immediately starts killing the 3 nginx:1.14.2 Pods that it had created, and starts creating nginx:1.16.1 Pods. It does not wait for the 5 replicas of nginx:1.14.2 to be created before changing course.

Label selector updates

It is generally discouraged to make label selector updates and it is suggested to plan your selectors up front. In any case, if you need to perform a label selector update, exercise great caution and make sure you have grasped all of the implications.

Note: In API version apps/v1, a Deployment's label selector is immutable after it gets created.

Selector additions require the Pod template labels in the Deployment spec to be updated with the new label too, otherwise a validation error is returned. This change is a non-overlapping one, meaning that the new selector does not select ReplicaSets and Pods created with the old selector, resulting in orphaning all old ReplicaSets and creating a new ReplicaSet.
Selector updates changes the existing value in a selector key -- result in the same behavior as additions.
Selector removals removes an existing key from the Deployment selector -- do not require any changes in the Pod template labels. Existing ReplicaSets are not orphaned, and a new ReplicaSet is not created, but note that the removed label still exists in any existing Pods and ReplicaSets.

Rolling Back a Deployment

Sometimes, you may want to rollback a Deployment; for example, when the Deployment is not stable, such as crash looping. By default, all of the Deployment's rollout history is kept in the system so that you can rollback anytime you want (you can change that by modifying revision history limit).

Note: A Deployment's revision is created when a Deployment's rollout is triggered. This means that the new revision is created if and only if the Deployment's Pod template (.spec.template) is changed, for example if you update the labels or container images of the template. Other updates, such as scaling the Deployment, do not create a Deployment revision, so that you can facilitate simultaneous manual- or auto-scaling. This means that when you roll back to an earlier revision, only the Deployment's Pod template part is rolled back.

Suppose that you made a typo while updating the Deployment, by putting the image name as nginx:1.161 instead of nginx:1.16.1:
The output is similar to this:
The rollout gets stuck. You can verify it by checking the rollout status:
The output is similar to this:
Press Ctrl-C to stop the above rollout status watch. For more information on stuck rollouts, .
You see that the number of old replicas (nginx-deployment-1564180365 and nginx-deployment-2035384211) is 2, and new replicas (nginx-deployment-3066724191) is 1.
The output is similar to this:
Looking at the Pods created, you see that 1 Pod created by new ReplicaSet is stuck in an image pull loop.
The output is similar to this:
Note: The Deployment controller stops the bad rollout automatically, and stops scaling up the new ReplicaSet. This depends on the rollingUpdate parameters (maxUnavailable specifically) that you have specified. Kubernetes by default sets the value to 25%.
Get the description of the Deployment:
The output is similar to this:
To fix this, you need to rollback to a previous revision of Deployment that is stable.

Checking Rollout History of a Deployment

Follow the steps given below to check the rollout history:

First, check the revisions of this Deployment:
The output is similar to this:
CHANGE-CAUSE is copied from the Deployment annotation kubernetes.io/change-cause to its revisions upon creation. You can specify theCHANGE-CAUSE message by:
- Annotating the Deployment with kubectl annotate deployment/nginx-deployment kubernetes.io/change-cause="image updated to 1.16.1"
- Manually editing the manifest of the resource.
To see the details of each revision, run:
The output is similar to this:

Rolling Back to a Previous Revision

Follow the steps given below to rollback the Deployment from the current version to the previous version, which is version 2.

Now you've decided to undo the current rollout and rollback to the previous revision:
The output is similar to this:
Alternatively, you can rollback to a specific revision by specifying it with --to-revision:
The output is similar to this:
For more details about rollout related commands, read .
The Deployment is now rolled back to a previous stable revision. As you can see, a DeploymentRollback event for rolling back to revision 2 is generated from Deployment controller.
Check if the rollback was successful and the Deployment is running as expected, run:
The output is similar to this:
Get the description of the Deployment:
The output is similar to this:

Scaling a Deployment

You can scale a Deployment by using the following command:

The output is similar to this:

Assuming is enabled in your cluster, you can set up an autoscaler for your Deployment and choose the minimum and maximum number of Pods you want to run based on the CPU utilization of your existing Pods.

The output is similar to this:

Proportional scaling

RollingUpdate Deployments support running multiple versions of an application at the same time. When you or an autoscaler scales a RollingUpdate Deployment that is in the middle of a rollout (either in progress or paused), the Deployment controller balances the additional replicas in the existing active ReplicaSets (ReplicaSets with Pods) in order to mitigate risk. This is called proportional scaling.

For example, you are running a Deployment with 10 replicas, =3, and =2.

Ensure that the 10 replicas in your Deployment are running.
The output is similar to this:
You update to a new image which happens to be unresolvable from inside the cluster.
The output is similar to this:
The image update starts a new rollout with ReplicaSet nginx-deployment-1989198191, but it's blocked due to the maxUnavailable requirement that you mentioned above. Check out the rollout status:
The output is similar to this:
Then a new scaling request for the Deployment comes along. The autoscaler increments the Deployment replicas to 15. The Deployment controller needs to decide where to add these new 5 replicas. If you weren't using proportional scaling, all 5 of them would be added in the new ReplicaSet. With proportional scaling, you spread the additional replicas across all ReplicaSets. Bigger proportions go to the ReplicaSets with the most replicas and lower proportions go to ReplicaSets with less replicas. Any leftovers are added to the ReplicaSet with the most replicas. ReplicaSets with zero replicas are not scaled up.

In our example above, 3 replicas are added to the old ReplicaSet and 2 replicas are added to the new ReplicaSet. The rollout process should eventually move all replicas to the new ReplicaSet, assuming the new replicas become healthy. To confirm this, run:

The output is similar to this:

The rollout status confirms how the replicas were added to each ReplicaSet.

The output is similar to this:

Pausing and Resuming a rollout of a Deployment

When you update a Deployment, or plan to, you can pause rollouts for that Deployment before you trigger one or more updates. When you're ready to apply those changes, you resume rollouts for the Deployment. This approach allows you to apply multiple fixes in between pausing and resuming without triggering unnecessary rollouts.

For example, with a Deployment that was created:
Get the Deployment details:
The output is similar to this:
Get the rollout status:
The output is similar to this:
Pause by running the following command:
The output is similar to this:
Then update the image of the Deployment:
The output is similar to this:
Notice that no new rollout started:
The output is similar to this:
Get the rollout status to verify that the existing ReplicaSet has not changed:
The output is similar to this:
You can make as many updates as you wish, for example, update the resources that will be used:
The output is similar to this:
The initial state of the Deployment prior to pausing its rollout will continue its function, but new updates to the Deployment will not have any effect as long as the Deployment rollout is paused.
Eventually, resume the Deployment rollout and observe a new ReplicaSet coming up with all the new updates:
The output is similar to this:
Watch the status of the rollout until it's done.
The output is similar to this:
Get the status of the latest rollout:
The output is similar to this:

Note: You cannot rollback a paused Deployment until you resume it.

Deployment status

A Deployment enters various states during its lifecycle. It can be while rolling out a new ReplicaSet, it can be , or it can .

Progressing Deployment

Kubernetes marks a Deployment as progressing when one of the following tasks is performed:

The Deployment creates a new ReplicaSet.
The Deployment is scaling up its newest ReplicaSet.
The Deployment is scaling down its older ReplicaSet(s).
New Pods become ready or available (ready for at least ).

When the rollout becomes “progressing”, the Deployment controller adds a condition with the following attributes to the Deployment's .status.conditions:

type: Progressing
status: "True"
reason: NewReplicaSetCreated | reason: FoundNewReplicaSet | reason: ReplicaSetUpdated

You can monitor the progress for a Deployment by using kubectl rollout status.

Complete Deployment

Kubernetes marks a Deployment as complete when it has the following characteristics:

All of the replicas associated with the Deployment have been updated to the latest version you've specified, meaning any updates you've requested have been completed.
All of the replicas associated with the Deployment are available.
No old replicas for the Deployment are running.

When the rollout becomes “complete”, the Deployment controller sets a condition with the following attributes to the Deployment's .status.conditions:

type: Progressing
status: "True"
reason: NewReplicaSetAvailable

This Progressing condition will retain a status value of "True" until a new rollout is initiated. The condition holds even when availability of replicas changes (which does instead affect the Available condition).

You can check if a Deployment has completed by using kubectl rollout status. If the rollout completed successfully, kubectl rollout status returns a zero exit code.

The output is similar to this:

and the exit status from kubectl rollout is 0 (success):

Failed Deployment

Your Deployment may get stuck trying to deploy its newest ReplicaSet without ever completing. This can occur due to some of the following factors:

Insufficient quota
Readiness probe failures
Image pull errors
Insufficient permissions
Limit ranges
Application runtime misconfiguration

One way you can detect this condition is to specify a deadline parameter in your Deployment spec: (). .spec.progressDeadlineSeconds denotes the number of seconds the Deployment controller waits before indicating (in the Deployment status) that the Deployment progress has stalled.

The following kubectl command sets the spec with progressDeadlineSeconds to make the controller report lack of progress of a rollout for a Deployment after 10 minutes:

The output is similar to this:

Once the deadline has been exceeded, the Deployment controller adds a DeploymentCondition with the following attributes to the Deployment's .status.conditions:

type: Progressing
status: "False"
reason: ProgressDeadlineExceeded

This condition can also fail early and is then set to status value of "False" due to reasons as ReplicaSetCreateError. Also, the deadline is not taken into account anymore once the Deployment rollout completes.

See the for more information on status conditions.

Note: Kubernetes takes no action on a stalled Deployment other than to report a status condition with reason: ProgressDeadlineExceeded. Higher level orchestrators can take advantage of it and act accordingly, for example, rollback the Deployment to its previous version.Note: If you pause a Deployment rollout, Kubernetes does not check progress against your specified deadline. You can safely pause a Deployment rollout in the middle of a rollout and resume without triggering the condition for exceeding the deadline.

You may experience transient errors with your Deployments, either due to a low timeout that you have set or due to any other kind of error that can be treated as transient. For example, let's suppose you have insufficient quota. If you describe the Deployment you will notice the following section:

The output is similar to this:

If you run kubectl get deployment nginx-deployment -o yaml, the Deployment status is similar to this:

Eventually, once the Deployment progress deadline is exceeded, Kubernetes updates the status and the reason for the Progressing condition:

You can address an issue of insufficient quota by scaling down your Deployment, by scaling down other controllers you may be running, or by increasing quota in your namespace. If you satisfy the quota conditions and the Deployment controller then completes the Deployment rollout, you'll see the Deployment's status update with a successful condition (status: "True" and reason: NewReplicaSetAvailable).

type: Available with status: "True" means that your Deployment has minimum availability. Minimum availability is dictated by the parameters specified in the deployment strategy. type: Progressing with status: "True" means that your Deployment is either in the middle of a rollout and it is progressing or that it has successfully completed its progress and the minimum required new replicas are available (see the Reason of the condition for the particulars - in our case reason: NewReplicaSetAvailable means that the Deployment is complete).

You can check if a Deployment has failed to progress by using kubectl rollout status. kubectl rollout status returns a non-zero exit code if the Deployment has exceeded the progression deadline.

The output is similar to this:

and the exit status from kubectl rollout is 1 (indicating an error):

Operating on a failed deployment

All actions that apply to a complete Deployment also apply to a failed Deployment. You can scale it up/down, roll back to a previous revision, or even pause it if you need to apply multiple tweaks in the Deployment Pod template.

Clean up Policy

You can set .spec.revisionHistoryLimit field in a Deployment to specify how many old ReplicaSets for this Deployment you want to retain. The rest will be garbage-collected in the background. By default, it is 10.

Note: Explicitly setting this field to 0, will result in cleaning up all the history of your Deployment thus that Deployment will not be able to roll back.

Canary Deployment

If you want to roll out releases to a subset of users or servers using the Deployment, you can create multiple Deployments, one for each release, following the canary pattern described in .

Writing a Deployment Spec

As with all other Kubernetes configs, a Deployment needs .apiVersion, .kind, and .metadata fields. For general information about working with config files, see , configuring containers, and documents.

When the control plane creates new Pods for a Deployment, the .metadata.name of the Deployment is part of the basis for naming those Pods. The name of a Deployment must be a valid value, but this can produce unexpected results for the Pod hostnames. For best compatibility, the name should follow the more restrictive rules for a .

A Deployment also needs a .

Pod Template

The .spec.template and .spec.selector are the only required fields of the .spec.

The .spec.template is a . It has exactly the same schema as a , except it is nested and does not have an apiVersion or kind.

In addition to required fields for a Pod, a Pod template in a Deployment must specify appropriate labels and an appropriate restart policy. For labels, make sure not to overlap with other controllers. See .

Only a equal to Always is allowed, which is the default if not specified.

Replicas

.spec.replicas is an optional field that specifies the number of desired Pods. It defaults to 1.

Should you manually scale a Deployment, example via kubectl scale deployment deployment --replicas=X, and then you update that Deployment based on a manifest (for example: by running kubectl apply -f deployment.yaml), then applying that manifest overwrites the manual scaling that you previously did.

If a (or any similar API for horizontal scaling) is managing scaling for a Deployment, don't set .spec.replicas.

Instead, allow the Kubernetes to manage the .spec.replicas field automatically.

Selector

.spec.selector is a required field that specifies a for the Pods targeted by this Deployment.

.spec.selector must match .spec.template.metadata.labels, or it will be rejected by the API.

In API version apps/v1, .spec.selector and .metadata.labels do not default to .spec.template.metadata.labels if not set. So they must be set explicitly. Also note that .spec.selector is immutable after creation of the Deployment in apps/v1.

A Deployment may terminate Pods whose labels match the selector if their template is different from .spec.template or if the total number of such Pods exceeds .spec.replicas. It brings up new Pods with .spec.template if the number of Pods is less than the desired number.

Note: You should not create other Pods whose labels match this selector, either directly, by creating another Deployment, or by creating another controller such as a ReplicaSet or a ReplicationController. If you do so, the first Deployment thinks that it created these other Pods. Kubernetes does not stop you from doing this.

If you have multiple controllers that have overlapping selectors, the controllers will fight with each other and won't behave correctly.

Strategy

.spec.strategy specifies the strategy used to replace old Pods by new ones. .spec.strategy.type can be "Recreate" or "RollingUpdate". "RollingUpdate" is the default value.

Recreate Deployment

All existing Pods are killed before new ones are created when .spec.strategy.type==Recreate.

Note: This will only guarantee Pod termination previous to creation for upgrades. If you upgrade a Deployment, all Pods of the old revision will be terminated immediately. Successful removal is awaited before any Pod of the new revision is created. If you manually delete a Pod, the lifecycle is controlled by the ReplicaSet and the replacement will be created immediately (even if the old Pod is still in a Terminating state). If you need an "at most" guarantee for your Pods, you should consider using a .

Rolling Update Deployment

The Deployment updates Pods in a rolling update fashion when .spec.strategy.type==RollingUpdate. You can specify maxUnavailable and maxSurge to control the rolling update process.

Max Unavailable

.spec.strategy.rollingUpdate.maxUnavailable is an optional field that specifies the maximum number of Pods that can be unavailable during the update process. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). The absolute number is calculated from percentage by rounding down. The value cannot be 0 if .spec.strategy.rollingUpdate.maxSurge is 0. The default value is 25%.

For example, when this value is set to 30%, the old ReplicaSet can be scaled down to 70% of desired Pods immediately when the rolling update starts. Once new Pods are ready, old ReplicaSet can be scaled down further, followed by scaling up the new ReplicaSet, ensuring that the total number of Pods available at all times during the update is at least 70% of the desired Pods.

Max Surge

.spec.strategy.rollingUpdate.maxSurge is an optional field that specifies the maximum number of Pods that can be created over the desired number of Pods. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). The value cannot be 0 if MaxUnavailable is 0. The absolute number is calculated from the percentage by rounding up. The default value is 25%.

For example, when this value is set to 30%, the new ReplicaSet can be scaled up immediately when the rolling update starts, such that the total number of old and new Pods does not exceed 130% of desired Pods. Once old Pods have been killed, the new ReplicaSet can be scaled up further, ensuring that the total number of Pods running at any time during the update is at most 130% of desired Pods.

Progress Deadline Seconds

.spec.progressDeadlineSeconds is an optional field that specifies the number of seconds you want to wait for your Deployment to progress before the system reports back that the Deployment has - surfaced as a condition with type: Progressing, status: "False". and reason: ProgressDeadlineExceeded in the status of the resource. The Deployment controller will keep retrying the Deployment. This defaults to 600. In the future, once automatic rollback will be implemented, the Deployment controller will roll back a Deployment as soon as it observes such a condition.

If specified, this field needs to be greater than .spec.minReadySeconds.

Min Ready Seconds

.spec.minReadySeconds is an optional field that specifies the minimum number of seconds for which a newly created Pod should be ready without any of its containers crashing, for it to be considered available. This defaults to 0 (the Pod will be considered available as soon as it is ready). To learn more about when a Pod is considered ready, see .

Revision History Limit

A Deployment's revision history is stored in the ReplicaSets it controls.

.spec.revisionHistoryLimit is an optional field that specifies the number of old ReplicaSets to retain to allow rollback. These old ReplicaSets consume resources in etcd and crowd the output of kubectl get rs. The configuration of each Deployment revision is stored in its ReplicaSets; therefore, once an old ReplicaSet is deleted, you lose the ability to rollback to that revision of Deployment. By default, 10 old ReplicaSets will be kept, however its ideal value depends on the frequency and stability of new Deployments.

More specifically, setting this field to zero means that all old ReplicaSets with 0 replicas will be cleaned up. In this case, a new Deployment rollout cannot be undone, since its revision history is cleaned up.

Paused

.spec.paused is an optional boolean field for pausing and resuming a Deployment. The only difference between a paused Deployment and one that is not paused, is that any changes into the PodTemplateSpec of the paused Deployment will not trigger new rollouts as long as it is paused. A Deployment is not paused by default when it is created.

StatefulSets

StatefulSet is the workload API object used to manage stateful applications.

Manages the deployment and scaling of a set of , and provides guarantees about the ordering and uniqueness of these Pods.

Like a , a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of its Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling.

If you want to use storage volumes to provide persistence for your workload, you can use a StatefulSet as part of the solution. Although individual Pods in a StatefulSet are susceptible to failure, the persistent Pod identifiers make it easier to match existing volumes to the new Pods that replace any that have failed.

Using StatefulSets

StatefulSets are valuable for applications that require one or more of the following.

Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, automated rolling updates.

In the above, stable is synonymous with persistence across Pod (re)scheduling. If an application doesn't require any stable identifiers or ordered deployment, deletion, or scaling, you should deploy your application using a workload object that provides a set of stateless replicas. or may be better suited to your stateless needs.

Limitations

The storage for a given Pod must either be provisioned by a based on the requested storage class, or pre-provisioned by an admin.
Deleting and/or scaling a StatefulSet down will not delete the volumes associated with the StatefulSet. This is done to ensure data safety, which is generally more valuable than an automatic purge of all related StatefulSet resources.
StatefulSets currently require a to be responsible for the network identity of the Pods. You are responsible for creating this Service.
StatefulSets do not provide any guarantees on the termination of pods when a StatefulSet is deleted. To achieve ordered and graceful termination of the pods in the StatefulSet, it is possible to scale the StatefulSet down to 0 prior to deletion.
When using with the default (OrderedReady), it's possible to get into a broken state that requires .

Components

The example below demonstrates the components of a StatefulSet.

In the above example:

A Headless Service, named nginx, is used to control the network domain.
The StatefulSet, named web, has a Spec that indicates that 3 replicas of the nginx container will be launched in unique Pods.
The volumeClaimTemplates will provide stable storage using provisioned by a PersistentVolume Provisioner.

The name of a StatefulSet object must be a valid .

Pod Selector

You must set the .spec.selector field of a StatefulSet to match the labels of its .spec.template.metadata.labels. Failing to specify a matching Pod Selector will result in a validation error during StatefulSet creation.

Volume Claim Templates

You can set the .spec.volumeClaimTemplates which can provide stable storage using provisioned by a PersistentVolume Provisioner.

Minimum ready seconds

FEATURE STATE: Kubernetes v1.25 [stable]

.spec.minReadySeconds is an optional field that specifies the minimum number of seconds for which a newly created Pod should be running and ready without any of its containers crashing, for it to be considered available. This is used to check progression of a rollout when using a strategy. This field defaults to 0 (the Pod will be considered available as soon as it is ready). To learn more about when a Pod is considered ready, see .

Pod Identity

StatefulSet Pods have a unique identity that consists of an ordinal, a stable network identity, and stable storage. The identity sticks to the Pod, regardless of which node it's (re)scheduled on.

Ordinal Index

For a StatefulSet with N , each Pod in the StatefulSet will be assigned an integer ordinal, that is unique over the Set. By default, pods will be assigned ordinals from 0 up through N-1.

Start ordinal

FEATURE STATE: Kubernetes v1.27 [beta]

.spec.ordinals is an optional field that allows you to configure the integer ordinals assigned to each Pod. It defaults to nil. You must enable the StatefulSetStartOrdinal to use this field. Once enabled, you can configure the following options:

.spec.ordinals.start: If the .spec.ordinals.start field is set, Pods will be assigned ordinals from .spec.ordinals.start up through .spec.ordinals.start + .spec.replicas - 1.

Stable Network ID

Each Pod in a StatefulSet derives its hostname from the name of the StatefulSet and the ordinal of the Pod. The pattern for the constructed hostname is $(statefulset name)-$(ordinal). The example above will create three Pods named web-0,web-1,web-2. A StatefulSet can use a to control the domain of its Pods. The domain managed by this Service takes the form: $(service name).$(namespace).svc.cluster.local, where "cluster.local" is the cluster domain. As each Pod is created, it gets a matching DNS subdomain, taking the form: $(podname).$(governing service domain), where the governing service is defined by the serviceName field on the StatefulSet.

Depending on how DNS is configured in your cluster, you may not be able to look up the DNS name for a newly-run Pod immediately. This behavior can occur when other clients in the cluster have already sent queries for the hostname of the Pod before it was created. Negative caching (normal in DNS) means that the results of previous failed lookups are remembered and reused, even after the Pod is running, for at least a few seconds.

If you need to discover Pods promptly after they are created, you have a few options:

Query the Kubernetes API directly (for example, using a watch) rather than relying on DNS lookups.
Decrease the time of caching in your Kubernetes DNS provider (typically this means editing the config map for CoreDNS, which currently caches for 30 seconds).

As mentioned in the section, you are responsible for creating the responsible for the network identity of the pods.

Here are some examples of choices for Cluster Domain, Service name, StatefulSet name, and how that affects the DNS names for the StatefulSet's Pods.

Cluster Domain

Service (ns/name)

StatefulSet (ns/name)

StatefulSet Domain

Pod DNS

Pod Hostname

Note: Cluster Domain will be set to cluster.local unless .

Stable Storage

For each VolumeClaimTemplate entry defined in a StatefulSet, each Pod receives one PersistentVolumeClaim. In the nginx example above, each Pod receives a single PersistentVolume with a StorageClass of my-storage-class and 1 Gib of provisioned storage. If no StorageClass is specified, then the default StorageClass will be used. When a Pod is (re)scheduled onto a node, its volumeMounts mount the PersistentVolumes associated with its PersistentVolume Claims. Note that, the PersistentVolumes associated with the Pods' PersistentVolume Claims are not deleted when the Pods, or StatefulSet are deleted. This must be done manually.

Pod Name Label

When the StatefulSet creates a Pod, it adds a label, statefulset.kubernetes.io/pod-name, that is set to the name of the Pod. This label allows you to attach a Service to a specific Pod in the StatefulSet.

Deployment and Scaling Guarantees

For a StatefulSet with N replicas, when Pods are being deployed, they are created sequentially, in order from {0..N-1}.
When Pods are being deleted, they are terminated in reverse order, from {N-1..0}.
Before a scaling operation is applied to a Pod, all of its predecessors must be Running and Ready.
Before a Pod is terminated, all of its successors must be completely shutdown.

The StatefulSet should not specify a pod.Spec.TerminationGracePeriodSeconds of 0. This practice is unsafe and strongly discouraged. For further explanation, please refer to .

When the nginx example above is created, three Pods will be deployed in the order web-0, web-1, web-2. web-1 will not be deployed before web-0 is , and web-2 will not be deployed until web-1 is Running and Ready. If web-0 should fail, after web-1 is Running and Ready, but before web-2 is launched, web-2 will not be launched until web-0 is successfully relaunched and becomes Running and Ready.

If a user were to scale the deployed example by patching the StatefulSet such that replicas=1, web-2 would be terminated first. web-1 would not be terminated until web-2 is fully shutdown and deleted. If web-0 were to fail after web-2 has been terminated and is completely shutdown, but prior to web-1's termination, web-1 would not be terminated until web-0 is Running and Ready.

Pod Management Policies

StatefulSet allows you to relax its ordering guarantees while preserving its uniqueness and identity guarantees via its .spec.podManagementPolicy field.

OrderedReady Pod Management

OrderedReady pod management is the default for StatefulSets. It implements the behavior described .

Parallel Pod Management

Parallel pod management tells the StatefulSet controller to launch or terminate all Pods in parallel, and to not wait for Pods to become Running and Ready or completely terminated prior to launching or terminating another Pod. This option only affects the behavior for scaling operations. Updates are not affected.

Update strategies

A StatefulSet's .spec.updateStrategy field allows you to configure and disable automated rolling updates for containers, labels, resource request/limits, and annotations for the Pods in a StatefulSet. There are two possible values:

OnDeleteWhen a StatefulSet's .spec.updateStrategy.type is set to OnDelete, the StatefulSet controller will not automatically update the Pods in a StatefulSet. Users must manually delete Pods to cause the controller to create new Pods that reflect modifications made to a StatefulSet's .spec.template.RollingUpdateThe RollingUpdate update strategy implements automated, rolling updates for the Pods in a StatefulSet. This is the default update strategy.

Rolling Updates

When a StatefulSet's .spec.updateStrategy.type is set to RollingUpdate, the StatefulSet controller will delete and recreate each Pod in the StatefulSet. It will proceed in the same order as Pod termination (from the largest ordinal to the smallest), updating each Pod one at a time.

The Kubernetes control plane waits until an updated Pod is Running and Ready prior to updating its predecessor. If you have set .spec.minReadySeconds (see ), the control plane additionally waits that amount of time after the Pod turns ready, before moving on.

Partitioned rolling updates

The RollingUpdate update strategy can be partitioned, by specifying a .spec.updateStrategy.rollingUpdate.partition. If a partition is specified, all Pods with an ordinal that is greater than or equal to the partition will be updated when the StatefulSet's .spec.template is updated. All Pods with an ordinal that is less than the partition will not be updated, and, even if they are deleted, they will be recreated at the previous version. If a StatefulSet's .spec.updateStrategy.rollingUpdate.partition is greater than its .spec.replicas, updates to its .spec.template will not be propagated to its Pods. In most cases you will not need to use a partition, but they are useful if you want to stage an update, roll out a canary, or perform a phased roll out.

Maximum unavailable Pods

FEATURE STATE: Kubernetes v1.24 [alpha]

You can control the maximum number of Pods that can be unavailable during an update by specifying the .spec.updateStrategy.rollingUpdate.maxUnavailable field. The value can be an absolute number (for example, 5) or a percentage of desired Pods (for example, 10%). Absolute number is calculated from the percentage value by rounding it up. This field cannot be 0. The default setting is 1.

This field applies to all Pods in the range 0 to replicas - 1. If there is any unavailable Pod in the range 0 to replicas - 1, it will be counted towards maxUnavailable.

Note: The maxUnavailable field is in Alpha stage and it is honored only by API servers that are running with the MaxUnavailableStatefulSet enabled.

Forced rollback

When using with the default (OrderedReady), it's possible to get into a broken state that requires manual intervention to repair.

If you update the Pod template to a configuration that never becomes Running and Ready (for example, due to a bad binary or application-level configuration error), StatefulSet will stop the rollout and wait.

In this state, it's not enough to revert the Pod template to a good configuration. Due to a , StatefulSet will continue to wait for the broken Pod to become Ready (which never happens) before it will attempt to revert it back to the working configuration.

After reverting the template, you must also delete any Pods that StatefulSet had already attempted to run with the bad configuration. StatefulSet will then begin to recreate the Pods using the reverted template.

PersistentVolumeClaim retention

FEATURE STATE: Kubernetes v1.27 [beta]

The optional .spec.persistentVolumeClaimRetentionPolicy field controls if and how PVCs are deleted during the lifecycle of a StatefulSet. You must enable the StatefulSetAutoDeletePVC on the API server and the controller manager to use this field. Once enabled, there are two policies you can configure for each StatefulSet:

whenDeletedconfigures the volume retention behavior that applies when the StatefulSet is deletedwhenScaledconfigures the volume retention behavior that applies when the replica count of the StatefulSet is reduced; for example, when scaling down the set.

For each policy that you can configure, you can set the value to either Delete or Retain.

DeleteThe PVCs created from the StatefulSet volumeClaimTemplate are deleted for each Pod affected by the policy. With the whenDeleted policy all PVCs from the volumeClaimTemplate are deleted after their Pods have been deleted. With the whenScaled policy, only PVCs corresponding to Pod replicas being scaled down are deleted, after their Pods have been deleted.Retain (default)PVCs from the volumeClaimTemplate are not affected when their Pod is deleted. This is the behavior before this new feature.

Bear in mind that these policies only apply when Pods are being removed due to the StatefulSet being deleted or scaled down. For example, if a Pod associated with a StatefulSet fails due to node failure, and the control plane creates a replacement Pod, the StatefulSet retains the existing PVC. The existing volume is unaffected, and the cluster will attach it to the node where the new Pod is about to launch.

The default for policies is Retain, matching the StatefulSet behavior before this new feature.

Here is an example policy.

The StatefulSet adds to its PVCs, which are then deleted by the after the Pod is terminated. This enables the Pod to cleanly unmount all volumes before the PVCs are deleted (and before the backing PV and volume are deleted, depending on the retain policy). When you set the whenDeleted policy to Delete, an owner reference to the StatefulSet instance is placed on all PVCs associated with that StatefulSet.

The whenScaled policy must delete PVCs only when a Pod is scaled down, and not when a Pod is deleted for another reason. When reconciling, the StatefulSet controller compares its desired replica count to the actual Pods present on the cluster. Any StatefulSet Pod whose id greater than the replica count is condemned and marked for deletion. If the whenScaled policy is Delete, the condemned Pods are first set as owners to the associated StatefulSet template PVCs, before the Pod is deleted. This causes the PVCs to be garbage collected after only the condemned Pods have terminated.

This means that if the controller crashes and restarts, no Pod will be deleted before its owner reference has been updated appropriate to the policy. If a condemned Pod is force-deleted while the controller is down, the owner reference may or may not have been set up, depending on when the controller crashed. It may take several reconcile loops to update the owner references, so some condemned Pods may have set up owner references and others may not. For this reason we recommend waiting for the controller to come back up, which will verify owner references before terminating Pods. If that is not possible, the operator should verify the owner references on PVCs to ensure the expected objects are deleted when Pods are force-deleted.

Replicas

.spec.replicas is an optional field that specifies the number of desired Pods. It defaults to 1.

Should you manually scale a deployment, example via kubectl scale statefulset statefulset --replicas=X, and then you update that StatefulSet based on a manifest (for example: by running kubectl apply -f statefulset.yaml), then applying that manifest overwrites the manual scaling that you previously did.

If a (or any similar API for horizontal scaling) is managing scaling for a Statefulset, don't set .spec.replicas. Instead, allow the Kubernetes to manage the .spec.replicas field automatically.

DaemonSet

A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.

Some typical uses of a DaemonSet are:

running a cluster storage daemon on every node
running a logs collection daemon on every node
running a node monitoring daemon on every node

In a simple case, one DaemonSet, covering all nodes, would be used for each type of daemon. A more complex setup might use multiple DaemonSets for a single type of daemon, but with different flags and/or different memory and cpu requests for different hardware types.

Writing a DaemonSet Spec

Create a DaemonSet

You can describe a DaemonSet in a YAML file. For example, the daemonset.yaml file below describes a DaemonSet that runs the fluentd-elasticsearch Docker image:

Create a DaemonSet based on the YAML file:

Required Fields

As with all other Kubernetes config, a DaemonSet needs apiVersion, kind, and metadata fields. For general information about working with config files, see and .

The name of a DaemonSet object must be a valid .

A DaemonSet also needs a section.

Pod Template

The .spec.template is one of the required fields in .spec.

The .spec.template is a . It has exactly the same schema as a , except it is nested and does not have an apiVersion or kind.

In addition to required fields for a Pod, a Pod template in a DaemonSet has to specify appropriate labels (see ).

A Pod Template in a DaemonSet must have a equal to Always, or be unspecified, which defaults to Always.

Pod Selector

The .spec.selector field is a pod selector. It works the same as the .spec.selector of a .

You must specify a pod selector that matches the labels of the .spec.template. Also, once a DaemonSet is created, its .spec.selector can not be mutated. Mutating the pod selector can lead to the unintentional orphaning of Pods, and it was found to be confusing to users.

The .spec.selector is an object consisting of two fields:

matchLabels - works the same as the .spec.selector of a .
matchExpressions - allows to build more sophisticated selectors by specifying key, list of values and an operator that relates the key and values.

When the two are specified the result is ANDed.

The .spec.selector must match the .spec.template.metadata.labels. Config with these two not matching will be rejected by the API.

Running Pods on select Nodes

If you specify a .spec.template.spec.nodeSelector, then the DaemonSet controller will create Pods on nodes which match that . Likewise if you specify a .spec.template.spec.affinity, then DaemonSet controller will create Pods on nodes which match that . If you do not specify either, then the DaemonSet controller will create Pods on all nodes.

How Daemon Pods are scheduled

A DaemonSet ensures that all eligible nodes run a copy of a Pod. The DaemonSet controller creates a Pod for each eligible node and adds the spec.affinity.nodeAffinity field of the Pod to match the target host. After the Pod is created, the default scheduler typically takes over and then binds the Pod to the target host by setting the .spec.nodeName field. If the new Pod cannot fit on the node, the default scheduler may preempt (evict) some of the existing Pods based on the of the new Pod.

The user can specify a different scheduler for the Pods of the DaemonSet, by setting the .spec.template.spec.schedulerName field of the DaemonSet.

The original node affinity specified at the .spec.template.spec.affinity.nodeAffinity field (if specified) is taken into consideration by the DaemonSet controller when evaluating the eligible nodes, but is replaced on the created Pod with the node affinity that matches the name of the eligible node.

Taints and tolerations

The DaemonSet controller automatically adds a set of to DaemonSet Pods:

Toleration key

Effect

Details

You can add your own tolerations to the Pods of a DaemonSet as well, by defining these in the Pod template of the DaemonSet.

Because the DaemonSet controller sets the node.kubernetes.io/unschedulable:NoSchedule toleration automatically, Kubernetes can run DaemonSet Pods on nodes that are marked as unschedulable.

If you use a DaemonSet to provide an important node-level function, such as , it is helpful that Kubernetes places DaemonSet Pods on nodes before they are ready. For example, without that special toleration, you could end up in a deadlock situation where the node is not marked as ready because the network plugin is not running there, and at the same time the network plugin is not running on that node because the node is not yet ready.

Communicating with Daemon Pods

Some possible patterns for communicating with Pods in a DaemonSet are:

Push: Pods in the DaemonSet are configured to send updates to another service, such as a stats database. They do not have clients.
NodeIP and Known Port: Pods in the DaemonSet can use a hostPort, so that the pods are reachable via the node IPs. Clients know the list of node IPs somehow, and know the port by convention.
DNS: Create a with the same pod selector, and then discover DaemonSets using the endpoints resource or retrieve multiple A records from DNS.
Service: Create a service with the same Pod selector, and use the service to reach a daemon on a random node. (No way to reach specific node.)

Updating a DaemonSet

If node labels are changed, the DaemonSet will promptly add Pods to newly matching nodes and delete Pods from newly not-matching nodes.

You can modify the Pods that a DaemonSet creates. However, Pods do not allow all fields to be updated. Also, the DaemonSet controller will use the original template the next time a node (even with the same name) is created.

You can delete a DaemonSet. If you specify --cascade=orphan with kubectl, then the Pods will be left on the nodes. If you subsequently create a new DaemonSet with the same selector, the new DaemonSet adopts the existing Pods. If any Pods need replacing the DaemonSet replaces them according to its updateStrategy.

You can on a DaemonSet.

Jobs

A Job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up the Pods it created. Suspending a Job will delete its active Pods until the Job is resumed again.

A simple case is to create one Job object in order to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot).

You can also use a Job to run multiple Pods in parallel.

If you want to run a Job (either a single task, or several in parallel) on a schedule, see .

Running an example Job

Here is an example Job config. It computes π to 2000 places and prints it out. It takes around 10s to complete.

You can run the example with this command:

The output is similar to this:

Check on the status of the Job with kubectl:

To view completed Pods of a Job, use kubectl get pods.

To list all the Pods that belong to a Job in a machine readable form, you can use a command like this:

The output is similar to this:

Here, the selector is the same as the selector for the Job. The --output=jsonpath option specifies an expression with the name from each Pod in the returned list.

View the standard output of one of the pods:

Another way to view the logs of a Job:

The output is similar to this:

Writing a Job spec

As with all other Kubernetes config, a Job needs apiVersion, kind, and metadata fields.

When the control plane creates new Pods for a Job, the .metadata.name of the Job is part of the basis for naming those Pods. The name of a Job must be a valid value, but this can produce unexpected results for the Pod hostnames. For best compatibility, the name should follow the more restrictive rules for a . Even when the name is a DNS subdomain, the name must be no longer than 63 characters.

A Job also needs a .

Job Labels

Job labels will have batch.kubernetes.io/ prefix for job-name and controller-uid.

Pod Template

The .spec.template is the only required field of the .spec.

The .spec.template is a . It has exactly the same schema as a , except it is nested and does not have an apiVersion or kind.

In addition to required fields for a Pod, a pod template in a Job must specify appropriate labels (see ) and an appropriate restart policy.

Only a equal to Never or OnFailure is allowed.

Pod selector

The .spec.selector field is optional. In almost all cases you should not specify it. See section .

Parallel execution for Jobs

There are three main types of task suitable to run as a Job:

Non-parallel Jobs
- normally, only one Pod is started, unless the Pod fails.
- the Job is complete as soon as its Pod terminates successfully.
Parallel Jobs with a fixed completion count:
- specify a non-zero positive value for .spec.completions.
- the Job represents the overall task, and is complete when there are .spec.completions successful Pods.
- when using .spec.completionMode="Indexed", each Pod gets a different index in the range 0 to .spec.completions-1.
Parallel Jobs with a work queue:
- do not specify .spec.completions, default to .spec.parallelism.
- the Pods must coordinate amongst themselves or an external service to determine what each should work on. For example, a Pod might fetch a batch of up to N items from the work queue.
- each Pod is independently capable of determining whether or not all its peers are done, and thus that the entire Job is done.
- when any Pod from the Job terminates with success, no new Pods are created.
- once at least one Pod has terminated with success and all Pods are terminated, then the Job is completed with success.
- once any Pod has exited with success, no other Pod should still be doing any work for this task or writing any output. They should all be in the process of exiting.

For a non-parallel Job, you can leave both .spec.completions and .spec.parallelism unset. When both are unset, both are defaulted to 1.

For a fixed completion count Job, you should set .spec.completions to the number of completions needed. You can set .spec.parallelism, or leave it unset and it will default to 1.

For a work queue Job, you must leave .spec.completions unset, and set .spec.parallelism to a non-negative integer.

For more information about how to make use of the different types of job, see the section.

Controlling parallelism

The requested parallelism (.spec.parallelism) can be set to any non-negative value. If it is unspecified, it defaults to 1. If it is specified as 0, then the Job is effectively paused until it is increased.

Actual parallelism (number of pods running at any instant) may be more or less than requested parallelism, for a variety of reasons:

For fixed completion count Jobs, the actual number of pods running in parallel will not exceed the number of remaining completions. Higher values of .spec.parallelism are effectively ignored.
For work queue Jobs, no new Pods are started after any Pod has succeeded -- remaining Pods are allowed to complete, however.
If the Job has not had time to react.
If the Job controller failed to create Pods for any reason (lack of ResourceQuota, lack of permission, etc.), then there may be fewer pods than requested.
The Job controller may throttle new Pod creation due to excessive previous pod failures in the same Job.
When a Pod is gracefully shut down, it takes time to stop.

Completion mode

FEATURE STATE: Kubernetes v1.24 [stable]

Jobs with fixed completion count - that is, jobs that have non null .spec.completions - can have a completion mode that is specified in .spec.completionMode:

NonIndexed (default): the Job is considered complete when there have been .spec.completions successfully completed Pods. In other words, each Pod completion is homologous to each other. Note that Jobs that have null .spec.completions are implicitly NonIndexed.
Indexed: the Pods of a Job get an associated completion index from 0 to .spec.completions-1. The index is available through three mechanisms:
- The Pod annotation batch.kubernetes.io/job-completion-index.
- As part of the Pod hostname, following the pattern $(job-name)-$(index). When you use an Indexed Job in combination with a , Pods within the Job can use the deterministic hostnames to address each other via DNS. For more information about how to configure this, see .
- From the containerized task, in the environment variable JOB_COMPLETION_INDEX.
The Job is considered complete when there is one successfully completed Pod for each index. For more information about how to use this mode, see .

Note: Although rare, more than one Pod could be started for the same index (due to various reasons such as node failures, kubelet restarts, or Pod evictions). In this case, only the first Pod that completes successfully will count towards the completion count and update the status of the Job. The other Pods that are running or completed for the same index will be deleted by the Job controller once they are detected.

Handling Pod and container failures

A container in a Pod may fail for a number of reasons, such as because the process in it exited with a non-zero exit code, or the container was killed for exceeding a memory limit, etc. If this happens, and the .spec.template.spec.restartPolicy = "OnFailure", then the Pod stays on the node, but the container is re-run. Therefore, your program needs to handle the case when it is restarted locally, or else specify .spec.template.spec.restartPolicy = "Never". See for more information on restartPolicy.

An entire Pod can also fail, for a number of reasons, such as when the pod is kicked off the node (node is upgraded, rebooted, deleted, etc.), or if a container of the Pod fails and the .spec.template.spec.restartPolicy = "Never". When a Pod fails, then the Job controller starts a new Pod. This means that your application needs to handle the case when it is restarted in a new pod. In particular, it needs to handle temporary files, locks, incomplete output and the like caused by previous runs.

By default, each pod failure is counted towards the .spec.backoffLimit limit, see . However, you can customize handling of pod failures by setting the Job's .

Note that even if you specify .spec.parallelism = 1 and .spec.completions = 1 and .spec.template.spec.restartPolicy = "Never", the same program may sometimes be started twice.

If you do specify .spec.parallelism and .spec.completions both greater than 1, then there may be multiple pods running at once. Therefore, your pods must also be tolerant of concurrency.

When the PodDisruptionConditions and JobPodFailurePolicy are both enabled, and the .spec.podFailurePolicy field is set, the Job controller does not consider a terminating Pod (a pod that has a .metadata.deletionTimestamp field set) as a failure until that Pod is terminal (its .status.phase is Failed or Succeeded). However, the Job controller creates a replacement Pod as soon as the termination becomes apparent. Once the pod terminates, the Job controller evaluates .backoffLimit and .podFailurePolicy for the relevant Job, taking this now-terminated Pod into consideration.

If either of these requirements is not satisfied, the Job controller counts a terminating Pod as an immediate failure, even if that Pod later terminates with phase: "Succeeded".

Pod backoff failure policy

There are situations where you want to fail a Job after some amount of retries due to a logical error in configuration etc. To do so, set .spec.backoffLimit to specify the number of retries before considering a Job as failed. The back-off limit is set by default to 6. Failed Pods associated with the Job are recreated by the Job controller with an exponential back-off delay (10s, 20s, 40s ...) capped at six minutes.

The number of retries is calculated in two ways:

The number of Pods with .status.phase = "Failed".
When using restartPolicy = "OnFailure", the number of retries in all the containers of Pods with .status.phase equal to Pending or Running.

If either of the calculations reaches the .spec.backoffLimit, the Job is considered failed.

Note: If your job has restartPolicy = "OnFailure", keep in mind that your Pod running the Job will be terminated once the job backoff limit has been reached. This can make debugging the Job's executable more difficult. We suggest setting restartPolicy = "Never" when debugging the Job or using a logging system to ensure output from failed Jobs is not lost inadvertently.

Pod failure policy

FEATURE STATE: Kubernetes v1.26 [beta]Note: You can only configure a Pod failure policy for a Job if you have the JobPodFailurePolicy enabled in your cluster. Additionally, it is recommended to enable the PodDisruptionConditions feature gate in order to be able to detect and handle Pod disruption conditions in the Pod failure policy (see also: ). Both feature gates are available in Kubernetes 1.27.

A Pod failure policy, defined with the .spec.podFailurePolicy field, enables your cluster to handle Pod failures based on the container exit codes and the Pod conditions.

In some situations, you may want to have a better control when handling Pod failures than the control provided by the , which is based on the Job's .spec.backoffLimit. These are some examples of use cases:

To optimize costs of running workloads by avoiding unnecessary Pod restarts, you can terminate a Job as soon as one of its Pods fails with an exit code indicating a software bug.
To guarantee that your Job finishes even if there are disruptions, you can ignore Pod failures caused by disruptions (such , or -based eviction) so that they don't count towards the .spec.backoffLimit limit of retries.

You can configure a Pod failure policy, in the .spec.podFailurePolicy field, to meet the above use cases. This policy can handle Pod failures based on the container exit codes and the Pod conditions.

Here is a manifest for a Job that defines a podFailurePolicy:

In the example above, the first rule of the Pod failure policy specifies that the Job should be marked failed if the main container fails with the 42 exit code. The following are the rules for the main container specifically:

an exit code of 0 means that the container succeeded
an exit code of 42 means that the entire Job failed
any other exit code represents that the container failed, and hence the entire Pod. The Pod will be re-created if the total number of restarts is below backoffLimit. If the backoffLimit is reached the entire Job failed.

Note: Because the Pod template specifies a restartPolicy: Never, the kubelet does not restart the main container in that particular Pod.

The second rule of the Pod failure policy, specifying the Ignore action for failed Pods with condition DisruptionTarget excludes Pod disruptions from being counted towards the .spec.backoffLimit limit of retries.

Note: If the Job failed, either by the Pod failure policy or Pod backoff failure policy, and the Job is running multiple Pods, Kubernetes terminates all the Pods in that Job that are still Pending or Running.

These are some requirements and semantics of the API:

if you want to use a .spec.podFailurePolicy field for a Job, you must also define that Job's pod template with .spec.restartPolicy set to Never.
the Pod failure policy rules you specify under spec.podFailurePolicy.rules are evaluated in order. Once a rule matches a Pod failure, the remaining rules are ignored. When no rule matches the Pod failure, the default handling applies.
you may want to restrict a rule to a specific container by specifying its name inspec.podFailurePolicy.rules[*].containerName. When not specified the rule applies to all containers. When specified, it should match one the container or initContainer names in the Pod template.
you may specify the action taken when a Pod failure policy is matched by spec.podFailurePolicy.rules[*].action. Possible values are:
- FailJob: use to indicate that the Pod's job should be marked as Failed and all running Pods should be terminated.
- Ignore: use to indicate that the counter towards the .spec.backoffLimit should not be incremented and a replacement Pod should be created.
- Count: use to indicate that the Pod should be handled in the default way. The counter towards the .spec.backoffLimit should be incremented.

Note: When you use a podFailurePolicy, the job controller only matches Pods in the Failed phase. Pods with a deletion timestamp that are not in a terminal phase (Failed or Succeeded) are considered still terminating. This implies that terminating pods retain a until they reach a terminal phase. Since Kubernetes 1.27, Kubelet transitions deleted pods to a terminal phase (see: ). This ensures that deleted pods have their finalizers removed by the Job controller.

Job termination and cleanup

When a Job completes, no more Pods are created, but the Pods are not deleted either. Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output. The job object also remains after it is completed so that you can view its status. It is up to the user to delete old jobs after noting their status. Delete the job with kubectl (e.g. kubectl delete jobs/pi or kubectl delete -f ./job.yaml). When you delete the job using kubectl, all the pods it created are deleted too.

By default, a Job will run uninterrupted unless a Pod fails (restartPolicy=Never) or a Container exits in error (restartPolicy=OnFailure), at which point the Job defers to the .spec.backoffLimit described above. Once .spec.backoffLimit has been reached the Job will be marked as failed and any running Pods will be terminated.

Another way to terminate a Job is by setting an active deadline. Do this by setting the .spec.activeDeadlineSeconds field of the Job to a number of seconds. The activeDeadlineSeconds applies to the duration of the job, no matter how many Pods are created. Once a Job reaches activeDeadlineSeconds, all of its running Pods are terminated and the Job status will become type: Failed with reason: DeadlineExceeded.

Note that a Job's .spec.activeDeadlineSeconds takes precedence over its .spec.backoffLimit. Therefore, a Job that is retrying one or more failed Pods will not deploy additional Pods once it reaches the time limit specified by activeDeadlineSeconds, even if the backoffLimit is not yet reached.

Example:

Note that both the Job spec and the within the Job have an activeDeadlineSeconds field. Ensure that you set this field at the proper level.

Keep in mind that the restartPolicy applies to the Pod, and not to the Job itself: there is no automatic Job restart once the Job status is type: Failed. That is, the Job termination mechanisms activated with .spec.activeDeadlineSeconds and .spec.backoffLimit result in a permanent Job failure that requires manual intervention to resolve.

Clean up finished jobs automatically

Finished Jobs are usually no longer needed in the system. Keeping them around in the system will put pressure on the API server. If the Jobs are managed directly by a higher level controller, such as , the Jobs can be cleaned up by CronJobs based on the specified capacity-based cleanup policy.

TTL mechanism for finished Jobs

FEATURE STATE: Kubernetes v1.23 [stable]

Another way to clean up finished Jobs (either Complete or Failed) automatically is to use a TTL mechanism provided by a for finished resources, by specifying the .spec.ttlSecondsAfterFinished field of the Job.

When the TTL controller cleans up the Job, it will delete the Job cascadingly, i.e. delete its dependent objects, such as Pods, together with the Job. Note that when the Job is deleted, its lifecycle guarantees, such as finalizers, will be honored.

For example:

The Job pi-with-ttl will be eligible to be automatically deleted, 100 seconds after it finishes.

If the field is set to 0, the Job will be eligible to be automatically deleted immediately after it finishes. If the field is unset, this Job won't be cleaned up by the TTL controller after it finishes.

Note:

It is recommended to set ttlSecondsAfterFinished field because unmanaged jobs (Jobs that you created directly, and not indirectly through other workload APIs such as CronJob) have a default deletion policy of orphanDependents causing Pods created by an unmanaged Job to be left around after that Job is fully deleted. Even though the eventually the Pods from a deleted Job after they either fail or complete, sometimes those lingering pods may cause cluster performance degradation or in worst case cause the cluster to go offline due to this degradation.

You can use and to place a cap on the amount of resources that a particular namespace can consume.

Job patterns

The Job object can be used to support reliable parallel execution of Pods. The Job object is not designed to support closely-communicating parallel processes, as commonly found in scientific computing. It does support parallel processing of a set of independent but related work items. These might be emails to be sent, frames to be rendered, files to be transcoded, ranges of keys in a NoSQL database to scan, and so on.

In a complex system, there may be multiple different sets of work items. Here we are just considering one set of work items that the user wants to manage together — a batch job.

There are several different patterns for parallel computation, each with strengths and weaknesses. The tradeoffs are:

One Job object for each work item, vs. a single Job object for all work items. The latter is better for large numbers of work items. The former creates some overhead for the user and for the system to manage large numbers of Job objects.
Number of pods created equals number of work items, vs. each Pod can process multiple work items. The former typically requires less modification to existing code and containers. The latter is better for large numbers of work items, for similar reasons to the previous bullet.
Several approaches use a work queue. This requires running a queue service, and modifications to the existing program or container to make it use the work queue. Other approaches are easier to adapt to an existing containerised application.

The tradeoffs are summarized here, with columns 2 to 4 corresponding to the above tradeoffs. The pattern names are also links to examples and more detailed description.

Pattern

Single Job object

Fewer pods than work items?

Use app unmodified?

When you specify completions with .spec.completions, each Pod created by the Job controller has an identical . This means that all pods for a task will have the same command line and the same image, the same volumes, and (almost) the same environment variables. These patterns are different ways to arrange for pods to work on different things.

This table shows the required settings for .spec.parallelism and .spec.completions for each of the patterns. Here, W is the number of work items.

Advanced usage

Suspending a Job

FEATURE STATE: Kubernetes v1.24 [stable]

When a Job is created, the Job controller will immediately begin creating Pods to satisfy the Job's requirements and will continue to do so until the Job is complete. However, you may want to temporarily suspend a Job's execution and resume it later, or start Jobs in suspended state and have a custom controller decide later when to start them.

To suspend a Job, you can update the .spec.suspend field of the Job to true; later, when you want to resume it again, update it to false. Creating a Job with .spec.suspend set to true will create it in the suspended state.

When a Job is resumed from suspension, its .status.startTime field will be reset to the current time. This means that the .spec.activeDeadlineSeconds timer will be stopped and reset when a Job is suspended and resumed.

When you suspend a Job, any running Pods that don't have a status of Completed will be . with a SIGTERM signal. The Pod's graceful termination period will be honored and your Pod must handle this signal in this period. This may involve saving progress for later or undoing changes. Pods terminated this way will not count towards the Job's completions count.

An example Job definition in the suspended state can be like so:

You can also toggle Job suspension by patching the Job using the command line.

Suspend an active Job:

Resume a suspended Job:

The Job's status can be used to determine if a Job is suspended or has been suspended in the past:

The Job condition of type "Suspended" with status "True" means the Job is suspended; the lastTransitionTime field can be used to determine how long the Job has been suspended for. If the status of that condition is "False", then the Job was previously suspended and is now running. If such a condition does not exist in the Job's status, the Job has never been stopped.

Events are also created when the Job is suspended and resumed:

The last four events, particularly the "Suspended" and "Resumed" events, are directly a result of toggling the .spec.suspend field. In the time between these two events, we see that no Pods were created, but Pod creation restarted as soon as the Job was resumed.

Mutable Scheduling Directives

FEATURE STATE: Kubernetes v1.27 [stable]

In most cases a parallel job will want the pods to run with constraints, like all in the same zone, or all either on GPU model x or y but not a mix of both.

The field is the first step towards achieving those semantics. Suspend allows a custom queue controller to decide when a job should start; However, once a job is unsuspended, a custom queue controller has no influence on where the pods of a job will actually land.

This feature allows updating a Job's scheduling directives before it starts, which gives custom queue controllers the ability to influence pod placement while at the same time offloading actual pod-to-node assignment to kube-scheduler. This is allowed only for suspended Jobs that have never been unsuspended before.

The fields in a Job's pod template that can be updated are node affinity, node selector, tolerations, labels, annotations and .

Specifying your own Pod selector

Normally, when you create a Job object, you do not specify .spec.selector. The system defaulting logic adds this field when the Job is created. It picks a selector value that will not overlap with any other jobs.

However, in some cases, you might need to override this automatically set selector. To do this, you can specify the .spec.selector of the Job.

Be very careful when doing this. If you specify a label selector which is not unique to the pods of that Job, and which matches unrelated Pods, then pods of the unrelated job may be deleted, or this Job may count other Pods as completing it, or one or both Jobs may refuse to create Pods or run to completion. If a non-unique selector is chosen, then other controllers (e.g. ReplicationController) and their Pods may behave in unpredictable ways too. Kubernetes will not stop you from making a mistake when specifying .spec.selector.

Here is an example of a case when you might want to use this feature.

Say Job old is already running. You want existing Pods to keep running, but you want the rest of the Pods it creates to use a different pod template and for the Job to have a new name. You cannot update the Job because these fields are not updatable. Therefore, you delete Job old but leave its pods running, using kubectl delete jobs/old --cascade=orphan. Before deleting it, you make a note of what selector it uses:

The output is similar to this:

Then you create a new Job with name new and you explicitly specify the same selector. Since the existing Pods have label batch.kubernetes.io/controller-uid=a8f3d00d-c6d2-11e5-9f87-42010af00002, they are controlled by Job new as well.

You need to specify manualSelector: true in the new Job since you are not using the selector that the system normally generates for you automatically.

The new Job itself will have a different uid from a8f3d00d-c6d2-11e5-9f87-42010af00002. Setting manualSelector: true tells the system that you know what you are doing and to allow this mismatch.

Job tracking with finalizers

FEATURE STATE: Kubernetes v1.26 [stable]Note: The control plane doesn't track Jobs using finalizers, if the Jobs were created when the feature gate JobTrackingWithFinalizers was disabled, even after you upgrade the control plane to 1.26.

The control plane keeps track of the Pods that belong to any Job and notices if any such Pod is removed from the API server. To do that, the Job controller creates Pods with the finalizer batch.kubernetes.io/job-tracking. The controller removes the finalizer only after the Pod has been accounted for in the Job status, allowing the Pod to be removed by other controllers or users.

Jobs created before upgrading to Kubernetes 1.26 or before the feature gate JobTrackingWithFinalizers is enabled are tracked without the use of Pod finalizers. The Job updates the status counters for succeeded and failed Pods based only on the Pods that exist in the cluster. The contol plane can lose track of the progress of the Job if Pods are deleted from the cluster.

You can determine if the control plane is tracking a Job using Pod finalizers by checking if the Job has the annotation batch.kubernetes.io/job-tracking. You should not manually add or remove this annotation from Jobs. Instead, you can recreate the Jobs to ensure they are tracked using Pod finalizers.

Elastic Indexed Jobs

FEATURE STATE: Kubernetes v1.27 [beta]

You can scale Indexed Jobs up or down by mutating both .spec.parallelism and .spec.completions together such that .spec.parallelism == .spec.completions. When the ElasticIndexedJob on the is disabled, .spec.completions is immutable.

Use cases for elastic Indexed Jobs include batch workloads which require scaling an indexed Job, such as MPI, Horovord, Ray, and PyTorch training jobs.

AWS Services

AWS Services overview

1. Compute

Category

Service

Description

Instances(Virtual machines)

EC2

Web-scale cloud computing is simplified using It.

EC2 Spot

Up to 90% off fault-tolerant workloads are run by using this.

EC2 Autoscaling

To meet changing demand, automatically add or remove compute capacity.

Lightsail

To create & operate a virtual private server with AWS using the simplest method available. A cloud platform that includes everything you need to create an application or website.

Batch

Allows developers, scientists, and engineers to create and run hundreds of thousands of batch processing jobs on Amazon Web Services (AWS)

Containers

Elastic Container Service (ECS)

A scalable, secure, and highly efficient way to run containers.

Elastic Container Registry (ECR)

You can store, manage, and deploy container images easily.

Elastic Kubernetes Service (EKS)

A fully managed service.

Fargate

Its is used as Serverless compute for containers

Serverless

Lambda

Pay only for the compute time you consume, instead of running code without thinking about servers.

Edge and hybrid

Outposts

You can have a truly consistent hybrid experience with AWS infrastructure and services on your own premises.

Snow Family

Formalise, process, and store data in rugged or disconnected edge environments.

Wavelength

It is used to deliver ultra-low latency application in devices using 5G

VMware Cloud on AWS

Work faster by innovating faster, rapidly shifting to the cloud, and securely working from anywhere.

Local Zones

It runs latency-sensitive applications closer to the end-users.

2. Storage

Service

Description

AWS S3

S3 is a distributed database that is connected to every device in the network through the Internet. It uses a peer-to-peer model, meaning that data is not stored on a central server. Instead, data is stored directly between the user and the service that the user is trying to access. This provides a faster and more reliable service than a traditional database would because it does not have to be transferred when a change is made.

AWS Backup

AWS Backup automates the entire backup process from storage to delivery — removing the need to manually input and process backup data. It provides end-to-end encryption of your backup data to help keep your data secure. AWS Backup is a highly efficient and cost-effective way to protect your business data.

Amazon EBS

Amazon Elastic Block Store provides block-level storage volumes. These storage volumes are created and managed from the web service's dashboard and can be used to backup your application data and store your logs. By providing storage volumes for your applications, you can create a controlled, low-cost way to backup your application data and store your application logs in the cloud. You can also use the Elastic Block Store as a way to automatically rotate your application data to prevent data loss in the case of a hard or software failure.

Amazon EFS Storage

EFS is a blob Storage. Amazon EC2 instances can store files in EFS. You can think of it as a hosting service that offers you cloud storage for free. You can store any type of file with this cloud storage, and it's very fast. You can get up to 2 TB storage for free. You can increase this storage limit by purchasing more space. EFS provides an option to encrypt your files.

Amazon FSx

FSx for Windows Server and Lustre (fully managed high-performance file systems built on Windows Server) offer native compatibility and characteristic sets for workloads. FSx for Windows Server (favourite storage built on Windows Server) and Lustre (favourite file system integrated with S3) are available as FSx for Windows Server.

AWS Storage Gateway

A storage gateway enables an on-premise software appliance to communicate with cloud-based storage. It provides an edge-led, no-premium, high-speed connection between the software and storage provider, allowing for a more cost-effective and efficient delivery of software and data to customers. The service can be accessed via a mobile device or web browser and eliminates the need for customers to maintain large, expensive on-premises data centres. Some of the benefits of using a data centre as opposed to a cloud provider include lower costs, longer operational flexibility due to lower operational costs, and availability of human resources for support. A data centre can be more than just a place to park servers. It can be a hub for other business processes, enabling a higher level of integration between the data and the applications that generate it.

AWS DataSync

DataSync is simple and efficient data transfer between on-premises storage and S3, EFS, or FSx for Windows File Server. DataSync can also be used to migrate your on-premises data to S3 and other cloud storage providers. DataSync offers both a server software and client software option. With the client software, you can create a disconnected storage pool and then connect the server to the storage pool using DataSync. With the server software, you can create a disconnected storage pool and then connect the storage pool to the on-premises data hub using DataSync.

AWS Transfer Family

Transfer Family is designed to provide seamless file transfers into & out of S3.

AWS Snow Family

Snow Family devices are highly-secured, portable computers that collect and transmit data at the edge, and migrate data between AWS and other systems.

3. Database

Database

type

Use cases

Service

Description

Relational

Ecommerce websites, Traditional sites etc.

Aurora,

Redshift,

RDS

RDS enables you to easily set up, control, and scale a relational database in the cloud.

Key-value

Ecommerce Websites, gaming websites etc.

DynamoDB

DynamoDB is a highly-scalable, real-time database that provides advanced features such as automatic ETL (Extract, Transform, Load) and real-time analytics. It is also a non-relational database, which means it does not store query results.DynamoDB is engineered with low latency and high availability in mind. It combines the scalability and performance of a database with the flexibility of a JavaScript application store.

In-memory

Coding Leadeboards

ElastiCache for Memcached & Redis

ElastiCache is a tool for web application accelerating the process of setting up and populating an in-memory cache with data. You can use it to speed up page loads and to make your application more responsive. ElastiCache is a centralised tool for setting up and populating an in-memory cache with data. You can use it to speed up page loads and to make your application more responsive.

Document

Content Management

DocumentDB

DocumentDB provides a complete turnkey solution for building data-based apps at scale, with the ability to scale up or down as needed to meet the needs of your business. It can be used to store almost any data, including big data, as well as run serverless SQL query against the data. It's scalable, efficient, and easy to use. It's also open source and community driven, so if you have any suggestions or feedback, don't be afraid to drop a line.

Wide column

Fleet management system

Keyspaces (for Apache Cassandra)

Keyspacesis is designed to be used in tandem with Apache Cassandra as the primary database for high-throughput workloads. The key to using Key Spaces is the isolation of data between the different applications that use it. The data is held in a single database instance, but applications can use different databases (such as Redis or CouchDB) to store their data in a different system. Key Spaces is a highly available database service, which means that if there is a failure in the primary database, other key spaces can continue to operate with minimal impact to the application.

Graph

Recommender Engines

Neptune

Neptune uses a hybrid database model that stores data in the form of graphs and allows users to query data in a variety of ways, such as by using a SQL-like query language. Users can choose how their data is stored, and how it is accessed, by using Neptune’s Storage & Access Management (SAM) tool. Neptune is available in a private and open-source version, as well as a closed-source version. The open-source version is licensed under the Apache 2.0 licence, and the closed-source version is licensed under the GPL licence.

Time series

IoT devices and applications

Timestream

With Timestream, you can record, manage, and analyse billions of events per day in a fast, simple, and serverless manner.

Ledger

Transaction Management

Quantum Ledger Database (QLDB)

A QLDB is a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority. It provides a transparent, incorrupt, and cryptologuement verified record.

You can download a PDF version of Aws Cheat Sheet.Download PDF

4. Developer Tools

Service

Description

Cloud9

Cloud9 is a cloud-based IDE that allows developers to write, run, and debug code.

CodeArtifact

CodeArtifact is a secure storage, publishing, and sharing of software code packages used in a development process organisation's software development. CodeArtifact makes it easy for small organisations to store, publish, and share software packages.

CodeBuild

CodeBuild is a code creation service that also produces code artefacts upon request.

CodeGuru

CodeGuru is a machine learning tool that recommends improved code quality and safe code by analysing the frequency of certain lines of code.

Cloud Development Kit

AWS CDK is an open source software development framework that defines cloud application resources using familiar programming languages.

CodeCommit

CodeCommit is a Git repository service that supports storing and managing Git archives on the Amazon Web Services cloud with CodeCommit.

CodeDeploy

CodeDeploy, a professionally managed deployment service, automates software installations on a variety of EC2, Fargate, Lambda, and on-premises servers.

CodePipeline

CodePipeline is a high-quality, automated release pipeline that helps automate app and infra update release pipelines.

CodeStar

With AWS CodeStar, you can create, manage, and scale automated code reviews with a single click. You can also monitor the performance and scalability of your code review process with the built-in metrics dashboard.

CLI

AWS CLI is a command tool that helps you manage multiple AWS services and automate them using scripts. It offers a simple yet powerful interface for managing multiple AWS services and a set of built-in commands that enables you to easily create and delete EC2 instances, cancel auto-scaling, and more.

X-Ray

X-Ray allows software engineers to view the state of a system at a glance, identify potential bottlenecks, and make informed operational decisions to improve performance and reliability. X-Ray inspects application code using a combination of machine and customer-provided data to identify potential bottlenecks and analyse performance and performance trends for each test scenario.

5. Network and Content Delivery

Use Case

Service

Description

Build a cloud network

VPC

VPC lets you provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define.

Transit Gateway

Transit Gateway simplifies network and peering relationships by connecting VPCs & on-premises networks through a central hub.

PrivateLink

PrivateLink is a great way to securely connect your on-premises workloads with the cloud, while still maintaining full control over who can access your data and application. You can use PrivateLink to securely connect your on-premises data center to your AWS data lake, while providing the full tenant control and regulatory compliance of an on-premises data center.

Route 53

Route 53 is a dedicated, real-time DNS service that allows you to focus on building an incredible Internet experience for your customers, partners, and vendors. It is a highly available, enterprise grade cloud DNS solution that provides load balancing, failover, high availability and performance monitoring to ensure optimal service for your customers and partners.

Scale your network design

Elastic Load Balancing

Elastic Load Balancing is a best practice to assign incoming traffic to a single target, such as an EC2 Instance, and then distribute the rest of the traffic across the target's resources. An elastic load balancer distributes traffic across an arbitrary number of targets.

Global Accelerator

Global Accelerator connects Amazon's global infrastructure to the global traffic-generation network of Global Accelerator, improving internet performance by up to 60%.

Secure your network traffic

Shield

Shield inspects the source and destination IP addresses of every request to a protected resource, and blocks malicious requests from either IP address if it suspects them to be from the same source. This helps to prevent DoS attacks, and also safeguards your sensitive data from being leaked on a public website by refusing to provide a response to requests from maliciously IPed websites.

WAF

WAF is a web application firewall that protects your API endpoints by blocking malicious requests and whitelisting trusted requests. It can be run as a service on your favourite cloud hosting provider or as a task on your favourite work-from-home platform.

Firewall Manager

Firewall Manager provides centralised control, auditing, and visibility over your AWS security policies and rules. It can be used to monitor, limit the usage of specific services for security purposes, or to enforce specific policies on specific traffic.

Build a hybrid IT network

(VPN) - Client

VPNs protect your privacy online, and also provide a secure way to access network resources when needed. For example, your corporate VPN provides a secure connection to the company intranet when needed, but doesn’t typically expose your online activity to the public internet.. AWS provides a rich set of tools for building and managing VPNs, and it’s easy to get started with the AWS VPN Service.

(VPN) - Site to Site

Site-to-Site VPN makes a secure connection between a data centre or branch office and AWS cloud resources.

Direct Connect

Direct Connect allows you to set up and manage a secure, private, and fault-tolerant network connection between your AWS and on-premises devices.

Content delivery networks

CloudFront

CloudFront is a distributed content delivery network (CDN) that enables easy delivery of web content to end users from a pool of web servers around the globe

Build a network for microservices architectures

App Mesh

AWS microservice-based App Mesh makes it possible to guide and control App Mesh-based microservices on AWS.

API Gateway

API Gateway provides the opportunity to create and expand your own REST and WebSocket APIs at any size.

Cloud Map

The cloud map handles the name and addresses of the clouds.

6. Security, Identity, & Compliance

Category

Service

Description

Identity & access management

Identity & Access Management (IAM)

IAM provides secure and controlled access to AWS services.

Single Sign-On

SSO simplifies, manages, and provides access to AWS accounts & business applications.

Cognito

Cognito helps in storing user sign-up data in the same database as your other web & mobile app data and manage user access controls such as read permissions, roles, and identity management

Directory Service

AWS Managed Microsoft Active Directory (MMAD) enables you to use Active Directory across your entire enterprise with an emphasis on security and regulatory compliance.

Resource Access Manager

Resource Access Manager (RAM)allows you to assign access control rules to resources so that only authorised users can access them. You can also set rules to assign specific users access to resources based on topic, role, or condition.

Organisations

As your environment grows and scales, organisations help you centrally manage your environment.

Detection

Security Hub

AWS IoT Security Hub can help you improve your security posture by monitoring the state of your applications and devices, and alerting you to any potential issues. You can view the state of your applications and devices, as well as receive notifications on potential issues, via a dashboard.

GuardDuty

GuardDuty reduces the risk of malicious activity and data breaches by proactively monitoring the AWS accounts, workloads, and storage in the cloud. AWS GuardDuty continuously watches for malicious activity, like unusual activity in the source IP address list, or abnormal activity in the number of notification emails received. It can also be used to proactively detect unauthorised behaviour, like a large influx of traffic to a particular AWS account or a significant increase in the number of notifications about an unauthorised change in access controls.

Inspector

It scans your AWS environment for potential security vulnerabilities, and offers remediation suggestions. It includes detailed analysis, including vulnerability severity, impact, and recommended action. You can use the online survey at the start of the report to rate the severity of the vulnerability. Inspectors can be used for every type of AWS service, but our researchers found that most of them are useful for VPCs and EC2 instances. Once an Inspector has been launched, you can get the details about the vulnerability, including the details about the root cause such as the AWS SDKs, the processes that were vulnerable and the actions that were taken.

Config

It is a free service that allows you to monitor, bill for, and adjust the settings of Amazon Web Services (AWS) resources from the comfort of your own desk. The service works by detecting and recording metadata about every action that a resource takes, like creation, modification, or removal. This information is then analysed to determine the impact that each action has on other resources. You can use this monitoring data to: Assess the health of your AWS resources and identify areas for rapid development and transformation.

CloudTrail

It records all the actions that a given AWS account has been taken in a given period of time. It provides a historical record of the actions taken by a given AWS account in a given period of time, allowing you to view which accounts have been accessing your resources, who has been accessing your resources, and for how long. You can also generate detailed spending reports for your account and see which spenders are making the biggest impacts on your bottom line. You can view a detailed report of all your actions taken within a given time period or drill down into the data to view actions taken by an account within a specific period of time.

IoT Device Defender

It monitors and secures connected devices from a security standpoint. It proactively blocks malicious or unsafe apps from being installed on connected devices, and controls the data that is being transmitted between the device and the cloud. It tracks the usage of connected devices, and ensures that data privacy is protected.

Infrastructure protection

Shield

Apps protected with Shield are continuously monitored for unusual traffic patterns, such as high-latency or unusual traffic patterns. When an abnormal pattern is detected, Shield automatically detects the attack, identifies the origin of the traffic, and identifies the threat vector. It then applies a variety of mitigations to prevent or reduce the impact of the attack.

Web Application Firewall (WAF)

WAF provides a set of rules that can be configured to block or allow requests based on their set of rules. Rules can be configured as either Whitelists or Blacklists. Whitelists allow for greater control and transparency while Blacklists are limited in their ability to adapt to changes in threat patterns. An API can be made subject to WAF rules through an API Management Gateway (AGW) or an Application Firewall (AFW) can be implemented within the hosting provider's infrastructure.

Firewall Manager

It provides a single point of access to view, manage, and control the whole AWS WAF lifecycle from the user's perspective. It provides an overview of the current state of the WAF, as well as a list of maintenance steps the user can take to adjust the configuration or launch a fresh audit. Once the user has accepted the terms and conditions, the plugin will create a new instance of Firewall Manager on the user's behalf. This instance includes all the necessary AWS services to enable the WAF and provide a secure environment.

Data protection

Macie

It is a one-stop-shop for your data protection needs. Macie monitors data at rest and in process, as well as in motion, across your organisation’s networks, devices and apps. It continuously scans the data it receives for patterns that indicate the presence of malicious or objectionable content, and notifies the user when it detects such content. Macie also provides you with a host of other data protection services, like data encryption, data integrity, data profiling, data governance, data auditing, data erasure, data shredding, data melting, data harvest, data collation, data transformation, and data rental.

Key Management Service (KMS)

Key management on AWS is a broad range of activities from creating & storing public & private keys to creating, managing, and authorising access to AWS services with digital keys. This guide explains the key management solution on AWS that is easiest to use, most secure, and provides the most flexibility for you to create and manage your keys the way you need them.

CloudHSM

This is particularly useful for mobile devices and other unsecured, remote-accessed devices. CloudHSM is a blockchain-based smart contract that secures and manages your data, identity, and access control. It comes with a host of features including a one-click setup, cloud storage, SaaS solution, and a mobile app.

Certificate Manager

Certificate Manager provides a single, easy-to-use interface for managing and deploying TLS/SSL certificates. Manage certificates with a single click from the dashboard or from the command line with certificate manager-cli . Automate certificate renewals with the built-in cron jobs. You can also manage certificates through the API or the command line interface. Certificate Manager can be deployed as a cloud-based service or on-premises with a virtual machine. Once up and running, you can manage and deploy certificates through the web interface or the API.

Secrets Manager

Secrets Manager allows you to securely store, access, & share secrets with a single-click. It is a flexible tool that allows you to set permissions for storing and accessing secrets. It can be used to store and share secrets between your services, between your apps, or between your backend & frontend code.

Incident response

Detective

Detectives can easily look at, investigate, and quickly identify potential security problems or suspicious activities.

CloudEndure Disaster Recovery

It can be used to protect your data from power outages, network outages, or any other disaster. It can also be used for disaster recovery for your business data centers. Disaster recovery can also be used to restore a server to its previous state to avoid the loss of data in the event of a server crash or other external causes. In order to be used for disaster recovery, a server must be provisioned with the appropriate hardware and software, and must be properly configured for disaster recovery.

Compliance

Artifact

You can use the Artifact web service to view and download AWS security and compliance records. The service returns an XML response that includes information about the record such as the AWS access and identity credentials that was used to create the record, the version of the record, the AWS Security Token used to access the record, the AWS Security Group used to protect the record and other metadata.

7. Migration & Transfer services

Service

Description

Migration Evaluator

To start using AWS, you need to first build a case for why the service is useful to your organisation. An easy way to do that is to build a Migration Evaluator, which is a detailed analysis of your current infrastructure and recommendations for how to best move forward.

Migration Hub

The migration hub tracks each app's effort to migrate to a new solution, such as a new solution release or a new partner. The migration hub not only tracks the app’s progress toward its goal, but records each action taken to get the app to the new solution, such as uploading a new solution package.

Migration Hub includes an easy-to-use dashboard for monitoring the progress of each app’s migrations. Once you’ve set up the dashboard, you can view the status of each migration and any action taken to get the app to the new solution. You can also view a list of all partners the app is connected to and view the progress of each partner’s migration.

Application Discovery Service

The service makes it easier for enterprises to collect data, analyse it, and create insight with real-time dashboarding that visualises data-driven decisions. By using AI and machine learning to predict user behaviour, businesses can save time and money by eliminating unplanned outages and rework caused by changes to app or IT servers.

Server Migration Service (SMS)

With SMS, you can move millions of pieces of business data across clouds, without needing to learn new technologies or hiring new staff. SMS works by relocating your apps from your on-premises data centre to the cloud, then tunnelling back between clouds as needed. SMS makes it easier to scale since there’s no need to add new hardware or change software.

Database Migration Service (DMS)

A DMS solution provides a set of tools that allows a data manager to: create an account, select aAWS region, create a service account, and create aAWS Identity & Access Management (IAM) role. Once a DMS solution is selected, the data manager can create an account and assign a role to the DMS solution. The data manager can then create a database, select aAWS region, and select aAWS Availability Zone (AZ). The data manager should select a unique name for the database, such as my_cool_app. This name is used throughout the AWS ecosystem and will be visible to other AWS users. The data manager can then create tables in the database and assign permissions to objects in the tables. The data manager can then enable the migration of data to the new database, by selecting the migration option. This allows other AWS users to view the new database and migrate data to the new database.

CloudEndure Migration

Cloud Endure Migration simplifies the task of deploying new software in the cloud by removing the need to transfer data from one location to another. With Cloud Endure Migration, you can: - Simplify the inventory process by tagging and tracking your assets with custom metadata. - Reduce the cost of relocation by streamlining the transfer of data with a minimum of effort.

VMware Cloud on AWS

Refer to the compute section. It has already been explained there.

DataSync

Refer to the storage section. It has already been explained there.

Transfer Family

Refer to the storage section. It has already been explained there.

Snow Family

Refer to the storage section. It has already been explained there.

8. Cost Management

Use Cases

Capabilities

Description

Organize

Construct cost allocation & governance foundation with your own tagging strategy

Cost Categories helps you to segment your AWS platform and process usage data to better understand costs and develop cost-effective infrastructure and operations.

Report

Provide users with information about their cloud costs by providing detailed allocable cost data

You can use this report to get a quick and detailed view of the AWS ecosystem and its infrastructure. You can also use this report to get a deeper understanding of AWS services and their cost & usage. You can use this data to help you make informed decisions about which AWS services to use and which to ignore. You can also use the data to make customized reports. This data is publicly available and made freely available by the AWS Repo. The AWS Repo is the primary source of this data, and the data is updated frequently.

Access

In a unified view, track billing information is tracked across the organisation.

The amount of credits an account pays to a service provider in order to cover its costs is known as its billing obligation.

Control

Set up effective governance mechanisms with the right guardrails in place

A central authority is established and managed to govern an AWS environment as it grows and scales workloads on the platform.

Forecast

Create estimated resource usage and forecast dashboards

You can create a forecast for the next 90 days, one month, two months, or even for the lifetime of your account. The forecast will show how much data you will need to store, how much you will use each month, and how much you will spend on AWS during the forecast month. You can view a history of your forecasts or create a new one. You can also get a breakdown of your costs and usage by region, country, or by describing your business needs in more detail. You can create a forecast for any length of time, but a short one will give you the most up-to-date information and save you the time of creating a new forecast each month.

Budget

Set custom budget threshold, auto alert notification on spend higher than threshold, and keep track of keep spend in check with a custom budget threshold.

Budgets can be set to track cost and usage in any manner from the simplest to the most challenging applications.

Purchase

Use free trials and programmatic discounts based on workload pattern and need to leverage free trials and programmatic discounts.

A reserved instance provides up to 75% off on-demand pricing

Elasticity

Devise plans to meet or exceed consumer demand by understanding and responding to its patterns and needs.

When you're setting up a new AWS account and want to start experimenting with Amazon's platform, you can trust that the experience provided by this website will help you get the most out of your experience. This includes getting the best domain name, choosing the right location for your testing site, and choosing a secure hosting solution. By following these tips, you can feel confident that your experience with the AWS platform is as smooth as possible.

Rightsize

Prioritize workload allocation size to meet demand.

AWS offers a variety of options for optimizing your compute resources - from on-premises equipment to cloud solutions - to help you get the best possible performance from your infrastructure. For example, you can use virtual machines with vMotion capability to transfer your workloads between data centers more efficiently. You can also use cloud metering to collect and report performance metrics on your use of compute resources.

Inspect

To keep up-to-date on resource deployment and cost offsetting opportunities

Cost Explorer helps you understand your current and future cost structure by automatically detecting and outlining your current & future spend on cloud services such as Amazon Web Services and more. You can also create a detailed report that breaks down your spending by month, by asset, or by location — making it easy to understand and visualizing your cloud costs.

9. SKDs and Toolkits

Service

Description

CDK

It was designed to solve the common problem of building mobile apps with a low level of abstraction. This reduces the need to manually code up elaborate logic and keeps the focus on developing apps using high-level language features. Familiarity with the syntax of your favorite language increases the ease of use of your app, as well as its chances of adoption by users. The more familiar your audience with your app, the more likely they are to install it.

Corretto

It is a free and open source software distribution, which can be used for both desktop and mobile apps. The goal of the project is to make it as compact and lightweight as possible, while at the same time striking a balance between speed and power. The project is led by SBase, an open source Java project, and collaborates with the other major OpenJDK project members

Crypto Tools

The AWS Crypto Tools libraries help you do your research andSolidity, Serpent, or Vyper are examples of popular JavaScript cryptographics libraries. The AWS Crypto Tools libraries are based on the open source Shepherds project. Shepherds is a widely-used and well-regarded implementation of the Diffie-Helman key-exchange algorithm in the Go programming language.

Serverless Application Model (SAM)

You can use SAM to create serverless apps that work with data from within your current application code. You can also use SAM to write serverless code that can be used in other applications. It can be used to create serverless applications that work with data from within your current application code. It can also be used to create serverless code that can be used in other applications.

10. Data Lakes & Analytics

Category

Service

Description

Analytics

Athena

Athena is a free service with no ads or hidden charges. You can use this service to analyze your data in real-time or query past data with a predefined set of rules. You can also run reports and drill-downs that let you explore data in more detail. When using Athena, don’t limit yourself to looking at the numbers. Think about what you’re analyzing and find a way to make sense of the data.

EMR

EMR is a data management engine that helps enterprises collect and analyze data from their data warehouses and other sources. It provides a common platform for data collection and analysis, and can be used to create real-time and historical reports. The term EMR is also used to refer to any software or platform that provides a similar set of benefits.

Redshift

It helps you store, process, and analyze your data with a data warehouse. It stores data in a relational database, and provides a set of tools for manipulating data and creating reports.

Kinesis

Kinesis makes it easy to collect data using any of the following options: web sites, email, text messages, sensors, or even in-app purchase data. One can then process the collected data with tools such as SQL or NoSQL, integrate the data with tools such as artificial intelligence, and display the data in a variety of ways. Kinesis also makes it easy to analyze the data with tools such as artificial intelligence, machine learning, and blockchain technology. By providing the ability to process and analyze data in real-time, Kinesis allows businesses to react to changing situations and market trends faster than before.

Elasticsearch Service

Elasticsearch Service is simple to set up, deploy, and operate at large scale. Elasticsearch Service is a managed service that makes it simple to operate, secure, and maintain Elasticsearch at a high level of efficiency.

Quicksight

QuickSight makes it simple to send information to everyone in your company by utilizing the cloud-based business intelligence service QuickSight.

Data movement

1) Amazon Managed Streaming for Apache Kafka (MSK)

2) Kinesis Data Streams

3) Kinesis Data Firehose

4) Kinesis Data Analytics

5) Kinesis Video Streams

6) Glue

MSK is a simple framework that makes it simple to build and run Apache Kafka applications.

Data lake

1) S3

2) Lake Formation

Setting up a data lake is simple with Lake Formation. It makes it straightforward to create a secure data lake in minutes. A data lake is a centralized, curated, and secured repository for all data, both in its original form and prepped for analysis.

1) S3 Glacier

2) Backup

These S3 cloud storage classes are designed for small- and medium-sized businesses that need a cost-effective and high-performance cloud storage solution for their data archives & long-term backup purposes. These S3 cloud storage classes are designed for small- and medium-sized businesses that need a cost-effective and high-performance cloud storage solution for their data archives & long-term backup purposes.

1) Glue

2) Lake Formation

Refer as above.

Data Exchange

Data Exchange is a cloud-based software that provides a simple and easy-to-use interface for handling data interactions with a view to increasing your data’s scalability and optimizing your business. It allows you to: - Find, list, search, and subscribe to data - Store data in the cloud - Query data - Process data - Export data Data Exchange is a cloud-based software that provides a simple and easy-to-use interface for handling data interactions with a view to increasing your data’s scalability and optimizing your business. It allows you to: Find, list, search, and subscribe to data Store data in the cloud Query data Process data Export data

Predictive analytics & machine learning

Deep Learning AMIs

Deep learning is a machine learning field of study that applies artificial intelligence and iterative learning algorithms to large data sets to generate new knowledge. AI and ML are being used in a variety of industries, including finance, retail, and manufacturing, to name a few. By using AMI’s, AI researchers can train their models on any of the many Cloud Storage resources, provided they have access to the right storage format.

SageMaker

SageMaker automates all the necessary steps to build, test, deploy, and scale your models including: model selection, preprocessing, meta-learning, visualization, and inference. SageMaker provides a full end-to-end solution for data analysis, data preparation, model training, and real-time visualizations of your data.

11. Containers

Use Cases

Service

Description

Store, encrypt, and manage container images

ECR

Refer to the compute section. It has already been explained there.

Run containerized applications or build microservices

ECS

Refer to the compute section. It has already been explained there.

Manage containers with Kubernetes

EKS

Refer to the compute section. It has already been explained there.

Run containers without managing servers

Fargate

The Fargate stack consists of a number of components which work together to create a highly available, low-cost, and secure business-grade application. It is designed to work with both ECS & EKS. We will cover the different components of the Fargate stack and the best practices to maintain a successful Fargate stack.

Run containers with server-level control

EC2

Refer to the compute section. It has already been explained there.

Containerize and migrate existing applications

App2Container

A2C helps you to: - Minimise the risk of security flaws by generating a self-signed certificate for every application. - Minimise the cost of installing customised java and .NET applications by generating the same unique code signing key for every application that is installed. - Automate application upgrades by generating the same code signing key for every application that is installed. - Generate a single update for all your apps to download and install. - Reduce the overall cost of maintaining your apps by using the same code signing key for every app.

Quickly launch and manage containerized applications

Copilot

It helps you manage your application’s life cycle from development to deployment, and enables you to make smarter and faster decisions during the application life-cycle. The interface is based on a set of common operations such as creating, deploying, and managing application containers, creating and terminating IAM permissions, and creating and listing clusters. It also provides support for common use cases such as batch processing, AI and ML, and secure data storage. This dashboard provides information about your clusters and applications, such as memory usage, CPU usage, and how long each application took to provision. The info helps you identify bottlenecks, optimise application performance, and create high-performing clusters.

12. Serverless

Category

Service

Description

Compute

Lambda

Lambda is a cloud-based service that functions as a sort of middleman. Data flows through the middleman and is processed at a data centre of your choice. The code running on the server is only responsible for processing request data, not generating it. This code is called the “backend” and is what most people think of when they think of “serverless.” It’s not a “server” at all. The code running on the serverless platform is purely “blackbox” in that it does not know what data it receives and it does not manage or store any of the data it receives or emit any data of its own. The backend code receives requests from clients and processes them accordingly.

Lambda@Edge

Amazon CloudFront provides Lambda@Edge, which allows you to run code closer to users of your application, which improves performance and reduces latency.

Fargate

Refer to the containers section. It has already been explained there.

Storage

Refer to the storage section. It has already been explained there.

EFS

Refer to the storage section. It has already been explained there.

Data stores

DynamoDB

DynamoDB is a NoSQL database that is designed to work with JavaScript. It is a highly scalable database that can be used to store huge amounts of data and still be fast. As more data is added to the database, the performance of the database also increases. In order to use this database, you must first create a database account and sign up for a free trial.

Aurora Serverless

Aurora Serverless is a serverless computing platform that eliminates the need for manually managing infrastructure and automates critical steps of a serverless infrastructure implementation, such as creating a configuration blueprint and selecting a provider. It helps you scale your software without scaling infrastructure. You can use Serverless to create an entire development and test environment, or you can use it to create a production-level application with the same codebase and same data, with the same engineers, testing in the same environment, and deployments across the same clusters.

RDS Proxy

The RDS Proxy can be used to:

Manage a single database across multiple clusters

Reduce costs by reducing the number of nodes in your RDS infrastructure

Enable high availability of your RDS clusters

Enable self managed storage for your RDS clusters

Speed up your application deployments

API Proxy

API Gateway

API Gateway works with any language or platforms that can communicate with the Google Cloud Platform. It can be used to create APIs both for internal and external clients, as well as absorb & route traffic if desired. It is a perfect fit for growing businesses or teams that need to build and manage an API programmatically.

Application integration

SNS

SNS facilitates the exchange of data between apps and devices using standardized APIs. You can create SNS topics and send and receive messages using SNS clients and servers. SNS provides security and control over messages that are not tagged with certain topics or sent from certain clients. With SNS, you can: Send and receive messages with a simple interface Control which devices can send messages, who can read them, etc. Access and view messages and conversations from different clients & devices at the same time.

SQS

With SQS, you can send messages between applications and services, route them to the right recipient, and keep track of the source & target addresses. It’s similar to Slack or Hipchat but it’s not a replacement for those popular chatting apps. SQS is a message queuing solution that enables you to decouple microservices, distributed systems, & serverless applications.

AppSync

It's simple to create GraphQL APIs with AppSync, which handles the hard work of securely connecting to data sources such as AWS DynamoDB, Lambda.

EventBridge

It is a low-cost alternative to building a new backend infrastructure for every new app. With Serverless EventBridge, you can connect your existing apps with a few lines of code. You don’t have to build a new backend for every new app you want to connect to. You can use existing infrastructure as a provider of event data, and connect your apps using Serverless EventBridge.

Orchestration

Step Functions

Step Functions is an easy-to-use function orchestra that makes it possible to string Lambda functions and multiple AWS services into business-critical applications.

Analytics

Kinesis

Kinesis enables one to get timely insights by collecting, processing, and analyzing real-time, streaming data.

Athena

Athena provides a high-level language that allows users to quickly and easily set up and operate their S3 data analysis. With Athena, you can view data in Amazon S3 using standard SQL queries. This allows you to save time by not having to learn a new data analytics software.

13. Application Integration

Category

Service

Integration

Messaging

SNS

Reliable high- throughput pub/sub, SMS, email, and mobile push notifications

SQS

Application companies may use a message queue that send, store, and receive messages between application parts at any volume to send, store, and retrieve messages between application parts.

The broker that allows for easy and hybrid architectures in Apache ActiveMQ is what makes migrating easy and hybrid architectures possible.

Workflows

Step Functions

Serverless workflows let you create and update apps from code without handling requests from clients. When you’re working with serverless, you can create one serverless process that handles requests from your clients and another that updates the app logic.

Serverless workflows are a great way to keep your code simple while still letting you respond to requests from clients. You can use serverless to build your apps without worrying about scalability, performance, or security.

API management

API Gateway

Build a secure API that allows users to manipulate, manipulate, & combine data from one or more data sources.

AppSync

Create a flexible API to securely access, manipulate, & combine data from one or more data sources

Event bus

EventBridge

Connect application data from your own apps, SaaS, & AWS services through an event-driven architecture.

AppFlow

Easy to implement, seamless data flow between SaaS applications and AWS services at any scale, without code.

14. Management and Governance

Category

Service

Description

Enable

Control Tower

The simplest method to set up and govern a new, secure Multi-account AWS environment

Organizations

As your AWS workloads grow and scale, organizations can assist in centrally governing the environment by helping to centralise governing operations.

Well-Architected Tool

Well-architected means that the resources and data are properly separated and accessed sequentially, with low latency between requests. You can use the well-architected tool to help determine if your workloads are well-architected and to monitor their performance and scalability. When you have well-architected apps, you can focus on building great experiences, not infrastructure.

Budgets

To track costs and usage in specific applications, budgets allow for precise control.

License Manager

License Manager makes it easier to manage software licenses from software vendors such as Microsoft, SAP, Oracle, & IBM across AWS & on-premises environments.

Provision

CloudFormation

CloudFormation enables the user to design & provision AWS infrastructure deployments predictably & repeatedly.

Service Catalog

A service catalog provides a common interface for managing the lifecycle of AWS services and for securely provisioning, migrating to the latest version, deleting, and upgrading services. Service catalogs allow you to manage your AWS resources like a cloud Drujo - and maintain compliance with regulatory specifications.

OpsWorks

Creating and maintaining stacks and applications with OpsWorks is simple and flexible.

Marketplace

In addition to thousands of independent software listings that can be found, tested, purchased, and deployed on AWS, Marketplace is a digital catalog with software listings from independent software vendors that make it easy to find, test, buy, and deploy software that runs on AWS.

Operate

CloudWatch

CloudWatch can provide a dependable, scalable, & flexible monitoring solution that is simple to set up.

CloudTrail

It enables governance, compliance, operational auditing, & risk auditing of AWS accounts.

Systems Manager

it helps you manage your applications and infrastructure running in AWS

Cost & usage report

Refer to the cost management section. It has already been explained there.

Cost explorer

Refer to the cost management section. It has already been explained there.

Managed Services

It helps in Operating the AWS infrastructure on our behalf.

UI : Angular 8

open-source, client-side TypeScript based JavaScript framework

Angular Overview

Angular is a framework for building client applications in HTML and JavaScript or language like Typescript that compiles to JavaScript. This platform develop’s application across web, desktop and mobile application. The framework is developed by developers at Google and is being used by developers across the globe.

Angular is used to build single page applications unlike the typical web application where on a click you need to wait for the page to re-load. This give you experience of a desktop or mobile app where part of the page will refresh asynchronously without refreshing the entire application.

During page redirects many sections of the page like header/footer and some functionality remains unchanged and hence instead of reloading everything Angular helps to load changed content dynamically, this increases the speed and performance of the application. Angular helps to achieve this.

Why Angular ?

Performance Faster initial Load, change detection and great rendering time.

Component based Development: In Angular 8 everything is component and this is the term you will hear and work the most. They are the building block of Angular application. So you get greater code reuse and the code is unit testable.

Client Side Templating: Using templates in the browser is becoming more and more widespread. Moving application logic from the server to the client, and the increasing usage of MVC-like patterns (model–view–controller) inspired templates to embrace the browser. This used to be a server-side only affair, but templates are actually very powerful and expressive in client-side development as well.

Mobile Support: (Cross Platform Support) With angular you can create SPA apps that work on web and great on mobile.

Getting Started with Angular

From the terminal, install the Angular CLI globally with:

apt update 
apt install npm -y
npm install -g @angular/cli

This will install the command ng into your system, which is the command you use to create new workspaces, new projects, serve your application during development, or produce builds that can be shared or distributed.

Create a new Angular CLI workspace using the ng new command:

ng new my-project-name

From there you replace the /src folder with the one from your StackBlitz download, and then perform a build.

ng build --prod

Angular 8 Architecture

The basic building blocks of an Angular application are NgModules, which provide a compilation context for components. NgModules collect related code into functional sets; an Angular app is defined by a set of NgModules. An app always has at least a root module that enables bootstrapping, and typically has many more feature modules.

Components define views, which are sets of screen elements that Angular can choose among and modify according to your program logic and data.
Components use services, which provide specific functionality not directly related to views. Service providers can be injected into components as dependencies, making your code modular, reusable, and efficient.

Key parts of Angular 8 Architecture:

Angular 8 Components

In Angular 8, Components and services both are simply classes with decorators that mark their types and provide metadata which guide Angular to do things.

Every Angular application always has at least one component known as root component that connects a page hierarchy with page DOM. Each component defines a class that contains application data and logic, and is associated with an HTML template that defines a view to be displayed in a target environment.

Metadata of Component class

The metadata for a component class associates it with a template that defines a view. A template combines ordinary HTML with Angular directives and binding markup that allow Angular to modify the HTML before rendering it for display.
The metadata for a service class provides the information Angular needs to make it available to components through dependency injection (DI).

MODULE

Angular bring the concept of Modularity, where you can make a single application by separating into many modules.

A module is a mechanism to group components, directives and services that are related, in such a way that can be combined with other modules to create an application
You can think of Modules as Boxes where you separate and organize application functionality into features and reusable chunks. You can think of them much like packages and namespaces of Java AND C#.

TEMPLATES

Now each Component has a HTML Templates which is the View that displays the information on the browser. So template looks like regular HTML but we just sprinkle Angular syntax’s and custom components on the page. You use interpolation to weave calculated strings into the text between HTML element tags and within attribute assignments. Structural directives are responsible for HTML layout.

There are two types of data binding:

1. Event Binding: Event binding is used to bind events to your app and respond to user input in the target environment by updating your application data.

2. Property Binding: Property binding is used to pass data from component class and facilitates you to interpolate values that are computed from your application data into the HTML.

DIRECTIVE

Angular templates are dynamic. When Angular renders them, it transforms the DOM according to the instructions given by directives. Angular ships with built-in directives and you can write your own directives. A directive is a class with a @Directive decorator. A component is a directive-with-a-template; a @Component decorator is actually a @Directive decorator extended with template-oriented features. There are other built-in directives, classified as either attribute directives or structural directives and they are used to modify the behavior and structure of other HTML elements.

DATA BINDING

Now if you have worked on Jquery and JavaScript then you must know the pain to push data values into HTML Controls and then tracking user response and actions. We always have errors to bind and understanding the binding was always difficult. Angular Supports Data Binding which is kind of a connection between View and application data values (Business Logic). This Data Binding make the application simple to read, write and maintain as these operation are maintained by binding framework. Based on the flow of data, you may have three kinds of data Binding also you may have different ways (or more than one way to achieve) the same.

View to Source (HTML Template to Component) : Event Binding ()
Source to View (Component to HTML Template) : Property Binding / Interpolation []
Two way (View to Source & Source to view) : Banana in a Box [()]

Services

Let’s see a scenario we developed a component and this component became famous so many people may ask for it. Or another scenario may be that your project require a piece of code to be used in many places. For these kind of scenarios we can create a service which bring re-usability and reduce the overhead required to develop debug and test the duplicate code in multiple place instead of having that in a single place.

Angular 8 Project Structure

src Folder

This is where the actual action happen, where we keep our source code like component, modules, templates, services, directives, images etc. Things we will look here:-

app Folder
environments Folder
Index.html
Main.ts & Polyfills.ts

app Folder

The sub-folder /app/ under /src/ contain our modules and component. Every angular application comes up with at-least one module and component.

Module groups component which are dedicated for the functioning of an application. The module file is named as app.module.ts and there is at-least one module inside an angular application.
While the component defines the view (things you see on browser) and the application logic. A module can have lots of component.

To see how the call of files happen inside the angular folder structure, please refer the above diagram. It all start with the 1) Main.ts the entry point from which 2) app.Module.ts gets called which in turns call a particular 3) App.component.ts file. Under /src/ folder we have other files and folders as shown in the above image

a) assets : Store’s all the static files like Images and icons etc. used within your application b) favicon.ico : The icon for your App c) style.css : Global CSS and stylesheet which can be used by all your pages and components d) test.ts : Configuration used for setting the testing environment

environments Folder

The sub-folder /environments/ under /src/ contain 2 files each for different environment and they are used to store configuration information like database credential, server address, or some property that differs across those environments

1) environment.ts: This environment file is used by default and is used during the development. If you open this file you will notice that production property is set to false. 2) environment.prod.ts: It is used during the production build, this is called when you build your application with ng build –env=prod .

Index.html

This is the file called by the user. A plain html file which contains your angular directive (e.g. <app-root></app-root>) which calls the angular component. Also when you open the file you don’t see any referenceto CSS as those are injected during the build process.

Main.ts & Polyfills.ts

Main.ts :- This is the starting point of the app. It is here your application get compiled with Just In Time compiler and then it bootstraps or load the Module called AppModule which is the root module of the application. Every angular application has at-least one Angular Module called the root module which is called the AppModule.

Polyfills.ts :- Angular is built on the latest standards of the web platform which target a wide range of browsers and this becomes challenging to ensure things do not break in the old browsers. Polyfills are workaround scripts that ensure that your recent codes (which use the new browser features) do not break in the old browsers (which do not support the new browser features).

Angular.json

Angular.json at the root level of an Angular workspace provides workspace-wide and project-specific configuration defaults for build and development tools provided by the Angular CLI. This file also defines settings and configuration for various features and files and help set naming conventions, pointing to correct files in the solution and also changing settings during the project build. With Angular 6 and later we have the workspace concept where collection of angular projects can co-locate where few properties configure the workspace. Then there is a project section where you will have per project individual configuration.

Main properties that we find in the angular.json file that configure workspace and projects properties are –

$schema: used by Angular CLI in order to enforce the specification of Angular workspace schema , this $schema property refers to angular CLI implementation file.
Version: The configuration-file version.
newProjectRoot: Path where new projects are created.
projects: Contain a subsection for each project (library or application) in the workspace.
defaultProject: When you use ng new to create a new app in a new workspace, this app gets creates as a default project in the workspace.

Package.json

This contain list of packages required to build and run our angular application. There are two lists of such packages called dependencies and dev dependencies. dependencies: Normal project dependency which are both required during runtime and development. You can think of them as a reference to mandatory modules. devDependencies : These modules are only required during the development and they don’t get called actual project release or runtime. This file also contains some basic information about project like (name, description, license etc). The interesting part here is the scripts section which takes Key/Value as input. Each one of the keys in these key/value pairs is the name of a command that can be run and the corresponding value of each key is the actual command that is run.

tsconfig.json

Typescript is the primary language for Angular application but the problem is that browsers don’t understand this language, so the typescript must be transpiled into the Javascript. Hence this file defines the compiler option required to compile the project

Angular 8 Lazy Loading

Angular announced a new version 8.0 and its improved few methods and the compiler to reduce the bundle size 40% less. Now time to update my previous article Angular Routing with Lazy loading Design Pattern.

Angular Test

Unit Testing

This is sometimes also called Isolated testing. It’s the practice of testing small isolated pieces of code. If your test uses some external resource, like the network or a database, it’s not a unit test.Functional Testing

This is defined as the testing of the complete functionality of an application. In practice with web apps, this means interacting with your application as it’s running in a browser just like a user would interact with it in real life, i.e. via clicks on a page.

This is also called End To End or E2E testing.

Jasmine

Jasmine is a javascript testing framework that supports a software development practice called Behaviour Driven Development, or BDD for short. It’s a specific flavour of Test Driven Development (TDD).

Jasmine, and BDD in general, attempts to describe tests in a human readable format so that non-technical people can understand what is being tested. However even if you are technical reading tests in BDD format makes it a lot easier to understand what’s going on.

For example if we wanted to test this function:TypeScript

Copyfunction helloWorld() {
  return 'Hello world!';
}

We would write a jasmine test spec like so:TypeScript

Copydescribe('Hello world', () => { (1)
  it('says hello', () => { (2)
    expect(helloWorld()) (3)
        .toEqual('Hello world!'); (4)
  });
});

One

The describe(string, function) function defines what we call a Test Suite, a collection of individual Test Specs.

Two

The it(string, function) function defines an individual Test Spec, this contains one or more Test Expectations.

Three

The expect(actual) expression is what we call an Expectation. In conjunction with a Matcher it describes an expected piece of behaviour in the application.

Four

The matcher(expected) expression is what we call a Matcher. It does a boolean comparison with the expected value passed in vs. the actual value passed to the expect function, if they are false the spec fails.

Built-in matchers

Jasmine comes with a few pre-built matchers like so:TypeScript

Copyexpect(array).toContain(member);
expect(fn).toThrow(string);
expect(fn).toThrowError(string);
expect(instance).toBe(instance);
expect(mixed).toBeDefined();
expect(mixed).toBeFalsy();
expect(mixed).toBeNull();
expect(mixed).toBeTruthy();
expect(mixed).toBeUndefined();
expect(mixed).toEqual(mixed);
expect(mixed).toMatch(pattern);
expect(number).toBeCloseTo(number, decimalPlaces);
expect(number).toBeGreaterThan(number);
expect(number).toBeLessThan(number);
expect(number).toBeNaN();
expect(spy).toHaveBeenCalled();
expect(spy).toHaveBeenCalledTimes(number);
expect(spy).toHaveBeenCalledWith(...arguments);

You can see concrete examples of how these matchers are used by looking at the Jasmine docs here: http://jasmine.github.io/edge/introduction.html#section-Included_Matchers

Setup and teardown

Sometimes in order to test a feature we need to perform some setup, perhaps it’s creating some test objects. Also we may need to perform some cleanup activities after we have finished testing, perhaps we need to delete some files from the hard drive.

These activities are called setup and teardown (for cleaning up) and Jasmine has a few functions we can use to make this easier:beforeAll

This function is called once, before all the specs in describe test suite are run.afterAll

This function is called once after all the specs in a test suite are finished.beforeEach

This function is called before each test specification, it function, has been run.afterEach

This function is called after each test specification has been run.

We might use these functions like so:TypeScript

Copydescribe('Hello world', () => {

  let expected = "";

  beforeEach(() => {
    expected = "Hello World";
  });

  afterEach(() => {
    expected = "";
  });

  it('says hello', () => {
    expect(helloWorld())
        .toEqual(expected);
  });
});

Running Jasmine Test

To manually run Jasmine tests we would create a HTML file and include the required jasmine javascript and css files like so:HTML

Copy<link rel="stylesheet" href="jasmine.css">
<script src="jasmine.js"></script>
<script src="jasmine-html.js"></script>
<script src="boot.js"></script>

We then load in the parts of our application code that we want to test, for example if our hello world function was in main.js:HTML

Copy<link rel="stylesheet" href="jasmine.css">
<script src="jasmine.js"></script>
<script src="jasmine-html.js"></script>
<script src="boot.js"></script>

<script src="main.js"></script>

Important : The order of script tags is important.

We then would load each individual test suite file, for example if we placed our test suite code above in a file called test.js we would load it in like so:HTML

Copy<link rel="stylesheet" href="jasmine.css">
<script src="jasmine.js"></script>
<script src="jasmine-html.js"></script>
<script src="boot.js"></script>

<script src="main.js"></script>

<script src="test.js"></script>

To run the tests we simply open up the HTML file in a browser.

Once all the files requested via script and link are loaded by a browser the function window.onload is called, this is when Jasmine actually runs the tests.

The results are displayed in the browser window, a failing run looks like:

A passing run looks like:

If we wanted to test our code in different browsers we simply load up the HTML file in the browser we want to test in.

If we wanted to debug the code we would use the developer tools available to us in that browser.

Karma

Manually running Jasmine tests by refreshing a browser tab repeatedly in different browsers every-time we edit some code can become tiresome.

Karma is a tool which lets us spawn browsers and run jasmine tests inside of them all from the command line. The results of the tests are also displayed on the command line.

Karma can also watch your development files for changes and re-run the tests automatically.

Karma lets us run jasmine tests as part of a development tool chain which requires tests to be runnable and results inspectable via the command line.

It’s not necessary to know the internals of how Karma works. When using the Angular CLI it handles the configuration for us and for the rest of this section we are going to run the tests using only Jasmine.

Dashboard ngx-admin

ngx-admin is a front-end admin dashboard template based on Angular 8+, Bootstrap 4+ and Nebular. That means all the data you can see on graphs, charts and tables is mocked in Javascript so you can use the backend of your choice with no limitations.

List of features

Angular 8+ & Typescript
Bootstrap 4+ & SCSS
Responsive layout
RTL support
High resolution
Flexibly configurable themes with hot-reload (3 themes included)
Authentication module with multiple providers
Lots of awesome features:
- Buttons
- Modals
- Popovers
- Icons
- Typography
- Animated searches
- Forms
- Tabs
- Notifications
- Tables
- Maps
- Charts
- Editors

Theme System

Nebular Theme System is a set of rules we put into how SCSS files and variables are organized to achieve the following goals:

ability to flexibly change looks & feel of the application by managing variables, without changing SCSS itself;
ability to switch between visual themes in a runtime without reloading the page;
support of CSS-variables (implemented partially).

Theme Map

Each theme is represented as an SCSS map with a list of key-value pairs:

$theme: (
  font-main: unquote('"Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif'),
  font-secondary: font-main,

  font-weight-thin: 200,
  font-weight-light: 300,
  font-weight-normal: 400,
  font-weight-bolder: 500,
  font-weight-bold: 600,
  font-weight-ultra-bold: 800,

  base-font-size: 16px,

  font-size-xlg: 1.25rem,
  font-size-lg: 1.125rem,
  font-size: 1rem,
  font-size-sm: 0.875rem,
  font-size-xs: 0.75rem,

  radius: 0.375rem,
  padding: 1.25rem,
  margin: 1.5rem,
  line-height: 1.25,

  ...

Where key - is a variable name, and value - is a raw SCSS value (color, string, etc) or parent variable name, so that you can inherit values from different variables:

$theme: (
  font-main: unquote('"Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif'),
  font-secondary: font-main,

Here font-secondary inherits its value from font-main.

Component Variables

Then, for each component of the Nebular Components, there is a list of variables you can change. For example - header component variables:

  ...

  header-font-family: font-secondary,
  header-font-size: font-size,
  header-line-height: line-height,
  header-fg: color-fg-heading,
  header-bg: color-bg,
  header-height: 4.75rem,
  header-padding: 1.25rem,
  header-shadow: shadow,

  ...

As you can see, you have 8 variables for a pretty simple component and from the other side, 6 of them are inherited from the default values. It means that if you want to create a new theme with a united look & feel of the components - in most cases you would need to change around 10 generic variables, such as color-bg, shadow, etc. to change the UI completely.

List of component style variables is specified in the component documentation, for example styles for header component.

Variables Usage

Now, if you want to use the variables in your custom style files, all you need to do (of course, after the successful setup of the Theme System is to call nb-theme(var-name) function:

@import '../../../@theme/styles/themes';

:host {

  background: nb-theme(card-bg); // and use it
}

Depending on the currently enabled theme and the way card-bg inherited in your theme, you will get the right color.

Built-in themes

Currently, there are 3 built-in themes:

default - clean white theme
cosmic - dark theme
corporate - firm business theme

Themes can also be inherited from each other, cosmic, for instance, is inherited from the default theme.

Magic of multiple themes with hot-reload

As you can see from the ngx-admin emo, you can switch themes in the runtime without reloading the page. It is useful when you have multiple visual themes per user role or want to provide your user with such a configuration so that he can decide which theme works best for him. The only requirement for the feature to work is to wrap all of your component styles into special mixin nb-install-component and use nb-theme to get the right value:

@import '../../../@theme/styles/themes';

@include nb-install-component() {
  background: nb-theme(card-bg); // now, for each theme registered the corresponding value will be inserted

  .container {
    background: nb-theme(color-bg);
    font-weight: nb-theme(font-weight-bold);
  }
}

Backend Integration

This section describes approaches of integration of ngx-admin application with backend API. Despite we understand that every backend is really different, we think that we can cover several most commonly used ways.

Integration with JSON REST server

Despite there's an option to do CORS requests to API server directly, we don't advise to do so. This way has disadvantages in terms of security and performance. In terms of security when you do CORS request you basically expose your API server URL to everybody. Your API server should take additional measures to make sure some URLs are not accessible, because it is exposed to the web. As for performance, CORS requests require to send preflight OPTIONS request before each HTTP request. This adds additional HTTP overhead.

The solution we suggest is to use proxy for your API server. In this case you can make your app accessible through some sub-url. For example, if your application's hosted under url website.com and your index file is located at website.com/index.html, you can make your API root accessible on website.com/api. This is well supported by angular-cli/webpack-dev-server for development setup and by web servers for production setup. Let's review these setups:

angular-cli/webpack-dev-server setup

There's not so much needs to be done to proxy your api using angular-cli. You can read detailed documentation in their docs. But the most important topics are:

You should create proxy.conf.json file in your application root. The file should contain something like below:

{
  "/api": {
    "target": "http://localhost:3000",
    "secure": false
  }
}

In this case you should put URL of your API server instead of http://localhost:3000.

After that you need to run your angular-cli application using following command

ng serve --proxy-config proxy.conf.json

That's it. Now you can access /api URL from your ngx-admin application and your requests will be forwarded to your API server.

Production setup

Production setup is not much different from development setup. The only difference is that usually you don't use there angular-cli or webpack-dev-server to host your HTML/CSS/JS. Usually we all use some web server for that. At Akveo we mostly use nginx for this use case. Below there is a sample configuration for this particular web server. For others it is not that much different.

Usually you create new virtual host with some similar configuration:

server {
  listen 80;
  server_name website.com;

  root /yourAngularAppDistPath;
  index index.html index.htm;
  etag on;

  location / {
    index index.html;
    try_files $uri /index.html;
  }
}

The only thing you need to add is proxy-pass to /api URL like below:

server {
  listen 80;
  server_name website.com;

  root /yourAngularAppDistPath;
  index index.html index.htm;
  etag on;

  location / {
    index index.html;
    try_files $uri /index.html;
  }

  location /api {
    proxy_pass http://localhost:3000/;
    proxy_set_header Host $host;
  }
}