跳转至

2025 interview

1. What are all the types of applications you have deployed?

  • Web applications (e.g., Java Spring Boot, Node.js),
  • microservices architectures,
  • databases (e.g., MySQL, MongoDB),
  • message brokers (e.g., Apache Kafka),
  • monitoring tools (e.g., Prometheus, Grafana),
  • CI/CD pipelines (e.g., Jenkins),
  • Containerized applications using Docker and Kubernetes

2 How have you injected the secrets in ConfigMaps

  • Secrets should not be injected in ConfigMaps as ConfigMaps are not designed for sensitive data.
  • Instead, Kubernetes Secrets should be used. Secrets can be injected into pods via environment variables or mounted as files.

3 How do you find which pod is taking more system resources across nodes using kubectl?

Use

kubectl top pod --all-namespaces

to list resource usage by pods.

Combine it with kubectl describe pod <pod-name> to get detailed resource usage

4 How do you know which worker node is consuming more resources across the clusters using kubectl?

Use kubectl top nodes to see resource consumption across nodes.

This will show CPU and memory usage on each node.

5. What are the steps for configuring Prometheus and Grafana for monitoring Kubernetes clusters?

  1. Deploy Prometheus using Helm or a custom YAML configuration.
  2. Set up Kubernetes service discovery for Prometheus.
  3. Deploy Grafana and configure it to use Prometheus as a data source.
  4. Import Kubernetes monitoring dashboards in Grafana.
  5. Set up alerting rules in Prometheus as needed.

6. If 20 pods are running, how do you visualize the metrics of these pods in Grafana?

Configure Grafana to use Prometheus as the data source. Use existing Kubernetes dashboards or create custom dashboards to visualize metrics for all pods.

7 What is Apache Kafka?

Apache Kafka is a distributed event streaming platform used for building real-time data pipelines and streaming applications. It’s designed to handle large volumes of data in a distributed, fault-tolerant manner.

8. How do you set up a Docker Hub private registry and integrate it with a CI/CD pipeline? What is the procedure?

  1. Create a private repository on Docker Hub.
  2. Configure CI/CD pipeline to authenticate with Docker Hub using credentials.
  3. Build Docker images in CI pipeline and push them to the private repository.
  4. Pull images from the private registry during the CD process
  • Hard Link: A direct reference to the file’s data on the disk.

    • Deleting the original file does not affect the hard link.
  • Soft Link (Symbolic Link): A pointer to the original file’s path.

    • If the original file is deleted, the soft link becomes broken.

10. What is the use of the break command in shell scripting? In what scenarios have you used it?

The break command is used to exit a loop prematurely. It is commonly used when a specific condition is met, and you no longer need to continue the loop

11. How do you count the number of "devops" words in 15 HTML files?

To count the number of times the word "devops" appears across 15 HTML files, you can use the following shell command:

grep -o -i "devops" *.html | wc -l
  • grep -o -i "devops" *.html: Searches for the word "devops" (case-insensitive due to -i) in all HTML files and outputs each occurrence on a new line (-o).
  • wc -l: Counts the number of lines output by grep, which corresponds to the number of occurrences of "devops".

12. What is the terraform taint command?

The terraform taint command is used to manually mark a specific resource for recreation during the next terraform apply.

When a resource is tainted, Terraform will destroy it and create a new instance of it on the next apply.

This is useful when you know a resource needs to be replaced but Terraform hasn't automatically determined that it should be

Syntax: terraform taint <resource_address>
Example: terraform taint aws_instance.example

13. What are the possible ways to secure a state file in Terraform?

  1. Encrypt the State File: Store the state file in an encrypted format using tools like AWS S3 bucket encryption, Azure Blob encryption, or Google Cloud Storage encryption.
  2. Use Remote Backends with Security Controls: Use a remote backend like AWS S3 with IAM roles and policies, HashiCorp Consul with ACLs, or Terraform Cloud, which manages access control and encryption.
  3. Enable Versioning: Use versioning on the state file storage to recover from any unintended changes or deletions.
  4. Access Control: Restrict access to the state file using IAM policies, RBAC (Role-Based Access Control), or similar mechanisms.
  5. Use Backend Locking: Use state locking mechanisms provided by backends like S3 with DynamoDB or Consul to prevent concurrent operations from corrupting the state file.

14. If you provision 100 servers and someone deletes 50 VMs manually, what happens if you apply the terraform apply command?

When you run terraform apply, Terraform will compare the state file with the actual infrastructure.

It will notice that 50 VMs are missing and will attempt to recreate those missing VMs to match the state defined in your configuration.

The end result will be 100 VMs again.

15. What is the syntax for for_each in Terraform?

The for_each meta-argument in Terraform allows you to create multiple instances of a resource or module based on the items in a map or set.

The syntax is as follows:

resource "aws_instance" "example" {
    for_each = var.instances
    ami = each.value["ami"]
    instance_type = each.value["instance_type"]
    tags = {
        Name = each.key
    }
 }

In this example,

var.instances is a map, and each.key refers to the current key in the map, while each.value refers to the associated value.

16. What are the advantages and disadvantages of multi-stage builds in Docker?

Advantages

  • Smaller Image Size: By separating the build environment from the runtime environment, only the necessary components are included in the final image, leading to smaller image sizes.
  • Improved Security: Reduces the attack surface by excluding build tools and other unnecessary components from the final image.
  • Better Caching: Allows for more efficient use of Docker’s caching mechanism, potentially speeding up the build process.
  • Separation of Concerns: Different stages can focus on specific tasks, making the Dockerfile more organized and easier to maintain.

Disadvantages

  • Complexity: Multi-stage Dockerfiles can be more complex and harder to understand, especially for those new to Docker.
  • Longer Build Times: In some cases, the use of multiple stages may lead to longer build times due to additional steps and transitions between stages.
  • Troubleshooting: Debugging and troubleshooting can be more challenging because intermediate stages are not preserved by default.

17. How do you deploy containers on different hosts, not the same host, within a Docker cluster?

To deploy containers on different hosts within a Docker cluster, you can use Docker Swarm or Kubernetes

Docker Swarm: Use Docker Swarm’s built-in orchestration capabilities.

When deploying a service, Swarm will automatically distribute the containers across different nodes in the cluster. You can influence this behavior using placement constraints.

Example

docker service create --name myservice --replicas 2 --constraint 'node.role == worker' myimage

Kubernetes: Use Kubernetes, which will schedule pods on different nodes based on the available resources and any specified node selectors or affinities

18. If you have a Docker Compose setup, how do you deploy the web container on one host and the DB container on another host?

To deploy the web container on one host and the DB container on another using Docker Compose, you can:

  1. Use Docker Swarm: Convert the Compose file into a stack file, and deploy it with docker stack deploy. Docker Swarm will distribute services across nodes.
  2. Manual Placement: Use placement constraints in your docker-compose.yml file to specify which containers should run on which nodes.
version: '3.7'
services:
 web:
 image: my-web-image
 deploy:
  placement:
  constraints: [node.hostname == web-node]

 db:
 image: my-db-image
 deploy:
    placement:
    constraints: [node.hostname == db-node]

19. What is the difference between bridge networking and host networking in Docker

  • Bridge Networking: The default Docker network driver.
    • Containers connected to the same bridge network can communicate with each other.
    • Each container gets its own IP address and is isolated from the host network. You can expose ports to the host using the -p option.
  • Host Networking: The container shares the host's network stack,
    • meaning it doesn’t get its own IP address, and the container's network is the same as the host's network.
    • This can lead to performance improvements but reduces isolation between the host and the container.

20. How do you resolve merge conflicts

To resolve merge conflicts, follow these steps:

  1. Identify the Conflict: Git will mark the conflicting areas in the files with conflict markers (<<<<<<<, =======, and >>>>>>>).
  2. Manually Resolve: Open the conflicting file and decide how to combine the changes. Remove the conflict markers and modify the content to resolve the conflict.
  3. Stage the Resolved Files: Once resolved, stage the files using git add.
  4. Commit the Changes: Commit the resolved conflicts with git commit. If you were in the middle of a merge, this will complete the merge process.
  5. Test: Run tests to ensure that the merge didn’t introduce any issues.

Example:

git add <resolved-files>
git commit -m "Resolved merge conflicts in X, Y, Z files

21. What command do you use to change the existing commit message?

To change the most recent commit message, you can use the following Git command

git commit --amend -m "New commit message"

If you have already pushed the commit to a remote repository, you will need to force push the changes:

git push --force

22. What is session affinity?

Session affinity, also known as sticky sessions, is a concept in load balancing where requests from a particular user are consistently directed to the same server (or pod) in a multi-server environment.

This ensures that the user's session data, which might be stored locally on the server, remains accessible throughout the session.

23. What is pod affinity and its use case?

Pod affinity is a feature in Kubernetes that allows you to specify rules for scheduling pods to run on nodes that have other specified pods running on them.

This can be useful when you want certain pods to be located together due to factors like data locality, network latency, or shared resources.

Use Case: An application where the frontend and backend services communicate frequently might use pod affinity to ensure that both are scheduled on the same node to reduce network latency

24. What is the difference between pod affinity and pod anti-affinity?

  • Pod Affinity: Ensures that pods are scheduled on the same node or in proximity to each other.
  • Pod Anti-Affinity: Ensures that pods are not scheduled on the same node or are placed far
  • apart from each other.

Example Use Case:

  • Pod Affinity: Running frontend and backend on the same node for low-latency communication.
  • Pod Anti-Affinity: Ensuring replicas of the same application are spread across different nodes to increase availability and fault tolerance.

25. What are readiness and liveness probes?

  • Readiness Probe: Used to determine when a pod is ready to start accepting traffic. If the readiness probe fails, the pod will be removed from the service endpoints, ensuring it does not receive traffic until it's ready.
  • Liveness Probe: Used to determine if a pod is still running. If the liveness probe fails, Kubernetes will restart the pod, assuming it's in a failed state.

26. Write a simple Groovy pipeline for a Java Spring Boot application that waits for user input for approval to move to the next stage, with stages for checkout, build, push, and deploy

pipeline {
 agent any

 stages {
    stage('Checkout') {
        steps {
        git 'https://github.com/your-repo/java-spring-boot-app.git'
    }
 }

 stage('Build') {
    steps {
        sh './mvnw clean package'
    }
 }

 stage('Push') {
    steps {
        sh 'docker build -t your-docker-repo/java-spring-boot-app .'
        sh 'docker push your-docker-repo/java-spring-boot-app'
    }
 }

 stage('Approval') {
    steps {
        input 'Do you want to deploy to production?'
    }
 }

 stage('Deploy') {
    steps {
        sh 'kubectl apply -f deployment.yaml'
        }
    }
 } 
}

27. How do you export test reports in Jenkins?

To export test reports in Jenkins:

  1. Ensure that your tests generate reports in a standard format like JUnit XML or HTML.
  2. Use the Publish JUnit test result report post-build action to archive test reports.
  3. You can also use the Archive the artifacts post-build action to store other types of reports.
  4. The test reports can then be accessed and downloaded from the Jenkins build page

28. If 5 pods are running, how do you scale the number of pods to 10 using the command line in Kubernetes?

To scale the number of pods from 5 to 10, use the following command:

kubectl scale --replicas=10 deployment/<your-deployment-name>

29. Can you explain the usage of the terraform import command?

The terraform import command is used to import existing infrastructure resources into Terraform's state file, allowing Terraform to manage them.

This is particularly useful when you want to bring existing infrastructure under Terraform management without having to recreate resources.

terraform import <resource_type>.<resource_name> <resource_id>
terraform import aws_instance.my_instance i-1234567890abcdef0

30. In AWS, where do you store the state file, and how do you manage it?

In AWS, Terraform state files are typically stored in an S3 bucket, and the state can be managed using a combination of S3 and DynamoDB for locking.

Example Configuration:

terraform {
 backend "s3" {
 bucket = "my-terraform-state-bucket"
 key = "path/to/my/terraform.tfstate"
 region = "us-west-2"
 dynamodb_table = "terraform-lock-table"
 encrypt = true
 }}

Management

  • S3: Stores the state file securely, supports versioning, and can be encrypted.
  • DynamoDB: Provides locking to prevent multiple simultaneous operations on the same state file, avoiding conflicts.

This setup ensures that your Terraform state is stored securely and is protected from simultaneous access issues

31. What is the biggest issue you have faced with Terraform, and how did you resolve it?

One significant issue I faced was managing Terraform state when multiple teams were working on the same infrastructure.

This was resolved by organizing the infrastructure into separate modules and using remote state management with proper locking mechanisms

32. What are the types of storage accounts in AWS S3?

In AWS S3, the different storage classes include:

  • S3 Standard
  • S3 Intelligent-Tiering
  • S3 Standard-IA (Infrequent Access)
  • S3 One Zone-IA
  • S3 Glacier
  • S3 Glacier Deep Archive

33. Are you familiar with lifecycle management in S3 buckets? How do you set up lifecycle policies?

Yes, lifecycle management in S3 allows you to define rules to transition objects between different storage classes or delete them after a certain period.

Lifecycle policies can be set up using the S3 Management Console, AWS CLI, or Terraform by specifying the transitions and expiration actions in a JSON configuration file.

34. What are the differences between load balancers, and why do we need them?

Load balancers distribute incoming network traffic across multiple servers. The main types are:

  • Application Load Balancer (ALB): Operates at the application layer (Layer 7), and is used for HTTP/HTTPS traffic with advanced routing capabilities.
  • Network Load Balancer (NLB): Operates at the transport layer (Layer 4) and is used for ultra-low latency TCP/UDP traffic.
  • Classic Load Balancer (CLB): Supports both Layer 4 and Layer 7, but is now mostly deprecated in favor of ALB and NLB

Load balancers improve fault tolerance, scalability, and ensure high availability.

35. Have you worked with Auto Scaling Groups (ASG)

Yes, I have worked with ASGs to automatically scale the number of instances in response to demand.

ASGs are configured with policies that adjust the desired capacity based on metrics such as CPU utilization, helping to maintain application performance and optimize costs.

36. Can you write a simple Dockerfile?

Yes, here is a simple example of a Dockerfile for a Node.js application:

Dockerfile

FROM node: 14
WORKDIR /app
Copy package.json
RUN npm install
COPY . .
EXPOSE 3000
CMD [ "npm", "start"]

37. If you want to expose your application to the public internet or access your application within a cluster, how would you do that in Kubernetes?

To expose your application to the public internet, you can use a Kubernetes Service of type LoadBalancer or NodePort.

For internal access within the cluster, you can use a ClusterIP service. Additionally, you might use an Ingress controller for more advanced routing.

38. Why do we need a ConfigMap in Kubernetes?

A ConfigMap is used to store non-confidential configuration data in key-value pairs.

It allows you to decouple configuration artifacts from image content, enabling you to modify application settings without rebuilding your container images

39. Which AWS services do you consider when setting up a CI/CD pipeline for a microservices application?

For a CI/CD pipeline, I would consider using:

  • AWS CodeCommit for source control.
  • AWS CodeBuild for building and testing.
  • AWS CodeDeploy or Amazon EKS for deployment.
  • AWS CodePipeline to orchestrate the CI/CD process.
  • Amazon S3 for artifact storage.
  • AWS Lambda for any custom automation tasks.

40. On a day with unusually high traffic for an e-commerce application, how would you, as a cloud engineer, manage the current setup to handle the load smoothly

I would ensure that Auto Scaling Groups are properly configured to handle the increased demand by automatically adding more instances.

I'd also verify that the load balancers are evenly distributing traffic and consider enabling caching (e.g., using Amazon CloudFront) to reduce the load on backend servers. Monitoring tools like CloudWatch would be used to track performance metrics and adjust resources in real-time

Strategies:

  • Auto Scaling: Scale up instances automatically to handle the increased load.
  • Caching: Use a caching layer to reduce the load on your application servers.
  • Load Balancing: Distribute traffic evenly across available instances.
  • Database Optimization: Ensure your database is properly configured and optimized for performance.
  • Monitoring: Closely monitor system metrics to identify bottlenecks and adjust resources accordingly

41. If traffic is currently handled on a single instance, how would you upgrade for high availability in AWS?

  • To upgrade for high availability, I would:
    • Deploy multiple instances across different Availability Zones (AZs) using an Auto Scaling Group.
    • Set up a Load Balancer (ALB or NLB) to distribute traffic across these instances.
    • Configure health checks to ensure traffic is only routed to healthy instances.
    • Use Multi-AZ deployments for databases like RDS to ensure data availability

42. When auto-scaling instances, how do you manage the backend RDS database?

  • To manage the backend RDS database during auto-scaling:
    • Enable Multi-AZ for high availability and automatic failover.
    • Use RDS Read Replicas to handle read-heavy traffic, reducing the load on the primary database.
    • Scale RDS vertically (instance size) or horizontally (read replicas) based on the database workload.
    • Monitor performance using Amazon CloudWatch and adjust as necessary

43. Have you ever set up cross-account access for S3? For example, if the QA team needs access to the production database.

Yes, I've set up cross-account access by:

  • Creating an IAM role in the production account with the necessary S3 permissions.
  • Establishing a trust relationship to allow the QA account to assume that role.
  • Using S3 bucket policies to grant access from the QA account.
    • QA team members can then assume the role using AWS STS (Security Token Service) to access the production S3 bucket.

44. How can an S3 account in Account A access an S3 account in Account B?

Account A can access Account B’s S3 bucket by:

  • Setting up a bucket policy in Account B that grants Account A the necessary permissions.
  • Creating an IAM role in Account B with permissions for S3 and allowing Account A to assume that role via a trust policy.
  • Using AWS STS to assume the role from Account A and access the S3 bucket in Account B.

45. Can you differentiate between IAM policies and IAM roles?

IAM Policies: These are sets of permissions attached to users, groups, or roles, defining what actions are allowed or denied

IAM Roles: These are identities with specific permissions that can be assumed by entities like users, applications, or AWS services. Roles are often used for temporary access or cross-account access.

46. Can you explain the STS assume role policy?

The STS (Security Token Service) AssumeRole policy allows a user or service to assume a role in a different account or within the same account.

This provides temporary security credentials with the permissions associated with the assumed role, enabling cross-account access or delegation of permissions

47. Have you experienced any challenging issues or incidents in your project? How did you and your team identify and resolve them?

Yes, one challenge was a sudden traffic spike causing performance degradation.

We identified the issue using CloudWatch metrics and logs, pinpointing the bottleneck in the database.

The resolution involved scaling the database vertically and adding read replicas to distribute the load, along with optimizing slow-running queries.

48. What is the difference between CMD and ENTRYPOINT in Docker?

  • CMD: Provides default arguments for the entrypoint or the command to run if no other command is provided.
  • ENTRYPOINT: Defines the executable that will always run, with CMD as its default arguments. ENTRYPOINT is useful when you want your container to behave like a specific executable.
  • Example: ENTRYPOINT ["python", "app.py"] ensures app.py is always executed, while CMD allows passing different arguments.

49. Have you ever managed an application single-handedly?

Yes, I have managed applications single-handedly, handling tasks such as deployment, monitoring, troubleshooting, and scaling.

This involved setting up the infrastructure, CI/CD pipelines, and ensuring high availability and security

50. What are the benefits of Infrastructure as Code (IaC)?

  • Consistency: Ensures consistent environments by automating the provisioning process.
  • Version Control: Infrastructure can be versioned and tracked, enabling rollbacks and audits.
  • Automation: Reduces manual intervention, minimizing errors and speeding up deployments.
  • Scalability: Allows easy scaling and management of resources through scripts.
  • Collaboration: Teams can collaborate more effectively using code reviews and version control systems.

51. What are the different ways to create infrastructure as code?

  • Terraform: A popular open-source tool that allows you to define infrastructure as code using a declarative configuration language. It works with various cloud providers.
  • AWS CloudFormation: A service provided by AWS that enables you to define AWS resources using JSON or YAML templates.
  • Azure Resource Manager (ARM) Templates: Azure's solution for infrastructure as code, allowing you to define resources in JSON.
  • Google Cloud Deployment Manager: Google's infrastructure as code tool that uses YAML to define resources.
  • Ansible: Although primarily a configuration management tool, it can also be used to provision infrastructure using playbooks written in YAML
  • Pulumi: A modern infrastructure as code tool that supports multiple programming languages like Python, JavaScript, and Go

52. What is the difference between public and private networking?

  • Public Networking: Refers to networks that are accessible from the internet. Public IP addresses are used, and resources are exposed to external access.
  • Private Networking: Refers to networks that are isolated from the public internet.
    • Resources within a private network communicate with each other securely using private IP addresses, and access from outside is typically restricted.

53. What is a Docker registry and why do we need it?

  • Docker Registry: A storage and distribution system for Docker images. It allows you to store, share, and manage Docker container images. Docker Hub is a popular public registry, but you can also set up private registries.
  • Why We Need It: Docker registries allow teams to version, share, and deploy Docker images easily. They support CI/CD pipelines by enabling automated builds and deployments.

54. What is a secrets manager?

  • Secrets Manager: A tool or service that securely stores and manages sensitive information such as API keys, passwords, certificates, and tokens. Examples include AWS Secrets Manager, HashiCorp Vault, and Azure Key Vault.
  • Purpose: To securely store and access secrets without hardcoding them into application code or configuration files

55. What is the secure way to manage sensitive information?

  • Use a Secrets Manager: Store secrets in a dedicated service like AWS Secrets Manager, Azure Key Vault, or HashiCorp Vault.
  • Environment Variables: Use environment variables to inject secrets at runtime rather than storing them in code.
  • Encrypted Storage: Store sensitive data in encrypted databases or files, ensuring that encryption keys are managed securely.
  • Access Control: Implement strict access controls and auditing to ensure that only authorized personnel and applications can access sensitive information.

56. Have you worked with Kubernetes (K8s)?

If you have experience with Kubernetes, you might discuss:

  • Deploying and managing containerized applications using Kubernetes.
  • Configuring Kubernetes clusters and using tools like kubectl.
  • Managing services, ingress, and networking in Kubernetes.
  • Using Helm charts for packaging and deploying applications.

57. What is the difference between Docker and Kubernetes?

  • Docker: A platform for developing, shipping, and running applications inside containers. It simplifies the process of managing application dependencies and environment consistency.
  • Kubernetes: An open-source orchestration system for automating the deployment, scaling, and management of containerized applications. Kubernetes manages multiple containers across a cluster, providing features like load balancing, scaling, and self-healing.

58. Can you explain an end-to-end deployment for an application?

An end-to-end deployment process might involve the following steps:

  1. Code Commit: Developers push code changes to a version control system like Git.
  2. CI/CD Pipeline: A continuous integration pipeline builds the code, runs tests, and creates a Docker image.
  3. Image Storage: The Docker image is pushed to a Docker registry.
  4. Deployment: The image is pulled from the registry by Kubernetes, Docker Swarm, or another orchestration tool, and deployed to a staging or production environment.
  5. Monitoring & Logging: The application is monitored for performance and errors, with logs collected and analyzed.
  6. Scaling & Updates: The application is scaled based on demand, and updates are rolled out using a strategy like blue-green or canary deployment

59. If you want to use Kubernetes instead of EC2 instances, how would you do it? Have you used Helm charts or other CD tools? How would you handle a project with multiple microservices on Kubernetes?

  • Using Kubernetes Instead of EC2: You would deploy your applications on a Kubernetes cluster rather than directly on EC2 instances. This involves setting up an EKS (Elastic Kubernetes Service) cluster in AWS or using another managed Kubernetes service.
  • Helm Charts: Helm is a package manager for Kubernetes that helps you manage Kubernetes applications. You can use Helm charts to deploy and manage multiple microservices in a consistent and repeatable manner.
  • Handling Multiple Microservices: Use Kubernetes namespaces to isolate microservices, and manage their deployment using Helm charts or a CI/CD tool like Jenkins, ArgoCD, or GitLab CI/CD. Implement service discovery, networking, and security policies to ensure seamless communication between microservices.

60. How do you connect a bastion host to a private network? Can you explain VPC and VPC peering?

Connecting a Bastion Host: A bastion host is typically set up in a public subnet of a Virtual Private Cloud (VPC) with access to the private network.

Users connect to the bastion host via SSH, and from there, they can access resources in the private subnet.

VPC (Virtual Private Cloud): A logically isolated section of a cloud provider's network where you can launch and manage resources. It allows you to define IP ranges, subnets, route tables, and network gateways.

VPC Peering: A network connection between two VPCs that allows traffic to be routed between them using private IP addresses. This is useful for connecting resources across different VPCs without going over the public internet.

61. Have you configured a system where code is automatically merged and published upon a developer completing a ticket in Jira? What exactly have you managed

Yes, I have set up a CI/CD pipeline where code is automatically merged and published once a developer completes a ticket in Jira. This process typically involves:

  • Integration with Jira: Configuring Jira to trigger CI/CD pipelines when a ticket is marked as "Done" or moved to a specific workflow stage.
  • Code Merging: Using tools like GitLab CI, GitHub Actions, or Jenkins to automatically merge feature branches into the main branch after passing tests.
  • Automated Testing: Running unit, integration, and end-to-end tests to ensure the quality of the code before merging.
  • Deployment: Using deployment tools like Kubernetes, Helm, or Docker Swarm to automatically deploy the merged code to staging or production environments.

62. How do you set up Nginx on a server?

Setting up Nginx on a server involves:

1. Installation:

  • For Ubuntu/Debian: sudo apt-get update && sudo apt-get install nginx
  • For CentOS/RHEL: sudo yum install nginx

2. Configuration:

  • Edit the configuration file located at /etc/nginx/nginx.conf or individual site configurations in /etc/nginx/sites-available/.
  • Define server blocks (virtual hosts) to specify different sites and their root directories. Configure reverse proxy, load balancing, SSL/TLS certificates, and caching as needed.

3. Start and Enable Nginx:

  • Start the service: sudo systemctl start nginx
  • Enable it to start on boot: sudo systemctl enable nginx

4. Testing:

  • Test the configuration: sudo nginx -t
  • Ensure Nginx is running and properly serving content.

63. What is a load balancer and its benefits? What is Cloud NAT?

Load Balancer: A load balancer distributes incoming network traffic across multiple servers or services to ensure reliability, scalability, and high availability. Benefits include:

  • Increased Fault Tolerance: Distributes traffic to prevent overload on a single server.
  • Scalability: Easily manage increased traffic by adding more servers.
  • Improved Performance: Balances load based on performance metrics, reducing latency.

Cloud NAT: Network Address Translation (NAT) service in cloud environments like Google Cloud.

It allows instances in private subnets to connect to the internet without exposing them to inbound internet traffic, maintaining security while enabling outbound connectivity

64. What is the difference between a load balancer and a Cloud NAT gateway?

**Load Balancer: **

  • Distributes incoming traffic across multiple servers or services.
  • Primarily used for load distribution, redundancy, and high availability.
  • Works at various layers (L4 for TCP, L7 for HTTP/HTTPS).

Cloud NAT Gateway:

  • Provides outbound internet access for instances in a private network without exposing them to inbound traffic.
  • Used for secure, private instances that need internet access without being directly accessible from the internet.

65. How do you see yourself fitting into this particular role?

I see myself fitting into this role by leveraging my technical expertise in infrastructure management, automation, and cloud technologies.

My experience in setting up CI/CD pipelines, managing deployments, and optimizing resource allocation aligns with the responsibilities of this role. I also bring problem-solving skills and a proactive approach to ensuring system reliability, which will contribute to the success of the team and organization.

In a previous project, I noticed that our cloud infrastructure was over-provisioned, leading to unnecessary costs.

I implemented auto-scaling based on actual usage metrics and utilized spot instances for non-critical workloads.

Additionally, I restructured the storage solution by moving infrequently accessed data to lower-cost storage classes.

These changes resulted in asignificant reduction in our monthly cloud expenses without compromising performance

67. Describe a situation where the entire production instance crashed, and you had to fix it quickly. Have you experienced such a scenario?

Yes, I have experienced such a scenario. In one instance, our production server crashed due to a memory leak in the application.

I quickly identified the issue using monitoring tools like Prometheus and logs from ELK Stack.

To resolve it, I restarted the affected services and temporarily scaled up the infrastructure to handle the load. I then worked with the development team to identify and fix the memory leak, ensuring it didn’t happen again

68. What is blue-green deployment and why is it needed?

Blue-Green Deployment:

  • A deployment strategy where two identical environments (Blue and Green) are maintained.
  • The Blue environment is the active production environment, while the Green is the idle one. During deployment, the new version is deployed to the Green environment.
  • After testing, traffic is switched to Green, making it the new production environment.

Why Needed:

  • Minimal Downtime: Reduces downtime as the switch between environments is instantaneous.
  • Easy Rollback: If issues arise, switching back to the Blue environment is straightforward.
  • Improved Reliability: Reduces the risk of deployment failures affecting users.

69. What other deployment strategies do you know?

  • Canary Deployment: Gradually rolling out the new version to a small subset of users before a full deployment.
  • Rolling Deployment: Incrementally updating instances or servers with the new version, ensuring at least some instances are always running the old version.
  • A/B Testing: Similar to blue-green, but used for comparing different versions/features with live user traffic to determine which performs better.
  • Feature Toggles: Allows features to be turned on/off dynamically, enabling deployment of incomplete features without impacting the user.

70. What advanced AWS resource types have you worked with and utilized?

  • AWS Lambda: Serverless computing for running code in response to events without managing servers.
  • AWS Fargate: Serverless compute engine for containers, allowing the deployment of containerized applications without managing the underlying infrastructure.
  • Amazon RDS: Managed database service for relational databases, including features like automated backups, scaling, and multi-AZ deployments.
  • AWS CloudFormation: Infrastructure as Code (IaC) tool for automating resource provisioning and management.
  • Amazon VPC Peering and Transit Gateway: For creating complex network architectures across multiple VPCs and accounts.
  • AWS Step Functions: Orchestration service for combining AWS Lambda functions and other services into serverless workflows.

71. How are hosted modules (like AI/ML) deployed, customized, and scaled as per different frontend/backend requirements in AWS with the help of a DevOps engineer? Deploying, Customizing, and Scaling Hosted Modules in AWS

Deployment: DevOps engineers use tools like AWS CodePipeline to automate the deployment of hosted AI/ML modules. This involves:

  • Containerization: Packaging the modules into Docker containers for portability.
  • Infrastructure Provisioning: Using IaC (e.g., CloudFormation) to set up the necessary AWS resources (EC2 instances, S3 buckets, etc.).
  • Deployment Automation: Using tools like AWS CodeDeploy to deploy the containers to the provisioned infrastructure.

Customization: Customization often involves:

  • Configuration Management: Using tools like Ansible or Puppet to configure the modules based on specific requirements.
  • Environment Variables: Using environment variables to inject different configurations for different environments (development, testing, production).

Scaling: DevOps engineers leverage AWS services like:

  • Auto Scaling Groups: Automatically adjust the number of instances based on load.
  • Elastic Load Balancing: Distribute traffic across multiple instances for high availability.
  • Serverless Computing: Using services like AWS Lambda to scale AI/ML workloads dynamically

72. Can you describe a technology you had not heard of before but managed to learn and use on your own? Learning a New Technology

Example: I recently learned about Apache Kafka, a distributed streaming platform. I was initially unfamiliar with its concepts but was intrigued by its potential for real-time data processing

Learning Process: I started by reading documentation, watching tutorials, and experimenting with Kafka on my own. I also joined online communities and forums to connect with other users and learn from their experiences.

Application: I successfully implemented Kafka in a project to handle high-volume event streams, improving data processing efficiency and real-time insights

73. What challenges have you faced as a DevOps engineer? DevOps Engineer Challenges

  • Complexity: Managing complex cloud environments with multiple services and dependencies.
  • Automation: Finding the right balance between automation and manual intervention.
  • Security: Ensuring the security of cloud infrastructure and applications.
  • Collaboration: Working effectively with developers, operations teams, and other stakeholders.
  • Continuous Learning: Keeping up with the rapid pace of change in the cloud computing landscape

76. During data loss, what strategy do you use to ensure no data loss, especially in critical applications like banking

Data Loss Prevention Strategy

Strategy: For critical applications like banking, a multi-layered approach is essential:

  • Redundancy: Use multiple data centers or cloud regions for replication and failover.
  • Backups: Implement frequent and automated backups to multiple locations.
  • Version Control: Track changes to data and maintain historical versions.
  • Monitoring: Monitor database health and performance to detect potential issues early.
  • Disaster Recovery Plan: Develop a comprehensive disaster recovery plan to restore data and services in case of an outage.

77.Have you faced any cyberattacks on systems you built and implemented? What precautions do you take?

Cyberattacks and Precautions

Experience: I've encountered security incidents like brute-force attacks and attempts to exploit vulnerabilities.

Precautions:

  • Security Best Practices: Implement strong passwords, multi-factor authentication, and least privilege access.
  • Vulnerability Scanning: Regularly scan systems for vulnerabilities and patch them promptly.
  • Security Monitoring: Use security information and event management (SIEM) tools to monitor for suspicious activity.
  • Incident Response Plan: Develop a plan to respond to security incidents effectively.

78. What are the networking setup rules you follow?

Networking Setup Rules

Rules:

  • Security Groups: Use security groups to control inbound and outbound traffic to instances.
  • Network Segmentation: Divide the network into smaller segments to isolate resources and limit the impact of security breaches.
  • Firewall Rules: Implement firewall rules to block unauthorized access.
  • VPN and Tunneling: Use VPNs and tunnels to secure communication between networks.
  • Network Monitoring: Monitor network traffic and performance to identify potential issues.

79. What are your daily responsibilities as a DevOps engineer?

Daily Responsibilities

  • Monitoring: Monitor system health, performance, and security.
  • Automation: Automate tasks like deployments, infrastructure provisioning, and backups.
  • Troubleshooting: Resolve issues and incidents.
  • Collaboration: Work with developers, operations teams, and other stakeholders.
  • Continuous Improvement: Identify areas for improvement and implement changes to enhance efficiency and reliability.

80. Which DevOps tools are you proficient with?

DevOps Tools Proficiency

  • Infrastructure as Code: Terraform, CloudFormation, Ansible, Puppet.
  • Containerization: Docker, Kubernetes.
  • CI/CD: Jenkins, GitLab CI/CD, AWS CodePipeline.
  • Monitoring and Logging: Prometheus, Grafana, ELK Stack.
  • Configuration Management: Ansible, Puppet, Chef.
  • Version Control: Git.

81. Can you describe the CI/CD workflow in your project?

CI/CD Workflow Description

Example: In a recent project, our CI/CD workflow involved:

  1. Code Commit: Developers commit code changes to a Git repository.
  2. Build and Test: A CI server (e.g., Jenkins, GitLab CI) automatically builds the application, runs unit tests, and performs code quality checks.
  3. Artifact Storage: Successful builds are stored as artifacts in a repository (e.g., S3).
  4. Deployment: The CD server (e.g., AWS CodeDeploy) deploys the artifact to the target environment (development, testing, production).
  5. Monitoring: Continuous monitoring tools (e.g., Prometheus, Grafana) track application health and performance.

82. How do you handle the continuous delivery (CD) aspect in your projects?

Continuous Delivery (CD) Handling

Methods:

  • Automated Deployments: Use tools like AWS CodeDeploy to automate deployments to different environments.
  • Blue-Green Deployments: Deploy new versions of the application alongside the existing version, allowing for seamless switchover.
  • Canary Deployments: Gradually roll out new versions to a small subset of users, monitoring for issues before full deployment.
  • Feature Flags: Use feature flags to enable or disable features in the application without code changes, allowing for controlled releases.

83. What methods do you use to check for code vulnerabilities?

Code Vulnerability Checks

Methods:

  • Static Code Analysis: Use tools like SonarQube or Snyk to analyze code for vulnerabilities and security issues.
  • Dynamic Code Analysis: Use tools like Burp Suite or ZAP to test the application in runtime for vulnerabilities.
  • Security Scanning: Use tools like AWS Inspector or Qualys to scan infrastructure and applications for vulnerabilities.

84. What AWS services are you proficient in?

AWS Service Proficiency

Services:

  • Compute: EC2, Lambda, ECS, EKS.
  • Storage: S3, EBS, EFS.
  • Networking: VPC, Route S3, Load Balancers.
  • Database: RDS, DynamoDB, Redshift.
  • CI/CD: CodePipeline, CodeBuild, CodeDeploy.
  • Security: IAM, Security Groups, KMS
  • Monitoring: CloudWatch,

85. How would you access data in an S3 bucket from Account A when your application is running on an EC2 instance in Account B?

Accessing S3 from Account B

Method: Use IAM roles and cross-account permissions:

  1. Create Role: In Account A, create an IAM role with permissions to access the S3 bucket.
  2. Assume Role: In Account B, configure the EC2 instance to assume the role created in Account A.
  3. Access S3: The EC2 instance can now access the S3 bucket using the assumed role's credentials.

86. How do you provide access to an S3 bucket, and what permissions need to be set on the bucket side? S3 Bucket Access and Permissions

Access: You can provide access to an S3 bucket using:

  • IAM Policies: Attach policies to users, groups, or roles to grant specific permissions.
  • Bucket Policies: Define access control rules directly on the bucket.

Permissions: Common permissions include:

  • Read: Allows users to read objects from the bucket.
  • Write: Allows users to write objects to the bucket.
  • Delete: Allows users to delete objects from the bucket.
  • List: Allows users to list objects in the bucket.

87.How can Instance 2, with a static IP, communicate with Instance 1, which is in a private subnet and mapped to a multi-AZ load balancer? Instance Communication with Private Subnet

  • Method: Use a NAT gateway or a bastion host
    • NAT Gateway: A managed service that provides internet access for instances in a private subnet.
    • Bastion Host: A secure server in a public subnet that allows access to private instances.

88.For an EC2 instance in a private subnet, how can it verify and download required packages from the internet without using a NAT gateway or bastion host? Are there any other AWS services that can facilitate this?

EC2 Instance in Private Subnet Accessing Internet

Method: Use a private DNS hostname:

  1. Private DNS: Configure a private DNS zone within your VPC.
  2. Private Hostname: Assign a private hostname to the EC2 instance.
  3. DNS Resolution: The EC2 instance can resolve the private hostname to access internet resources.

89.What is the typical latency for a load balancer, and if you encounter high latency, what monitoring steps would you take?

Load Balancer Latency and Monitoring

Typical Latency: Load balancer latency typically ranges from a few milliseconds to a few hundred milliseconds, depending on factors like network conditions and load.

Monitoring Steps:

  • CloudWatch Metrics: Monitor load balancer latency using CloudWatch metrics.
  • Network Tracing: Use tools like AWS X-Ray to trace requests through the load balancer and identify bottlenecks.
  • Performance Testing: Run load tests to simulate real-world traffic and identify performance issues.

90.If your application is hosted in S3 and users are in different geographic locations, how can you reduce latency?

Reducing Latency for S3 Hosted Application

Methods:

  • Edge Locations: Use AWS CloudFront to cache content at edge locations closer to users, reducing latency.
  • Regional Buckets: Store data in S3 buckets in the same region as the users, minimizing network hops.
  • Content Delivery Networks (CDNs): Use a CDN to distribute content across multiple locations, reducing latency for users worldwide.
  • Which services can be integrated with a CDN (Content Delivery Network)?

91. Which services can be integrated with a CDN (Content Delivery Network)?

CDN Integration

Services: CDNs can integrate with various services, including:

  • Web Servers: Apache, Nginx, IIS.
  • Content Management Systems (CMS): WordPress, Drupal, Joomla.
  • Cloud Storage: AWS S3, Google Cloud Storage, Azure Blob Storage.
  • Streaming Services: Netflix, YouTube, Twitch.
  • API Gateways: AWS API Gateway, Google Cloud Endpoints.

92.How do you dynamically retrieve VPC details from AWS to create an EC2 instance using IaC?

Dynamically Retrieving VPC Details

Method: Use Terraform's data sources:

  1. aws_vpc Data Source: Retrieve details of a specific VPC by its ID or name.
  2. aws_subnet Data Source: Retrieve details of subnets within a VPC.
  3. aws_security_group Data Source: Retrieve details of security groups associated with a VPC.
  4. aws_instance Data Source: Retrieve details of existing EC2 instances within a VPC

93. Managing Unmanaged Resources in Terraform

Approach: Use Terraform import command to bring existing unmanaged resources under Terraform's control. This allows you to manage them alongside your IaC code

94. Passing Arguments to VPC During Import

Method: Use the --var flag with the terraform import command to pass arguments:

terraform import aws_vpc.example "vpc-1234567890abcdef0" --
var="cidr_block=10.0.0.0/16"

95. What is the master-slave architecture in Jenkins?

A master-slave architecture in Jenkins allows you to distribute build tasks across multiple nodes (slaves). This provides:

  • Parallel Execution: Run builds concurrently on multiple nodes, reducing build times.
  • Resource Optimization: Utilize different hardware configurations for different build tasks.
  • Scalability: Scale your Jenkins infrastructure by adding more slave nodes

96. How do you integrate LDAP with AWS and Jenkins?

A: You can integrate LDAP with AWS and Jenkins by:

  1. Configuring LDAP: Set up an LDAP server and configure it to authenticate users.
  2. AWS IAM: Create an IAM role with permissions to access the LDAP server.
  3. Jenkins Configuration: Configure Jenkins to use the LDAP server for authentication

97. What are some key features of GitHub?

  • Version Control: Track changes to code over time.
  • Collaboration: Facilitate collaboration among developers.
  • Pull Requests: Enable code reviews and approvals.
  • Issues: Track bugs, feature requests, and other tasks.
  • Projects: Organize and manage work items.

98. What are some key features of Jenkins?

  • Continuous Integration (CI): Automate build, test, and deployment processes.
  • Continuous Delivery (CD): Deploy applications to different environments.
  • Pipeline as Code: Define pipelines using code (Jenkinsfile).
  • Plugins: Extend Jenkins functionality with a wide range of plugins.
  • Master-Slave Architecture: Distribute build tasks across multiple nodes.

99. What are the benefits and uses of CI/CD?

  • Faster Delivery: Reduce the time it takes to deliver new features and updates.
  • Improved Quality: Catch bugs and errors early in the development process.
  • Increased Efficiency: Automate repetitive tasks, freeing up developers to focus on innovation.
  • Reduced Risk: Deploy changes more frequently and with less risk

Uses of CI/CD:

  • Software Development: Automate build, test, and deployment processes.
  • Infrastructure Management: Provision and configure infrastructure automatically.
  • Data Pipelines: Automate data processing and analysis tasks

100. What is a GitHub workflow, and how is it used?

A: A GitHub workflow is a set of automated tasks that are triggered by events in a GitHub repository. You use workflows to:

  • Build and Test Code: Automate build and test processes.
  • Deploy Applications: Deploy applications to different environments.
  • Run Code Analysis: Perform code quality checks and security scans

101 How do you handle merge conflicts in Git?

  1. Identifying Conflicts: Git will identify conflicts when merging branches.
  2. Resolving Conflicts: Manually resolve conflicts by editing the affected files.
  3. Staging Changes: Stage the resolved files using git add.
  4. Committing Changes: Commit the changes with a descriptive message using git commit

102. What steps do you take when a build fails in Jenkins?

A: When a build fails in Jenkins, you should:

  1. Analyze Logs: Review the build logs to identify the cause of the failure.
  2. Troubleshoot: Investigate the issue and try to resolve it.
  3. Fix Code: If the failure is due to a code error, fix the code and rebuild.
  4. Update Configuration: If the failure is due to a configuration issue, update the Jenkins configuration.
  5. Rollback: If necessary, roll back to a previous working version of the application

103 How do you execute jobs in AWS?

A: You can execute jobs in AWS using:

  • AWS Batch: A fully managed batch computing service for running large-scale, compute-intensive jobs.
  • AWS Lambda: A serverless computing service for running code in response to events.
  • AWS ECS: A container orchestration service for deploying and managing containerized applications.

104. What are Ansible roles, and how do you use them?

A: Ansible roles are a way to organize and reuse Ansible playbooks.

They encapsulate tasks, variables, and dependencies related to a specific component or service. You use roles to

  • Modularize Playbooks: Break down complex playbooks into smaller, reusable units.
  • Simplify Deployment: Deploy multiple components or services with a single role.
  • Improve Maintainability: Make it easier to manage and update Ansible configurations.

105 How do you ensure data persistence with Docker volumes?

A: Use Docker volumes to persist data outside the container:

  • Named Volumes: Create named volumes that can be shared between containers.
  • Data Volumes: Mount data volumes from the host machine into the container.
  • Bind Mounts: Mount directories from the host machine into the container.

106 What are the key differences between Docker and Kubernetes?

  • Scope: Docker focuses on containerization, while Kubernetes focuses on container orchestration.
  • Management: Docker manages individual containers, while Kubernetes manages clusters of containers.
  • Scalability: Kubernetes provides advanced features for scaling and managing large-scale deployments.
  • Networking: Kubernetes offers more sophisticated networking capabilities for container communication

107 How do you securely store credentials in GitHub?

A: You can securely store credentials in GitHub using:

  • GitHub Secrets: Use GitHub Secrets to store sensitive information securely.
  • Environment Variables: Set environment variables in your GitHub workflow to access credentials.
  • GitHub Actions: Use GitHub Actions to manage credentials and access them within your workflow.

108 Where is the Jenkinsfile typically stored?

  • GitHub Secrets: Use GitHub Secrets to store sensitive information securely.
  • Environment Variables: Set environment variables in your GitHub workflow to access credentials.
  • GitHub Actions: Use GitHub Actions to manage credentials and access them within your workflow.

109 Where is the Jenkinsfile typically stored?

A: The Jenkinsfile is typically stored in the root directory of your Git repository

110 How do you execute a shell script within a Python script?

Use the subprocess module:

import subprocess

subprocess.run(["/path/to/script.sh"])

111 How do you communicate with a Jenkins server and a Kubernetes cluster?

A: You can communicate with a Jenkins server and a Kubernetes cluster using:

  • Jenkins Plugins: Use Jenkins plugins like the Kubernetes plugin to interact with a Kubernetes cluster.
  • API Calls: Use the Kubernetes API to interact with the cluster programmatically.
  • kubectl: Use kubectl commands from within Jenkins to manage Kubernetes resources

112 Do you only update Docker images in Kubernetes, or do you also update replicas, storage levels, and CPU allocation?

A: You can update various Kubernetes resources, including:

  • Docker Images: Update the container image used by a deployment.
  • Replicas: Scale up or down the number of replicas for a deployment.
  • Storage Levels: Adjust storage capacity for persistent volumes.
  • CPU Allocation: Modify CPU and memory resource requests and limits for containers.

113. Can you define environment variables inside your Jenkins pipeline?

A: Yes, you can define environment variables inside your Jenkins pipeline using:

  • Pipeline Script: Use the environment directive within your Jenkinsfile to define environment variables.
  • Global Variables: Configure global environment variables in Jenkins settings.
  • Credentials: Store sensitive information as credentials and access them within the pipeline

114 What is the role of artifacts in Jenkins, and why do we need to push them to Nexus instead of building and storing them locally?

A: Artifacts are the output of a build process, such as compiled code, container images, or test reports. Pushing artifacts to Nexus (a repository manager) provides:

  • Version Control: Track different versions of artifacts.
  • Dependency Management: Manage dependencies between projects.
  • Security: Control access to artifacts and ensure their integrity

115. If you’re developing a Python-based application, how do you separate the packages needed for your local deployment to avoid interfering with globally installed packages

A: Use virtual environments:

  1. Create Virtual Environment: Use python -m venv <env_name> to create a virtual environment.
  2. Activate Environment: Activate the environment using source <env_name>/bin/activate.
  3. Install Packages: Install packages specific to your project using pip install <package_name>.

116 What are the prerequisites before importing a VPC in Terraform?

A: Before importing a VPC, you need:

  • Terraform Configuration: A Terraform configuration file defining the VPC resource you want to import.
  • Resource ID: The unique identifier (ID) of the VPC in AWS.
  • Resource Type: The correct Terraform resource type (e.g., aws_vpc)

117. If an S3 bucket was created through Terraform but someone manually added a policy to it, how do you handle this situation using IaC?

A: You can use Terraform's terraform plan command to detect differences between the current state of the S3 bucket and your IaC configuration. This will highlight the manually added policy. You can then:

  • Update IaC: Modify your Terraform configuration to include the manually added policy, ensuring consistency.
  • Ignore: If the policy is acceptable, use the ignore_changes option in Terraform to exclude it from future plans.