CloudPro #20: AWS SSL/TLS certificates expiring-update them asap, Okta hacked again, New AWS Coursera Certificate, Open source LLMs on GCloud

Get smarter about cloud and security

Packt

Oct 30, 2023

Welcome to a brand new edition of the CloudPro! Today, we’ll talk about:

Masterclass:

Secret Knowledge:

Techwave:

From the Cloud World:

HackHub:

Cheers,

Shreyans Singh

Editor-in-Chief

PS: I hope you will enjoy today's newsletter! I’m all ears for your thoughts – the good, the great, and the "meh." Share your feedback and snag a free Packt eBook. It's a win-win. Can't wait to hear what you think!

Share your feedback and get a free Packt eBook!

⭐ MasterClass: Tutorials & Guides

⭐How to deploy Mistral 7B Instruct on your Kubernetes cluster using TGI:

Add Helm Repository: Use `helm repo add substratusai https://substratusai.github.io/helm` to add the Helm repository.
Create Configuration File: Make a `values.yaml` file with deployment settings.
Deployment: Deploy Mistral 7B Instruct with `helm install`.
Check Pod Status: Verify the pod status with `kubectl describe pod`.
Check Logs: Ensure the model initializes correctly with `kubectl logs`.
Port Forwarding: Expose the model to your local machine with `kubectl port-forward`.
Access the Model: Explore the API endpoints in the TGI API documentation.
Run Inference: Use tools like `curl` to send JSON requests for content generation based on specific instructions.

⭐How to run 15,000+ tasks in a single Amazon ECS cluster:

Prepare your environment with sufficient private IPv4 addresses.
Configure AWS Fargate vCPU-based quotas to meet your scaling needs.
Amazon ECS services using AWS Cloud Map-based service discovery have a hard limit of 1,000 tasks.
An Amazon ECS cluster can run up to 5,000 services, with each service handling up to 5,000 tasks.
Consider AWS Application Load Balancer (ALB) quotas when scaling services associated with ALBs.
If your workload exceeds hard limits, consider a cell-based architecture.
These concepts mainly apply to AWS Fargate but have similar considerations for Amazon ECS with Amazon EC2.

⭐How to build a workflow orchestrator for Kubernetes:

Workflow Orchestration in Kubernetes:

The author created Flowmium, a workflow orchestrator for Kubernetes.
It's designed to manage and chain multiple tasks or jobs in a sequential manner.
These tasks can be interdependent, where one task relies on the output of another.
This is particularly valuable for creating data processing pipelines and multi-step computations.

Components:

Planner: Responsible for analyzing the flow, detecting dependencies, and planning task deployments.
Task: Each task gets converted into a Kubernetes Job manifest and is responsible for handling input, running a specific command, and producing output.
Scheduler: Manages flow status, tracks task progress, and schedules task deployments.
Executor: Ties everything together, handling requests, deploying tasks, and monitoring their status.

Framework for Python:

The author created a Python framework for defining tasks as simple functions with dependencies
Users define tasks as functions and their dependencies using decorators.

⭐How to manage multiple Kubernetes clusters:

SIG-Multicluster, a part of Kubernetes, deals with managing multiple Kubernetes clusters and the applications they host.
It aims to enable seamless communication between workloads in different clusters, share cluster metadata, and break down cluster boundaries.
Running multiple clusters is essential for reasons like optimizing location, achieving isolation for security and cost management, and enhancing reliability.
SIG-Multicluster focuses on defining three main APIs: About, Multicluster Services, and Work.
These APIs help identify clusters, share services across clusters, and deploy workloads across clusters.

⭐How to make the EKS cluster safe from attacks without affecting its regular use:

Use a dedicated VPC for each cluster to isolate workloads, reduce risks, and ensure address execution safety.
Implement VPC endpoints for secure and fast access to AWS services.
Secure the EKS API by choosing public, private, or both access options.
Use OIDC for workloads to access AWS services without access and secret keys.
Encrypt ETCD data, especially Kubernetes secrets, with a KMS key.
Collect and monitor logs, especially API server and audit logs, but be mindful of the cost.
Consider using Bottlerocket for efficient container workload hosting.

🔍Secret Knowledge: Learning Resources

🔍How to manage Kubernetes resources in Terraform: Kubernetes provider

Advantages of Terraform for Kubernetes:

Terraform simplifies managing Kubernetes resources.
It allows changes to infrastructure and application deployments in one commit.
Offers a quicker disaster recovery option using Terraform CLI.

Terraform Kubernetes Providers:

Kubernetes
Helm

Managing Kubernetes Deployment:

The blog provides an example of managing a Kubernetes Deployment using Terraform's Kubernetes provider.
It shows how to define a deployment using HCL, similar to Kubernetes YAML.

Installing Istio:

For complex installations like Istio, the blog suggests using tools like `tfk8s` to automate the YAML-to-HCL conversion.

⭐Observing Java Applications Running Via Docker Compose Using OpenTelemetry:

Get the OpenTelemetry agent and any extensions you need.
Make a file (e.g., `docker-compose.override.otel.yml`) to extend your original Compose file. Specify the service you want to instrument and set environment variables.
Start your application using both the original and override files to include observability.
You can then visualize data using tools like Jaeger, Prometheus, and Grafana.

⭐Container image scanning with Trivy in AWS CDK:

Trivy is a security tool for vulnerability testing.
Advantages: This AWS CDK construct lets you scan container images in your CDK deployment, avoiding unnecessary builds, and can stop the deployment if vulnerabilities are found.
How to Use: Install the construct and use the `ImageScannerWithTrivy` construct in your CDK code.
Options: You can specify various Trivy options for vulnerability scanning.
Stopping Image Push: The construct can prevent image pushes to Amazon Elastic Container Registry (ECR) if vulnerabilities are detected.

⭐What are Kubernetes Volume Snapshots:

A Kubernetes Volume Snapshot is like taking a picture of the stuff in your storage.
Setting Up a Kubernetes Cluster: You need a Kubernetes cluster with a CSI driver supporting snapshots. Digital Ocean clusters are a cost-effective choice.
Deploying a Stateful Workload: Demonstrated using a StatefulSet for deploying a MariaDB database with storage and root password.
Creating Test Data: Shows how to create data within the MariaDB.
Taking a Snapshot: Demonstrates creating a VolumeSnapshot to capture the database state.
Simulating Data Loss: Data loss is simulated by dropping the table in the database.
Restoring from a Snapshot: Explains how to restore data using a PersistentVolumeClaim referencing the snapshot.
Inspecting the Snapshot: Describes how to inspect the VolumeSnapshotContents to understand the relationship between snapshots and underlying storage.

⭐How to use Vector Databases(Pinecone, Chrome, Weviate) in Python:

Vector Databases are databases optimized for storing and searching high-dimensional vectors, which are often used in applications like search engines and recommendation systems. The article focuses on how to work with vector databases in Python in 2023. They list several libraries and databases, including Chroma, ClickHouse, MyScale, OpenSearch, pgVector, Pinecone, Qdrant, Redis, Vespa, Weaviate, and more.

⚡ TechWave: Cloud News & Analysis

⚡Amazon RDS and Aurora SSL/TLS Certificates are expiring. Update them now.

Amazon RDS and Aurora databases are updating their SSL/TLS certificates, and you need to rotate them before the old ones expire in 2024.
New Certificates: New certificates are valid for 40 or 100 years, so you won't have to rotate them again for a long time.
Identify Impacted Resources: Find the DB instances that need the update.
Update Applications: Before applying new certificates to DB instances, update your application's trust store and SSL/TLS certificates.
Test in Non-Production: Test certificate rotation in a non-production environment with the same setup as production.
Apply in Production: Schedule or apply certificate rotation in production. Modern engines may not require a restart.

⚡Okta got hacked. Again.

On October 2nd, 2023, an Okta support agent asked a BeyondTrust administrator for help with a technical problem. The agent requested the admin to create a HAR file, which contains sensitive information like an API request and a session cookie.
This file was necessary to understand and fix the issue. The admin followed the request, made the HAR file, and uploaded it to the Okta support portal.
Just 30 minutes later, a hacker used the hidden session cookie in the HAR file to launch an attack on BeyondTrust. Luckily, BeyondTrust detected the attack and thwarted it.
After investigating, BeyondTrust discovered that the hacker had used Okta to target them. They promptly informed Okta on October 2nd. However, Okta didn't respond for two weeks.
Then, on October 18th, hackers used the same method to attack another Okta customer, Cloudflare. Cloudflare quickly informed Okta about the breach.
Finally, on October 20th, Okta admitted they had been hacked, and 170 customers were affected.

⚡Introducing GKE Stateful High Availability (HA) Controller:

Stateful applications, like databases and message queues, need to balance cost and availability.
You can either run a single replica in one location to save money but risk downtime, or you can run multiple replicas for high availability but at a higher cost.
With Stateful HA Operator, you get a middle ground.
It uses a new technology called regional persistent disk (RePD) that synchronously replicates data between two locations in a region.
This means you can have the availability of multiple replicas without the high cost of running additional compute.
If there's a failure, your application can quickly failover to another location.

🌐From the Cloud World:

🌐New AWS Cloud Technology Consultant Professional Certificate now available on Coursera:

Coursera now offers the AWS Cloud Technology Consultant Professional Certificate, a program to train individuals for cloud consulting careers. Cloud Consultants advise on cloud app design, development, and infrastructure, and this field is in high demand. The first four courses provide an introduction to cloud technology and AWS services, enabling learners to build a strong foundation. Upon completing all nine courses, you receive a Professional Certificate.

🌐Google Cloud has introduced intellectual property indemnity for generative AI products:

They offer two levels of protection. The first indemnity covers the use of training data to create generative AI models and assures customers that Google will stand behind their services. The second indemnity covers the generated output, providing protection against third-party intellectual property claims related to the generated content. These indemnities aim to give customers confidence and assurance in using generative AI products without worrying about legal risks. This protection applies as long as customers follow responsible AI practices.

🌐Kubernetes introduces ingress2gateway:

Ingress, a Kubernetes API, is commonly used for managing external access to services. However, it has limitations such as inadequate support for modern proxy features, a flawed permission model, and a focus mainly on HTTP routing.
Gateway API, designed to address these limitations, offers flexibility, expressiveness, and extensibility. It follows a role-oriented approach, enhancing portability and supporting various features like header-based matching, traffic splitting, and request mirroring.
To simplify migration from Ingress to Gateway API, Kubernetes has released "ingress2gateway." This tool converts existing Ingress resources into Gateway API resources, making the transition process smoother.

🌐Netlify launches Composable Jamstack Framework for Web Apps:

Netlify, a web development platform, has launched a new tool that makes it easier to create web applications.
The tool allows developers to build web apps in a flexible way, connecting the front-end to different backend services.
It's designed to simplify the process of combining content, data sources, code, and infrastructure, giving developers more flexibility.
This tool helps developers move away from old, rigid web app structures and create applications more quickly by automating some of the complex processes that slow down development.

🌐Serve open-source LLMs on Google Cloud:

Google Cloud has improved its serving solution for large language models (LLMs) in Vertex AI Model Garden. This solution is based on open-source vLLM libraries and offers optimized transformer implementation, continuous batching, and tensor parallelism. Benchmark results show that this solution provides up to 19 times higher throughput compared to other libraries, enabling users to serve more requests at a lower cost. Google Cloud provides a Colab notebook example for deploying open-source foundation models on Vertex AI, making it easier to take advantage of this efficient LLM-serving solution.

🛠️HackHub: Best Tools for Cloud

🛠️cezarguimaraes/pkgsite: A Helm chart to deploy a fully-featured private pkg.go.dev instance.

🛠️nuclio/nuclio: High-Performance Serverless event and data processing platform

🛠️solo-io/squash: The debugger for microservices

🛠️datreeio/datree: E2E policy enforcement solution to run automatic checks for rule violations

🛠️project-zot/zot: A production-ready vendor-neutral OCI-native container image registry