“`html
Unlocking the Power of Many Clouds: Lessons from the Trenches
Imagine having a superpower that lets you use the best features from different places. That’s kind of what using multiple clouds (called “multicloud”) is like for businesses. But, like any superpower, it comes with its own set of challenges. This article explores how to make multicloud work for you, based on advice from experts who’ve been there.
Why Use More Than One Cloud?
The idea behind multicloud is simple: don’t put all your eggs in one basket. Different cloud providers (like Amazon Web Services, Google Cloud, and Microsoft Azure) offer different strengths. Multicloud lets you pick and choose the best services from each, giving you more flexibility and potentially saving you money. But it’s not as easy as just signing up for a bunch of different services.
It’s important to have a plan.
Planning Your Multicloud Strategy
Before you even start writing code, you need to figure out why you’re using multiple clouds. This isn’t just a technical decision; it’s a business strategy.
John: Lila, let me give you an analogy. Imagine you are trying to bake a cake, but some ingredients are only available in certain stores. One store has the best flour (AWS), another has the best chocolate (Azure), and a third has the best sprinkles (GCP). Multicloud is like going to all three stores to get the best ingredients for your cake, instead of settling for whatever one store has. Does that make sense?
Lila: Yes, it does! But what if the stores are really far apart, and it takes a lot of effort to go to each one?
John: That’s a great question, Lila! And that’s exactly the challenge with multicloud. You need a good strategy to make sure the benefits outweigh the extra effort.
Drew Firment, a cloud strategist, says that multicloud is a “strategy problem.” You need a clear plan that defines when, where, and why your development teams use specific cloud features. Without a plan, you could end up spending more money, having security problems, and even failing projects.
Heather Davis Lam, a CEO, emphasizes the importance of communication between different teams. Developers, operations, security, and even legal teams need to talk to each other to avoid problems caused by miscommunication, not just bad code.
Generic vs. Specific: What Kind of Code Should You Write?
One of the biggest questions you’ll face is when to write code that’s specific to a certain cloud provider and when to write code that can run on any cloud.
Lila: John, what does “code” mean in this context?
John: Good question! In simple terms, code is a set of instructions that tells the computer what to do. It’s like a recipe for your computer. So, writing code for multiple clouds means writing different recipes that work on different computers, which each computer speaks a different language!
Some teams try to make their code work on any cloud, but that can lead to over-engineering and complexity. Heather Davis Lam warns against making your code too abstract, which can slow down development.
Patrik Dudits, a software engineer, agrees. He says that trying to limit your code to the “lowest common denominator” of cloud features is a mistake. Instead, you should embrace the strengths of each cloud.
Matt Dimich, a VP of platform engineering enablement, suggests aiming for agility rather than total uniformity. He wants to take advantage of faster and cheaper computing options as they become available.
Drew Firment recommends abstracting the core shared services that are common across clouds while isolating cloud-specific services. For example, you can use standard authentication and computing layers across all clouds while using Amazon S3 (a storage service) and Athena (a query service) to optimize data queries.
Making Code Portable: Kubernetes to the Rescue
So, how do you make your code as portable as possible? The answer, according to almost everyone interviewed, is Kubernetes.
Lila: John, what on earth is Kubernetes?
John: Okay, Lila, imagine you have a bunch of containers (like shipping containers) that hold your application. Kubernetes is like a super-efficient port manager that automatically arranges these containers, makes sure they’re running smoothly, and scales them up or down as needed. It hides all the complexity of managing these containers so you don’t have to worry about the nitty-gritty details.
Radhakrishnan Krishna Kripa, a lead DevOps engineer, uses Kubernetes and Docker containers (those are like the shipping containers Lila!) to standardize deployments. This allows him to write code once and run it on different cloud platforms with minimal changes.
Sidd Seethepalli, a CTO, relies on Kubernetes instead of provider-specific services to deploy code consistently on any Kubernetes cluster. They use Helm charts (templates for Kubernetes) to hide cloud-specific configurations and tools like KOTS to simplify deployment customization.
Neil Qylie, a principal solutions architect, uses Kubernetes as a foundation and builds on it with tools like Helm and ArgoCD to automate deployments and ensure consistent, validated deployments through CI/CD pipelines.
Tools of the Trade: Expert Recommendations
Here are some tools that cloud experts recommend:
- Anant Agarwal (Aidora): Maintain a living system diagram to visualize data flows and service responsibilities.
- Heather Davis Lam (Revenue Ops): Use Splunk or Datadog for broader logging across clouds and even older systems.
- Patrik Dudits (Payara Services): Use Pulumi Infrastructure as Code (IaC) tools to model infrastructure with programming languages.
- Radhakrishnan Krishna Kripa: Standardize pipelines using cloud-neutral tools like GitHub Actions and Terraform Cloud.
- Anant Agarwal: Use adapter layers to wrap cloud APIs and SDKs in internal libraries, creating a clean, generic interface.
Conquering Multicloud Complexity
Multicloud environments are complex, so you need to centralize your logs and alerts. Anant Agarwal recommends routing all logs to a unified observability platform like Datadog to quickly identify and resolve incidents.
Patrik Dudits suggests investing in a central, provider-neutral dashboard for high-level metrics across your multicloud setup.
Heather Davis Lam emphasizes the importance of good logging and monitoring to save time when debugging issues across multiple clouds. But she cautions against excessive logging and alerting, suggesting that you should only retry processes that are likely to succeed.
Automation is another key to managing multicloud complexity. Anant Agarwal automates everything using GitHub Actions to ensure synchronized schema changes, code deployments, and service updates.
He also uses internal AI tools to streamline workflows, such as a custom GPT that answers questions about deployment locations and provider capabilities.
Finally, plan for failure. Heather Davis Lam says that the more clouds and services you connect, the more chances there are for something to break. Expect API timeouts, expiring authentication tokens, and latency spikes. Think about what should be retried and what should trigger an alert.
In short, multicloud development is messy, but if you plan for it, you can write better and stronger code.
John’s Final Thoughts
Having worked in this field for years, I find the shift towards multicloud both exciting and challenging. It’s about leveraging the best of each world, but doing so smartly and strategically.
Lila’s perspective: I’m still learning, but it sounds like multicloud is like having a really awesome toolbox with lots of specialized tools. You just need to know which tool to use for which job!
This article is based on the following original source, summarized from the author’s perspective:
Multicloud developer lessons from the trenches
“`