What is the cloud? The joke I always used to hear is that it’s “just someone else’s computer”. Well, if a public cloud is just treated like any other data center, with VMs and servers running statically 24x7x365, I could see how someone would be inclined to say that. But the value in using a public cloud is not about running a VM in a different data center. Workloads should not be run the same in a cloud as a typical data center.
To get the most value out of using a public cloud, more thought needs to be put into the ability to scale workloads and use only what is actually needed. As businesses modernize their workloads, they open themselves up to more technologies that lend themselves better to taking advantage of the value that a public cloud can bring.
Enter Knative serverless.
For those unfamiliar with the serverless computing concept, often referred to as Functions as a Service (FaaS), picture all underlying servers, clusters, and networking are abstracted away. Applications run when they are needed, or can be otherwise “off”. Using a serverless approach in a public cloud means that businesses are not spending time dealing with infrastructure and paying for the number of hours the worker nodes are provisioned.
Instead, businesses would focus on the applications and pay for the ms (yes milliseconds) of execution time that the application needs to respond to a request. For workloads that don’t need to run all the time, this could allow for it to scale down to 0 running instances when it is not needed. Knative brings these capabilities to Kuberentes and allows us to run container-based applications in a serverless fashion.
IBM, is one of the community maintainers of the Knative project. It is used as a foundation for their Code Engine service in IBM Cloud. Let’s dive deeper into Code Engine and how it can be used to run workloads.
To deploy Code Engine, find it in the Catalog under the Compute category. This will bring us to the Code Engine page where we can select “Projects” in the left side menu.
A project will be a collection of serverless resources that we want to group. Give the project a name, put it in a resource group and apply any necessary tags and then click “Create”. In a few seconds, the project will be created and we are ready to deploy our code.
Once in the project, we can see that there are two options for running our code:
Applications – serve HTTP requests
Jobs – run the code once and exit
These two options come from the underlying Knative components: serving and eventing. Let’s go into how both of these can be used.
Creating an Application
Creating an application is simple. Click “Create Application”. From here you can specify the container image from a registry or a git repository to have the application be built from source code.
Specify details such as whether the application should be exposed on the internet or a private network in a VPC. Runtime settings allow for the control over the number of instances, triggers for scaling as well as the number of resources an instance gets. Click “Create Application”.
We can see that since I set the minimum number of instances to 0, Code Engine will scale down my application pretty quickly to have nothing running, meaning I am not paying anything. I can test the application by clicking the “Test Application” button and then clicking “Send request”.
We can see the Hello World output but more important the latency of the response. It was a bit higher than what we would normally want. Since this was scaling from 0, Code Engine automatically created an instance of the application for us when it detected the request coming in and it took a few seconds to complete. Subsequent requests would be much lower. We can avoid this first request delay by setting the minimum scale to 1 so there is always at least one instance running.
If we want to test the scaling further, a load testing tool such as Siege can be used to generate requests against our Code Engine application forcing it to scale up (and down) with the load.
Creating a Job
Creating a job is almost identical to creating an application. There is no need to specify the endpoint since there is no need to access the job. Jobs can be run manually, but are more useful when paired with another Code Engine feature called “Event Subscriptions”. These are triggers that will run the job when the event happens and pass the job information about the event. The two types of events that can be used are timers and changes in Object Storage buckets. As an example, I have used the Object Storage event subscription to trigger my job and pass it the name of an uploaded object as part of an ETL process. Pretty neat.
Serverless computing is the best showcase of the value a public cloud can offer. Truly pay for what you use while scaling with ease. It will be interesting to see how serverless technologies transforms the way we think about running applications in the future.
For many organizations, public cloud is the main place applications are deployed to today and for good reason. Utilizing public cloud managed services properly brings flexibility and ease of use that allows companies to focus on what’s important to them. But a public cloud isn’t the hammer for every nail. Established organizations may have applications that need to run in an on-premises data center. The reasons for this are plentiful but common justifications are sensitivity of the workload being in a data center outside of the organization’s control, latency requirements, and general complexities of moving highly integrated workloads.
In response to this, Cloud Services Providers (CSP) have developed ways of deploying their cloud services outside of their cloud datacenters. AWS Outpost, Azure Arc, Google Anthos. In 2021 IBM launched Cloud Satellite as their answer. These services all have their differences, but generally have the same goal of bringing cloud-managed services to where they are needed. From the CSP’s perspective, it keeps the customer using their services even if they cannot bring the workload into the CSP data center. From the customer perspective, they get some of the benefits of using cloud services and can have consistency in how those services are used either in a CSP data center or their own on-premise data center.
Let’s dive deeper into IBM Cloud Satellite concepts and its architecture. The main concept is the “Satellite location”, which represents the location where the IBM Cloud services will be extended to. It could be the on-premise data center, another CSP data center, or even your home. Essentially anywhere that VM or Bare Metal servers can be placed could be a Satellite location. The Satellite location consists of x86 RHEL hosts assigned to either be part of the control plane for the Satellite location or used to run the IBM Cloud services in the location.
A Satellite location would be managed from an IBM Cloud region. These could be any of the IBM Cloud MZR regions, but the important piece is that there is < 200 ms of latency between the Cloud region and the Satellite location control plane. This connectivity between the location and the cloud region is called a Satellite Link. From a network perspective, the Satellite Link communication is initiated from the Satellite location to the cloud region so there is no need to open any incoming ports to the Satellite location. All data flowing over the Satellite Link is encrypted to TLS 1.3 standards.
The process of creating a Satellite location is as simple as going into the IBM Cloud portal and defining the location with a name and which cloud region it is managed from. From there, a bash script is generated which can be run on the hosts in the Satellite location. The script attaches the hosts to the Satellite location and will appear in the cloud portal as unassigned.
The Satellite location control plane requires at least 3 hosts with a minimum of 4 cores and 16 GB of RAM. Other services will have their own host requirements. Hosts that run the cloud service do not need to necessarily be in the same physical location as the control plane but should have < 100 ms of latency to the control plane.
Assign 3 hosts in the cloud portal to be part of the control plane. Satellite has the concept of zones, independent data centers that can be used to spread workloads across. If the infrastructure provider has zones, each host can be assigned to a respective zone in the infrastructure provider. Commonly if deploying in an on-premise data center there will not be multiple zones. In that case just specify that each control plane host is in a different zone, even if they are in the same DC. Once the 3 control plane hosts are assigned in different zones the Satellite location status should show as healthy, meaning it is ready to start deploying cloud services. Before that happens though, additional hosts will need to be attached to the Satellite location. These will be left as “unassigned” until they are used by a service in the Satellite.
For a service to be deployed in a Satellite location, it needs to be “Satellite Enabled”, meaning it can support running on Satellite. Not every service supports it and as of today, this list is still growing. The first service that launched, and where a lot of the interest that I see is the Redhat OpenShift service, also known as ROKS. This would deploy an OpenShift cluster in the Satellite location that is managed by the IBM Cloud SRE team. For customers who may not have the skills to set up and maintain an OpenShift cluster, ROKS on Satellite makes it easy to get started and have a consistently managed OpenShift cluster in any environment.
Deploying a ROKS cluster into a Satellite location isn’t much different from deploying ROKS in IBM Cloud. Satellite shows up as an infrastructure deployment option right next to Classic and VPC. Once you specify the worker node sizes, the appropriate unassigned Satellite hosts will be assigned to the ROKS service for the cluster. When it’s provisioned the cluster shows up in the same list of clusters as those that are running in IBM Cloud.
This is the value of Satellite, extending the cloud capabilities wherever they are needed while maintaining consistent service management, logging & monitoring, and IAM access controls to not increase operational complexity.
See the video below for how to set up ROKS in a Satellite location.
Over the past few months, I’ve had a similar recurring question come up from customers, “How do I access my IBM PowerVS LPARs through my IBM Cloud Direct Link”.
While there is documentation available that shows how to conceptually do it, a practical end-to-end guide does not exist.
Let’s first start by reviewing the target architecture we want to create and the cloud components necessary. Our goal is we want to have the source network of 10.240.128.0/18 in the on-premises data center reach the 172.16.0.0/18 network in PowerVS.
Where some complexity comes in is because PowerVS today does not directly connect into the Transit Gateway. So, we need to build the connectivity through the Juniper to reach the Direct Link connected to the Transit Gateway.
As you can see there are 4 BGP sessions (there could be more if there are redundant Direct Links) and 3 GRE tunnels that we need to configure.
BGP #1 – Between the Customer Edge Router and the Direct Link Gateway
BGP #2 & #3 – Between the Juniper vSRX and the PowerVS instances. These connections require GRE tunnels to be built.
BGP #4 – Between the Transit Gateway and the Juniper vSRX. This requires a GRE tunnel to be built between the Transit Gateway and the Juniper vSRX
Now that we understand what the architecture will look like and the pieces we need let’s start configuring.
The first piece we want to deploy is the Juniper vSRX itself. It can be found under the Gateway Appliances tile in the catalog. There will be options for a Virtual Router Appliance/ Vyatta as the vendor. Ensure Juniper is listed under the vendor dropdown.
We want to deploy the Juniper vSRX into the same region and ideally the data center that the PowerVS instance will be. In my case, Dallas is the region and I have selected the DAL10 data center.
Once the Juniper is provisioned it can be found under the Classic Infrastructure > Gateway Appliances section. On the page there will be a private IP address listed, this is important to note as this address will be used to terminate the GRE tunnels in the subsequent sections.
Direct Link Provisioning
Next, we will order the Direct Link. This has the potential to take the longest if there is a need to get a new circuit brought to the data center/PoP location. The ordering part on the cloud portal is relatively quick. When ordering make sure the tile used is a Direct Link Connect (often referred to as a Direct Link Connect 2.0) or Direct Link Dedicated 2.0 as shown below. Do not use any Direct Link with the word classic in the name otherwise it will not be able to connect to a Transit Gateway and this architecture will not work.
The Direct Link order page will ask for which location to establish the connection, the provider being used, IP and BGP information for peering. This would be BGP session #1 that we referred to earlier. In the example screenshots, I have selected to provision the Direct Link in Washington with Global Routing to reach the Transit Gateway in Dallas. The Connection section at the end of the order form will ask if the Direct Link should be connected to ‘Direct resources’ or ‘Transit Gateway’. Select Transit Gateway then create the Direct Link. Depending on the type of Direct Link (Dedicated or Connect) it may require a physical cross-connect, or a virtual connection to be established through an exchange provider. This can happen in parallel while the rest of the environment is being set up.
Power Systems Virtual Server Cloud Connections
Next, we will want to provision the PowerVS service. Look for the Power Systems Virtual Server tile in the catalog.
We want to provision into the region (ideally the datacenter) that we provisioned the Juniper vSRX. Once the PowerVS resource is provisioned, it can be found in the resource from the resource list. We will then need to create a subnet in the PowerVS instance. This subnet is where our test LPAR will be provisioned to.
After we create the subnet, we will create our first GRE tunnel to connect PowerVS to Classic. From the Cloud Connections on the left-hand menu click Create Connection. Give it a name and select the speed of 5Gbps. If the Juniper vSRX is in the same region as the PowerVS instance the global routing toggle can be left off. Under Endpoint destinations, select the check box for Classic Infrastructure and enable the toggle for GRE. This will enable two form boxes:
GRE destination IP
This should be the private IP address of the Juniper vSRX. Our example is ‘10.94.176.70’. It can be found in the Classic Infrastructure > Gateway Appliances section.
This will be used by PowerVS to create the local GRE IP addresses as well as the local/remote tunnel addresses. We will use 192.168.10.0/29 as our subnet.
PowerVS will automatically calculates the addresses as follows:
192.168.10.1 – Local GRE IP
192.168.10.5 – Local GRE tunnel address
192.168.10.6 – Remote GRE tunnel address
Next, in the subnets section click the attach existing button and select the subnet created earlier.
Click Create Connection. It will take a few minutes to create the connection. While it is creating repeat the process again to create the second Cloud Connection/GRE tunnel but this time use 192.168.20.0/29 as the GRE subnet.
It’s important to note that for these BGP sessions, PowerVS provides the ASN and they cannot be changed. So, the ASN information for BGP #2 & 3 would be:
PowerVS: 64999 (except for WDC04 which uses 64995)
Once both Cloud Connections are completed, we can proceed to provision the Transit Gateway and BGP session #4.
Transit Gateway Provisioning
From the cloud catalog, provision the Transit Gateway into the same region as the Juniper vSRX and PowerVS instance.
Since the Transit Gateway is connecting to Classic Infrastructure, we can use Local Routing and avoid bandwidth charges. Enabling global Routing would allow the Transit Gateway to connect to VPCs that are in different regions.
On the order form under the Connections section, select Classic Infrastructure to provide the connection to classic infrastructure. Also, add a connection for the Direct Link created earlier. Then click create.
Once the Transit Gateway is provisioned click on the provisioned gateway and click Add Connection. In the Network connection dropdown, select GRE tunnel. In the Zone drop-down, select the zone that corresponds to the datacenter that the Juniper vSRX and the PowerVS instance are in. My Juniper was provisioned in DAL10 so I selected Zone 1 for the GRE connection. Look at the following doc page for the zone to datacenter mapping: https://cloud.ibm.com/docs/overview?topic=overview-locations#mzr-table.
Under base connection, select the Classic Infrastructure connection that was previously created. Then we must fill out the GRE tunnel information. It is a bit different from the PowerVS GRE page where all the IPs were calculated from the GRE subnet put in. With Transit Gateway we must explicitly state the IPs. I will use the 192.168.30.0/29 subnet and keep the format consistent with how PowerVS provisioned the GRE IPs.
Remote Gateway IP – Juniper vSRX private IP
Local Gateway IP – 192.168.30.1
Remote tunnel IP – 192.168.30.6
Local tunnel IP – 192.168.30.5
(optional) Remote BGP ASN – We will keep this the same as what PowerVS had assigned 64880 for BGP #2 & 3. Transit Gateway will have an ASN assigned automatically.
Once the GRE connection is provisioned make note of the assigned Transit Gateway ASN.
The last step is to configure the Juniper vSRX provisioned earlier and setup the routing between the GRE tunnels.
The configuration below will setup everything needed with the example IPs discussed. Change the IPs, ASNs, and prefixes in the import/export maps as required. The bolded areas are what may need to be changed. SSH into the Juniper vSRX private IP using the admin credentials, type ‘configure’ to enter configuration mode and paste the commands. Type ‘commit’ to save and apply the configuration.
#Create GRE tunnels, change source address to transit IP, destination IP to TGW & power local gateway
set interfaces gr-0/0/0 unit 0 tunnel source 10.94.176.70
set interfaces gr-0/0/0 unit 0 tunnel destination 192.168.30.1
set interfaces gr-0/0/0 unit 0 family inet address 192.168.30.6/30
set interfaces gr-0/0/0 unit 1 tunnel source 10.94.176.70
set interfaces gr-0/0/0 unit 1 tunnel destination 192.168.10.1
set interfaces gr-0/0/0 unit 1 family inet address 192.168.10.6/30
set interfaces gr-0/0/0 unit 2 tunnel source 10.94.176.70
set interfaces gr-0/0/0 unit 2 tunnel destination 192.168.20.1
set interfaces gr-0/0/0 unit 2 family inet address 192.168.20.6/30
#Create BGP groups, match local address IPs to above net address, neighbour IP other side of tunnel IP
set protocols bgp group TGW1 local-address 192.168.30.6
set protocols bgp group TGW1 family inet unicast
set protocols bgp group TGW1 peer-as 4201065540
set protocols bgp group TGW1 neighbor 192.168.30.5 local-as 64880
set protocols bgp group TGW1 import tgw-import-policy
set protocols bgp group TGW1 export tgw-export-policy
set protocols bgp group PVSCC1 local-address 192.168.10.6
set protocols bgp group PVSCC1 family inet unicast
set protocols bgp group PVSCC1 peer-as 64999
set protocols bgp group PVSCC1 neighbor 192.168.10.5 local-as 64880
set protocols bgp group PVSCC1 import powervs-import-policy
set protocols bgp group PVSCC1 export powervs-export-policy
set protocols bgp group PVSCC2 local-address 192.168.20.6
set protocols bgp group PVSCC2 family inet unicast
set protocols bgp group PVSCC2 peer-as 64999
set protocols bgp group PVSCC2 neighbor 192.168.20.5 local-as 64880
set protocols bgp group PVSCC2 import powervs-import-policy
set protocols bgp group PVSCC2 export powervs-export-policy
#Allow only exact match to be imported/exported to BGP
set policy-options policy-statement tgw-import-policy term allowed_from_tgw from route-filter 10.240.128.0/18 exact
set policy-options policy-statement tgw-import-policy term allowed_from_tgw then accept
set policy-options policy-statement tgw-import-policy term others then reject
set policy-options policy-statement powervs-import-policy term allowed_from_power from route-filter 172.16.0.0/18 exact
set policy-options policy-statement powervs-import-policy term allowed_from_power then accept
set policy-options policy-statement powervs-import-policy term others then reject
set policy-options policy-statement powervs-export-policy term advertised_to_power from route-filter 10.240.128.0/18 exact
set policy-options policy-statement powervs-export-policy term advertised_to_power then accept
set policy-options policy-statement powervs-export-policy term others then reject
set policy-options policy-statement tgw-export-policy term advertised_to_tgw from route-filter 172.16.0.0/18 exact
set policy-options policy-statement tgw-export-policy term advertised_to_tgw then accept
set policy-options policy-statement tgw-export-policy term others then reject
set security zones security-zone TGW host-inbound-traffic system-services all
set security zones security-zone TGW host-inbound-traffic protocols all
set security zones security-zone TGW interfaces gr-0/0/0.0
set security zones security-zone POWERVS host-inbound-traffic system-services all
set security zones security-zone POWERVS host-inbound-traffic protocols all
set security zones security-zone POWERVS interfaces gr-0/0/0.1
set security zones security-zone POWERVS interfaces gr-0/0/0.2
set security policies from-zone TGW to-zone POWERVS policy allow match source-address any destination-address any application any
set security policies from-zone TGW to-zone POWERVS policy allow then permit
set security policies from-zone POWERVS to-zone TGW policy allow match source-address any destination-address any application any
set security policies from-zone POWERVS to-zone TGW policy allow then permit
set security policies from-zone POWERVS to-zone POWERVS policy allow match source-address any destination-address any application any
set security policies from-zone POWERVS to-zone POWERVS policy allow then permit
set security policies from-zone TGW to-zone TGW policy allow match source-address any destination-address any application any
set security policies from-zone TGW to-zone TGW policy allow then permit
#set static route so gateway IP that is terminating remote side of tunnel is reachable via transit ip in underlay. The next hop would be the gateway address of the subnet for the vsrx private IP
set routing-options static route 192.168.20.1/32 next-hop 10.94.176.65
set routing-options static route 192.168.10.1/32 next-hop 10.94.176.65
set routing-options static route 192.168.30.1/32 next-hop 10.94.176.65
#local firewall policies to allow ping on tunnel IPs
set firewall filter PROTECT-IN term PING from destination-address 192.168.10.6/32
set firewall filter PROTECT-IN term PING from destination-address 192.168.20.6/32
set firewall filter PROTECT-IN term PING from destination-address 192.168.30.6/32
#local firewall policies to allow BGP on tunnel IPs
set firewall filter PROTECT-IN term BGP from destination-address 192.168.10.6/32
set firewall filter PROTECT-IN term BGP from destination-address 192.168.20.6/32
set firewall filter PROTECT-IN term BGP from destination-address 192.168.30.6/32
set firewall filter PROTECT-IN term BGP from source-address 192.168.10.5/32
set firewall filter PROTECT-IN term BGP from source-address 192.168.20.5/32
set firewall filter PROTECT-IN term BGP from source-address 192.168.30.5/32
set firewall filter PROTECT-IN term BGP from protocol tcp
set firewall filter PROTECT-IN term BGP from port 179
set firewall filter PROTECT-IN term BGP then accept
After applying these commands, we should be able to run ‘Show bgp summary’ in the Juniper and see 3 BGP sessions over the GRE tunnels in established mode.
A ‘show route protocol bgp’ should also display the expected routes. It will be important the Direct Link is properly setup at this point otherwise we will not see the routes learned from the Direct Link.
Provision an LPAR and test a ping. Pings between our networks from the Direct Link to PowerVS should now work.
If connectivity isn’t working as expected here are a few things to check to help narrow down problems:
Check that the expected routes are in the routing table
Use the ‘Routes’ feature on the Transit Gateway to validate what subnets are being learned from which connections
If the Transit Gateway does not have the routes learned from the Direct Link look at the Direct Link setup
If the Transit gateway does not have the PowerVS routes learned from the vSRX GRE connection look at the vSRX/TGW connection
If the vSRX does not have routes for PowerVS look at the PowerVS/vSRX connection
If routes are not in the routing table check the policy maps for the BGP sessions and ensure masks length matches the prefixes specified
Check the status of the BGP sessions on the Juniper and ensure they are established
Check that the GRE remote tunnel interfaces are pingable
Moving workloads to a cloud provider presents a fundamental shift in the way security is handled for most organizations. The transition from being responsible for security for the entire stack in an on-premise DC to the shared responsibility model in a cloud environment is an area where security and operations teams need to pay close attention. A cloud service should have the lines of responsibility documented and client responsibilities clearly articulated so that there is no misconception. A lot of security breaches occur due to misconfigurations of a cloud service by an organization and assuming the cloud provider is responsible for all of the security.
As an organization, how do you know if your cloud services are properly configured and where your risks are? With how accessible cloud services can be, not all cloud assets may be properly secured or tracked. This is where Cloud Security Posture Management (CSPM) tools come in. These tools provide security teams visibility by monitoring cloud environments to ensure that the deployed services or infrastructure do not have misconfigurations. This allows security teams to quickly act and remediate security issues in cloud service configurations instead of the misconfiguration going undetected until an attacker finds it.
All the major cloud providers will offer some form of security and compliance detection for their cloud. Security vendors have CSPM products that work across cloud environments. In IBM Cloud, the Security and Compliance Center provides visibility into IBM Cloud services as well as some visibility into other cloud provider’s services. The service focuses on Posture Management, Configuration Governance as well as Security Insights from other tools in the cloud. Let’s take a look at the posture management features.
The basis posture management functionality of the Security and Compliance Center comes from the IBM acquisition of Spanugo in 2020. A key part the service is defining a profile and a scope and attaching them to a scheduled scan. A profile would be made up of a collection of security controls called goals and a scope would be a collection of resources such as a resource group. There are numerous predefined security profiles such as ‘IBM Cloud Best Practises’ and CIS Benchmarks. Custom profiles can also be created.
Scans can be scheduled to occur with the profile of controls on a defined scope as needed. These tools are meant to enable continuous security monitoring, so I would recommend at least a daily scan of the environment to ensure that any misconfigured services are quickly detected. Results of the scan populate the dashboard with a posture score and which resources are in violation of the specified controls.
From the scan results above that the VPC Security Groups and ACLs are configured to allow connections to port 22 and 3389 from any source. Additionally, one of the Virtual Server Instances has a floating (public) IP address. This is against best practise and the combination of these misconfigurations would allow remote attackers to potentially access my virtual server.
Leveraging the Security and Compliance Center will help security teams ensure that they have a strong security posture and that their deployments in IBM Cloud are configured to best practises, helping them avoid costly security breaches.
I frequently get a lot of questions on how to go about setting up a VPC in IBM Cloud so I decided to make a video out of it.
In the video below I step through how to create a VPC, how to deploy 3 Virtual Server Instances across multiple Availability Zones to run Apache and lastly how to expose those web servers to the internet using a VPC load balancer.
I’ll be following this up with videos on how to expand this VPC with other services and connect to a Classic VMware environment.
In an earlier post, I wrote about how I needed to create a squid proxy server to get access to the internet from a server in my IBM Cloud Classic private network. What I want to do in this post is dive a bit deeper into the Classic network architecture and how that compares to Virtual Private Cloud (VPC).
To recap, IBM Cloud Classic has separate private and public networks within a customer account. Servers that get deployed are placed into VLANs on the private network. They can optionally have VLANs for the public network connected to the public side. Having a public VLAN on the server means that the server also gets an internet routable IP address. From a security standpoint, security groups or a network perimeter firewall should be used as a layer of network protection from connections from the internet.
Typically, what I recommend to most clients is to not use the public VLAN on the servers. While it could be secured as mentioned above, I (and most security teams that I deal with) do not feel comfortable with an internet routable IP on the servers directly. If an admin makes a mistake such as removing the security group or removing a public VLAN from a firewall, that server becomes exposed.
My recommendation is to have the servers connected to the private VLANs only and have all traffic to or from the internet be NATted by the firewall which is connected to the public VLAN. More complex environments may have multiple firewalls.
Having separate networks allows for servers to be physically disconnected from the internet. This has its benefits from a security point of view but does take some time to set up if internet connectivity is needed since the cloud administrator would need to configure the firewall device for NAT. The firewall can also become a bottleneck depending on throughput requirements because it is deployed on dedicated hardware. More firewalls may be needed as the environment scales out.
There are a few use cases for why you would put the public VLAN directly on the server:
The server is in a network DMZ
An application does not work well with NAT
Extra bandwidth for bandwidth pooling
The first two bullets are straightforward while the last one is more from a billing perspective. IBM Cloud Classic networking gives a specific amount of free internet egress allotment when servers are deployed with public interfaces. This can range from 250GB to 20TB per server. These allotments can be pooled to be shared by all the servers in the region. Many customers never get internet egress charges since their usage falls within the free allotment.
One of the main challenges with Classic networking as mentioned is getting it set up in the first place. For most customers with steady-state workloads, it is a one-time setup. For customers that are looking to build and tear down environments, some further configuration may often be needed. For example, if there is a new VLAN that is created in the cloud to isolate new servers for a specific project, the configuration needs to be added to the firewall to protect that VLAN.
Another challenge in Classic networking is that it automatically assigns IP subnets to the customer account, from the 10.0.0.0/8 address space. This does not work for most enterprise customers. The configuration is needed on the firewall to enable custom addresses, through the creation of an overlay network.
This is where VPC networking comes in. VPC allows customers to create their cloud environment on top of the IBM Cloud network. Where Classic networking is built using physical appliances, VPC uses logical components.
For example, if I were to deploy a Virtual Server Instance (VSI) in a VPC and needed to have outbound internet access, I would not have to deploy a physical firewall device to perform NAT like in Classic. I could activate the public gateway in the VPC to perform NAT. It can be activated in seconds at the click of a button or using automation tools such as Terraform. This is important because it allows customers to be more agile and set up environments quicker than they could with Classic.
There is also no restriction on private addresses that can be used; customers define the subnets that they want the servers to be provisioned on without workarounds such as using an overlay network like in Classic.
Overall, VPC is a significant improvement for cloud networking over Classic but when implementing a new deployment, how would you decide on which to use? If implementing a new deployment, I would recommend deploying in VPC. But that may not be possible. VPC in its current form does not have complete feature parity with Classic in the services it supports. As of this writing, VSIs work in both Classic and VPC. Bare Metal servers and VMware solutions sit in Classic only. The Kubernetes Service clusters can sit in both, but only by using VSI worker nodes in VPC. Eventually, I expect all these services to be available in VPC.
VPC is also still being deployed to all regions worldwide. Today it is targeted at Multi-Zone Regions (MZRs). These are the main cloud regions that have multiple Availability Zones (AZs) in a geographic region which means these regions get new services first. Single Zone Regions (1 AZ in a region) today are Classic only. So, depending on geographic or data residency requirements, deploying in an SZR with Classic may be a requirement.
For enterprises that start with a Classic environment and want to have new deployments in VPC, it is possible to connect Classic and VPC networks together using the Transit Gateway service. This would be a common pattern as new modernized workloads run in VPC on the Kubernetes Service, while still needing to access data and legacy applications running in Classic.
In a future post, I will show how to create a VPC and set up connectivity between it and a VMware cluster running in a Classic environment.
Squid is an open-source proxy server that can support a wide variety of protocols such as HTTP and HTTPS. One of its uses is as a cache in front of large websites to accelerate the delivery of content. In my use case I used it as a forward proxy for outgoing HTTP requests to the public internet.
IBM Cloud Classic has a physically separated network for private traffic and public internet traffic. This physical separation allows clients to securely deploy workloads solely onto the private network with no ability for access to come from the internet. Typically in these scenarios, all network traffic would come through client Direct Links into the private network.
I recently needed to set up an HTTP proxy for a server that was on a private VLAN on the IBM Cloud Classic network.The server I had deployed had no access to the internet but needed to make an HTTP rest call to an endpoint on the internet to activate a software license. Since this was going to be a temporary requirement, I decided to set up a Squid proxy server on a Centos Linux virtual server instance (VSI) in order to provide internet access.
This VSI would have interfaces on the private and public network, which will allow it to receive traffic from my server on the local cloud network and make requests to the internet. After a few moments when the VSI finished deploying, I can see the public and private IPs of my Squid VSI.
I then setup Security Groups to block incoming traffic from the public interface to the VSI as a security precaution.
Setting up a basic Squid proxy on Centos 8 for my use case can be straight forward. Once connected to the VSI with SSH, run the following command as root or with sudo:
dnf install squid
Once Squid is installed, edit the /etc/squid/squid.conf configuration file. In this configuration file, there will be several default networks already set under the ACL for localnet. These can be commented out. Since I wanted the proxy only usable by my server, I added its IP address in specifically.
acl localnet src 10.141.20.100/32
Once done, save and quit the file. Then restart the Squid service with the command:
systemctl restart squid
Squid uses the default port of 3128. This port can be changed in the configuration file but it is not required to do so.
Using the private address of the Squid VSI and port 3128, I configured the proxy settings of the application on my server. It was able to make outgoing requests to the internet and activate the application license. Once I was done with the proxy I deleted the VSI, cutting off any public network access for my server. And because this is public cloud, I only paid a penny for that VSI for the hour I used it.
In a follow up post, I will show you a more in depth walk through on deploying a three tier application with VSIs.
Now I know what some of you may be thinking, if you are going to move workloads to a public cloud, why would you need vSphere anymore? Today I will give you a few reasons to consider VMware vSphere as part of your cloud strategy.
Server virtualization, may not be the hot technology that everyone is talking about these days. VM hypervisors is a commodity today. Containers, serverless, and cloud-native services are the future for greenfield or modernized workloads. And yet, the 800-pound gorilla in this space, VMware’s vSphere is a robust, time-tested platform which forms the cornerstone on top of which most organizations run their x86 workloads today. It is not uncommon when I talk to clients to discover that they are running hundreds or even thousands of VMware virtual machines in their data centers today.
With more and more enterprises looking to either shift or deploy new workloads on a public cloud, VMware’s place in the data center may not be as key in the future as it once was. Even if your workloads are not modernized and still running as virtual machines (and today that is still most enterprises), these can be run as virtual machines in a public cloud without VMware. All the major public clouds will have a cloud-native VM capability that allows customers to run their Windows and Linux virtual machines without the management or licensing of a hypervisor. So if you can deploy your virtual machine workloads on an IBM Cloud VSI or AWS EC2 instance, where is the place for VMware in our cloud strategy for running workloads? For me, there are three key reasons.
Firstly, migrating workloads onto a public cloud, or getting them back out is not always straight forward. Workloads should be portable. For enterprise clients with complex legacy workloads, having to convert those workloads to use whichever hypervisor format the cloud provider uses can complicate the migration and increase the risk of something not working or performing correctly. Using VMware vSphere on a public cloud provider reduces that risk since the virtual machines stay in the same format as they were. VMware also has tools like HCX to do things like stretch the on-premise network to the cloud, simplifying the migration for workloads that have complex dependencies. Keeping the virtual machines in VMware format also has a secondary benefit of making it easy to get them back out of the cloud provider without conversion or export.
Secondly, running VMware in the cloud reduces the amount of operational process and tooling change required for Day 2 operations. Running cloud-native services will require a change in organizational processes and does require a certain level of cloud maturity. While the level of control that clients get over VMware based solutions in a public cloud will differ depending on the cloud provider, I can say that IBM Cloud for VMware solutions allows clients to have root access and control of vCenter. This level of access enables clients to bring and continue to use the tools and processes that they are already using.
Lastly, running VMware on a public cloud allows organizations to take advantage of some of the benefits of the public cloud such as an OpEx model and on-demand resource scalability with a minimal commitment for workloads that may not scale-out as well as they scale up. When running VMware on a public cloud, you are generally paying per host in the cluster. This model allows you to scale down the number of hosts in the vSphere cluster and run denser during off-peak times of the year and then scale up quickly during peak months by adding additional hosts. In an on-premise environment I would have to have to size my cluster for peak usage. Being able to scale out a cluster and scale up the legacy workload within it during those peak times does bring some of those cloud benefits to otherwise traditional workloads that don’t work well with an application scale out model. This type of cluster scale-out approach also works well for disaster recovery use cases.
Now I have provided several reasons why you should consider VMware vSphere as part of your cloud strategy, but this does not mean it should be the end goal. Just because you may move your workloads into a public cloud with VMware does not mean that is where the journey ends. Running VMware on a public cloud should be a stepping stone to getting onto that public cloud while reducing risk and simplifying operations. Once in the public cloud, start looking at parts of these workloads and begin modernizing where it makes sense with containers or cloud-native services. This becomes much simpler to do when the workloads and dependencies are all in the cloud together.
Now that I have given you some reasons to consider vSphere as part of your public cloud strategy, let’s see how this could actually work. In a follow up I will show how some of the concepts I talked about can be put into practice. Stay tuned.