VMware Cloud on AWS

Almost after a year of the announcement, VMware Cloud on AWS is available to customers. At VMWorld 2017, both the companies highlighted the benefits of the partnership. Existing businesses using VMware stack can easily extend their virtualized data center to Amazon’s public cloud. When it comes to infrastructure, VMware can ride on top of Amazon’s global footprint. Customers across the globe can choose a region closer to their data center for public cloud migration. When Amazon announces a new region, VMware can piggyback on it without the CapEx and the management expertise. This comes as a huge win to VMware and its ecosystem. But this write up is about the nuts and bolts of the solution and how it affects our day to day operations. VMware Cloud on AWS comes with three components to it:

  1. Compute (Virtualized) – ESXi
  2. Storage (Virtualized) – vSAN
  3. Network (Virtualized) – NSX

VMwarecloud onAWS_1

All of these are managed by vSphere. This is an On-demand service which delivers software defined Data Centers (SDDC) as a cloud service. Click a button in console or make an API call and you can deploy a complete running VMware cloud in AWS with all above mentioned Software defined components which are installed, configured and  ready to use. VMware maintains and manages these components for you, so it will patch, upgrade all of these components. So if you add a new host to a cluster, ESXi is already configured on the host, same goes for vSAN and NSX. Since this is running on AWS infrastructure, it has dynamic capacity in terms of compute, storage and network.


VMware cloud on AWS is deployed directly on Bare Metal inside in an AWS EC2 environment. So its not a nested virtualization, its all ESXi sitting directly on Bare Metal servers. Hardware servers being used have below specifications:

  • I3.16XL Equivalent
  • 36 cores / 72 vCPUs
  • 512 GiB  RAM
  • 15 TiB NVMe All Flash Memory Storage
  • 25 Gb ENA (network)

This is almost the same ESXi software that you would run on-premise, however you can start as low as 4 host cluster and go up to 32 host cluster. A single customer can have multiple clusters. These are maintained by VMware and there is no direct SSH / root access to ESXi host or a VIBs or third party plugins to ESXi host.


From a storage perspective, VMware is using vSAN which actually aggregates the local storage of each host and after a suitable RF setting provides the necessary usable capacity for VMs. We cannot attach EBS or EFS to the existing hosts, from a data store perspective. The existing NVMe drives are used for the aggregate storage of vSAN pool. We can however add EFS volumes to the VMs as NAS shares if need be. All necessary VMware storage policies still apply as per requirement, so you can create individual VMware storage policies to choose the number of parity bits that are set for each VM.


NSX is being used for virtualization of Network, which basically creates logical networks. This is not running directly inside AWS subnet. So VMs are not attached to a AWS subnet, but to an overlay network, you can create Layer -2 networks which are connected into Compute and  Management Gateways. The Compute Gateway is basically a VM running to provide gateway services for all your compute nodes and Management Gateway manages and controls the NSX control center and vCentre traffic. Gateways actually act as an IGW (if you are not familiar what an IGW is in AWS click here.) except in this case, there are a few additional things which they do. They also act as IP-sec termination points for IP-sec VPN tunnels, they perform NAT and perform the North-South fire walling.


This is the best part about VMware cloud on AWS, since an IT administrator does not need to learn a new tool, since its vSphere, which he or she has been managing for ages now. It is managed by VMware, it is its own single sign on domain and you are delegated rights to an account  that allows you to actually manage your workload. VMware introduced a new feature called Hybrid Linked Mode, which allows you to connect the single sign on domain which is running inside of  VMware Cloud on AWS into your on-premises environment.

So if, you look at the big picture, the whole setup looks like a much awaited #HybridCloud. This has three pillars namely, Customer DC (on-premises), VMware Cloud on AWS, and AWS Cloud, see image below.VMwarecloud onAWS_2

Lets talk a little bit about accounts, since there are two different accounts in play when you manage VMware Cloud on AWS. When you sign up for the service, VMware is going to create a brand new AWS account, this will be owned and operated by VMware, they will pay for this account and you as a customer will have no visibility to this. They use this account to create and run all the SDDC resources which are needed to run VMware Cloud on AWS environment. This account is called VMware Cloud SDDC Account. There is a second account which is your own AWS account. This is owned, operated and paid by you as a customer, this can have a private connectivity to VMware Cloud on AWS. This runs all native AWS services and its bill is paid by you to AWS, when compared to VMware Cloud SDDC Account for which you pay the bill to VMware.

Getting Started:

  1. Go to https://vmc.vmware.com/ , this is the VMware Cloud on AWS console.
  2. Login using my.vmware.com credentials and you can create organizations.
  3. VMware also has Identity and Access Management (not the same as AWS IAM but similar to it), here you can go ahead your users and groups. Assign permissions to users etc.
  4. Create a new SDDC, by giving a new SDDC name.
  5. Choose number of hosts (4 – 32).
  6. Choose the AWS region in which the SDDC will run. (AWS EU (London) RegionAWS US East (N. Virginia)region and AWS US West (Oregon) region)
  7. Connect VMware Cloud on AWS to your existing AWS account.
  8. Connect VMware Cloud on AWS to your existing on-premise VMware account.

Once this is all done, we can manage the resources in our SDDC in VMware Cloud on AWS via vmc.vmware.com or even via vSphere HTML 5 Web Client. Remember, the whole SDDC is delivered as a service, so

  1. AWS manages the physical resources (servers, DC, hardware, cooling, power etc.).
  2. VMware manages the hypervisor and management components.
  3. You manage the VMs and applications running on them.

Access via vCentre is through a delegated permission model, so you do not have root access, you will have a cloud admin account which will have delegated rights.

Use Cases

  1. Expansion of Current DC’s without buying new hardware – Disaster recovery, backup and continuity of operations.
  2. Consolidation and Migration –  data center consolidation and migration, application migration, getting out of on-premise DC completely.
  3. Workload Flexibility – Prod, Dev, Test, Lab and Training, Burst Capacity for new application and workloads.

VMwarecloud onAWS_3



Object Storage Demystified

I’ve seen a few definitions and watched a few presentations and I’ve never really been able to very easy and clearly articulate what object storage actually is! We all know it is an architecture that managed data as an object (rather than in blocks/sectors or a hierarchy) but I never really understood what an object was…! Might just be me being stupid but after a bit of reading I understood it a lot better once i understood the characteristics of an object e.g.

  • An object is independent of the application i.e. it doesn’t need an OS or an application to be able to make sense of the data. This means that a users can access the content (e.g. JPEG, Video, PDF etc) directly from a browser (over HTTP/HTTPS) rather than needing to use a specific application. This means no app servers required, dramatically improving simplicity and performance (of course you can still access object storage via an application if needed)
  • Object storage is globally accessible i.e. no requirement to move or copy data (locations, firewalls etc)… instead data is accessible from anywhere
  • Object storage is highly parallelized, what this means is that there are no locks on write operations meaning that we have the ability to have hundreds of thousands of users distributed around the world all writing simultaneously, none of the users need to know about one another and their behavior will not impact others. This is very different to traditional NAS storage where if you want it available in a secondary site it would need to replicated to another NAS platform which is sat passive and cannot be written to directly.
  • Object storage is linearly scalable i.e. there is no point at which we would expect performance to be impacted, it can continue to grow and there is no need to manage around limitations or constraints such as capacity or structure.
  • Finally it’s worth noting that object platforms are extensible, really all this means is that it has the ability to easily extend the capabilities without large implementation efforts, examples within this context is things like the ability to enrich data with meta-data and add policies such as retention, protection and where data cannot live (compliance).

Object storage is the way to organize data by addressing and manipulating discrete units of data called objects. Each object, like a file, is a stream of binary data. However, unlike files, objects are not organised in a hierarchy of folders and are not identified by its path in the hierarchy.  Each object is associated with a key made of a string when created, and you may retrieve an object by using the key to query the object storage. As a result, all of the objects are organized in a flat name space (one object cannot be placed inside another object). Such organisation eliminates the dependency between objects but retains the fundamental functionality of a storage system: storing and retrieving data. The main profit of such organisation is very high level of scalability.

Both files and objects have metadata associated with the data they contain, but objects are characterized by their extended metadata. Each object is assigned a unique identifier which allows a server or end user to retrieve the object without needing to know the physical location of the data. This approach is useful for automating and streamlining data storage in cloud computing environments. S3 and Swift are the most commonly used cloud object protocols. Amazon S3 (Simple Storage Service) is an online file storage web service offered by Amazon Web Services. OpenStack is a free and open-source software platform for cloud computing. The S3 protocol is the most commonly used object storage protocol.  So, if you’re using 3rd party applications that use object storage, this would be the most compatible protocol. Swift is a little bit less than S3, but still very popular cloud object protocol. S3 was developed by AWS and it’s API is open for third party developers. Swift protocol is managed by the OpenStack Foundation, a non-profit corporate entity established in September 2012 to promote OpenStack software and its community. More than 500 companies have joined the project. Below are some major difference between S3 and SWIFT.

Unique features of S3:

  • Bucket-level controls for versioning and expiration that apply to all objects in the bucket
  • Copy Object – This allows you to do server-side copies of objects
  • Anonymous Access – The ability to set PUBLIC access on an object and serve it via HTTP/HTTPS without authentication.
  • S3 stores its objects in a bucket.

Unique features of SWIFT

SWIFT API  allows Unsized object create feature, Swift is the only protocol where you can use “Chunked” encoding to upload an object where the size is not known beforehand.  S3 require multiple requests to achieve this. SWIFT stores the objects in its “Containers”.

Authentication (S3 vs SWIFT)

S3 – Amazon S3 uses an authorization header that must be present in all requests to identify the user (Access Key Id) and provide a signature for the request. An Amazon access key ID has 20 characters. Both HTTP and HTTPS protocols are supported.

SWIFT – Authentication in Swift is quite flexible. It is done through a separate mechanism creating a “token” that can be passed around to authenticate requests. Both HTTP and HTTPS protocols are supported.

Retention and AUDIT (S3 vs SWIFT)

Retention periods are supported on all object interfaces including S3 and Swift. The controller API provides the ability to audit the use of the S3 and Swift object interfaces.

Large Objects (S3 vs SWIFT)

S3 Multipart Upload allows you to upload a single object as a set of parts. After all of these parts are uploaded, the data will be presented as a single object. OpenStack Swift Large Object is comprised of two types of objects: segment objects that store the object content, and a manifest object that links the segment objects into one logical large object. When you download a manifest object, the contents of the segment objects will be concatenated and returned in the response body of the request.

So which object storage API to use ? Well, both have their benefits, at specific use cases. DellEMC ECS is an on-premise object storage solution which allows users to have multiple object protocols like S3, SWIFT, CAS, HTTPS, HDFS, NFSv3 etc all in a single machine. It is built on servers with their DAS storage running ECS software and is also available in software format which can be deployed on your own servers.


There are many benefits of using ECS as your own object storage: Continue reading “Object Storage Demystified”

Software Defined Storage for Block Workloads

Almost two months since I last wrote. No, I was not utterly busy, just procrastinating on my blog topics. Last week, I was lucky to be part of a meeting which had nothing to do with data protection (I didn’t know this!). The customer I met has several thousand Virtual machines (OK, around 9,000 VMs) and his concern is not data protection (at least initially) but performance on these. These VMs are used for running web servers, databases, Hadoop clusters, some even hold cold archives and so on and so forth. Obviously storage and corresponding IOPS performance required here is mammoth. Also, just to make things clear, the infra has VMware, Hyper-V, KVM, RHEV etc. They have a bunch of storage equipment as well, from almost all of major data storage vendors. In the conversation with customer, I learned their main concern was Cost, Scalability,  Performance, Data Services, DR capabilities and integration with ecosystem (applications, hyper-visors, OS, network etc). They had already tried almost every vendor and they were “satisfied” but were not particularly happy. They were looking for something software defined, which could perform like enterprise storage, or even better for their scale.

As Wikipedia describes it, “Software-defined storage (SDS) is a term for computer data storage software for policy-based provisioning and management of data storage independent of the underlying hardware. Software-defined storage typically includes a form of storage virtualization to separate the storage hardware from the software that manages it. The software enabling a software-defined storage environment may also provide policy management for features such as replication, thin provisioning, snapshots and backup.” Software-defined storage (SDS) is a key driver of data center transformation. As a data center grade SDS, the enterprise features, availability, performance and flexibility of Software defined storage makes it perfect for traditional array consolidation, private cloud/IaaS, and new emerging technologies like DevOps and container microservices. Since SDS is hardware agnostic (does not depend on type of drive, disk, network), it’s very easy for it to take advantage of new hardware releases immediately. Therefore, with SDS you can leverage newer hardware in market (such as NVMe Drives) providing performance and acceleration advancements.

Well, then what are the options in market for SDS? Now, before I take a plunge into this topic, would want to clarify, I in this blog will only be referring to Software Defined Block Storage (I will leave file and object for some other day.). If you perform a quick Google search, you will find almost everyone proclaiming the right to throne of SDS – Block kingdom. Before we choose a winner, I would want to re-iterate the requirements so that we can judge wisely. We need following attributes – SCALABILITY (No, not Terabytes (common, that was required in late 2000’s), Petabytes, Zeta bytes), Performance (on almost all block sizes, not just on 8K, 16K etc., this is important as different applications have different block sizes on which they deliver best results the storage should adapt to the same.), COTS enabled (can I deploy the storage on servers?), Data Services (Snapshots, Compression, Replication, Encryption etc.), Integration with ecosystem (supports for all OS, hyper-visors, container systems, microservices etc.). That seems a lot to ask from a single product, but this is how the dice is rolled in case of block storage requirement. But hadn’t we already solved all these issues with Traditional SAN systems? Well only for a while, as scale of IT infra grows, requirement for stateless systems managed by microservices are needed more and more for running “newer” applications and optimizing already existing ones.

We all want what we can’t have, normal human nature: a single globally distributed, unified storage system, that is infinitely scalable, easy to manage, replicated between several data centers and serves block devices, file systems, and object, all without any issues and delivering data services such as Data compression, and snapshots etc. However this is not really possible, not at scale. The point is that some storage systems are for IOPS, some for scale, some just for sprawl. With these different requirements it becomes extremely difficult to code storage for all the use cases. Adding different data services, just increases the data hops between different daemons, involved, reducing performance, as far as I believe as of now, with present technology and trends it is difficult to achieve a storage which does all, not that a unified storage does not work, but when you need performance, purpose built is the way to go. I have been an admin for storage for some time in my earlier life and I acknowledge that managing a Unified storage is much easier and simpler, than multiple purpose built appliances, but then again I would say, if I need block performance, I would bet my life on a purpose built Software defined Storage for block.

SCALEIO and CEPH are two most valid candidates to hold the baton for Software defined Block storage, but who is the real winner, in terms of attributes mentioned above. I will try to demystify on architecture levels and usability. So here is what CEPH delivers in single software, in a single go…

  • Scalable distributed block storage
  • Scalable distributed object storage
  • Scalable distributed file system storage
  • Scalable control plane that manages all of the above

To sweeten the deal, this all is free, for any capacity almost (well this depends, if you are a storage admin, you know what I mean). This is Holy Grail of storage (almost!), this is all OPEN SOURCE. But as a technologist if you look underneath the skin, remove the flesh, and understand the skeleton of a software, there are a lot of things happening here. Let’s check what CEPH has in its kitty. As I earlier mentioned fundamental problem with any multi-purpose tool is that it makes compromises in each “purpose” it serves, this is for a simple reason, cause a multipurpose storage like CEPH is designed to do many things and different things interfere with each other. It’s like you are asking a toaster to toast (which is fine) and also to fry your steak (All the best with that!), with present technology and coding it is possible but then there are some “TRADE-OFFS”  Ceph’s trade-off, as a multi-purpose tool, is the use of a single “object storage” layer.  You have a block interface (RBD), an object interface (RADOSGW), and a filesystem interface (CephFS), all of which talk to an underlying object storage system (RADOS).  Here is the CEPH architecture from their documentation:


RADOS itself is reliant on an underlying file system to store its objects. So the diagram should actually look like this:

So in a given data path, for example a block written to disk, there is a high level of overhead:c3

In contrast, a purpose-built block storage system that does not compromise and is focused solely on block storage, like DellEMC ScaleIO, can be significantly more efficient:


(Here, SDC is ScaleIO Data Client which hosts the application which requires the IOPs and SDS is ScaleIO Data Server which pools the storage from multiple other SDS machines. A single server can act as both SDC and SDS.) This allows skipping two steps, but more importantly, it avoids complications and additional layers of indirection/abstraction as there is a 1:1 mapping of the ScaleIO client’s block and the block(s) on disk in the ScaleIO cluster. By comparison, multi-purpose systems need to have a single unified way of laying out storage data, which can add significant overhead, even at smaller scales.  Ceph, for example, takes any of its “client data formats” (object, file, block), slices them up into “stripes”, and distributes those stripes across many “objects”, each of which is distributed within replicated sets, which are ultimately stored on a Linux file system in the Ceph cluster.  Here’s the diagram from the Ceph documentation describing this:


This is a great architecture if you are going to normalize multiple protocols, but it’s a terrible architecture if you are designing for high performance block storage only, reason simple enough, there will be just too many calculations and “INSIDE IOPS” for a heavy transactional workload. In terms of latency, Ceph’s situation would get much grimmer, with Ceph having incredibly poor latency, almost certainly due to their architecture compromises.

DellEMC ScaleIO is software that creates a server-based SAN from local application server storage (local or network storage devices). ScaleIO delivers flexible, scalable performance and capacity on demand andintegrates storage and compute resources, scaling to hundreds of servers (also called nodes). As an alternative to traditional SAN infrastructures, ScaleIO combines hard disk drives (HDD), solid state disk (SSD), Peripheral Component Interconnect Express (PCIe) flash cards and NVMe drives to create a virtual pool of block storage with varying performance tiers. As opposed to traditional Fibre Channel SANs, ScaleIO has no requirement for a Fibre Channel fabric between the servers and the storage. This further reduces the cost and complexity of the solution. In addition, ScaleIO is hardware-agnostic and supports both physical and virtual application servers.


It creates a Software-Defined Storage (SDS) environment that allows users to exploit the unused local storage capacity in any server. ScaleIO provides a scalable, high performance, fault tolerant distributed shared storage system. Once again it can be installed on VMware, Xen, Hyper-V, Bare Metal servers etc., you get the vibe.

There are other problems besides performance with a multi-purpose system.  The overhead I outlined above also means the system has to be hefty to just to do internal jobs, every new task or purpose it takes on includes overhead in terms of business logic, processing time, and resources consumed.  In most common configurations, ScaleIO, being purpose-built takes less of the host system’s resources such as memory and CPU. Ceph would take significantly more resources than ScaleIO, making it a very poor choice for “hyper-converged”, semi-hyper-converged, scale-out deployments. This means that if you built two separate configurations of Ceph vs. ScaleIO that are designed to deliver the same performance levels, ScaleIO would have significantly better TCO, just factoring in the cost of the more expensive hardware required to support Ceph’s heavyweight footprint. So this also ensures that purpose built software just not promise and deliver performance but also cost effectiveness. I stumbled upon an old YouTube video (https://www.youtube.com/watch?v=S9wjn4WN4tE) showcasing how on block storage ScaleIO performs better than Ceph on similar compute resources. If you watch the video in entirety it clearly shows that ScaleIO exploits the underlying the resources much more efficiently, making it more scalable over time.

If you want to build a relatively low cost, high performance, distributed block storage system that supports bare metal, virtual machines, and containers, then you need something purpose built for block storage (Performance Matters!).  You need a system optimized for block, ScaleIO. If you haven’t already, checked out ScaleIO, which is free to download and use at whatever size you want, Installing ScaleIO is very easy and can be made up and running in less than 10 minutes.  Run these tests yourself.  Report the results if you like. I am adding some documentation for ScaleIO which I found extremely useful understanding the way ScaleIO works: ScaleIO Architecture Guide. I will be writing more about SDS, specifically on its native data services like snapshots (as it pertains to Data protection) and ways to protect it via enterprise backup software and data protection appliances .