Data Protection in VMware Cloud on AWS

Since VMware is now available on AWS as well, one of my customer using VMware on-premise have started playing with VMC on AWS. They like the same management console and similar GUI but have certain inhibitions, mostly towards data protection in VMC on AWS. AWS, truth being said still does not offer enterprise level data protection services for IT to function without having a heart attack. Fortunately,  Dell EMC is one of the first provider to bring cloud-enabled, self-service data protection for VMware’s enterprise class Software-Defined Data Center to the AWS Cloud. Whether expanding services on-premises or in the public cloud, Dell EMC provides the same world class data protection with superior compression and deduplication. Dell EMC Data Protection for VMware Cloud on AWS is available as a single bundle and includes the data protection software and protection storage needed to protect your data and applications running on VMware Cloud on AWS. For those of whom are new to the idea of VMware Cloud on AWS the following write up can help you to bring up to speed. Below schematic shows the high-level overview of VMware Cloud on AWS:

VMwarecloud onAWS_2

VMware Cloud on AWS can also allow you to migrate and move data to native AWS cloud also as shown in figure below.

VMC on AWS

Dell EMC offers a bundle for customers wanting to protect their VMware Cloud on AWS environments that includes Dell EMC Data Protection Software and Data Domain Virtual Edition (DD Virtual Edition). Below are some salient features of the licensing bundle:

  • Similar pricing to VMware Cloud on AWS pricing: Pricing is per host – 1 or 3 year subscription model.
  • Best-in-class deduplication lowers cloud consumption costs.
  • vSphere integration and attractive pricing that makes it painless to protect VMware workloads on VMware Cloud on AWS
  • DD Virtual Edition now expands to 96 TB, leveraging object storage for even more cost efficiencies
  • File based backups and recoveries, VM image based backups and restores etc. supported.

The solution for Data Protection in VMC on AWS includes: NetWorker, DDVE, CB and AVE, which can be used as per requirement by customers. The solution allows to take backups on S3 or on EBS devices devices as per performance and cost requirements. DellEMC DPS solution offers below functionalities as of now.

VMC on AWS -1

Customers can also leverage Cloud DR (spinning up VMware VMs in AWS or VMC in AWS in case of DR) functionality from DellEMC as well in case of a disaster. Hence making another use case of DR to VMC from on-premise a possibility for customer now. With this feature added now DellEMC Data protection software allows customers:

  • to keep long term retention data on object storage such as S3, Azure LRS etc.
  • to leverage public clouds like Azure, AWS and VMC on AWS as a DR site
  • to run production workloads on public clouds by providing data protection in public cloud as well.

VMC on AWS -3

More details about DellEMC Data Protection Solution in VMC on AWS can be read here.

Advertisements

Why DellEMC DPS for Azure Data Protection ?

Azure Backup is the Microsoft’s cloud-based service you can use to back up and restore your data in Microsoft Azure. Azure Backup offers multiple ways to deploy the solution based on what you want to backup.  All solution options, regardless of on-premises or cloud resources, can be used to backup data to a Recovery Services vault in Azure. Within the Azure portal in the Recovery Services vault, Microsoft provides a simple wizard to help determine which solution to deploy based on your needs.  You simply select either On-Premises or Azure as well as what you want to backup and you are provided with instructions based on the appropriate solution required. There are four primary ways to utilize Azure Backup. Each of these options are describe below:

  • Azure Backup Agent
  • System Center Data Protection Manager
  • Azure Backup Server
  • Azure IaaS VM Backup

Azure Backup Agent

Azure Backup Agent is a server-less agent that installs directly on a physical or virtual Windows Server.  The servers can be on-premises or in Azure.  This agent can backup files, folders      and system states directly to an Azure Recovery Services Vault up to 3 times per day.  This agent is not application aware and can only restore at the volume level.  Also, there is no support for Linux.

System Center Data Protection Manager (DPM)

DPM provides a robust enterprise backup and recovery solution with the ability to backup on-premises and in Azure. A DPM server can be deployed on-premises or in Azure.  DPM can be used to backup Application-aware solutions such as SQL Server, SharePoint, and Exchange.  DPM can also backup files and folders, System states, Bare Metal Recovery (BMR), as well as entire Hyper-V or VMWare VMs.  DPM can store data on disks, tape, or within Azure Recovery Services Vault. DPM supports backups of Window 7 or later client machines and Windows 2008 R2 SP1 or later servers. DPM cannot backup any Oracle, DB2 etc. workloads.Support for Linux-based machines is based on Microsoft’s endorsed list found here 

Azure Backup Server

Microsoft Azure Backup Server (MABS) is merely a slightly scaled-down version of System Center DPM. MABS is for customers that do not already have System Center DPM. MABS does not require any System Center licenses. MABS requires an Azure subscription to be active always. The primary differences between MABS and System Center DPM are as follows:

    • Does not support tape backups
    • No centralized System Center administration
    • Unable to back up another MABS instance
    • Does not integrate with Azure Site Recovery Services

Azure IaaS VM Backup

All Azure VMs can be directly backup up to a Recovery Services Vault with no agent installation or additional infrastructure required.  You can also backup all attached disks to a VM. This works for both Windows and Linux VMs. You can back up only once per day and only to Azure; on-premises backup is not supported. VMs are only restored at disk level. DellEMC’s approach is far ahead than what is being offered by native Azure options. We have multiple options which can be used to protect workloads in Azure:

1. Cloud Snapshot Manager for Azure:

This is a SaaS based offering which allows protection of Azure managed VMs to be protected on a snapshot basis in Azure object storage. The licensing is based on number of instances on subscription basis. No backup server or infrastructure is required for such configuration. More importantly, customer can have multiple Azure accounts and subscriptions and all can be protected and managed by a single CSM console User Interface. It literally takes 5-7 mins to start your first backup.

2. NetWorker and Data Domain / Avamar and Data Domain:

NetWorker and Avamar are DellEMC’s flagship software which can be deployed in a Azure VM and can be made to write to Data Domain Virtual Edition in Azure. NetWorker and DD allow customers to have no media servers by virtue of client direct and since DD has a single deduplication pool the storage savings are magnanimous. With continued investment in DD engineering we have now the functionality to deploy DD in Azure on object Storage which allows for more storage savings than ever. Since we are leveraging NetWorker and Avamar we get the functionality to integrate with any application or database that is hosted in Azure. The main benefits from such a solution are below:

    • No media servers required in Azure
    • Ability to protect data in de-duplicated format in Azure object storage
    • Wide application and database support.
    • Enterprise level backup performance and de-duplication
    • Both Data Domain and Avamar are available in Azure marketplace.

VOTF OCT_1

3. Data Domain and DDBEA:

Data Domain can protect workloads without integrating with any backup software and can protect both SQL and No SQL databases, by leveraging BOOST and BOOSTFS integration respectively. Customer can run the BOOSTFS tool and with help of DellEMC make any customer application write backups on DD with source based deduplication as well.  DDVE is available up to 96 TB in a single instance in Azure, the capacity of 96 TB comes from Azure BLOB storage so it is extremely cost efficient.

DellEMC DPS is one stop solution for enterprise data protection in and to Azure. Below are some market place links for DellEMC DPS solutions in Azure.

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/dellemc.dell-emc-avamar-virtual-edition  — Avamar in Azure Marketplace

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/dellemc.dell-emc-datadomain-virtual-edition-v4 — Data Domain in Azure Marketplace

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/dellemc.dell-emc-networker-virtual-edition — NetWorker in Azure Marketplace

Object Storage Demystified

I’ve seen a few definitions and watched a few presentations and I’ve never really been able to very easy and clearly articulate what object storage actually is! We all know it is an architecture that managed data as an object (rather than in blocks/sectors or a hierarchy) but I never really understood what an object was…! Might just be me being stupid but after a bit of reading I understood it a lot better once i understood the characteristics of an object e.g.

  • An object is independent of the application i.e. it doesn’t need an OS or an application to be able to make sense of the data. This means that a users can access the content (e.g. JPEG, Video, PDF etc) directly from a browser (over HTTP/HTTPS) rather than needing to use a specific application. This means no app servers required, dramatically improving simplicity and performance (of course you can still access object storage via an application if needed)
  • Object storage is globally accessible i.e. no requirement to move or copy data (locations, firewalls etc)… instead data is accessible from anywhere
  • Object storage is highly parallelized, what this means is that there are no locks on write operations meaning that we have the ability to have hundreds of thousands of users distributed around the world all writing simultaneously, none of the users need to know about one another and their behavior will not impact others. This is very different to traditional NAS storage where if you want it available in a secondary site it would need to replicated to another NAS platform which is sat passive and cannot be written to directly.
  • Object storage is linearly scalable i.e. there is no point at which we would expect performance to be impacted, it can continue to grow and there is no need to manage around limitations or constraints such as capacity or structure.
  • Finally it’s worth noting that object platforms are extensible, really all this means is that it has the ability to easily extend the capabilities without large implementation efforts, examples within this context is things like the ability to enrich data with meta-data and add policies such as retention, protection and where data cannot live (compliance).

Object storage is the way to organize data by addressing and manipulating discrete units of data called objects. Each object, like a file, is a stream of binary data. However, unlike files, objects are not organised in a hierarchy of folders and are not identified by its path in the hierarchy.  Each object is associated with a key made of a string when created, and you may retrieve an object by using the key to query the object storage. As a result, all of the objects are organized in a flat name space (one object cannot be placed inside another object). Such organisation eliminates the dependency between objects but retains the fundamental functionality of a storage system: storing and retrieving data. The main profit of such organisation is very high level of scalability.

Both files and objects have metadata associated with the data they contain, but objects are characterized by their extended metadata. Each object is assigned a unique identifier which allows a server or end user to retrieve the object without needing to know the physical location of the data. This approach is useful for automating and streamlining data storage in cloud computing environments. S3 and Swift are the most commonly used cloud object protocols. Amazon S3 (Simple Storage Service) is an online file storage web service offered by Amazon Web Services. OpenStack is a free and open-source software platform for cloud computing. The S3 protocol is the most commonly used object storage protocol.  So, if you’re using 3rd party applications that use object storage, this would be the most compatible protocol. Swift is a little bit less than S3, but still very popular cloud object protocol. S3 was developed by AWS and it’s API is open for third party developers. Swift protocol is managed by the OpenStack Foundation, a non-profit corporate entity established in September 2012 to promote OpenStack software and its community. More than 500 companies have joined the project. Below are some major difference between S3 and SWIFT.

Unique features of S3:

  • Bucket-level controls for versioning and expiration that apply to all objects in the bucket
  • Copy Object – This allows you to do server-side copies of objects
  • Anonymous Access – The ability to set PUBLIC access on an object and serve it via HTTP/HTTPS without authentication.
  • S3 stores its objects in a bucket.

Unique features of SWIFT

SWIFT API  allows Unsized object create feature, Swift is the only protocol where you can use “Chunked” encoding to upload an object where the size is not known beforehand.  S3 require multiple requests to achieve this. SWIFT stores the objects in its “Containers”.

Authentication (S3 vs SWIFT)

S3 – Amazon S3 uses an authorization header that must be present in all requests to identify the user (Access Key Id) and provide a signature for the request. An Amazon access key ID has 20 characters. Both HTTP and HTTPS protocols are supported.

SWIFT – Authentication in Swift is quite flexible. It is done through a separate mechanism creating a “token” that can be passed around to authenticate requests. Both HTTP and HTTPS protocols are supported.

Retention and AUDIT (S3 vs SWIFT)

Retention periods are supported on all object interfaces including S3 and Swift. The controller API provides the ability to audit the use of the S3 and Swift object interfaces.

Large Objects (S3 vs SWIFT)

S3 Multipart Upload allows you to upload a single object as a set of parts. After all of these parts are uploaded, the data will be presented as a single object. OpenStack Swift Large Object is comprised of two types of objects: segment objects that store the object content, and a manifest object that links the segment objects into one logical large object. When you download a manifest object, the contents of the segment objects will be concatenated and returned in the response body of the request.

So which object storage API to use ? Well, both have their benefits, at specific use cases. DellEMC ECS is an on-premise object storage solution which allows users to have multiple object protocols like S3, SWIFT, CAS, HTTPS, HDFS, NFSv3 etc all in a single machine. It is built on servers with their DAS storage running ECS software and is also available in software format which can be deployed on your own servers.

ECS1

There are many benefits of using ECS as your own object storage: Continue reading “Object Storage Demystified”