Why DellEMC DPS for Azure Data Protection ?

Azure Backup is the Microsoft’s cloud-based service you can use to back up and restore your data in Microsoft Azure. Azure Backup offers multiple ways to deploy the solution based on what you want to backup.  All solution options, regardless of on-premises or cloud resources, can be used to backup data to a Recovery Services vault in Azure. Within the Azure portal in the Recovery Services vault, Microsoft provides a simple wizard to help determine which solution to deploy based on your needs.  You simply select either On-Premises or Azure as well as what you want to backup and you are provided with instructions based on the appropriate solution required. There are four primary ways to utilize Azure Backup. Each of these options are describe below:

  • Azure Backup Agent
  • System Center Data Protection Manager
  • Azure Backup Server
  • Azure IaaS VM Backup

Azure Backup Agent

Azure Backup Agent is a server-less agent that installs directly on a physical or virtual Windows Server.  The servers can be on-premises or in Azure.  This agent can backup files, folders      and system states directly to an Azure Recovery Services Vault up to 3 times per day.  This agent is not application aware and can only restore at the volume level.  Also, there is no support for Linux.

System Center Data Protection Manager (DPM)

DPM provides a robust enterprise backup and recovery solution with the ability to backup on-premises and in Azure. A DPM server can be deployed on-premises or in Azure.  DPM can be used to backup Application-aware solutions such as SQL Server, SharePoint, and Exchange.  DPM can also backup files and folders, System states, Bare Metal Recovery (BMR), as well as entire Hyper-V or VMWare VMs.  DPM can store data on disks, tape, or within Azure Recovery Services Vault. DPM supports backups of Window 7 or later client machines and Windows 2008 R2 SP1 or later servers. DPM cannot backup any Oracle, DB2 etc. workloads.Support for Linux-based machines is based on Microsoft’s endorsed list found here 

Azure Backup Server

Microsoft Azure Backup Server (MABS) is merely a slightly scaled-down version of System Center DPM. MABS is for customers that do not already have System Center DPM. MABS does not require any System Center licenses. MABS requires an Azure subscription to be active always. The primary differences between MABS and System Center DPM are as follows:

    • Does not support tape backups
    • No centralized System Center administration
    • Unable to back up another MABS instance
    • Does not integrate with Azure Site Recovery Services

Azure IaaS VM Backup

All Azure VMs can be directly backup up to a Recovery Services Vault with no agent installation or additional infrastructure required.  You can also backup all attached disks to a VM. This works for both Windows and Linux VMs. You can back up only once per day and only to Azure; on-premises backup is not supported. VMs are only restored at disk level. DellEMC’s approach is far ahead than what is being offered by native Azure options. We have multiple options which can be used to protect workloads in Azure:

1. Cloud Snapshot Manager for Azure:

This is a SaaS based offering which allows protection of Azure managed VMs to be protected on a snapshot basis in Azure object storage. The licensing is based on number of instances on subscription basis. No backup server or infrastructure is required for such configuration. More importantly, customer can have multiple Azure accounts and subscriptions and all can be protected and managed by a single CSM console User Interface. It literally takes 5-7 mins to start your first backup.

2. NetWorker and Data Domain / Avamar and Data Domain:

NetWorker and Avamar are DellEMC’s flagship software which can be deployed in a Azure VM and can be made to write to Data Domain Virtual Edition in Azure. NetWorker and DD allow customers to have no media servers by virtue of client direct and since DD has a single deduplication pool the storage savings are magnanimous. With continued investment in DD engineering we have now the functionality to deploy DD in Azure on object Storage which allows for more storage savings than ever. Since we are leveraging NetWorker and Avamar we get the functionality to integrate with any application or database that is hosted in Azure. The main benefits from such a solution are below:

    • No media servers required in Azure
    • Ability to protect data in de-duplicated format in Azure object storage
    • Wide application and database support.
    • Enterprise level backup performance and de-duplication
    • Both Data Domain and Avamar are available in Azure marketplace.


3. Data Domain and DDBEA:

Data Domain can protect workloads without integrating with any backup software and can protect both SQL and No SQL databases, by leveraging BOOST and BOOSTFS integration respectively. Customer can run the BOOSTFS tool and with help of DellEMC make any customer application write backups on DD with source based deduplication as well.  DDVE is available up to 96 TB in a single instance in Azure, the capacity of 96 TB comes from Azure BLOB storage so it is extremely cost efficient.

DellEMC DPS is one stop solution for enterprise data protection in and to Azure. Below are some market place links for DellEMC DPS solutions in Azure.

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/dellemc.dell-emc-avamar-virtual-edition  — Avamar in Azure Marketplace

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/dellemc.dell-emc-datadomain-virtual-edition-v4 — Data Domain in Azure Marketplace

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/dellemc.dell-emc-networker-virtual-edition — NetWorker in Azure Marketplace


SaaS Data Protection for AWS – Cloud Snapshot Manager

Few weeks back DellEMC released a newer version of Cloud Snapshot Manager. Just in case if you are not aware, it is a Software as a Service solution, fully operated by DellEMC providing our customers the control, automation and visibility over their AWS workload protection in cloud. Cloud Apps specifically applications in AWS are agile and can scale really fast due to nature of service in AWS, they need different kind of data protection. Yes we have NetWorker, DDVE, AVE and CloudBoost in AWS and they have their own use case. AWS workloads are a bit different, we do not see the hypervisor (which is by the way customized XEN) and have limited abilities in AWS (courtesy AWS.), this makes data protection in AWS a bit different. Below are some reasons which make traditional data protection not a complete solution for AWS workloads.

Issues with AWS DP

Now taking snaps from native EC2 (Elastic Cloud Compute – VM in AWS), EBS, RDS etc. by CSM has many benefits some of which are listed below:

  • Snapshots provide incremental forever protection, CSM calls and retains same snapshots which AWS natively uses, only this time with added benefits which we will see.
  • Snapshot are of EBS, EC2 storage, RDS machines, whereas native snapshots only support protection of EBS and RDS workload.
  • Snapshots are incremental forever and are compressed before they are written to S3 storage, since S3 is globally available data is in secure and durable storage.
  • Native snapshots cannot be restored in another region (without massive scripting), but by CSM its a simple restore, this also is beneficial in case a complete AWS region goes away.
  • Since CSM leverages AWS APIs for the snapshot and CSM portal infra is managed by DellEMC, customers do not have to manage backup server, backup storage etc. as I mentioned this is a SaaS service. This is not the case with Veritas Cloudpoint which is lot more difficult to manage. The same is true with Commvault, Veeam and Rubrik. They all have to deploy a backup server in AWS to get the backups started, with CSM you can start backups in 4-5 mins. That’s the whole promise of CLOUD – AGILITY.
  • Restores are much faster when restoring from Snapshots and snapshots can be taken even if the RDS, EC2 machines are down.
  • The only way to protect RDS – Relational Database service in AWS (which hosts Oracle, MSSQL, PostgreSQL, Aurora, Maria DB and MySQL) is via Snapshots, which CSM does promptly and by the way as of now Rubrik does not support data protection for RDS machines in AWS at all.


  • CSM allows you to have any retention in AWS (more than 35 days) for EC2, EBS, RDS etc. which is not possible with native AWS data protection.
  • CSM allows for resources such as EC2, EBS, RDS to be automatically protected by native AWS TAGS (tags are metadata specific to an organization that can be added to cloud resources). Tags can help in reporting, compliance, show-back and charge-back etc. CSM allows for automatic assignment of resources to protection policies to achieve auto-scaling for data protection. So that you can set it and forget it.
  • CSM supports multi-tenancy, support for backup of multiple AWS account, multiple regions, multiple availability zones with ONE CONSOLE, which as of today no other vendor has.
  • With new release of CSM, we have support for file level recovery from the snapshots! Native AWS snapshots do not support FLR from snapshots.
  • CSM has also added copying the snapshots to another region, which enables customer to have a proper DR plan, since if region X gets lost, they do not need to worry, since their backup console is with DellEMC and not in region X and their snapshots are at DR site (region Y).
  • CSM also can quiesce applications using VSS architecture for a application consistent snapshot for the Microsoft applications.
  • Normal scripting / AWS native snapshots etc. do not provide audit logs, reporting etc. whereas, the HTML 5 console of CSM does it all for any number of AWS accounts, regions etc.

Just in case you want to try it yourself, take it for a spin or ask your customer to use the trial version for 30 days at Cloud Snapshop Manager – Data Protection | Dell EMC US

Object Storage Demystified

I’ve seen a few definitions and watched a few presentations and I’ve never really been able to very easy and clearly articulate what object storage actually is! We all know it is an architecture that managed data as an object (rather than in blocks/sectors or a hierarchy) but I never really understood what an object was…! Might just be me being stupid but after a bit of reading I understood it a lot better once i understood the characteristics of an object e.g.

  • An object is independent of the application i.e. it doesn’t need an OS or an application to be able to make sense of the data. This means that a users can access the content (e.g. JPEG, Video, PDF etc) directly from a browser (over HTTP/HTTPS) rather than needing to use a specific application. This means no app servers required, dramatically improving simplicity and performance (of course you can still access object storage via an application if needed)
  • Object storage is globally accessible i.e. no requirement to move or copy data (locations, firewalls etc)… instead data is accessible from anywhere
  • Object storage is highly parallelized, what this means is that there are no locks on write operations meaning that we have the ability to have hundreds of thousands of users distributed around the world all writing simultaneously, none of the users need to know about one another and their behavior will not impact others. This is very different to traditional NAS storage where if you want it available in a secondary site it would need to replicated to another NAS platform which is sat passive and cannot be written to directly.
  • Object storage is linearly scalable i.e. there is no point at which we would expect performance to be impacted, it can continue to grow and there is no need to manage around limitations or constraints such as capacity or structure.
  • Finally it’s worth noting that object platforms are extensible, really all this means is that it has the ability to easily extend the capabilities without large implementation efforts, examples within this context is things like the ability to enrich data with meta-data and add policies such as retention, protection and where data cannot live (compliance).

Object storage is the way to organize data by addressing and manipulating discrete units of data called objects. Each object, like a file, is a stream of binary data. However, unlike files, objects are not organised in a hierarchy of folders and are not identified by its path in the hierarchy.  Each object is associated with a key made of a string when created, and you may retrieve an object by using the key to query the object storage. As a result, all of the objects are organized in a flat name space (one object cannot be placed inside another object). Such organisation eliminates the dependency between objects but retains the fundamental functionality of a storage system: storing and retrieving data. The main profit of such organisation is very high level of scalability.

Both files and objects have metadata associated with the data they contain, but objects are characterized by their extended metadata. Each object is assigned a unique identifier which allows a server or end user to retrieve the object without needing to know the physical location of the data. This approach is useful for automating and streamlining data storage in cloud computing environments. S3 and Swift are the most commonly used cloud object protocols. Amazon S3 (Simple Storage Service) is an online file storage web service offered by Amazon Web Services. OpenStack is a free and open-source software platform for cloud computing. The S3 protocol is the most commonly used object storage protocol.  So, if you’re using 3rd party applications that use object storage, this would be the most compatible protocol. Swift is a little bit less than S3, but still very popular cloud object protocol. S3 was developed by AWS and it’s API is open for third party developers. Swift protocol is managed by the OpenStack Foundation, a non-profit corporate entity established in September 2012 to promote OpenStack software and its community. More than 500 companies have joined the project. Below are some major difference between S3 and SWIFT.

Unique features of S3:

  • Bucket-level controls for versioning and expiration that apply to all objects in the bucket
  • Copy Object – This allows you to do server-side copies of objects
  • Anonymous Access – The ability to set PUBLIC access on an object and serve it via HTTP/HTTPS without authentication.
  • S3 stores its objects in a bucket.

Unique features of SWIFT

SWIFT API  allows Unsized object create feature, Swift is the only protocol where you can use “Chunked” encoding to upload an object where the size is not known beforehand.  S3 require multiple requests to achieve this. SWIFT stores the objects in its “Containers”.

Authentication (S3 vs SWIFT)

S3 – Amazon S3 uses an authorization header that must be present in all requests to identify the user (Access Key Id) and provide a signature for the request. An Amazon access key ID has 20 characters. Both HTTP and HTTPS protocols are supported.

SWIFT – Authentication in Swift is quite flexible. It is done through a separate mechanism creating a “token” that can be passed around to authenticate requests. Both HTTP and HTTPS protocols are supported.

Retention and AUDIT (S3 vs SWIFT)

Retention periods are supported on all object interfaces including S3 and Swift. The controller API provides the ability to audit the use of the S3 and Swift object interfaces.

Large Objects (S3 vs SWIFT)

S3 Multipart Upload allows you to upload a single object as a set of parts. After all of these parts are uploaded, the data will be presented as a single object. OpenStack Swift Large Object is comprised of two types of objects: segment objects that store the object content, and a manifest object that links the segment objects into one logical large object. When you download a manifest object, the contents of the segment objects will be concatenated and returned in the response body of the request.

So which object storage API to use ? Well, both have their benefits, at specific use cases. DellEMC ECS is an on-premise object storage solution which allows users to have multiple object protocols like S3, SWIFT, CAS, HTTPS, HDFS, NFSv3 etc all in a single machine. It is built on servers with their DAS storage running ECS software and is also available in software format which can be deployed on your own servers.


There are many benefits of using ECS as your own object storage: Continue reading “Object Storage Demystified”

Understanding GDPR

For the past few months a lot has been spoken, written and talked about GDPR Compliance. Below write up is amalgamation of major takeaways from all the articles and actual GDPR document (OK, I did not read the whole document, but I did read few parts from it). If you find I have missed something that I have missed and I should have mentioned, please do point out, cause this is something which we have to get right. I’ll start with first highlighting some key aspects of GDPR – like:

  • What is GDPR
  • Key Regulatory Requirements
  • Role of IT Professionals
  • Actions for Compliance (12 Steps)

What is GDPR –

It’s not something new and before GDPR we had Data Protection Act so if you had it implemented then you will go through less pain since a lot of elements are partially covered by it. The whole idea and concept is to know how the data is collected, where the data resides, stored, processed, deleted, who can access it and how it’s used for EU citizens. This means that organizations will be required to show the data flow or life-cycle to minimize any risk of personal data being leaked and all required steps are in place under GDPR. In short, GDPR is to have common sense of data security ideas, minimize collection of personal data, delete personal data that’s no longer necessary, restrict access, and secure data through its entire life-cycle and also by adding requirements for documenting IT procedures, performing risk assessments under certain conditions, notifying the consumer and authorities when there is a breach, as well as strengthening rules for data minimization.

Key Regulatory Requirements –

  • Privacy by Design:   PbD is referenced heavily in Article 25 of the GDPR, and in many other places in the new regulation. Privacy by Design (PbD) focuses on minimizing data collection and retention and gaining consent from consumers when processing data are more explicitly formalized.  The idea is to minimize collection of consumer data, minimize who you share the data with, and minimize how long you keep it. Less is more: less data for the hacker to take, means a more secure environment. So the data points you collected from a web campaign over three years ago — maybe containing 15000 email addresses along with favorite pet names — and now lives in a spreadsheet no one ever looks at. Well, you should find it and delete it. If a hacker gets hold of it, and uses it for phishing purposes, you’ve created a security risk for your customers. Plus, if the local EU authority can trace the breach back to your company, you can face heavy fines.
  • Data Protection Impact Assessments: When certain data associated with subjects is to be processed, companies will have to first analyze the risks to their privacy. This is another new requirement in the regulation. You may also need to run a DPIA if the nature, scope, context, and purposes of your data processing place high risk to the people’s rights and freedoms. If so, before data processing can commence, the controller must produce an assessment of the impact on the protection of personal data. Who exactly determines whether your organization’s processing presents a high risk to the individuals’ rights and freedoms? The text of the GDPR is not specific, so each organization will have to decide for itself. If you find more details about it, please mention in the comments below.
  • Right to Erase and to be Forgotten: Discussed in Article 17 of the GDPR, it states that “The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay where … the personal data are no longer necessary in relation to the purposes for which they were collected or otherwise processed; … the data subject withdraws consent on which the processing is based … the controller has made the personal data public and is obliged … to erase the personal data”. There’s been a long standing requirement in the DPD allowing consumers to request that their data be deleted. The GDPR extends this right to include data published on the web. This is the still controversial right to stay out of the public view and “be forgotten”. This means that in the case of a social media service that publishes personal data of a subscriber to the Web, they would have to remove not only the initial information, but also contact other web sites that may have copied the information. The new principle of extraterritoriality in the GDPR says that even if a company doesn’t have a physical presence in the EU but collects data about EU data subjects — for example, through a web site—then all the requirements of GDPR are in effect. In other words, the new law will extend outside the EU. This will especially affect e-commerce companies and other cloud businesses.
  • Breach Notification: A new requirement not in the existing DPD is that companies will have to notify data authorities within 72 hours after a breach of personal data has been discovered. Data subjects will also have to be notified but only if the data poses a “high risk to their rights and freedoms”. Breaches can be categorized according to the following security levels:
      • Confidentiality Breach: where there is an unauthorized or accidental disclosure of, or access to, personal data.
      • Integrity Breach: where there is an unauthorized or accidental alteration of personal data.
      • Availability Breach: where there is an accidental or unauthorized loss of access to, or destruction of, personal data (include where data has been deleted either accidentally or by an unauthorized person).
  • Fines: The GDPR has a tiered penalty structure that will take a large bite out of offender’s funds. More serious infringements can merit a fine of up to 4% of a company’s global revenue. This can include violations of basic principles related to data security — especially PbD principles. A lesser fine of up to 2% of global revenue — still enormous — can be issued if company records are not in order or a supervising authority and data subjects are not notified after a breach. This makes breach notification oversights a serious and expensive offense.

Role of IT Professionals

Information Security today is just not limited to the IT Department of any organization and as businesses have evolved during time, so does the need for everyone in the business for making his or her contribution to the security of the organisation’s information, and for protecting the personal data the organisation uses. You will notice that most GDPR webinars are attended by  business managers, compliance people and the like and these people are responsible for operating and overseeing GDPR compliance. Asking colleagues what data they hold, and getting the company lawyer to update standard contract terms and write privacy notices. But they can’t really do all this stuff on their own since they need IT for doing most of the work like providing a dump of the database schema, gives a guaranteed correct version and don’t forget the unique access required to scan the various files stored in local hard disks and networked file shares for the millions of files we use in the form of documents, emails, spreadsheets, meeting notes, etc. It is extremely important to engage the  IT Team from the discovery phase, for example: most of us hardly ever had one because nobody’s really been sufficiently bothered to spend the money and ask what data you hold about them. The other thing you need to understand is whether there’s a gap between how you think you work and how you actually work. For Example backups: Even though customer’s backup strategy is documented, do you really understand how it’s implemented by the tech teams? How your disk-to-disk-to-tape setup really works? Who transports the tapes to offsite storage? Do you destroy tapes when you say you will? If you’ve erased someone’s data on request, does the tech team re-delete the data from the live system if they’ve had to restore from backup?

Nearly every organization I have come across keeps some sort of back up with them and not everyone is fully utilizing the Cloud infrastructure and Back up tools. The data aspect is important and becoming compliant is one thing, but being able to quantify compliance is quite another. Specifically Data Protection Admins (note – there is a reason I did not mention backup administrators, since Data Protection / Management team shall manage backups, Archives, LTR copies etc.) who handle the data for company and its customers. Having a sound and tested data protection scheme which can report well also is what customers need, that is something which can be delivered by DellEMC DPS solutions.

Actions for Compliance

Below is a list of actions the organization needs to take in order to comply with GDPR and notice that I have not mentioned any timeline, since different organizations have different data set sizes and they may require less to more amount of time to carry out same set of actions.

  • Step 1 – Data Mapping: Identify and map your data processing activities, including data flows and use cases in order to create a comprehensive record of activities since GDPR requires you to keep detailed records of data processing activities. These records can be used to assess the compliance steps required by the business going forward and respond quickly to data breaches and to individuals who request their own data.
  • Step 2 – Privacy Governance / Data Protection Officer: Improve the corporate governance policies and structure to ensure that they are effective to achieve reasonable compliance throughout the business. Organizations who are in EU or deal heavily with EU users data have to assign a “Data Protection Officer” who meets GDPR criteria.
  • Step 3 – Data Sharing: Customers have to identify any data sharing with third parties, determine the role of those parties and put appropriate safeguards in place since GDPR imposes mandatory content for certain agreements and requires the clear assignment of roles and responsibilities.
  • Step 4 – Justification of Processing: Review or establish legal bases for processing, for key use cases. Plan and implement  remedial action to fill any compliance gaps, GDPR requires that all data processing has a legal basis and makes usage more difficult. GDPR also contains restrictions / additional obligations relating to the use of automated processing, including profiling.
  • Step 5 – Privacy Notices & Consents
  • Step 6 – Data Protection Impact Assessment: Assess whether the business carries out any “high risk” processing under the GDPR. If so, carry out a Data Protection Impact Assessment (DPIA) and, if necessary, consult with your supervisory authority, vendors (this is where we come in with NetWorker, DD, Avamar, Storage assessments as we can inform customer of their backup data, retention policies etc.).
  • Step 7 – Policies: Review and supplement the company’s existing suite of polices and processes dealing with data  protection, including those dealing with data retention and integrity, such as data accuracy and  relevance. The GDPR imposes stricter obligations to keep data accurate, proportionate and no longer  than necessary.
  • Step 8 – Individuals Rights: Organizations have to identify the new individual rights provided by the GDPR and establish procedures for dealing  with them. Review the procedures in place in order to comply with existing rights and set up any new internal procedures and processes, where required.
  • Step 9 – Data, Quality, Privacy by Design: Organizations have to make sure that GDPR compliance is embedded in all applications and processes that involve personal data from the start. Default settings must comply with the GDPR.
  • Step 10 – International Data Transfers: Organizations have to make sure they Identify and review the data transfer mechanisms in place in order to comply with the GDPR. Fill any gaps, including entering into Standard Contractual Clauses with service providers and group companies.
  • Step 11 – Data Security & Breach Management Process: Review the data security measures in place to ensure they are sufficient and to assess whether the specific measures referred to in the GDPR are (or should be) in place. Review or establish an effective Data Breach Response Plan (this is where we can talk a bit about IRS, encryption, WORM functionality of DPS products.). The GDPR implements stricter requirements regarding appropriate technical and organizational data security measures. It also requires data breaches involving risk to individuals to be reported to supervisory authorities without delay and within 72 hours (unless a longer period can be justified); affected individuals must also be notified if the breach is high risk.
  • Step 12 – Roll out of Compliance Tools & Staff Training: Roll-out amended and new privacy notices and consent forms. Publish new and revised policies and procedures and conduct training of key personnel on GDPR compliance.

Complete GDPR information can be found at: https://gdpr-info.eu/