The picture above is from one of the many CISSP videos that I have watched but it came to good point as I was planning to write a posts about data retention and why it matters.
Table of Contents
So why it matters?
In Europe we have GDPR regulations and they state that every individual can ask organizations to get data that organizations have on them and when they ask to remove this data the organizations have to act based on the request.
And if they don’t, they will be fined on breaking the regulations.
Of course GDPR regulates a lot more than just this but this is the part that I’m focusing in.
If the organization doesn’t know what data they have, how they can destroy the data? This is where the data classification comes in action and also retention periods based to a label and a policy. Data could be having credit card numbers, social security numbers, business secrets etc.
Retention examples that all (Exchange) mailbox owners have
In Exchange retention is used when user deletes an Email from their Inbox.
Below you can see the default retention rules that a mailbox has, sound familiar right?
|Name||Type||Retention age (days)||Retention action|
|Default 2 years move to archive||Default Policy Tag (DPT)||730||Move to Archive|
|Recoverable Items 14 days move to archive||Recoverable Items folder||14||Move to Archive|
|Personal 1 year move to archive||Personal tag||365||Move to Archive|
|Personal 5 year move to archive||Personal tag||1,825||Move to Archive|
|Personal never move to archive||Personal tag||Not applicable||Move to Archive|
|1 Week Delete||Personal tag||7||Delete and Allow Recovery|
|1 Month Delete||Personal tag||30||Delete and Allow Recovery|
|6 Month Delete||Personal tag||180||Delete and Allow Recovery|
|1 Year Delete||Personal tag||365||Delete and Allow Recovery|
|5 Year Delete||Personal tag||1,825||Delete and Allow Recovery|
|Never Delete||Personal tag||Not applicable||Delete and Allow Recovery|
There is at least Move to Archive and 1 month delete applied to Inbox and Deleted items but you can also use these policies to apply different tags to apply different policies.
So this what everyone has and propably didn’t even realize that it’s done with retention policies.
Retention label defines the settings of the label, how to count the time for removal, time frame for the removal.
You have 4 predefined event types but you can define your own from “Create new event type”
Event-based retention is typically used as part of a records-management process. This means that:
- Retention labels based on events also usually mark items as a record, as a part of a records management solution. For more information, see Learn about records management.
- A document that’s been declared a record but whose event trigger has not yet happened is retained indefinitely (records can’t be permanently deleted), until an event triggers that document’s retention period.
- Retention labels based on events usually trigger a disposition review at the end of the retention period, so that a records manager can manually review and dispose of the content. For more information, see Disposition of content.
If you want to have an person to review the content before removal you can choose “Trigger a disposition review”
It’s like Access Reviews but for content. Sometimes it’s useful example if the data regulator officer needs to see the content before it will be removed.
After label is created
When you finish the creation of the label you can directly enable the following
Publish mean that you can put it usable for users to see and apply. Auto-Apply means auto-labeling which will put the label to content based on location or various rules available, I will be covering auto-labeling little bit later.
When you define the policies you can choose the location where do you want it to apply.
And you can also but Exclusion but for SharePoint and OneDrive sites you have to define the exclusions with the full URL but for Exchange and Groups you can choose them directly from the list.
So now we have the really secret policy in-place.
It will take up to 1 day for the labels to appear for the users. In my experience it doesn’t takes this long but Microsoft reserves a time for the process just like in Custom Domain removal from a tenant https://docs.microsoft.com/en-us/microsoft-365/admin/get-help-with-domains/remove-a-domain?view=o365-worldwide#how-long-does-it-take-for-a-domain-to-be-removed
The is to kind of labeling options available.
Client-side labeling when users edit documents or compose (also reply or forward) emails: Use a label that’s configured for auto-labeling for files and emails (includes Word, Excel, PowerPoint, and Outlook).This method supports recommending a label to users, as well as automatically applying a label. But in both cases, the user decides whether to accept or reject the label, to help ensure the correct labeling of content. This client-side labeling has minimal delay for documents because the label can be applied even before the document is saved. However, not all client apps support auto-labeling. This capability is supported by the Azure Information Protection unified labeling client, and some versions of Office.For configuration instructions, see How to configure auto-labeling for Office apps on this page.
Service-side labeling when content is already saved (in SharePoint or OneDrive) or emailed (processed by Exchange Online): Use an auto-labeling policy.You might also hear this method referred to as auto-labeling for data at rest (documents in SharePoint and OneDrive) and data in transit (email that is sent or received by Exchange). For Exchange, it doesn’t include emails at rest (mailboxes).Because this labeling is applied by services rather than by applications, you don’t need to worry about what apps users have and what version. As a result, this capability is immediately available throughout your organization and suitable for labeling at scale. Auto-labeling policies don’t support recommended labeling because the user doesn’t interact with the labeling process. Instead, the administrator runs the policies in simulation mode to help ensure the correct labeling of content before actually applying the label.
And Auto-labeling which is covered next, is Service-Side labeling.
With auto-labeling you can search content for sensitive info, user keywords to find the content or make trainable classifiers.
The others ones are maybe self-explanotary but trainable classifiers ain’t.
Microsoft defined a set of classifiers, basically classifier are what ever info you want to find from the files, it could be (like in MS classifiers) Offensive language or threats. Classifiers support multiple languages so they can be used also in global organizations.
You will find seeds for classifiers from the real content and then you can use those seeds to train your rules.
GDPR data from content
I will be covering the GDPR as it is already in-place, there is insane amount of different options that you can also use.
Next you will choose how many instances if the current definition has to bee found inside the content to mark it.
and again the same thing with Auto-labels, define a location and exclusions.
And we are done, just next and submit.
And we have labeling requirement for a document.
When Auto-labeling kicks in it will not label this document as it has already been labeled with the same label.
So When content has been manually labeled, that label will never be replaced by automatic labeling. However, automatic labeling can replace a lower priority label that was automatically applied.
Final thoughts on retention
Marking and removal of documents is necessary due to regulations and restrictions based on those. Users can have regulated data and finding that data can be tricky without proper tools.
Removing or just archiving data away from users OneDrive and organizational SharePoint sites is something you can easily do as long as you understand how retention works.
Next posts will cover Azure Information Protection and Data Loss Prevention parts of Compliance.
Until next time,