Skip to main content

Lab 5 โ€” Detection Logic using DLP Engine

๐Ÿ”„

Switch to Lab Tenant (Tenant 2)
Module 2 starts here. Log out of the Enterprise Tenant and log in using your Lab Tenant credentials (ZIdentity URL from your email). Use your updated password โ€” not the one-time password.

Lab 5โฑ 15 minโš— Lab Tenant ยท Read/Write๐Ÿ‘ค Alex
Detection Logic using DLP Engine
In the previous lab, you assessed Microsoft Copilot readiness by identifying sensitive data exposure across SharePoint, OneDrive, and Teams. Now Alex needs to build the detection logic that identifies sensitive content in documents โ€” the foundation for every enforcement policy in Labs 6, 7, and 8.
๐Ÿ›ก
Alex โ€” Security Administrator
Lab Tenant (Read/Write) โ€” Configuration Mode
You are Alex. Before any policy can protect data, the system needs to know what sensitive data looks like. The detection logic you build here will be reused across Web, Endpoint, and Browser enforcement in Labs 6, 7, and 8.

๐Ÿ”—Dependency: The DLP dictionary and engine created in this lab are referenced in Labs 6, 7, and 8. Complete all steps before moving to the next lab.

๐ŸŽฏBuild custom detection logic โ€” dictionaries and engines โ€” that will be reused across all protection policies in Labs 6, 7, and 8.

Step 1: Navigate to DLP Dictionaries and Review Predefined Dictionariesโ€‹

Policies โ†’ Data Protection โ†’ Common Resources โ†’ Dictionaries & Engines

Navigating to Policies โ†’ Data Protection โ†’ Common Resources โ†’ Dictionaries & Engines
Policies โ†’ Data Protection โ†’ Common Resources โ†’ Dictionaries & Engines navigation path.

Review the list of predefined dictionaries. Locate and observe the following three dictionaries used in this lab:

  • Credit Cards
  • Social Security Numbers (US)
  • ABA Bank Routing Number
๐Ÿ’ก Key Insight

These predefined dictionaries provide built-in detection capabilities for commonly regulated data types โ€” no configuration required to get started with standard compliance frameworks.

Step 2: Modify Predefined Dictionaries to High Confidenceโ€‹

๐ŸŽฏUpdate predefined dictionaries to use High Confidence scoring and extended proximity for more accurate detection.

For each of the three predefined dictionaries โ€” Credit Cards, Social Security Numbers (US), and ABA Bank Routing Number โ€” open the dictionary and update the following settings:

SettingValue
Confidence Score ThresholdHigh
Proximity Length200 characters

Save each dictionary after making the changes.

Edit DLP Dictionary dialog showing Credit Cards dictionary with High confidence and proximity 200
Edit DLP Dictionary โ€” Credit Cards shown with Confidence Score Threshold set to High and Proximity Length set to 200. Apply the same settings to Social Security Numbers (US) and ABA Bank Routing Number.
๐Ÿ’ก Facilitator Notes

Explain that increasing proximity from the default (50) to 200 means the system looks at a wider window of surrounding text to confirm context. A credit card number appearing within 200 characters of keywords like "card number", "payment", or "billing" has much higher confidence than one appearing in isolation. High Confidence combined with wider proximity reduces false positives in real-world payroll and financial documents.

Step 3: Create a Custom DLP Dictionaryโ€‹

๐ŸŽฏBuild custom detection logic for organization-specific sensitive data.

Click Add DLP Dictionary and configure with the following settings.

FieldValue
Name
DP Project Code
Dictionary TypePatterns & Phrases
Enable ProximityEnabled
Proximity Length200

Add the following detection patterns:

DP-PRJ-2025-\d{4}
DAC-\d{7}

Set the action to Count Unique.

Then add the following contextual phrases:

  • Confidential
  • Internal Only
  • Salary
  • Payroll
  • Project Codes
  • Internal
DP Project Code custom DLP dictionary configured with two patterns and five phrases
DP Project Code dictionary โ€” Patterns & Phrases type, proximity enabled at 200, two regex patterns (DP-PRJ-2025 and DAC), and contextual phrases configured.
๐Ÿ’ก Key Insight

Custom dictionaries allow organizations to detect proprietary identifiers that are not covered by standard compliance templates โ€” project codes, internal classifications, or domain-specific terminology unique to your business.

Step 4: Create a Detection Logic using a DLP Engineโ€‹

๐ŸŽฏCombine multiple detection signals into a single classification rule.

Switch to the DLP Engines tab, then click + Add DLP Engine.

DLP Engines tab selected with Add DLP Engine button highlighted
Switch to the DLP Engines tab (1) and click + Add DLP Engine (2) to begin creating the detection engine.
FieldValue
Name
DP Project Code
OperatorALL

Add the following detection components, each with condition > 0:

  • Credit Cards
  • Social Security Numbers (US)
  • ABA Bank Routing Number
  • DP Project Code
DLP Engine expression configured with ALL operator and four detection components
DLP Engine expression โ€” ALL operator combining Credit Cards, SSN, ABA Routing, and DP Project Code detection.

Review the expression preview. It should display:

((Credit Cards > 0) AND (Social Security Numbers (US) > 0) AND (ABA Bank Routing Number > 0) AND (DP Project Code > 0))
๐Ÿ’ก Facilitator Notes

This logic represents a high-confidence detection scenario where multiple sensitive data elements appear together โ€” exactly the pattern you'd expect in a payroll file like Dataparity_Q2_2025_Payroll_Report.docx.

Emphasize the separation of detection and enforcement โ€” this engine can be reused across Labs 6, 7, and 8 without reconfiguration. Tune detection once, apply everywhere.

Step 5: Understand How Detection Logic Supports Enforcementโ€‹

๐ŸŽฏConnect detection logic to future protection scenarios.

This detection logic will be reused in the following labs:

LabChannelAction
Lab 6WebBlock sensitive data uploads
Lab 7EndpointPrevent exfiltration to removable media
Lab 8BrowserControl copy and paste
๐Ÿ’ฌ Discussion
  • Why is it important to combine multiple detection signals instead of relying on a single identifier?
  • How does proximity detection reduce false positives?
  • What types of organization-specific identifiers should be added to custom dictionaries?
  • How does this detection logic support consistent protection across Web and Endpoint environments?
๐Ÿ’ก Key Insight

Detection logic defines what is sensitive. Policies define what to do about it.

Once created, the same logic can be reused across multiple enforcement channels to provide consistent data protection across Web, Endpoint, and Browser environments โ€” no duplication, no drift.

๐Ÿ’ก Facilitator Notes

Transition line: "The system now knows what sensitive data looks like. In Lab 6, we'll use that to stop it from leaving the organization via the web."

If attendees ask why ALL vs. ANY โ€” ALL is intentionally high-confidence to minimize false positives for a block action. ANY would cast a wider net and is better suited for an alert-only policy.

๐ŸŽ“
Lab Assistant
Zenith Live 2026 ยท Dataparity
Lab 5 โ€” Detection Logic
Browse all topics