Egnyte Secure & Govern has recently released an improved and expanded ML detection model for the Unusual Access Analysis Rule. The new multivariate detection model is discussed in further detail below.
For Unusual Access detections to occur, our ML model requires 60 days of user history and 90 days of content source activity. Waiting 60 days ensures our ML model is properly trained on a user's usage patterns and prevents a high number of false positives.
General
What is the Unusual Access Multivariate Anomaly Detection model?
The new multivariate anomaly detection model detects Unusual Access anomalies. The improved model focuses on file access/download and file deletion anomalies. The following model now considers the following variants for Unusual Access detections:
-
- The volume of files (accessed/downloaded and/or deleted)
- Sensitivity of the files
- Location of access
- Time-of-day of access
Can I control which model variants are used for Unusual Access detections?
Yes, each of the variants can be turned on/off independently.
For existing customers, only "File access/downloads" and "File deletes" are enabled by default. For new customers, all variants are enabled by default.
Syncing content for offline access does NOT contribute towards the Unusual Access detections.
Model Variants
Number of Files Accessed/Downloaded
Number of Files Deleted
Number of Sensitive Files Accessed/Downloaded
Number of Files Accessed/Downloaded by Location
Number of Files Accessed/Downloaded by Time of Day
Number of Files Accessed/Downloaded
Detects unusual file access/download patterns using user-based Machine learning.
How does the model variant work
This variant of the ML model works on a per-user basis and captures daily file accessed/downloaded patterns for each user.
Example: The current user-based model prediction for User A = 30 files accessed/downloaded
- If User A has 35 files accessed/downloaded on a given day, the user has exceeded their predicted rate of 30, and an Unusual Access anomaly will be detected.
- If User A has 25 files accessed/downloaded on a given day, the user has NOT exceeded their predicted rate of 30, and an Unusual Access anomaly will NOT be detected.
The above example does not take into account the "threshold" settings discussed below.
What settings can be modified to change the number of detections?
- Minimum file threshold
- Sensitivity threshold (low, medium, high)
The minimum file threshold and Sensitivity threshold work independently. Changing the Sensitivity threshold from low to medium does NOT change or impact the Minimum file threshold.
How does Minimum file threshold work?
The minimum file threshold sets the minimum number of files accessed/downloaded required to generate an Unusual Access detection. The minimum file threshold works independently from user-based model predictions and sensitivity thresholds.
By default, the Unusual Access minimum file threshold parameter is ignored if at least one file from a detection contains sensitive content. Customers can change the default behavior preventing the minimum file threshold from being ignored when sensitive files are detected by unchecking the "Ignore the minimum threshold when sensitive files are detected"
Example #1: Minimum file threshold = 20 and user-based model prediction = 15 files accessed/downloaded
- If the user has 18 files accessed/downloaded, the user has exceeded their predicted rate of 15 but has NOT exceeded the minimum file threshold of 20. An Unusual Access anomaly will NOT be detected
Example #2: Minimum file threshold = 20 and user-based model prediction = 30 files accessed/downloaded
- If a user has 25 files accessed/downloaded, the user has NOT exceeded their predicted rate of 30 but has exceeded the minimum file threshold of 20. An Unusual Access anomaly will NOT be detected
- If a user has 35 files accessed/downloaded, the user has exceeded their predicted rate of 30 and has also exceeded the minimum file threshold of 20. An Unusual Access anomaly will be detected
The above example does not take into account the "Sensitivity threshold" setting discussed below.
What happens when I change the Minimum file threshold?
Any minimum file threshold changes will automatically affect any future Unusual Access detections.
Example: Changing threshold from 10 to 50
- No future Unusual Access detections will occur for any user that has less than 50 files accessed/downloaded.
How does Sensitivity threshold work?
The sensitivity threshold works in conjunction with user-based model prediction. The Sensitivity threshold determines the multiplier that is applied to the user-based predictions to determine when an Unusual Access anomaly is detected.
- The sensitivity threshold does NOT change or impact the minimum file threshold
- A Sensitivity setting of "Low" will generate the most detections, while a Sensitivity setting of "High" will generate the least amount of detections
- Sensitivity settings are tied to multipliers for each user:
- Low = Multiplier of 1
- Medium = Multiplier of 3
- High = Multiplier of 10
Example: The user-based model prediction = ten files accessed/downloaded per day for User "A"
- If the Sensitivity threshold is set to "Low," the user-based model prediction would still be 10 (10 x 1) files accessed/downloaded per day for User "A."
- If the Sensitivity threshold is set to "Medium," the user-based model prediction would now be 30 (10 x 3) files accessed/downloaded per day for User "A."
- If the Sensitivity threshold is set to "High," the user-based model prediction would now be 100 (10 x 10) files accessed/downloaded per day for User "A."
The above example does not take into account the "Minimum file threshold" setting discussed above.
What happens when I change the Sensitivity threshold?
Any changes will automatically affect any future Unusual Access detections.
Example: Changing threshold from medium to high
- The Sensitivity threshold multiplier changes from 3 to 10
Initially, customers may see an increase in Unusual Access detections involving Windows Explorer (explorer.exe). This is due to the current limitation of Windows Explorer, which prevents differentiating between files accessed during a windows search and actual user file downloads. The number of detections will reduce over time as our ML model adjusts to a user's behavior.
Number of Files Deleted
Detects unusual file deletion patterns using user-based Machine learning.
How does the model variant work?
This variant of the ML model works on a per-user basis and captures daily file deletion patterns for each user.
Example: The current user-based model prediction for User A = 30 deleted files
- If User A has 35 files deleted on a given day, the user has exceeded their predicted rate of 30, and an Unusual Access anomaly will be detected.
- If User A has 25 files deleted on a given day, the user has NOT exceeded their predicted rate of 30, and an Unusual Access anomaly will NOT be detected.
The above example does not take into account the "threshold" settings discussed below.
What settings can be modified to change the number of detections?
- Minimum file threshold
- Sensitivity threshold (low, medium, high)
The minimum file threshold and Sensitivity threshold work independently. Changing the Sensitivity threshold from low to medium does NOT change or impact the Minimum file threshold.
How does Minimum file threshold work?
The minimum file threshold sets the minimum number of deleted files required to generate an Unusual Access detection. The minimum file threshold works independently from user-based model predictions and sensitivity thresholds.
By default, the Unusual Access minimum file threshold parameter is ignored if at least one file from a detection contains sensitive content. Customers can change the default behavior preventing the minimum file threshold from being ignored when sensitive files are detected by unchecking the "Ignore the minimum threshold when sensitive files are detected"
Example #1: Minimum file threshold = 20 and user-based model prediction = 15 deleted files
- If a user has 18 deleted files, the user has exceeded their predicted rate of 15 but has NOT exceeded the minimum file threshold of 20. An Unusual Access anomaly will NOT be detected
Example #2: Minimum file threshold = 20 and user-based model prediction = 30 deleted files
- If a user has 25 deleted files, the user has NOT exceeded their predicted rate of 30 but has exceeded the minimum file threshold of 20. An Unusual Access anomaly will NOT be detected
- If a user has 35 deleted files, the user has exceeded their predicted rate of 30 and has also exceeded the minimum file threshold of 20. An Unusual Access anomaly will be detected
The above example does not take into account the "Sensitivity threshold" setting discussed below.
What happens when I change the Minimum file threshold
Any minimum file threshold changes will automatically affect any future Unusual Access detections
Example: Changing threshold from 10 to 50
- No future Unusual Access detections will occur for any user that has less than 50 deleted files.
How does Sensitivity threshold work?
The sensitivity threshold works in conjunction with user-based model prediction. The Sensitivity threshold determines the multiplier that is applied to the user-based predictions to determine when an Unusual Access anomaly is detected.
- The sensitivity threshold does NOT change or impact the minimum file threshold
- A Sensitivity setting of "Low" will generate the most detections, while a Sensitivity setting of "High" will generate the least amount of detections
- Sensitivity settings are tied to activity percentiles for each user:
- Low = 90% or higher
- Medium = 95% or higher
- High = 99% or higher
Example: In the chart below, the user ranges of 5-80 deleted files daily.
- The median or average for this user is ~41 deleted files.
- If the Sensitivity threshold is set to "Low," an Unusual Access detection would occur at ~64 deleted files or higher.
- If the Sensitivity threshold is set to "Medium," an Unusual Access detection would occur at ~71 deleted files or higher.
- If the Sensitivity threshold is set to "High," an Unusual Access detection would occur at ~78 deleted files or higher.
User Detection Frequency = The number of days a user deletes "x" number of files.
User Daily Files Deleted = The actual number of files a user deleted per day.
The above example does not take into account the "Minimum file threshold" setting discussed above.
What happens when I change the Sensitivity threshold?
Any changes will automatically affect any future Unusual Access detections
Example: Changing threshold from medium to high
- Unusual Access detections will only occur when a user crosses over the 99% file threshold (instead of 95%)
Number of Sensitive Files Accessed/Downloaded
Detects unusual Sensitive file access/download patterns using user-based Machine learning.
How does the model variant work?
This variant of the ML model works on a per-user basis and captures daily Sensitive file accessed/downloaded patterns for each user.
Example: The current user-based model prediction for User A = 15 sensitive files accessed/downloaded
- If User A has 20 sensitive files accessed/downloaded on a given day, the user has exceeded their predicted rate of 15, and an Unusual Access anomaly will be detected.
- If User A has 12 sensitive files accessed/downloaded on a given day, the user has NOT exceeded their predicted rate of 15, and an Unusual Access anomaly will NOT be detected.
The above example does not take into account the "threshold" settings discussed below.
What settings can be modified to change the number of detections?
- Sensitivity threshold (low, medium, high)
Only the Sensitivity threshold is available for this model variant. The minimum file threshold only applies to the "files deleted" and files accessed/downloaded" model variants.
How does Sensitivity threshold work?
Sensitivity threshold works in conjunction with user-based model prediction. The Sensitivity threshold determines the multiplier that is applied to the user-based predictions to determine when an Unusual Access anomaly is detected.
- A Sensitivity setting of "Low" will generate the most detections, while a Sensitivity setting of "High" will generate the least amount of detections
- Sensitivity settings are tied to activity percentiles for each user:
- Low = 90% or higher
- Medium = 95% or higher
- High = 99% or higher
Example: In the chart below, the user ranges of 2-33 sensitive files accessed/downloaded daily.
- The median for this user is ~18 deleted files.
- If the Sensitivity threshold is set to "Low," an Unusual Access detection would occur at ~27 sensitive files accessed/downloaded or higher.
- If the Sensitivity threshold is set to "Medium," an Unusual Access detection would occur at ~30 sensitive files accessed/downloaded or higher.
- If the Sensitivity threshold is set to "High," an Unusual Access detection would occur at ~32 sensitive files accessed/downloaded or higher.
User Detection Frequency = The number of days a user accesses/downloads "x" number of sensitive files.
User Daily Sensitive Files Accessed/Downloaded = The actual number of sensitive files a user accessed/downloaded per day.
What happens when I change the Sensitivity threshold?
Any changes will automatically affect any future Unusual Access detections
Example: Changing threshold from medium to high
- Unusual Access detections will only occur when a user crosses over the 99% file threshold (instead of 95%)
Number of Files Accessed/Downloaded by Location
Detects unusual file access/download patterns by location using user-based Machine learning.
How does the model variant work?
This variant of the ML model works on a per-user basis and captures daily file accessed/downloaded patterns, based on distance from the primary location, for each user. The model variant identifies a user's most common single location (primary location) and calculates an acceptable median distance (x number of miles/kilometers) from that location. The median distance is NOT a defined or a set number of miles/kilometers. Acceptable median distance will be determined by a user's usage patterns.
- The model will analyze user history to define a median distance to prevent work from home detections.
- When the model identifies a location(s) that is outside the acceptable median distance, based on user history, an Unusual Access detection will occur.
Example: The user-based model prediction median distance = 20 miles from their primary location for User "A"
- If user "A" accesses/downloads files 25 miles from their primary location, an Unusual Access anomaly will be detected.
- If user "A" accesses/downloads files 20 miles or less from their primary location, an Unusual Access anomaly will NOT be detected.
The above example does not take into account the "threshold" settings discussed below.
What settings can be modified to change the number of detections?
- Sensitivity threshold (low, medium, high)
Only the Sensitivity threshold is available for this model variant. The minimum file threshold only applies to the "files deleted" and files accessed/downloaded" model variants.
How does Sensitivity threshold work?
Sensitivity threshold works in conjunction with user-based model prediction. The Sensitivity threshold determines the multiplier that is applied to the user-based predictions to determine when an Unusual Access anomaly is detected.
- A Sensitivity setting of "Low" will generate the most detections, while a Sensitivity setting of "High" will generate the least amount of detections
- Sensitivity settings are tied to activity percentiles for each user:
- Low = 90% or higher
- Medium = 95% or higher
- High = 99% or higher
Example: In the chart below, the user ranges of 0-32 miles from their Primary Location
- The median, for this user, is ~16 miles from the primary location
- If the Sensitivity threshold is set to "Low," an Unusual Access detection would occur at ~26 miles or higher
- If the Sensitivity threshold is set to "Medium," an Unusual Access detection would occur at ~29 miles or higher
- If the Sensitivity threshold is set to "High," an Unusual Access detection would occur at ~31 miles or higher
User Detection Frequency = The number of times a user accesses/downloads files
User Location Distance = The actual number of miles from their primary location
What happens when I change the Sensitivity threshold?
Any changes will automatically affect any future Unusual Access detections
Example: Changing threshold from medium to high
- Unusual Access detections will only occur when a user crosses over the 99% file threshold (instead of 95%)
Number of Files Accessed/Downloaded by Time of Day
Detects unusual file Access/Download patterns by time of day using user-based Machine learning.
How does the model variant work?
This variant of the ML model works on a per-user basis and captures daily file accessed/downloaded patterns, based on time of day, for each user.
The model variant divides a day into 4-hour time buckets (six time buckets total). Here are the time buckets (UTC time zone will be used for detections)
- 12 AM-4 AM
- 4 AM-8 AM
- 8 AM-12 PM
- 12 PM-4 PM
- 4 PM-8 PM
- 8 PM-12 AM
The model then captures file access/download statistics for each time bucket for each user. An Unusual Access detection will occur when:
- A spike in file access/download activity occurs in any time bucket
- file access/download activity occurs in an unusual time bucket for a user
Example #1 (new time bucket): The user-based model prediction identifies User A typically has activity in two-time buckets: 8 AM-12 PM and 12 PM-4 PM
- If file access/download activity occurs in the 4 AM-8 AM time bucket, an Unusual Access anomaly will be detected.
- If normal file access/download activity occurs in the 8 AM-12 PM time bucket, an Unusual Access anomaly will NOT be detected.
Example #2 (access/download spike): The user-based model prediction identifies User A typically has activity from 8 AM-12 PM and typical activity is 20 files accessed/downloaded
- If 25 files are accessed/downloaded between 8 AM-12 PM, an Unusual Access anomaly will be detected.
- If up to 20 files are accessed/downloaded between 8 AM-12 PM, an Unusual Access anomaly will NOT be detected.
- The above example does not take into account the "threshold" settings discussed below.
What settings can be modified to change the number o detections?
- Sensitivity threshold (low, medium, high)
Only the Sensitivity threshold is available for this model variant. The minimum file threshold only applies to the "files deleted" and files accessed/downloaded" model variants.
How does Sensitivity threshold work?
Sensitivity threshold works in conjunction with user-based model prediction. The Sensitivity threshold determines the multiplier that is applied to the user-based predictions to determine when an Unusual Access anomaly is detected.
- A Sensitivity setting of "Low" will generate the most detections, while a Sensitivity setting of "High" will generate the least amount of detections
- Sensitivity settings are tied to activity percentiles for each user:
- Low = 90% or higher
- Medium = 95% or higher
- High = 99% or higher
Example #1: In the chart below, the user ranges of 5-80 files accessed/downloaded a day from 8 AM-12 PM for user "A"
- The median, for this user, is ~41 files accessed/downloaded from 8 AM-12 PM.
- If the Sensitivity threshold is set to "Low," an Unusual Access detection would occur at ~64 files accessed/downloaded or higher.
- If the Sensitivity threshold is set to "Medium," an Unusual Access detection would occur at ~71 files accessed/downloaded or higher.
- If the Sensitivity threshold is set to "High," an Unusual Access detection would occur at ~78 files accessed/downloaded or higher.
Time Bucket = 8AM - 12PM
User Detection Frequency = The number of days a user accesses/downloads "x" number of files.
User Daily Files Accessed/Downloaded = The actual number of files a user accessed/downloaded per day.
Example #2: In the chart below, the user ranges of 0-5 files accessed/downloaded a day from 4 PM-8 PM for user "A"
- The median, for this user, is ~2.5 files accessed/downloaded from 4 PM-8 PM.
- If the Sensitivity threshold is set to "Low," an Unusual Access detection would occur at ~3.5 files accessed/downloaded or higher.
- If the Sensitivity threshold is set to "Medium," an Unusual Access detection would occur at ~4 files accessed/downloaded or higher.
- If the Sensitivity threshold is set to "High," an Unusual Access detection would occur at ~4.5 files accessed/downloaded or higher
Time Bucket = 4PM - 8PM
User Detection Frequency = The number of days a user accesses/downloads "x" number of files.
User Daily Files Accessed/Downloaded = The actual number of files a user accessed/downloaded per day.
What happens when I change the Sensitivity threshold?
Any changes will automatically affect any future Unusual Access detections
Example: Changing threshold from medium to high
- Unusual Access detections will only occur when a user crosses over the 99% file threshold (instead of 95%)
Back to Model Variants