Every file and folder in the Sensitive Content view has a Risk and Confidence value. The risk score is a quantified risk of the sensitive content within the file, assuming all the discovered matches are correct. Confidence represents the likelihood that the file or folder contains the sensitive matches that give it a certain risk score.
Confidence can be high, medium, low, or undetermined. Undetermined confidence values will reduce over time as Egnyte analyzes and quantifies the confidence of sensitive historical data. High confidence results will tend to contain fewer false positives for matches.
How Confidence is calculated
Confidence is based on several factors such as:
- The proximity of the sensitive term to surrounding qualifying terms.
- The type of term found.
- Probability scores for any patterns found using machine learning-based approaches.
- How specifically defined the underlying pattern is (for example, social security numbers and credit card numbers follow a specific format that can increase the confidence score associated with those types of patterns).
How to use Confidence
A confidence filter can be applied when reviewing items in the Sensitive Content view to find the locations and files most likely to contain sensitive content. Confidence can be used as a filter in the main Sensitive Content View and when reviewing details for a specific Sensitive Content location.
Use Confidence as a filter in the Sensitive Content View to find folders most likely to contain sensitive content.
Use confidence as a filter in the location details view to find files most likely to contain sensitive content.