
MITRE Evaluation & its Evolution
Prateek Bhajanka, Tanaa Chauhan, Devesh Taneja
July 26, 2025 · 4 min read
Imagine you’re shopping for a laptop for yourself. Dozens of brands are claiming that they have the best features. But how will you know which laptop works as advertised? Which one is the best for you?
Imagine you’re shopping for a laptop for yourself. Dozens of brands are claiming that they have the best features . But how will you know which laptop works as advertised ? which one is the best for you? Do you rely on the marketing claims or look for unbiased and independent stamps and tests that help you make the decision.
Well, MITRE Evaluations can serve the exact purpose for you in the world of cybersecurity; it bridges the gap between security solution providers and their customers expectations by helping them understand how the platform fares in a real life like lab environment and defend against adversarial behaviours.
MITRE performs tests through a transparent process, and the best part is that they are available publicly to everyone without a paywall/ login page. MITRE acts as an independent agency, providing data to the customers, enabling them to assess security solutions more effectively and also provide feedback to security solution providers on their product efficacy.
Limitations and Evolution
The first MITRE Enterprise Evaluations was conducted in the year 2018. Since the beginning MITRE Enterprise Evaluations has been considered the de-facto standard for a third party/ independent evaluation during EDR/XDR evaluation. MITRE Enterprise Evaluation isn't perfect just like any other tests. It has its fair share of limitations:
- Consoles operated by vendors leaving the scope for overinterpretation which may result in biased data.
- The test environment being noise-free or only had attacker activity happening
- No noise and no information about false positives was stated
- Measures of protection were not listed
However, MITRE continuously evolves his testing methodology to make it as close as possible to the real world production environment and real test against the real life attacks.
Important Points to Note:
There are certain more points to be noted when it comes to MITRE Enterprise Evaluation Results:
- The results (numbers) of one year shouldn’t be compared with another because the factors behind the score change almost every year
- All detection types aren’t the same such as Analytic, Telemetry, Tactic, Technique, IOC, MSSP, etc
- The configuration of the EDR/XDR platforms under the evaluation may be running on aggressive modes which aren’t suitable for a production environment
- The evaluation doesn’t evaluate the investigation and response capabilities yet
The below table indicates the evolution journey that MITRE Enterprise Evaluation went through since its inception in 2018 and its relevance to the field environments.
MITRE Enterprise Evaluation and its Evolution
| Test Name | APT-3 (2018) (Cobalt Strike, PowerShell Empire) |
APT-29 (2020) | Carbanak + FIN7 (2021) | Wizard Spider + Sandworm (2022) | Turla (2023) (Carbon, Snake) |
Enterprise 2024 (DPRK, CLOP, LockBit) |
Enterprise 2025 |
|---|---|---|---|---|---|---|---|
| Main Detection Types | None Telemetry IOC Enrichment General Behaviour Specific Behaviour |
None Telemetry +General +Tactic +Technique -IOC -Enrichment -General Behaviour -Specific Behaviour |
None Telemetry General Tactic Technique +Not Applicable -MSSP |
None Telemetry General Tactic Technique Not Applicable |
Telemetry General Tactic Technique Not Applicable |
None Telemetry General Tactic Technique Not Applicable |
None General Tactic Technique Not Applicable -Telemetry |
| Modifier Detection Types | Delayed Tainted Configuration Change |
Delayed Configuration Change +Alert +Correlated +Host Interrogation +Residual Artifact +Innovative -Tainted |
Delayed Configuration Change -Alert -Correlated -Host Interrogation -Residual Artifact -Innovative |
Delayed Configuration Change |
Delayed Configuration Change |
Delayed Configuration Change |
Not revealed yet |
| Operating System Tested | Windows | Windows (RDP Enabled) |
Windows (RDP Enabled) +CentOS (Linux with SSH Enabled) |
Windows CentOS (Linux) |
Windows +Ubuntu (Linux) -CentOS (Linux) |
Windows Ubuntu (Linux) +MacOS (Apple Silicon) |
Windows Ubuntu (Linux) +Cloud-hosted Systems -MacOS (Apple Silicon) |
| Protections | Not Available | Not Available | Enabled (Categories-None/Blocked) (Modifiers-User Consent) |
Enabled (Categories-None/Blocked) (Modifiers-User Consent) |
Enabled (Categories-None/Blocked) (Modifiers-User Consent) |
Enabled (Categories-None/Blocked) -(Modifiers-User Consent) |
Enabled |
| Volume | Not Available | Not Available | Not Available | Not Available | Not Available | Not Available | Enabled |
| False Positives | Not Available | Not Available | Not Available | Not Available | Not Available | Enabled (named as Noise) (833 Reported) |
Enabled |
| Console Operated | Vendor | Vendor | Vendor | Vendor | Vendor | MITRE OPERATOR | MITRE OPERATOR |
| No. of Techniques Tested | 51 | 54 | 46 | 48 | 55 | 42 | 54 |
| No. of Vendors | 12 | 21 | 17 | 30 | 29 | 19 | 19 -Microsoft -SentinelOne -Palo Alto |
| Field Relevance Rating | ★☆☆☆☆ | ★☆☆☆☆ | ★★☆☆☆ | ★★★☆☆ | ★★★☆☆ | ★★★★☆ | ★★★★★ |
In the consequent years, MITRE refined its approach:
- From 2019-2023, they expanded the number of techniques tested and saw an increase in the number of vendors for a broader view.
- Since 2021, they have provided protection measures, but they remain limited. In the same year, we also saw a decrease in the no. of modified detection types.
- They later included noise and information about false positives to ensure more analysis to get better results.
Conclusion
MITRE Evaluation has become a major part of the cyber-industry as it provides details regarding many consoles and their every aspect, from techniques to providing protection measures.
Over the years, it has evolved according to the present needs, like providing a list of false positives(as we have discussed before). In 2025 evaluations, as the field-relevant ratings are 5 star, some major EDR/XDR vendors like Microsoft Defender, SentinelOne and Palo Alto Networks have stepped back from MITRE Evaluation citing internal reasons and focus on product roadmap and execution. This stepping back isn’t good for the industry overall as it will reduce the transparency and confidence that the industry requires at this point. Additionally, as leaders in the market Microsoft, Palo Alto Networks and SentinelOne have a responsibility towards the industry and the decision to step away shouldn’t be taken in isolation.
Did you find this article helpful?
Let the authors know by leaving a like or comment.
No comments yet
Be the first to share your thoughts!
