Introduction to Predictive Risk Scoring

This document is for:

Invicti Enterprise On-Demand, Invicti Enterprise On-Premises

This document provides an overview and answers frequently asked questions about how Predictive Risk Scoring in Invicti Enterprise works and why you should use it.

What is Predictive Risk Scoring?

Predictive Risk Scoring augments the application scanning process by helping you prioritize your web assets prior to scanning. It uses AI to calculate risk scores compiled from up to 220 data points that predict the highest severity vulnerability of each discovered website with a minimum 83% confidence level. The assigned risk scores then give you the means to rank your sites and gauge the overall potential risk of your web assets before you scan them. Using this information, you can focus on scanning and fixing your riskiest sites first to make your web assets and organization safer.

How does Predictive Risk Scoring work?

Predictive Risk Scoring utilizes AI developed, maintained, and trained in-house by our Invicti AI Engineering team. It uses a machine learning model to predict risk scores based on the findings of the Discovery Service in Invicti Enterprise. Predictive Risk Scoring then visits those web assets (without scanning them) and utilizes publicly available attributes to calculate risk score predictions.

The risk score model was trained by scanning a large number (150,000+) of websites that are part of Bug Bounty and/or VDP (Vulnerability Disclosure Program) programs. From each one of these websites, we computed 220 data points that are correlated with the security posture of the website.

A few examples of the 220 data points that are used:

The website supports deprecated TLS versions like TLS v1.0
The copyright year of the website (older websites tend to be more vulnerable)
Number of form inputs
Number of XHR requests
Number of cookies not marked as HttpOnly/Secure

"With Predictive Risk Scoring, we don’t use an LLM and also don’t send any requests to an external AI service provider. Our machine-learning model is explainable and deterministic. It is also not trained on any customer data. Because it doesn’t process any natural language instructions like an LLM, there is no risk of prompt injections and similar attacks."
- Bogdan Calin, Invicti's Principal Security Researcher and the main creator of Predictive Risk Scoring.

For a deeper dive into the technical side of Predictive Risk Scoring, read our blog post Why Predictive Risk Scoring is the smart way to do AI application security.

What is a Risk Score?

The risk score indicates how likely a website is to have vulnerabilities that make it susceptible to attacks. Invicti Enterprise categorizes risk scores into critical, high, medium, and low risk. If a website has a critical risk score, this means the site is predicted to have at least one critical vulnerability and, therefore, should be treated with the highest priority for scanning to determine its vulnerabilities.

What is the difference between Predictive Risk Scoring and Scanning?

Predictive Risk Scoring is not a substitute for scanning your web assets for vulnerabilities. Even sites with a medium or low risk score are still likely to have vulnerabilities and could still have critical vulnerabilities not predicted by the model. The risk score gives you insight into the likely vulnerability of your web assets to help you make an informed decision about which sites to scan immediately and which sites can be scanned next. Predictive Risk Scoring is not as thorough as scanning each site, so it is important to note that you need to scan your sites to find vulnerabilities.

Why should I use Predictive Risk Scoring?

By prioritizing your web assets based on their risk score, you can create scan targets from the most risky sites first. This allows you to utilize your Invicti license effectively. For example, if you have 5000 results on the Discovered Websites page and 500 targets available on your Invicti license, you can use the risk score to analyze and prioritize which sites to scan first and determine how many more targets you need to order for your license.

How do I use Predictive Risk Scoring?

Predictive Risk Scoring runs in the background as part of the Discovery Service. Risk Scores are displayed on the Discovered Websites page for each of your discovered web assets. Once a risk score is calculated for a discovered web asset, the list of URLs can be ordered or filtered by the predictive level of risk, allowing you to easily determine which sites to scan immediately and which sites can be scanned next.

For more information, refer to Utilizing Predictive Risk Scoring.

Is a risk score provided for every discovered asset?

Every reachable discovered asset will receive a risk score. A discovered asset will not be reachable if the site is on an internal network—Our Discovery service, by definition, looks for external-facing assets only. Additionally, some external-facing assets might not provide or have enough information required for our model to calculate a predictive risk score confidently. In fact, some assets might have tools that block our service from accessing the requisite information. An Undetermined label will be displayed in each of these cases in lieu of a risk score.

Is a risk score provided for every target?

No. The Predictive Risk Scoring feature is only available for discovered assets.
If you create a target from a discovered web asset and then scan the target, the latest scan information will be displayed on the Discovery page in lieu of the predictive risk score.

Can I disable Predictive Risk Scoring?

Yes, you can disable the Predictive Risk Scoring feature within the Application and Service Discovery Settings page.

To disable Predictive Risk Scoring:

Go to Discovery > Settings.
Select the Risk Scoring tab.
Uncheck the box next to Enable Risk Scoring.
Click Save & Recrawl to confirm the change to your settings.

Do the Predictive Risk Scoring results refresh?

The Discovery service automatically refreshes daily with newly discovered web assets. When a new web asset is discovered, a risk score is automatically provided (assuming the web asset is reachable).