Reducing Noise in Crowdsourced Security

More people → more coverage → more vulnerabilities. While the crowdsourced model provides a plethora of benefits, it’s tough to deny one of the core reasons many choose it. A larger pool of pay-per-finding security researchers are more likely to find more high-impact vulnerabilities than any other testing method. But there’s a catch. More vulnerabilities → more false positives → more time wasted if not properly managed.

In this blog we’ll look at why reduction of false positives is crucial for a strong crowdsourced security program, how signal-to-noise ratio is calculated, and what factors make or break the number of valid vulnerabilities customers see.

Defining Signal-to-Noise in Crowdsourced Security

One of the most frequent objections to crowdsourced security is fear of being overwhelmed by too many submissions. While some organizations fear they lack the resources to appropriately manage a sudden influx of vulnerabilities, it’s the invalid ones that are of greater concern. Too many false positives waste company resources, while also dangerously obscuring critical insights that need immediate attention. That’s why crowdsourced security platforms like Bugcrowd provide triage services, to reduce the noise that customers would otherwise have to process themselves. The effectiveness of these services comprises the “signal-to-noise ratio.”

At this point it’s important to clarify the difference between signal-to-noise and the ratio of valid to invalid vulnerabilities. The easiest way to think about each is to consider how they might be changed. The ratio of valid to invalid vulnerabilities is purely a reflection of the skill of researchers across all programs, public and private. The signal-to-noise ratio is a reflection of the skill of the triage team in weeding out the noise that a customer would otherwise see. It’s also important to note the difference between reproducible and in-scope vulnerabilities, versus those that customers may later deem not-applicable or something that they won’t fix. All measures are critical to overall program success. Some additional definitions below:

Triage-Dependent Metrics:

True Positives (Signal): Triage team and the customer are aligned that the submission is in scope, is not a duplicate, and can be replicated. 96% of submissions across Bugcrowd public and private programs are valid according to this definition, with 97% valid in private (invite-only) programs alone. If the customer finds the submission to be non-applicable to their program, or is something they won’t fix, this number may fall.

False Positives (Noise): Triage team believes a submission is valid as it is in scope and can be replicated, but customer disagrees. 3-6% of submissions fit this definition.

False Negatives (Noise): Triage team believes a submission to be invalid, but the Customer disagrees. 0% of submissions triaged by Bugcrowd fit this category. Note: While we have not run into this occurrence to a level of statistical significance, there is a possibility that some submissions could be miscategorized. Customers can opt to review submissions that were marked invalid in order to prevent such a situation.

Crowd-Dependent Metrics:

Total Submissions: 100% of submissions to the platform, including those to be later deemed valid or invalid.

Valid Submissions: Submissions that are in-scope and can be reproduced. 70% of submissions received by Bugcrowd in 2020 fit this category

Invalid Submissions: Submissions that are out of scope, and/or cannot be reproduced. 30% of submissions received by Bugcrowd in 2020 fit this category

Customer-Dependent Metrics:

Valid But Won’t Fix: These submissions are technically valid insofar as they are in scope, and can be replicated, but the customer has indicated that they will not need to fix the vulnerability, such as might be the case for an asset that should be deprecated and removed. 19% of submissions to public and private programs fall into this category, which falls to 12% when looking just at private (invite-only) programs.

Valid But Not Applicable: These submissions are technically valid insofar as they are in scope, and can be replicated, but the customer has indicated that the vulnerability is not applicable to their core focus. 26% of submissions to public and private programs fall into this category, which falls to 21% when looking just at private (invite-only) programs.

Improving Rate of Valid Vulnerabilities

Matching the Right Resources

Having a larger pool of resources is the crux of every crowdsourced model. But bigger isn’t always better. Consider the advice a well-meaning parent probably offered after your first breakup, “there’s plenty of fish in the sea!” And consider your likely response, “Right but doesn’t that make it harder to find the right one?” Without the ability to match resources to the environment that best suits their skills and interests, more people means more noise. And the greater the size of the crowd, the harder proper pairing becomes.

That’s why Bugcrowd built CrowdMatch– with the ambition to preserve the magic of a good match, regardless of the size of search. No matter how many individuals join the Crowd (Now around 200,000), CrowdMatch helps ensure every researcher’s unique skills, experience, interest, and performance are considered in all program recommendations. The result is not only a high rate of valid vulnerabilities, but also the fastest time to value, with an average of just 4 days to first critical vulnerability within on-demand bug bounty programs.

Improving Signal-to-Noise

The Triage Team:

Well-trained, highly skilled, and highly communicative triagers are paramount to reducing noise for customers who need to focus on what matters most. Bugcrowd’s triage team is the largest in-house team of its kind, comprising veteran pen-testers and security researchers selected for their outstanding performance and professionalism on the Bugcrowd platform. Prior to digging in, they complete 1:1 training, starting slowly by triaging only subsets of vulnerability classes, until they master their craft and can graduate to more complex categories. But success in this role is just as dependent on soft-skills as it is triage training. Bugcrowd’s triage team is the virtual face of the organization, working closely with researchers and customers to answer queries, provide context, communicate outcomes, and ultimately promote rapid remediation of critical vulnerabilities.

Platform-Assisted Workflows:

Even the world’s most advanced triage team needs technology to scale. This was never more apparent than during the early days of the coronavirus pandemic. While stay-at-home and social distancing orders impaired in-person testing services, the opposite was true of remote-friendly options like crowdsourced security. During just one week in April, Bugcrowd saw 6,000+ submissions, netting out to just under 5,000 valid vulnerabilities. Despite these volumes, we maintained our average response times (<12 hours to first touch for P1 vulnerabilities and <1 business day for all else), as well as our average signal to noise ratio. Our team didn’t grow during this time, so how did we do it?

The Bugcrowd platform was designed to support exponential growth in our community and customer base, and subsequently, the number of simultaneous submissions. With hundreds per day, programmatic workflows are imperative for continued success. Our platform provides automated deduplication, prioritized queues, and smart sequences for ensuring highly efficient triage processes. This enables us to elevate the most critical vulnerabilities without faltering on our commitment to reduce all other noise.

What’s Next?

Bugcrowd is continually striving to improve our researcher matching, enhance security workflows, and grow our in-house expertise. These objectives have helped us to:

Grow our crowd: The Crowd swelled by 39% in the last year.
Reduce the time to value: Average time to first P1 or P2 is 50% faster than this time last year
Increase number of P1 or P2: The number of critical and high priority findings increased by 75% YoY.

With a north star that’s driven by quality over quantity, Bugcrowd has been able to raise the bar for both. For more information on any of our crowdsourced programs, our platform technology, or the researchers that make us who we are, check out our website, or start building your program today!

Tags: