Before you start bug hunting on a new program, you need to feed the right assets to the right tools for automated recon. Sorting through the scope and getting your environment setup is a tedious (and delicate) process.

No one should want to do this manually. Especially since manual sorting can lead to mistakes. And you don’t want to make mistakes with staying in scope!

So in this post, I’ll show you how to script this process with qsv.

Note: HackerOne is the only bug bounty platform that provides scope as a CSV (that I know of). While these examples are HackerOne-specific, the parsing techniques are broadly useful anytime you’re working with structured data.

How I organize recon

When I start hacking on a web application, I separate their assets into 3 main text files:

domains.txt – In-scope domains
wildcards.txt – Domains that support wildcards (ex: *.example.com)
urls.txt – URLs in-scope (including those resolved from the first two files)

Pretty simple stuff. Sometimes I will create more files if I want more specificity (ex: apis.txt).

This setup makes it easy to pass information to tools and chain them together for automated recon. More on that in a future post!

This is the end goal for my environment setup. I’ll show you how to make this happen with qsv.

Using `qsv`

1. Viewing headers

The first thing I do when parsing a qsv is looking at the headers. This is essentially the CSV’s schema.

As an example, I’ll use Hubspot’s program scope.

$ qsv headers hubspot_scope.csv
1   identifier
2   asset_type
3   instruction
4   eligible_for_bounty
5   eligible_for_submission
6   availability_requirement
7   confidentiality_requirement
8   integrity_requirement
9   max_severity
10  system_tags
11  created_at
12  updated_at

$ qsv headers hubspot_scope.csv
1   identifier
2   asset_type
3   instruction
4   eligible_for_bounty
5   eligible_for_submission
6   availability_requirement
7   confidentiality_requirement
8   integrity_requirement
9   max_severity
10  system_tags
11  created_at
12  updated_at

Using headers, I can see the column number next to each header.

2. Selecting relevant columns

In this case, the most important columns for me are 1,2, and 5.

I can do this with the select command:

$ qsv select 1,2,5 hubspot_scope.csv | qsv table -w 1 -p 1 -c 20
identifier              asset_type         eligible_for_submiss...
api*.hubapi.com         WILDCARD           true
*.hubspotemail.net      WILDCARD           true
events.hubspot.com      URL                false
*.hubspotpagebuilder... WILDCARD           true
chatspot.ai             URL                true
api*.hubspot.com        WILDCARD           true
HubSpot Sales Office... OTHER              true
thespot.hubspot.com     URL                false
*.hubspotpagebuilder... WILDCARD           true
connect.com             URL                false
HubSpot Android Mobi... GOOGLE_PLAY_APP_ID true
ir.hubspot.com          URL                false
*.hs-sites(-eu1)?.co... WILDCARD           true
trust.hubspot.com       URL                false
Customer Portal         OTHER              true
Customer Connected D... OTHER              true
app*.hubspot.com        WILDCARD           true
shop.hubspot.com        URL                false
HubSpot iOS Mobile A... APPLE_STORE_APP_ID true
Other HubSpot-owned ... OTHER              true

$ qsv select 1,2,5 hubspot_scope.csv | qsv table -w 1 -p 1 -c 20
identifier              asset_type         eligible_for_submiss...
api*.hubapi.com         WILDCARD           true
*.hubspotemail.net      WILDCARD           true
events.hubspot.com      URL                false
*.hubspotpagebuilder... WILDCARD           true
chatspot.ai             URL                true
api*.hubspot.com        WILDCARD           true
HubSpot Sales Office... OTHER              true
thespot.hubspot.com     URL                false
*.hubspotpagebuilder... WILDCARD           true
connect.com             URL                false
HubSpot Android Mobi... GOOGLE_PLAY_APP_ID true
ir.hubspot.com          URL                false
*.hs-sites(-eu1)?.co... WILDCARD           true
trust.hubspot.com       URL                false
Customer Portal         OTHER              true
Customer Connected D... OTHER              true
app*.hubspot.com        WILDCARD           true
shop.hubspot.com        URL                false
HubSpot iOS Mobile A... APPLE_STORE_APP_ID true
Other HubSpot-owned ... OTHER              true

I also used the table command to format the output (with some extra arguments to make the data fit on smaller screen sizes). For information on a specific command, you can use qsv <command> --help.

As you can see, qsv accepts input from stdin which makes it pipeable! Another reason to love it.

3. Filtering

Obviously I only want to pass assets that are in-scope to automated tools. And not all assets are created equal.

This is where filtering comes in to play.

For example, only wildcard domains go into wildcards.txt. To grab only in-scope wildcards, I can use the search command:

qsv search -s 5 true hubspot_scope.csv| qsv search -s 2 WILDCARD | qsv select 1
identifier
api*.hubapi.com
*.hubspotemail.net
*.hubspotpagebuilder.eu
api*.hubspot.com
*.hubspotpagebuilder.com
*.hs-sites(-eu1)?.com
app*.hubspot.com

qsv search -s 5 true hubspot_scope.csv| qsv search -s 2 WILDCARD | qsv select 1
identifier
api*.hubapi.com
*.hubspotemail.net
*.hubspotpagebuilder.eu
api*.hubspot.com
*.hubspotpagebuilder.com
*.hs-sites(-eu1)?.com
app*.hubspot.com

Great, now I have what I want. But it’s not ready to be passed to a tool. Which, is where the next step comes in.

4. Processing

For wildcards.txt, I’m after wildcards that begin with *.. I also don’t want to include the *. in the file.

Doing that is easy enough with built-in tools:

$ qsv search -s 5 true hubspot_scope.csv | qsv search -s 2 WILDCARD | qsv slice -n -s 1 | qsv select 1 | grep '^\*\.' | grep -v \( | sed 's/\*\.//'
hubspotemail.net
hubspotpagebuilder.eu
hubspotpagebuilder.com

$ qsv search -s 5 true hubspot_scope.csv | qsv search -s 2 WILDCARD | qsv slice -n -s 1 | qsv select 1 | grep '^\*\.' | grep -v \( | sed 's/\*\.//'
hubspotemail.net
hubspotpagebuilder.eu
hubspotpagebuilder.com

I know it looks scary, but I’m just using slice to get rid of the headers, grep to get wildcards with leading .* (for gathering subdomains), and sed to chop-off the wildcard bit (because subfind3r doesn’t process that).

You can see that this process can get complicated fast. Which is where the final step comes in!

5. Scripting

It can be a pain to memorize these commands and one-liners are messy. Luckily, qsv is highly scriptable.

For example, I can get wildcards and domains easily with this script:

#!/bin/bash

get_asset() {
    qsv search -s 5 true hubspot_scope.csv | qsv search -s 2 "$1" | qsv slice -n -s 1 | qsv select 1
}

parse_sub() {
    grep '^\*\.' | grep -v \( | sed 's/\*\.//'
}

parse_domain() {
    grep -Ev '^https?://'
}

get_asset URL | parse_domain > domain.txt
get_asset WILDCARD | parse_sub > wildcard.txt

#!/bin/bash

get_asset() {
    qsv search -s 5 true hubspot_scope.csv | qsv search -s 2 "$1" | qsv slice -n -s 1 | qsv select 1
}

parse_sub() {
    grep '^\*\.' | grep -v \( | sed 's/\*\.//'
}

parse_domain() {
    grep -Ev '^https?://'
}

get_asset URL | parse_domain > domain.txt
get_asset WILDCARD | parse_sub > wildcard.txt

Now, any time the scope changes, I can just run this script and run my tools again instead of copy-pasting.

It’s also best practice to make this more general so it can work with other programs. With a strong knowledge of qsv, you can tailor your script to your needs for more efficient and safer recon.

Conclusion

Setting up an environment for automated recon and staying in scope can be a painful task. But, qsv can take most of that pain away.

Hopefully you got enough of a taste of qsv to be able to use it confidently on your bug bounty endeavors.

Thanks for reading and happy hunting!

Automating HackerOne Scope Parsing with qsv for Bug Bounty Recon