Before you start bug hunting on a new program, you need to feed the right assets to the right tools for automated recon. Sorting through the scope and getting your environment setup is a tedious (and delicate) process.
No one should want to do this manually. Especially since manual sorting can lead to mistakes. And you don’t want to make mistakes with staying in scope!
So in this post, I’ll show you how to script this process with qsv.
Note: HackerOne is the only bug bounty platform that provides scope as a CSV (that I know of). While these examples are HackerOne-specific, the parsing techniques are broadly useful anytime you’re working with structured data.
How I organize recon
When I start hacking on a web application, I separate their assets into 3 main text files:
domains.txt– In-scope domainswildcards.txt– Domains that support wildcards (ex:*.example.com)urls.txt– URLs in-scope (including those resolved from the first two files)
Pretty simple stuff. Sometimes I will create more files if I want more specificity (ex: apis.txt).
This setup makes it easy to pass information to tools and chain them together for automated recon. More on that in a future post!
This is the end goal for my environment setup. I’ll show you how to make this happen with qsv.
Using qsv
1. Viewing headers
The first thing I do when parsing a qsv is looking at the headers. This is essentially the CSV’s schema.
As an example, I’ll use Hubspot’s program scope.
$ qsv headers hubspot_scope.csv
1 identifier
2 asset_type
3 instruction
4 eligible_for_bounty
5 eligible_for_submission
6 availability_requirement
7 confidentiality_requirement
8 integrity_requirement
9 max_severity
10 system_tags
11 created_at
12 updated_atUsing headers, I can see the column number next to each header.
2. Selecting relevant columns
In this case, the most important columns for me are 1,2, and 5.
I can do this with the select command:
$ qsv select 1,2,5 hubspot_scope.csv | qsv table -w 1 -p 1 -c 20
identifier asset_type eligible_for_submiss...
api*.hubapi.com WILDCARD true
*.hubspotemail.net WILDCARD true
events.hubspot.com URL false
*.hubspotpagebuilder... WILDCARD true
chatspot.ai URL true
api*.hubspot.com WILDCARD true
HubSpot Sales Office... OTHER true
thespot.hubspot.com URL false
*.hubspotpagebuilder... WILDCARD true
connect.com URL false
HubSpot Android Mobi... GOOGLE_PLAY_APP_ID true
ir.hubspot.com URL false
*.hs-sites(-eu1)?.co... WILDCARD true
trust.hubspot.com URL false
Customer Portal OTHER true
Customer Connected D... OTHER true
app*.hubspot.com WILDCARD true
shop.hubspot.com URL false
HubSpot iOS Mobile A... APPLE_STORE_APP_ID true
Other HubSpot-owned ... OTHER true
I also used the table command to format the output (with some extra arguments to make the data fit on smaller screen sizes). For information on a specific command, you can use qsv <command> --help.
As you can see, qsv accepts input from stdin which makes it pipeable! Another reason to love it.
3. Filtering
Obviously I only want to pass assets that are in-scope to automated tools. And not all assets are created equal.
This is where filtering comes in to play.
For example, only wildcard domains go into wildcards.txt. To grab only in-scope wildcards, I can use the search command:
qsv search -s 5 true hubspot_scope.csv| qsv search -s 2 WILDCARD | qsv select 1
identifier
api*.hubapi.com
*.hubspotemail.net
*.hubspotpagebuilder.eu
api*.hubspot.com
*.hubspotpagebuilder.com
*.hs-sites(-eu1)?.com
app*.hubspot.comGreat, now I have what I want. But it’s not ready to be passed to a tool. Which, is where the next step comes in.
4. Processing
For wildcards.txt, I’m after wildcards that begin with *.. I also don’t want to include the *. in the file.
Doing that is easy enough with built-in tools:
$ qsv search -s 5 true hubspot_scope.csv | qsv search -s 2 WILDCARD | qsv slice -n -s 1 | qsv select 1 | grep '^\*\.' | grep -v \( | sed 's/\*\.//'
hubspotemail.net
hubspotpagebuilder.eu
hubspotpagebuilder.comI know it looks scary, but I’m just using slice to get rid of the headers, grep to get wildcards with leading .* (for gathering subdomains), and sed to chop-off the wildcard bit (because subfind3r doesn’t process that).
You can see that this process can get complicated fast. Which is where the final step comes in!
5. Scripting
It can be a pain to memorize these commands and one-liners are messy. Luckily, qsv is highly scriptable.
For example, I can get wildcards and domains easily with this script:
#!/bin/bash
get_asset() {
qsv search -s 5 true hubspot_scope.csv | qsv search -s 2 "$1" | qsv slice -n -s 1 | qsv select 1
}
parse_sub() {
grep '^\*\.' | grep -v \( | sed 's/\*\.//'
}
parse_domain() {
grep -Ev '^https?://'
}
get_asset URL | parse_domain > domain.txt
get_asset WILDCARD | parse_sub > wildcard.txtNow, any time the scope changes, I can just run this script and run my tools again instead of copy-pasting.
It’s also best practice to make this more general so it can work with other programs. With a strong knowledge of qsv, you can tailor your script to your needs for more efficient and safer recon.
Conclusion
Setting up an environment for automated recon and staying in scope can be a painful task. But, qsv can take most of that pain away.
Hopefully you got enough of a taste of qsv to be able to use it confidently on your bug bounty endeavors.
Thanks for reading and happy hunting!
