Overview

Joseliyo_Jstnk · 02-06-2025 09:42 AM

Overview

We recently explained what Backscatter is, a tool developed by the Mandiant FLARE team that automatically extracts malware configurations. Unlike traditional dynamic analysis methods, which can be time-consuming and circumvented by anti-analysis techniques, Backscatter utilizes static signatures and emulation to examine malware without executing it.

Trying to find samples related to specific malware families

Historically, there have been multiple approaches to finding samples of certain malware families in VirusTotal Enterprise, all of which have been valid. Some of the most common ways that users have used were the following-

Understanding Backscatter process

When a file is uploaded to the platform, this can lead to multiple executions in the sandbox. For example, a ZIP file uploaded to Google Threat Intelligence will contain a VBS that acts as a downloader for a PE EXE, and finally, the PE EXE can be associated with a malware family.

Therefore, the logic implemented in the platform will indicate that both the ZIP and the VBS, up to the PE EXE, have the Remcos malware family linked to them because, during execution, all three samples had direct activity with the PE EXE (the Remcos payload).

If we see this ZIP file as an example, there is a Remcos malware family linked within the details tab. This is because during the execution in sandbox, the ZIP file was unzipped to execute the BAT file. Analyzing the BAT we can see that there is a connection to get content from hxxps://paste[.]rs/LH7ZH which is in charge of obtaining the Remcos payload from other resources and then load it into memory as illustrated in the following figure.

Figure 1: Content used by the BAT file to load the Remcos payload

Finally, the result of this will be the family linked to the initial ZIP file and the BAT sample.

Using engines values

One of the most commonly used ways to identify samples of a malware family has been through the use of the engines modifier.

This modifier filters the files according to malware family name on the antivirus/EDR results (no matter which particular engine produced the output). Besides the antivirus results, it can filter out using dynamic analysis sandbox detections as well. For example, let’s say that you want to get samples related to the xworm family. You can use the next query.

engines:"xworm"

Using YARA rules

YARA rules remain a solid way to detect patterns that are common across malware families. Google Threat Intelligence, and historically VirusTotal Enterprise, users can consult the Crowdsourced YARA Hub which stores all YARA rules that can detect a pattern in the samples uploaded to the platform.

An example of searching for malware families by the yara_rule modifier can be the following to detect Cobalt Strike samples.

yara_rule:"cobalt"

This can give us samples that have YARA rules containing the word "Cobalt".

Using comments

The VirusTotal community also makes use of sample comments that are uploaded to the platform through automated processes or simply added by hand after analyzing a sample. Some users use other sandboxes or private YARA rules to subsequently add a comment about the identified malware family.

For example, the following query can return results for comments that contain the word "guloader"

comment:"guloader"

Using behaviors

There are some cases where certain malware families have specific behavior that is easy to detect and categorize with activity of the family in question. For example, the following pattern of file creation at runtime is common in RATs and has been verified in VenomRAT and AsyncRAT samples.

behaviour_files:"\\AppData\\Roaming\\DataLogs\\DataLogs.conf"

This approach can also be followed for other malware family commonalities that might be indexed under the metadata search modifier or some other static/dynamic properties.

Backscatter in action

As mentioned, backscatter is capable of extracting configuration information from malware families through static techniques, including memory dumps. This functionality implemented in Google Threat Intelligence is a complement to the previously described methods for categorizing malware families.

The best part of Backscatter is that it is also capable of extracting actionable information from the families that can be useful for pivoting and threat hunting, such as C2, campaign identifiers, SMTP configuration, registry keys created for persistence among others.

Consuming backscatter information from the API

Before going into the details of Backscatter and looking at different use cases, it is important to know that the information offered through the API sometimes includes more information than what can be consulted through the GUI.

To consume the information generated by Backscatter, you can get it from the /api/v3/files/[sha256] endpoint on a sample that has extracted configuration. In the JSON response, there will be a key called malware_config with the extracted details. See next JSON from the AsyncRAT sample with 519f3ceedba4471f3d5178451c1007911145fb6eaf4e259a2c29b8e3483dabb1 sha256.

"malware_config": {
        "families": [
          {
            "family": "asyncrat",
            "configs": [
              {
                "implant_info": {
                  "version": "| CRACKED BY https://t.me/xworm_v2",
                  "campaign_ids": [
                    "SolaraFake"
                  ]
                },
                "tool": "MANDIANT_BACKSCATTER",
                "host_info": {
                  "folders": [
                    {
                      "folder_name": "%Temp%",
                      "description": "Install Folder"
                    }
                  ],
                  "mutexes": [
                    "AsyncMutex_6SI8OkPnk"
                  ],
                  "files": [
                    {
                      "file_name": "%Temp%\\Windows.exe"
                    },
                    {
                      "file_name": "%Temp%\\Windows.exe",
                      "description": "Install File"
                    }
                  ]
                },
                "txt_configs": [
                  "{\"Server\": \"anyone-blogging.gl.at.ply.gg\", \"Ports\": \"22284\"
[.....]

It is important to mention that many of the fields included in the JSON can be used to do threat hunting on the platform using the malware_config: modifier.

In some malware families the information stored in the JSON can be different, for example, if we take the following 5433726d3912a95552d16b72366eae777f5f34587e1bdaa0c518c5fcbc3d8506 sha256 related to Remcos, we can see more details like mutexes, folders used, files used or event registry keys created to persist.

[...]
                
"host_info": {
                  "mutexes": [
                    "iwebfiewbfihbewlfkm-WH4782"
                  ],
                  "folders": [
                    {
                      "folder_name": "MicRecords",
                      "description": "Audio Folder"
                    }
                  ],
                  "files": [
                    {
                      "file_name": "unknown path\\Remcos\\remcos.exe",
                      "description": "Install Path"
                    },
                    {
                      "file_name": "remcos.exe",
                      "description": "Install File"
                    },
                    {
                      "file_name": "logs.dat",
                      "description": "Keylog File"
                    },
                    {
                      "file_name": "Screenshots",
                      "description": "Screenshot File"
                    }
                  ],
                  "registry": [
                    {
                      "hive": "HKEY_LOCAL_MACHINE",
                      "subkey": "Software\\Microsoft\\Windows\\CurrentVersion\\Winlogin",
                      "value_name": "Userinit"
                    },
                    {
                      "hive": "HKEY_CURRENT_USER",
                      "subkey": "Software\\iwebfiewbfihbewlfkm-WH4782"
                    }
                  ]

Backscatter for malware family identification

The main use case where Backscatter can help users is in identifying the malware family that may have been extracted.

To do this, you can use the malware_config: modifier followed by the name of the family whose samples you want to identify. Some of the families included are the following (note this list is not exhaustive):

redline	agenttesla	remcos	asyncrat
xworm	njrat	darkgate	warzone
beacon	lummac	donut	smokeloader
mirai	xmrig	basta	vshell

Let's say you want to identify samples where Backscatter was able to extract configuration information about AgentTesla family, you can run the following query.

malware_config:agenttesla

This will return sample results that are related to AgentTesla, whether because the configuration could be extracted statically or dynamically, or because there was a relation with a final AgentTesla payload during the infection chain.

Figure 2: Results related to AgentTesla samples using malware_config modifier

If we open one of the results, within the detection section we will see a message indicating that a malware configuration was detected.

Figure 3: Message within the detection tab indicating that there was a malware config detected

Finally, in the details tab we can see the information that was extracted and is displayed in the GUI (remember that sometimes, the API returns additional information).

Figure 4: Information extracted from the malware configuration

Identifying samples by campaign IDs

Another way to use Backscatter that has value beyond the name of the malware family is through the ID of a campaign found in the configuration information.

This way of using Backscatter allows tracking campaigns launched by threat actors on a massive scale. These campaign IDs are found in many of the configuration files of certain families, or are even used as parameters in communications with the C2.

Proofpoint researchers reported in 2024 a campaign on the malware family they called Latrodectus that had certain similarities with IcedID and even shared infrastructure historically identified with IcedID. Proofpoint analysts brute-forced the hashes generated using the FNV-1a algorithm, which helped them identify campaign IDs to track the actors distributing the malware samples.

As an example to illustrate the same thing that Proofpoint analysts did, having this f116c598a710715742bec6611ad8557e1947fc90207d3c603afafb3357f6282c sha256 related to a PowerShell downloader in charge of downloading Remcos, we can see the campaign ID zynova in the details tab, which could be extracted by Backscatter.

Figure 5: Campaign ID identified in a Remcos sample

Just by clicking on that campaign ID or doing it manually with the query malware_config:zynova we can pivot and identify other samples that have the same campaign ID in their configuration.

Figure 6: Pivoting using a campaign ID

Identifying exfiltration channels

There are many malware samples that include information in their configuration about the methods used to exfiltrate information from the infected system. Channels like SMTP, Telegram or Discord and some of the most used to exfiltrate this kind of information.

Is it possible to use malware_config to detect this information? It is possible!

The first example we will look at will be a use case where we will identify malware samples that could make use of Mailhostbox SMTP services to exfiltrate information. To achieve this, we will use the malware_config:”us2.smtp.mailhostbox.com” query. This way, we will obtain samples that include Mailhostbox SMTP, a provider of email hosting services.

Figure 7: Malware samples having the Mailhostbox SMTP in the configuration.

If we click on one of the results through the GUI, we will not get information about the email account used, but if we consume this same information through the API, we will be able to get more information about the exfiltration methods related to Mailhostbox SMTP used in this case.

This information can help us to identify intrusion sets, since sometimes, threat actors are using the same email account and password to exfiltrate information from their victims. This means that, even if you can’t see the email used for exfiltration in the GUI but you can get it through the API, you can create a query to identify other samples using the same email.

For example, with this query you can get other samples that are using the same email in their malware configuration.

malware_config:"moni@veguillla.com"

Figure 8: SMTP information extracted from configuration

Another use case is monitoring Telegram channels and bots that are configured to exfiltrate information from infected systems. This is common in families such as AgentTesla or Xworm among others. To achieve this, it is as simple as running the query malware_config:"api.telegram.org" to obtain samples that use this channel to exfiltrate information.

Figure 9: Telegram bot token used to exfiltrate information

From this point, we can identify the use of the same Telegram token in multiple samples just with a query similar to the next one.

malware_config:"bot8063115238:AAF5QkyJ-P7cyJqnksrZvJq6tGSBKoQeVtY"

Figure 10: Example of query filtering by Telegram bot token

In this example, we obtained 8 AgentTesla samples that shared the same bot token that will be responsible for exfiltrating information from compromised systems.

Other use cases

So far we have presented some use cases, but remember that they are not the only ones, and that the imagination of each analyst can create advanced queries in combination with Backscatter's malware_config modifier.

If you want to get samples that backscatter was able to extract from the configuration and that were attached to emails, you can use the following query.

have:malware_config have:email_parents

Or for example, if you are curious and want to know which samples could have extracted configuration data but were not detected by AV engines, you can use the query below.

have:malware_config p:0

Wrapping up

Backscatter is a tool developed by the Mandiant FLARE team that uses static signatures and emulation to extract malware configurations, without executing the malware. This tool is a complement to other methods for categorizing malware families, and the information it provides can be used for threat hunting.

In conclusion, Backscatter provides a powerful way to identify malware families, campaign IDs, and exfiltration channels by extracting configuration data, enhancing threat hunting capabilities. These methods can enable new ways to do threat hunting using Google Threat Intelligence and VirusTotal Enterprise.

We hope you found this blog interesting and useful, and as always we are happy to hear your feedback.

Consuming Backscatter Information to Perform Threat Hunting

Overview

Trying to find samples related to specific malware families

Understanding Backscatter process

Using engines values

Using YARA rules

Using comments

Using behaviors

Backscatter in action

Consuming backscatter information from the API

Backscatter for malware family identification

Identifying samples by campaign IDs

Identifying exfiltration channels

Other use cases

Wrapping up