Getting Started with Data Tables in Google Security Operations

David-French
Staff

In Google Security Operations (SecOps), single-column reference lists have been a longstanding method for including or excluding events in detection rules based on a list of strings, regex expressions, or CIDR ranges. But what if you need to apply more complex logic in your rules that involves filtering events based on multiple criteria or enriching events/entities using custom data? We recently launched data tables in public preview to help security teams with their more sophisticated event filtering and enrichment use cases.

A data table like the example shown below is made up of named columns (in the first row of the table) and rows of data. Each column of a data table must be mapped to either a data type (string, regex, or CIDR) or to a UDM entity field (e.g. entity.user.termination_date.seconds). Specifying a data type for a column in a data table allows you to use values in those columns to filter events in your rules. Mapping a data table column to an entity field allows you to override, append, and exclude data in Google SecOps’ Entity Graph using custom data.

This post will demonstrate how to create data tables and use them to filter events in YARA-L rules. In future posts, we’ll do a deep dive on how to write values to a data table using the results from a rule and how to use a data table to modify data in the entity graph, but we’ll “table” that for now 😐

Please note, while the data tables feature is in public preview, it’s not possible to read & write data to/from data tables in search queries. This functionality is expected to be added in a future release.

image.pngExample of a data table in Google SecOps

Now that we know what data tables are and how they can be used at a high level, I’ll go ahead and create a new data table and show how to use it in a rule.

From the Data Tables page in Google SecOps, I click the “Create” button and enter a name and description for the new table. A common detection use case for security operations teams is to maintain a list of users to monitor for suspicious activity such as data exfiltration. This is the example use case that we’re going to focus on today.

image.pngCreating a new data table in Google SecOps

The new data table has been created. Now it’s time to populate it with some data. I can either manually type the data (column headers and rows) into the user interface or I can click the “Import File” button to upload a CSV/TSV file to populate the data. I’m going to upload the following CSV file.

user_email,expiry_date
mikeross@cymbal-investments.net,2025-05-01 00:00:00
roxysmith@cymbal-investments.net,2025-05-01 00:00:00
rachelmason@cymbal-investments.net,2025-06-01 00:00:00

This is a small data table for demonstration purposes. We will explore more complex use cases for data tables as this blog series progresses.

After reviewing the import file options, I click “Import Data” and my new data table is populated with the data from my CSV file.

image.pngReviewing available options for importing the contents of a file into a data table

image.pngPopulating a data table in Google SecOps

Our next step is to define the data type (STRING, REGEX, or CIDR) for each of the columns in the data table. Setting the data type to REGEX or CIDR allows for the comparison of values in events that you’ve ingested into SecOps and regular expressions or CIDR IP address ranges stored in the data table. Setting the data type to STRING allows you to do row- or column-based comparisons based on string values.

Selecting the STRING data type is fine for both of these columns. The “expiry_date” column contains a date and time that we will convert to a timestamp in the logic for the new rule that we’re about to create.

image.pngSetting the data type for columns in a data table

The rule shown below is a customized version of one of our community rules. It detects when a user who is in the “monitored_users” data table shares a file via Google Drive with an email address that’s associated with a free service such as Gmail or Hotmail. I’ve removed some values from the meta and outcome section of the rule for brevity.

Specifically, the rule does the following:

  1. Filters Google Workspace events to identify Google Drive file share events
  2. Performs a row-based comparison between the user’s email address and the values in the “user_email” column of the “monitored_users” data table
  3. Checks that the event timestamp (metadata.event_timestamp.seconds) is earlier than the “expiry_date” for the user in the data table

Let’s take a closer look at the YARA-L syntax for performing a row-based comparison between a value in a UDM event and the values stored in a data table.

rule monitored_user_google_workspace_file_shared_from_google_drive_to_free_email_domain {
    meta:
        author = "Google Cloud Security"
        description = "Identifies when a user account that is on our list of monitored users shares a file that's stored on Google Drive with a free email domain."
    
    events:
        $workspace.metadata.vendor_name = "Google Workspace"
        $workspace.metadata.product_name = "drive"

        (
            $workspace.metadata.product_event_type = "change_user_access" or
            $workspace.metadata.product_event_type = "change_document_visibility" or
            $workspace.metadata.product_event_type = "change_document_access_scope" or
            $workspace.metadata.product_event_type = "change_acl_editors"
        )

        // File shared with an email address that's associated with free email service
        // This list of domains can be customized and/or stored & maintained in its own data table
        $workspace.target.resource.attribute.labels["visibility"] = "shared_externally"
        $workspace.target.user.email_addresses = /.*@gmail\.com|.*@aol\.com|.*@ymail\.com|.*@ymail\.com|.*@hotmail\.com|.*@outlook\.com|.*@icloud\.com/

        $user_email = $workspace.principal.user.email_addresses[0]
        // Check if the user's email address is found in a row within the data table's 'user_email' column
        $user_email = %monitored_users.user_email
    
        // Ensure that the event timestamp is earlier than the 'expiry_date' that's stored for the user in the data table
        $workspace.metadata.event_timestamp.seconds < timestamp.as_unix_seconds(%monitored_users.expiry_date)

    match:
        $user_email over 15m

    outcome:
        $target_emails = array_distinct($workspace.target.user.email_addresses)
        $doc_name = array_distinct($workspace.target.resource.name)

    condition:
        $workspace
}

In the rule’s events section, we are looking for events where the user’s email address (stored in the $user_email placeholder variable) is found within a row in the data table for the “user_email” column. The syntax for referencing a data table column is %data_table_name.column_name.

We’re also comparing the timestamp (metadata.event_timestamp.seconds) for the UDM event with the timestamp that’s stored in the “expiry_date” column for the same user. Note that the timestamp.as_unix_seconds function is being used to convert the timestamp (e.g. 2025-04-01 00:00:00) to an epoch timestamp value ready for comparison with the timestamp in the UDM event.

image.pngUtilizing a data table in a YARA-L rule in Google SecOps

I’ve used the term, row-based comparison a couple of times. When working with data tables, a row-based comparison is used to evaluate conditions for a single row in a data table. In the example rule above, we want to ensure that the values for a user’s email address and “expiry_date” are evaluated against a single row in the data table. We don’t want to match a user’s email address in one row in the data table and match the event timestamp with another user’s “expiry_date” in a different row within the data table.

Row-based comparisons are performed by using equality operators such as = and <. For example, $user_email = %monitored_users.user_email. Column-based comparisons are performed by using the “in” keyword. You can read more about row- and column-based comparisons in the documentation.

Testing the rule by running it over the last two weeks of events reveals that user “mikeross@cymbal-investments.net” shared a file, “Customer Proposals - March 2025” via Google Drive with a Gmail email address.

image.pngTesting the new rule in Google SecOps’ rules editor

Wrap up

In this post, we learned how to create data tables in Google SecOps and populate them with data. We walked through an example of how to utilize data tables in YARA-L rules to filter events based on a detection use case. I also explained the difference between row- and column-based comparisons when using data tables.

I hope you found this useful. Stay tuned for more posts on data tables in the near future. I’ll be providing examples on how to write results from a rule to a data table, how to enrich entities using custom data, and how to manage data tables using Google SecOps’ API.

6 10 11.1K
Authors
10 Comments
rahul7514
Bronze 4
Bronze 4

@David-French : Nice article. So the difference between Data table and reference list is in Reference list we can compare one column item with the events. in data table we can do two or more column values  with events . Is my understanding correct here? 

David-French
Staff

@rahul7514 that's one of the differences, yes. Data tables can also be used in rules to override, append, and exclude data from the entity graph. Values can also be written from a rule to a data table. In a future release, we expect that users will be able to read & write data from/to data tables in search queries as well.

nahatx
Bronze 4
Bronze 4

Super cool! 

@David-French , are there plans to add SOAR Actions to search/update Data Tables as well? Assuming so, do you have any roadmap estimate?

sufuz
Bronze 1
Bronze 1

Hello @David-French 

Thanks for your article.

I would like to know if it possible for example to exclude IP addresses on specifics ports using data tables without excluding all ports for every IPs in the table.

For example, in this example data table :

10.10.10.10, 161
10.11.11.11, 135

Is that possible to exclude traffic from 10.10.10.10 to port 161 and 10.11.11.11 to port 135 ?
Thank you

mikewilusz
Staff

@nahatx While we do not have an official integration today, I did write a community integration for Data Tables with the exact functionality you're looking for. You can review and download from here (make sure to download the zip from the Releases): https://github.com/pilot006/chronicle-soar-secops-data-tables

-mike

nahatx
Bronze 4
Bronze 4

Awesome! Thank you @mikewilusz.

Which service account key does the integration require?

David-French
Staff

@sufuz - Here is an example rule that filters network connection events based on target.ip and target.port.

rule example_exclude_traffic_to_specific_target_ip_and_port_combinations {

  meta:
    author = "David French"

  events:
    $network.metadata.event_type = "NETWORK_CONNECTION"

    $log_type = $network.metadata.log_type
    // join (=) the metadata.log_type UDM field to the rows in the log_type column in the data table
    $log_type = %ip_port_combos_3.log_type

    // filter events where the target.ip is not equal to the target_ip in the data table row
    $network.target.ip != %ip_port_combos_3.target_ip
    // filter events where the target.port is not equal to the target_port in the same data table row
    $network.target.port != cast.as_int(%ip_port_combos_3.target_port)

  match:
    $log_type over 15m

  outcome:
    $risk_score = 0

  condition:
    $network
}

And here is the example data table that I created:

log_type,target_ip,target_port
GCP_FIREWALL,10.11.11.11,135
GCP_FIREWALL,10.130.10.7,22
GCP_FIREWALL,10.10.10.10,161

I have a couple of questions about your use case.

Do you need to filter events based on CIDR IP address ranges? If so, it's possible to use the CIDR data type for a column in a data table and utilize that in your rule.

Do you need to filter events for IP address(es) based on ranges or a collection of ports? If so, you can use the REGEX data type in a data table and use that as well.

mikewilusz
Staff

@nahatx Just updated the README on the GitHub with what's required for the service account. Thanks for asking!

https://github.com/pilot006/chronicle-soar-secops-data-tables/blob/main/README.md

-mike

sufuz
Bronze 1
Bronze 1

Hello @David-French 

Thank you for your response, appreciate it.

So if Iunderstand correctly, by doing this, this will exclude only source IP and target ports  that are on the same row, and not all the ports in your data tables for every IP adresses ?

$network.target.ip != %ip_port_combos_3.target_ip
 $network.target.port != cast.as_int(%ip_port_combos_3.target_port)

 To provide an anwser to your questions :

Do you need to filter events based on CIDR IP address ranges? If so, it's possible to use the CIDR data type for a column in a data table and utilize that in your rule.
No, for now I just need to exclude specific behaviors, for example this IP "10.11.11.11" requesting the port 135

-Do you need to filter events for IP address(es) based on ranges or a collection of ports? If so, you can use the REGEX data type in a data table and use that as well.
I don't need this for now, but nice to knwo that is possible.

Thank you very much!

David-French
Staff

@sufuz the answer to your question about comparing values in the same row of a data table is yes. Happy to help.