Keep data secure: Linting LookML for access filter...

andy4

Looker empowers analysts to build powerful data models that can serve diverse user groups from a single codebase. However, this flexibility brings the responsibility of ensuring that sensitive data is accessible only to authorized users. While Looker provides tools such as access_filter and access_grant to manage data visibility, how can you enforce their use consistently across your project, especially when you’re dealing with sensitive information?

This post explores how to implement a LookML linter, specifically LAMS (Look At Me Sideways), to automatically check your code during development and prevent changes that could inadvertently expose sensitive data. We'll use a real-world scenario faced by @brandenwest at Reddit involving Material Non-Public Information (MNPI) to illustrate the concept and provide steps for implementation.

The challenge: Balancing access and security (Reddit's MNPI example)

Reddit needed to provide analytics on business performance data. Some of this data, particularly related to the current, unannounced financial quarter, constituted MNPI under SEC regulations, strictly limiting who could view it.

Context:

A specific Looker project was dedicated to analyzing this potentially sensitive data.
All developers who contributed to this project were vetted and approved for MNPI access through a separate compliance process.
However, the consumers of the dashboards and Explores that were built on this project included both users who were authorized for MNPI and users who were not.
Reddit wanted to use the same dashboards and Explores to view both current (sensitive) and historical (publicly released) data. Users without MNPI access should see only the historical data.

This context meant that every Explore that was accessing potentially sensitive data needed a mechanism to perform one of the following functions:

Filter the data: Use an access_filter to show only historical (public) data to users without the necessary permissions.
Restrict access entirely: Use an access_grant to completely hide the Explore from users without the necessary permissions.

This approach allowed Reddit to maintain a relatively open data ecosystem while strictly controlling access to sensitive information on an Explore-by-Explore basis.

Technical background: Why linting is crucial

Manually checking every Explore in every Pull Request (PR) for the correct access controls is error-prone and doesn't scale. Looker itself doesn't natively enforce the presence of access_filter or access_grant on all Explores within a project.

This is where automated linting within your code promotion workflow becomes essential:

Project-wide enforcement: Looker projects are typically linked one-to-one with a Git repository (such as GitHub). CI/CD tools and linters operate at the repository level. Therefore, a linting rule configured for the repository will apply consistently across all code changes (PRs) within that LookML project.
Developer workflow integration: Linting integrates directly into the PR process. If a code change violates a rule (for example, an Explore is missing required access controls), the linter check fails, which blocks the PR from being merged until the issue is fixed.
Bridging the gap: The linter acts as an automated check, ensuring that every Explore definition adheres to the defined security policy (in this case, requiring an appropriate access_filter or access_grant).

Important note on project scope:

Remember that users primarily interact with models and Explores, not with projects directly. Projects are a developer-centric concept which comprise a shared codebase, developer community, and code promotion workflow (including linting). Projects can be used to create boundaries where consistent development practices and enforcement (such as these access rules) are needed.

The Solution: Enforcing access controls with a LAMS rule

To meet Reddit's requirements, a specific development pattern and a corresponding LAMS rule were implemented.

The pattern

User attribute: A Looker user attribute (can_see_mnpi in this example) is defined and assigned to users based on their authorization status. This can be done with a SAML (or similar) integration or with API automation.
Implement control:

If filtering by date is possible: A dedicated view that maps calendar dates to their public release status is created (and locked down using github’s CODEOWNERS; more on that in another article). Developers join this view to their Explore's relevant date field and add an access_filter that references the user attribute. This filter restricts data for users where can_see_mnpi is false, showing only rows that correspond to publicly released dates.
If filtering by date is not feasible: Developers add an access_grant that requires the can_see_mnpi permission to view the Explore at all.

The LAMS rule

The core requirement is that every explore definition within the designated LookML project must contain either an access_filter block that references the can_see_mnpi user attribute OR an access_grant block that references the same.

The LAMS linter is configured with a custom rule written by @brandenwest to check precisely this condition for every PR that’s submitted against the project's repository.

Requirements for implementation

Before you start, ensure that you have the following:

Git integration: Your Looker project must be connected to a Git provider that supports Pull Requests (for example, GitHub, GitLab, Bitbucket).
Required PRs: Enable the "Require Pull Requests to deploy" setting in your Looker project's Git configuration. See the Looker documentation on setting up and testing a Git connection (navigate to the relevant section on PRs).
Looker user attribute: Define a user attribute in Looker (for example, can_see_mnpi) to identify the users who are authorized to see sensitive data. Assign appropriate values to your users or groups. Interestingly, in this case the “yes” condition was actually “Yes,No” since that audience could see all data, regardless of its MNPI status. Since this is a comma-separated list, the user attribute type must be “String Advanced.” Type “String” does not support lists nor wildcards.
LAMS environment: A CI/CD environment capable of running LAMS checks on your PRs (for example, GitHub Actions, GitLab CI).
(Optional but recommended) MNPI calendar view: If you’re using the access_filter approach based on public release dates, a reliable, centrally managed date dimension or mapping view is needed.

Steps to set up the linter

Connect Looker to Git and require PRs: Ensure that prerequisites #1 and #2 from the “Requirements for implementation” section are met.
Set up LAMS runtime: Configure your CI/CD pipeline (for example, GitHub Actions) to run LAMS. This typically involves checking out the PR code and executing the LAMS command. Refer to the official LAMS documentation for setup instructions: LAMS GitHub Repository

Set up the repository to run CI checks during PR create or update. The LAMS project has a few example implementation methods.
Reddit chose to only check the modified files.

Implement the LAMS rule:

Obtain the specific LAMS rule code that’s designed to check for the presence of access_filter or access_grant on every Explore. You can find @brandenwest's implementation of this rule here: Community Contribution: Access Filters
Place this code, commented out, in the project manifest file.
Configure your LAMS execution (for example, in your lams.config.yaml or via command-line arguments) to include this rule. Ensure the rule correctly references the name of your specific user attribute (e.g., can_see_mnpi).

How the linter works in practice

Once set up, the workflow is seamless for developers. The linter workflow consists of six main steps:

Develop: A developer makes changes to LookML in the Looker IDE (for example, adds or modifies an Explore).
Commit and PR: The developer commits their changes and opens a PR in your Git provider (for example, GitHub).
Automated check: Your CI/CD pipeline automatically triggers. It runs LAMS, including the custom rule checking for access controls on all Explores that have been modified or added in the PR.
Feedback:

Pass: If all Explores have the required access_filter or access_grant, the LAMS check passes. The PR can proceed through any other required reviews and be merged.
Fail: If LAMS finds an Explore without the required access controls, the check fails. The PR status in GitHub (or equivalent) will indicate failure, blocking the merge. The LAMS output will specify which Explore(s) violated the rule.

Remediate: The developer reviews the LAMS feedback, adds the necessary access_filter or access_grant to the problematic Explore(s) in the Looker IDE, commits the fix, and pushes the update to the existing PR.
Re-check: The CI/CD pipeline runs again. If the fix is correct, the LAMS check now passes, unblocking the PR.

Conclusion

By integrating LAMS with a custom rule into their Looker development workflow, Reddit was able to automate the enforcement of critical data access policies. This approach significantly enhances data security and compliance by ensuring that sensitive information that is exposed through Explores is consistently protected by appropriate access_filter or access_grant mechanisms, preventing accidental data leakage before code changes are merged and deployed.