Custom Cluster DNS Providers for Anthos Clusters on VMware and Bare Metal

Overview

Generally, organizations maintain Public and/or Private DNS Servers for resolving their DNS queries. Consider an organization running Anthos Clusters on VMware or Bare Metal. If such an organization wants to configure some Cluster DNS Provider(s) or wants DNS Resolution (within cluster) for Anthos workloads to happen via some specific Public and/or Private DNS Server(s), what should be done? This article focuses on highlighting the steps and leading practices for configuring Cluster DNS Provider(s) for Anthos Clusters on VMware and Bare Metal.

Background

In September 2021, CoreDNS was released as the default cluster DNS provider for Anthos Clusters on VMware (v1.9.x and above), thereby replacing KubeDNS. The support for CoreDNS was also announced for Anthos Clusters on Bare Metal around the same time. Along with the release of CoreDNS, a new ClusterDNS Custom Resource was launched as well to help configure cluster DNS options such as upstream Name Servers.

Introduction

Note: For brevity purposes, the article will be using the term Anthos Clusters to refer to the Anthos Clusters on VMware and Bare Metal both, unless otherwise stated.
 
In Anthos Clusters, ClusterDNS Custom Resource can be configured to point to the Upstream Name Servers, as needed. The ClusterDNS Custom Resource allows the users to configure their cluster DNS provider(s). ClusterDNS is based on CoreDNS (an Open Source general-purpose authoritative DNS Server that can serve as cluster DNS); treat it like a subset of the CoreDNS Corefile (CoreDNS Configuration File), having limited configuration options available, when compared to the CoreDNS Corefile.
 
The DNS configuration for a cluster is held in a ClusterDNS Custom Resource named default (the only valid name as reconciliation, if needed, will apply to the resource named default only). The ClusterDNS Custom Resource is a cluster-wider resource and not namespaced.
 
To check the existing ClusterDNS resource specification (applies to only Anthos Clusters on VMware if considering a fresh installation; Anthos Clusters on Bare Metal don't come with the ClusterDNS Custom Resource by default), the below-mentioned command can be used.
kubectl --kubeconfig <cluster_kubeconfig> get clusterdns default --output yaml
 
In case the spec section is empty or missing (as mentioned above, this would be applicable to fresh installations of Anthos Clusters on VMware), it means that the cluster uses the default Kubernetes CoreDNS configuration.
 
In order to have custom DNS configuration applied, a manifest file (this can be used to create ClusterDNS Custom Resource in Anthos Clusters on Bare Metal) for ClusterDNS Custom Resource can be created, as shown below (the example shows a file having name custom-clusterdns.yaml). Please note that the name of the ClusterDNS Custom Resource should be default.
# Manifest File Name: custom-clusterdns.yaml
apiVersion: networking.gke.io/v1alpha1
kind: ClusterDNS
metadata:
  name: default
spec:
  upstreamNameservers:
  - serverIP: 8.8.8.8
  - serverIP: 8.8.4.4
  domains:
  - name: altostrat.com
    nameservers:
    - serverIP: 198.51.100.0.1
  - name: my-own-personal-domain.com
    nameservers:
    - serverIP: 203.0.113.1
    - serverIP: 203.0.113.2
      serverPort: 54
  googleAccess: private
 
To apply the ClusterDNS configuration, the below mentioned command can be used.
kubectl --kubeconfig <cluster_kubeconfig> apply -f custom-clusterdns.yaml
 
More information can be found at the official documentation.

ClusterDNS Specifications

ClusterDNS specifications define the desired state of the ClusterDNS. The specifications are discussed in detail in this section.
spec.upstreamNameservers
spec.upstreamNameservers specification specifies an array of objects of Name Servers that external (outside cluster) DNS queries will be forwarded to.
 
A sample configuration for the same is stated below for reference.
spec:
  upstreamNameservers:
  - serverIP: 8.8.8.8
  - serverIP: 1.2.3.4
    serverPort: 54
 
Each object should have a serverIP along with an optional serverPort (default port is 53). serverIP is the IP Address of the upstream Name Server and can be a valid IPv4 or IPv6 IP Address.
 
If no configuration for spec.upstreamNameservers is provided, the DNS provider fallbacks to the /etc/resolv.conf file present on the respective Kubernetes Worker Node / Host to find the list of upstream Name Servers.
spec.domains
spec.domains specification specifies a list of domain configurations. Essentially, it sets the Name Servers that should be used for specific domain queries. It should be treated like a specification block hosting the configuration for stub domains where each stub domain configuration is a mapping from domain suffix to a list of Name Servers. Queries that match the given domain suffix(es) are forwarded to the corresponding Name Servers rather than the global upstream Name Servers (specified under spec.upstreamNameservers).
 
Using this configuration, query handling for specific domains can be done.
 
A sample configuration for the same is stated below for reference.
spec:
  domains:
  - name: altostrat.com
    nameservers:
    - serverIP: 203.0.113.1
  - name: my-own-personal-domain.com
    nameservers:
    - serverIP: 198.51.100.1
    - serverIP: 198.51.100.2
      serverPort: 50000
  - name: cluster.local
    queryLogging: true
spec.domains.name

spec.domains.name specification specifies the name of the domain to configure. It should be a valid domain suffix adhering to the standard DNS naming conventions.

spec.domains.nameservers

spec.domains.nameservers specification specifies the set of Name Servers to which the requests corresponding to the domain suffix (as mentioned in spec.domains.name) should be forwarded.

spec.domains.queryLogging

spec.domains.queryLogging specification is a toggle (true/false; boolean) for enforcing query logging for the corresponding domain (as mentioned in spec.domains.name). This can be used for enabling logging for debugging DNS queries.

spec.googleAccess

spec.googleAccess specification specifies how the Google domains should be resolved. It allows the users to resolve the Google domains to private-access IP Addresses by setting the key-value pair as spec.googleAccess: private. Similarly, the users can resolve the Google domains to restricted-access IP Addresses by setting the key-value pair as spec.googleAccess: restricted. If there is no requirement for resolving Google domains to private-access or restricted-access IP Addresses, the spec.googleAccess configuration can either be removed or set as spec.googleAccess: default.

It should be noted that this configuration is only applicable for the Pods in the cluster that are using Cluster DNS service to resolve the domain names. This configuration will not affect the DNS configuration of the Kubernetes Worker Nodes, as such. If private-access/restricted-access is to be used, a user should ensure that some appropriate Hybrid Network Connectivity medium like VPN/Interconnect between GCP and Anthos Clusters is in place. For more information onNetwork Connectivity, check out the official documentation.

spec.orderPolicy

spec.orderPolicy specification specifies the ordering policy using which the upstream servers are selected. This configuration comes handy when multiple serverIP entries are there in the Name Server configuration (regardless of whether the serverIP is part of spec.upstreamNameservers or spec.domains.nameservers) and the target upstream server IP Address is to be chosen.

The default value for spec.orderPolicy is random. This means that the selection of the upstream server happens in a random manner. There are two other supported values, round_robin (round robin selection of the upstream server) and sequential (upstream servers are sequentially queried until one responds, starting with the first server for each new query).

Recommendations and Noteworthy Points

  • Use the spec.domains configuration block effectively with appropriate domain suffixes and corresponding Name Servers.
  • When defining domain suffixes in scenarios where both sub-domains and parent domains need to proxy requests to different Name Servers, it is recommended to place the sub-domains before the parent domain, otherwise sub-domain requests will be forwarded to the parent domain’s upstream Name Server. This is because the evaluation of the domains happen in the order they are listed. For example, if kubernetes.example.net needs to forward requests to 10.100.0.2 and example.net needs to forward requests to 10.200.0.2, since kubernetes.example.net is a sub-domain of example.net, the configuration for the same should also occur before the configuration for example.net (parent domain in this case), as shown below.

 

spec:
  domains:
  - name: kubernetes.example.net
    nameservers:
    - serverIP: 10.100.0.2
  - name: example.net
    nameservers:
    - serverIP: 10.200.0.2

 

  • Having too many entries under the spec.domains specification section could result in added latencies as the evaluation for the domains happen in the order they are listed. Additionally, clubbing of domains having common domain suffix should be done in order keep the spec.domains specification tidy and performant, as shown below.

A badly configured ClusterDNS manifest file which is unmanageable and not performant (additional latencies are there) can be found below.

 

# Badly configured ClusterDNS manifest file: custom-clusterdns.yaml
apiVersion: networking.gke.io/v1alpha1
kind: ClusterDNS
metadata:
  name: default
spec:
  upstreamNameservers:
  - serverIP: 8.8.8.8
  - serverIP: 8.8.4.4
  domains:
  - name: abc.kubernetes.example.net
    nameservers:
    - serverIP: 10.100.0.2
  - name: xyz.kubernetes.example.net
    nameservers:
    - serverIP: 10.100.0.2
  - name: efg.kubernetes.example.net
    nameservers:
    - serverIP: 10.100.0.2
  - name: internal.example.net
    nameservers:
    - serverIP: 10.100.0.3
  - name: mock.net
    nameservers:
    - serverIP: 10.100.0.4
  - name: hij.kubernetes.example.net
    nameservers:
    - serverIP: 10.100.0.2   
  googleAccess: private

 

The above-shown badly configured ClusterDNS manifest file can be corrected by clubbing the domains having common domain suffix, as shown below.

 

# Rightly configured ClusterDNS manifest file: custom-clusterdns.yaml
apiVersion: networking.gke.io/v1alpha1
kind: ClusterDNS
metadata:
  name: default
spec:
  upstreamNameservers:
  - serverIP: 8.8.8.8
  - serverIP: 8.8.4.4
  domains:
  - name: kubernetes.example.net
    nameservers:
    - serverIP: 10.100.0.2
  - name: internal.example.net
    nameservers:
    - serverIP: 10.100.0.3
  - name: mock.net
    nameservers:
    - serverIP: 10.100.0.4
  googleAccess: private

 

  • Make sure that the Name Servers that are part of spec.upstreamNameservers don’t provide different answers for the same query. The users may feel free to provide Public or Private DNS Name Server entries, keeping the above-mentioned recommendation in consideration.
  • Entries under spec.domains should only be made for domain suffixes and their corresponding Name Servers; some users end up confusing the serverIP for DNS A Record, which is not right.

Conclusion

ClusterDNS Custom Resource is a feature which is natively supported by Anthos Clusters to enable the users to configure cluster DNS provider(s). ClusterDNS Custom Resource being derived from CoreDNS and not having the full set of CoreDNS Corefile configurations could pose challenges to some users if they need advanced configurations. In such cases, a user could instead configure CoreDNS directly (for which the user will be solely responsible in terms of ownership, maintenance, and reliability). However, it will be best to first check with the Google Cloud Support Team.

Hope you liked the article. Please feel free to post comments/questions, if any.

Version history
Last update:
‎07-26-2023 09:17 AM
Updated by: