DNS Gotchas With CockroachDB and GSS-API
In this post, learn the way Cockroach binary connects through GSS-API, see impacts from a recent release on the behavior of the native client, and more.
Join the DZone community and get the full member experience.
Join For FreeWe just pushed a new release of CockroachDB and Postgres also had a recent release with a vulnerability impacting GSS. I figured it was a good time as any to update my repos with the latest versions of Postgres and Cockroach, and thereby test that everything works. I discovered an issue that is easily fixed but changes the behavior of cockroach
and psql
clients.
High-Level Steps
- Start a three-node CockroachDB cluster in Docker with GSSAPI.
- Demonstrate the problem scenario.
- Verify.
Step by Step Instructions
Start a Cluster
There's nothing more special about this tutorial than what was covered in my previous tutorials. Feel free to set up a stand-alone environment to follow along or use my docker-compose environment.
Demonstrate the Problem Scenario
Using psql
client, we should be able to connect to CockroachDB with no fuss.
psql "postgresql://lb:26257/defaultdb?sslmode=verify-full&sslrootcert=/certs/ca.crt" -U testerpsql (14.1, server 13.0.0)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_128_GCM_SHA256, bits: 128, compression: off) Type "help" for help. defaultdb=>
We accomplish the same thing by passing parameters.
psql "host=lb port=26257 dbname=defaultdb user=tester"
If we attempt to connect to CockroachDB with the native cockroach
client, we now get an error:
cockroach sql \ --certs-dir=/certs --url "postgresql://tester:nopassword@lb:26257/defaultdb?sslmode=verify-full&sslrootcert=/certs/ca.crt
# # Welcome to the CockroachDB SQL shell. # All statements must be terminated by a semicolon. # To exit, type: \q. # ERROR: pq: failed to get Kerberos ticket: "lookup lb: DNS response contained records which contain invalid names" Failed running "sql"
After various searches, it was suggested to start looking at DNS. I played with PGHOST
and PGHOSTADDR
environment variables to disable DNS lookup, which in the latter case, breaks the cockroach
client. After all my failed tries, I decided to remove the /etc/hosts
entry for lb
from the hosts file.
Verify
With hosts file updated:
[root@client cockroach]# cockroach sql --certs-dir=/certs --url "postgresql://tester:nopassword@lb:26257/defaultdb?sslmode=verify-full&sslrootcert=/certs/ca.crt" # # Welcome to the CockroachDB SQL shell. # All statements must be terminated by a semicolon. # To exit, type: \q. # # Server version: CockroachDB CCL v21.2.0 (x86_64-unknown-linux-gnu, built 2021/11/15 13:58:04, go1.16.6) (same version as client) # Cluster ID: 52be92b5-8fdf-41d9-889c-004b87e82a4e # Organization: Cockroach Labs - Production Testing # # Enter \? for a brief introduction. # tester@lb:26257/defaultdb>
We can also use a simplified version of that command.
cockroach sql --certs-dir=/certs --host=lb --user=tester
Now let's see the opposite effect with psql.
[root@client cockroach]# psql "host=lb port=26257 dbname=defaultdb user=tester"psql: error: connection to server at "lb" (172.28.0.6), port 26257 failed: connection to server at "lb" (172.28.0.6), port 26257 failed: GSSAPI continuation error: Unspecified GSS failure. Minor code may provide more information: Server krbtgt/COCKROACH-GSSAPI-MULTINODE_ROACHNET@EXAMPLE.COM not found in Kerberos database
By the way, psql
expects an entry in the host file for DNS lookup. CockroachDB also requires hostnames to be resolvable via DNS or via /etc/hosts
.
When using hostnames, make sure they resolve properly (e.g., via DNS or /etc/hosts). In particular, be careful about the value advertised to other nodes, either via
--advertise-addr
or via--listen-addr
when--advertise-addr
is not specified.
Given the current situation, a user must decide whether they want to proceed using psql
, in which case, they have to keep an entry in the host file for the service principal, or use cockroach
client without an associated hosts' entry.
UPDATE: I looked into this further and upon a hint from one of our distinguished engineers as to whether lb
hostname is not a valid hostname, I started experimenting further.
I installed bind-utils
and reviewed the output of dig lb
:
; <<>> DiG 9.11.26-RedHat-9.11.26-6.el8 <<>> lb ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24522 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;lb. IN A ;; ANSWER SECTION: lb. 600 IN A 172.28.0.6 ;; Query time: 0 msec ;; SERVER: 127.0.0.11#53(127.0.0.11) ;; WHEN: Wed Nov 24 16:41:04 UTC 2021 ;; MSG SIZE rcvd: 38
It looks like gibberish and is a bit over my head to understand, so I took a different approach.
Looking at the original error message (ERROR: pq: failed to get Kerberos ticket: "lookup lb: DNS response contained records which contain invalid names"
), I found the code in the Go repo.
// errMalformedDNSRecordsDetail
is the DNSError detail which is returned when aResolver.Lookup... //
method receives DNS records which contain invalid DNS names. This may be returned alongside// results
which have had the malformed records filtered out.
The intuition about a malformed hostname still holds. I decided to change lb
hostname to lb.local
across every instance of the name being mentioned, and I'm happy to report that without changing /etc/hosts
and removing the entry from the file, it still works with both clients.
psql "host=lb.local port=26257 dbname=defaultdb user=tester" psql (14.1, server 13.0.0) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_128_GCM_SHA256, bits: 128, compression: off) Type "help" for help. defaultdb=>
psql "postgresql://lb.local:26257/defaultdb?sslmode=verify-full&sslrootcert=/certs/ca.crt" -U tester psql (14.1, server 13.0.0) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_128_GCM_SHA256, bits: 128, compression: off) Type "help" for help. defaultdb=>
cockroach sql --host=lb.local --certs-dir=/certs --user tester # # Welcome to the CockroachDB SQL shell. # All statements must be terminated by a semicolon. # To exit, type: \q. # # Server version: CockroachDB CCL v21.2.0 (x86_64-unknown-linux-gnu, built 2021/11/15 13:58:04, go1.16.6) (same version as client) # Cluster ID: f721b8dc-66a5-4909-82b4-a8c18738a21d # Organization: Cockroach Labs - Production Testing # # Enter \? for a brief introduction. # tester@lb.local:26257/defaultdb>
cockroach sql \ > --certs-dir=/certs --url "postgresql://tester:nopassword@lb.local:26257/defaultdb?sslmode=verify-full&sslrootcert=/certs/ca.crt&krbsrvname=customspn" # # Welcome to the CockroachDB SQL shell. # All statements must be terminated by a semicolon. # To exit, type: \q. # # Server version: CockroachDB CCL v21.2.0 (x86_64-unknown-linux-gnu, built 2021/11/15 13:58:04, go1.16.6) (same version as client) # Cluster ID: f721b8dc-66a5-4909-82b4-a8c18738a21d # Organization: Cockroach Labs - Production Testing # # Enter \? for a brief introduction. # tester@lb.local:26257/defaultdb>
Finally, what baffled me still was that I can still do the following:
psql "host=lb port=26257 dbname=defaultdb user=tester"psql (14.1, server 13.0.0) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_128_GCM_SHA256, bits: 128, compression: off) Type "help" for help. defaultdb=>
Notice host=lb
, but I cannot do this:
psql "postgresql://lb:26257/defaultdb?sslmode=verify-full&sslrootcert=/certs/ca.crt" -U tester psql: error: connection to server at "lb" (172.28.0.6), port 26257 failed: connection to server at "lb" (172.28.0.6), port 26257 failed: server certificate for "roach-0" (and 1 other name) does not match host name "lb"
... but this is for another day.
Articles Covering CockroachDB and Kerberos
I find the topic of Kerberos very interesting and my colleagues commonly refer to me for help with this complex topic. I am by no means an expert at Kerberos, I am however familiar enough with it to be dangerous. That said, I've written multiple articles on the topic which you may find on my profile.
Published at DZone with permission of Artem Ervits. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments