In my bubble it’s always been common knowledge that a database connection over a Unix socket is faster than that same connection over a TCP connection.

But does this boat still float when using the Google recommended way to access Cloud SQL from GKE: with Google Cloud Proxy as a sidecar in a pod? Since Cloud Proxy moves data over a TCP tunnel, speed should be identical.

So to save everyone a little time, I did some very minimal benchmarking:

Setup

(If you want to replicate this; remember to fill in connection_name, username and password.)

Unix Socket

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pgbench
spec:
  selector:
    matchLabels:
      app: pgbench
  template:
    metadata:
      labels:
        app: pgbench
    spec:
      containers:
        - name: cloudsql-proxy
          image: gcr.io/cloudsql-docker/gce-proxy
          command: ["/cloud_sql_proxy",
                    "--dir=/cloudsql",
                    "-instances=<connection_name>",
                    "-credential_file=/secrets/cloudsql/service-account-credentials.json"]
          volumeMounts:
            - name: cloudsql-instance-credentials
              mountPath: /secrets/cloudsql
              readOnly: true
            - name: cloudsql-sockets
              mountPath: /cloudsql
        - name: benchapp
          image: postgres
          command: ["pgbench", "-h", "/cloudsql", "-U", "<username>", "-d", "<dbname>", "-c", "5", "-T", "30"]
          volumeMounts:
            - name: cloudsql-sockets
              mountPath: /cloudsql
      volumes:
        - name: cloudsql-instance-credentials
          secret:
            secretName: service-account-credentials
        - name: cloudsql-sockets
          emptyDir: {}

TCP Port

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pgbench
spec:
  selector:
    matchLabels:
      app: pgbench
  template:
    metadata:
      labels:
        app: pgbench
    spec:
      containers:
        - name: cloudsql-proxy
          image: gcr.io/cloudsql-docker/gce-proxy
          command: ["/cloud_sql_proxy",
                    "-instances=<connection_name>=tcp:127.0.0.1:5432",
                    "-credential_file=/secrets/cloudsql/service-account-credentials.json"]
          volumeMounts:
            - name: cloudsql-instance-credentials
              mountPath: /secrets/cloudsql
              readOnly: true
        - name: pgbench
          image: postgres
          command: ["pgbench", "-h", "localhost:5432", "-U", "<username>", "-d", "<dbname>", "-c", "5", "-T", "30"]
      volumes:
        - name: cloudsql-instance-credentials
          secret:
            secretName: service-account-credentials

Benchmark

Unix Socket

First test:

transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 5
number of threads: 1
duration: 30 s
number of transactions actually processed: 10410
latency average: 14.415 ms
tps = 346.862388 (including connections establishing)
tps = 347.051570 (excluding connections establishing)

Second test:

transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 5
number of threads: 1
duration: 30 s
number of transactions actually processed: 9852
latency average: 15.321 ms
tps = 326.351848 (including connections establishing)
tps = 326.511321 (excluding connections establishing)

TCP

First test:

transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 5
number of threads: 1
duration: 30 s
number of transactions actually processed: 9496
latency average: 15.803 ms
tps = 316.396207 (including connections establishing)
tps = 316.564728 (excluding connections establishing)

Second test:

transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 5
number of threads: 1
duration: 30 s
number of transactions actually processed: 9474
latency average: 15.840 ms
tps = 315.652811 (including connections establishing)
tps = 315.813361 (excluding connections establishing)

Conclusion

In my 2(!) tests sockets are about 7% faster, which surprised me a bit. I expected the Cloud Proxy tunnel to equalize the performance, but there’s still a measurable difference.

I leave it up to you to decide whether that’s enough to switch if you’re currently using TCP. You should probably benchmark it in your own cluster with your actual workload, but at least now you know what to expect.