HikariCP & RDS IAM Authentication in Clojure

Background

Most of the JDBC-using world uses some sort of connection pooling library. HikariCP is one of the more popular options. When backed by an RDS database, you may want to use IAM authentication. IAM authentication is great as an additional security measure since you’re able to rely on short-lived credentials all the way down. You’re also able to manage database access with a state-of-the-art permissions system — IAM. It’s excellent, and I highly recommend it. Getting it all configured in HikariCP does take some work, however.

The Solution (TL;DR)

Use the no arg HikariDataSource constructor and set the underlying DataSource. This approach avoids all the nasty reflection and automatic initialization that occurs in the 1 arg constructor and lets you configure all the JDBC connection parameters using regular map instead of getters and setters.

The full solution code is available here.

How I got to The Solution

At the surface, RDS IAM authentication is simple. AWS provides a specification (SigV4) and optional SDKs to generate short-lived authentication tokens. A generated token is passed as the connection’s password and has a TTL of 15 minutes. Those familiar with the JDBC API have probably already worked out the implementation in their head.

  1. Issue SDK call to generate authentication token.

  2. Get a connection from the datasource using the getConnection(String username, String password) method, passing the authentication token as the password.

That’s what I thought too. Turns out, Hikari does not implement that method overload. There’s most likely a good reason for that, but I do not know it.

Since the naive approach wouldn’t work, I asked around and was pointed to this article: Configure Hikari Connection Pool when using AWS RDS IAM. It’s written for Java, but that’s okay. We’re Clojure programmers who fear no Java. The author confirms that the "Hikari DataSource doesn’t support authentication token as the datasource expects credentials for the lifetime of the datasource" and goes on to suggest a solution that extends the HikariDataSource class. Simple enough, and we can do it in Clojure. The below is a rough translation of the author’s Java solution.

(defn generate-auth-token
  [db-spec]
  (.getAuthToken (.build (doto (RdsIamAuthTokenGenerator/builder)
                           (.credentials (DefaultAWSCredentialsProviderChain.))
                           (.region (.getRegion (DefaultAwsRegionProviderChain.)))))
    (.build
      (doto (GetIamAuthTokenRequest/builder)
        (.hostname (:host db-spec))
        (.port 5432)
        (.userName (:user db-spec))))))

(defn new-datasource
  [db-url db-spec]
  (let [hconfig (doto (HikariConfig.)
                  (.setJdbcUrl db-url)
                  (.setDataSourceProperties
                    (doto (Properties.)
                      (.putAll {"user" (:user db-spec)}))))]
    (proxy [HikariDataSource] [hconfig]
      (getPassword [] (generate-auth-token db-spec)))))

And we give it a test…​

(new-datasource db-url db-spec)
2022-03-27 06:18:28 nREPL-session-e26238de-915e-464d-a188-5549551ec395 [com.zaxxer.hikari.HikariDataSource] INFO - HikariPool-2 - Starting...
2022-03-27 06:18:29 nREPL-session-e26238de-915e-464d-a188-5549551ec395 [com.zaxxer.hikari.pool.HikariPool] ERROR - HikariPool-2 - Exception during pool initialization.
org.postgresql.util.PSQLException: The server requested password-based authentication, but no password was provided.
 at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication (ConnectionFactoryImpl.java:670)
    org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect (ConnectionFactoryImpl.java:163)
    org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl (ConnectionFactoryImpl.java:215)
    org.postgresql.core.ConnectionFactory.openConnection (ConnectionFactory.java:51)
    org.postgresql.jdbc.PgConnection.<init> (PgConnection.java:225)
    org.postgresql.Driver.makeConnection (Driver.java:466)
    org.postgresql.Driver.connect (Driver.java:265)
    com.zaxxer.hikari.util.DriverDataSource.getConnection (DriverDataSource.java:121)
    com.zaxxer.hikari.pool.PoolBase.newConnection (PoolBase.java:359)
    com.zaxxer.hikari.pool.PoolBase.newPoolEntry (PoolBase.java:201)
    com.zaxxer.hikari.pool.HikariPool.createPoolEntry (HikariPool.java:470)
    com.zaxxer.hikari.pool.HikariPool.checkFailFast (HikariPool.java:561)
    com.zaxxer.hikari.pool.HikariPool.<init> (HikariPool.java:100)
    com.zaxxer.hikari.HikariDataSource.<init> (HikariDataSource.java:81)
    cs.dev.analytics.job_runner.proxy$com.zaxxer.hikari.HikariDataSource$ff19274a.<init> (:-1)
    cs.dev.analytics.job_runner$new_datasource.invokeStatic (job_runner.clj:465)
    cs.dev.analytics.job_runner$new_datasource.invoke (job_runner.clj:461)
    cs.dev.analytics.job_runner$eval134534.invokeStatic (job_runner.clj:483)
    cs.dev.analytics.job_runner$eval134534.invoke (job_runner.clj:483)
    clojure.lang.Compiler.eval (Compiler.java:7181)
    clojure.lang.Compiler.eval (Compiler.java:7136)
    clojure.core$eval.invokeStatic (core.clj:3202)
    clojure.core$eval.invoke (core.clj:3198)
    nrepl.middleware.interruptible_eval$evaluate$fn__993$fn__994.invoke (interruptible_eval.clj:87)
    clojure.lang.AFn.applyToHelper (AFn.java:152)
    clojure.lang.AFn.applyTo (AFn.java:144)
    clojure.core$apply.invokeStatic (core.clj:667)
    clojure.core$with_bindings_STAR_.invokeStatic (core.clj:1977)
    clojure.core$with_bindings_STAR_.doInvoke (core.clj:1977)
    clojure.lang.RestFn.invoke (RestFn.java:425)
    nrepl.middleware.interruptible_eval$evaluate$fn__993.invoke (interruptible_eval.clj:87)
    clojure.main$repl$read_eval_print__9110$fn__9113.invoke (main.clj:437)
    clojure.main$repl$read_eval_print__9110.invoke (main.clj:437)
    clojure.main$repl$fn__9119.invoke (main.clj:458)
    clojure.main$repl.invokeStatic (main.clj:458)
    clojure.main$repl.doInvoke (main.clj:368)
    clojure.lang.RestFn.invoke (RestFn.java:1523)
    nrepl.middleware.interruptible_eval$evaluate.invokeStatic (interruptible_eval.clj:84)
    nrepl.middleware.interruptible_eval$evaluate.invoke (interruptible_eval.clj:56)
    nrepl.middleware.interruptible_eval$interruptible_eval$fn__1024$fn__1028.invoke (interruptible_eval.clj:152)
    clojure.lang.AFn.run (AFn.java:22)
    nrepl.middleware.session$session_exec$main_loop__1092$fn__1096.invoke (session.clj:202)
    nrepl.middleware.session$session_exec$main_loop__1092.invoke (session.clj:201)
    clojure.lang.AFn.run (AFn.java:22)
    java.lang.Thread.run (Thread.java:829)

Execution error (PSQLException) at org.postgresql.core.v3.ConnectionFactoryImpl/doAuthentication (ConnectionFactoryImpl.java:670).
The server requested password-based authentication, but no password was provided.

This is turning into quite the adventure, isn’t it? I added a println in the overloaded getPassword method, and discovered it was not getting called. Odd. I did some digging, and found I, perhaps, should have feared Java. While our call to the HikariDataSource(HikariConfig configuration) is the one documented in the Hikari documentation, it goes off the rails with mutations. Not only does the constructor create the underlying JDBC connections, but it also does some reflection to copy information from the passed HikariConfig to the current class 🤯 At this point, I am thinking all I must do is find a solution and hide this code in the darkest corner, praying to never gaze upon it again.

I see we have another no arg constructor we can call that doesn’t appear to do anything crazy. The docstring indicates there could be some performance implications, but it’s not clear to what degree and when those performance impacts are incurred. We can slightly alter our new-datasource implementation to use the no arg constructor instead.

(defn new-datasource
  [db-url db-spec]
  (doto (proxy [HikariDataSource] []
          (getPassword [] (generate-auth-token db-spec)))
    (.setJdbcUrl db-url)
    (.setDataSourceProperties
      (doto (Properties.)
        (.putAll {"user" (:user db-spec)})))))

And now let’s test it…​

(.getConnection (new-datasource db-url db-spec))
2022-03-27 06:43:15 nREPL-session-e26238de-915e-464d-a188-5549551ec395 [com.zaxxer.hikari.HikariDataSource] INFO - HikariPool-7 - Starting...
2022-03-27 06:43:16 nREPL-session-e26238de-915e-464d-a188-5549551ec395 [com.zaxxer.hikari.pool.HikariPool] INFO - HikariPool-7 - Added connection org.postgresql.jdbc.PgConnection@582c46b8
2022-03-27 06:43:16 nREPL-session-e26238de-915e-464d-a188-5549551ec395 [com.zaxxer.hikari.HikariDataSource] INFO - HikariPool-7 - Start completed.
=>
#object[com.zaxxer.hikari.pool.HikariProxyConnection
        0x352f2226
        "HikariProxyConnection@892281382 wrapping org.postgresql.jdbc.PgConnection@582c46b8"]

Woot, it works! Or so it appears…​ Yes, this connection does work, and yes I did deploy it to staging, believing this problem to be dealt with, and signed off for the day.

The next day I observed authentication failures in our staging environment that looked like "FATAL: PAM authentication failed for user "<elided>"". If you recall, dear reader, our tokens have a TTL. After the TTL expires, my assumption was that the getPassword method would get invoked, we’d get a new authentication token, and everything would be hunky-dory. That, as it turned out, was not what happened. To this day, I do not have a theory why we were seeing this other than some (waves hands in air) vague notion that Hikari is doing some sketchy stuff with reflection, caches are probably involved, and/or those things may not mix well with Clojure’s proxy.

Fortunately, with some sleep, I happened upon a much simpler solution. We can get rid of all the HikariDataSource extension and just deal with a regular old JDBC datasource.

(defn new-data-source
  [db-url db-spec]
  (let [init-props (doto (java.util.Properties.)
                     (.putAll (cond-> {}
                                (:user db-spec) (assoc "user" (:user db-spec)))))
        get-conn (fn []
                   (java.sql.DriverManager/getConnection db-url
                     (doto init-props
                       (.putAll {"password" (generate-auth-token db-spec)}))))
        base-datasource (reify DataSource
                          (getConnection [_] (get-conn))
                          (getConnection [_ user password])
                          (getLoginTimeout [_])
                          (setLoginTimeout [_ seconds]))]
    (doto (HikariDataSource.)
      (.setDataSource base-datasource))))

After marinating in staging for well past the TTL, everything was working as expected. Code very similar to this has been deployed to production for some months with no hiccup. If you’ve made it this far, I hope you have enjoyed the pain I went through, and that you might find some use in it.

Written on 2022-03-31