Username & Password
🚧 Feature has not been tested. Proceed at your own risk. 🚧
Schema
Users
import "server-only"; // <- if you are using react server components
import { pgTable, timestamp, uuid, varchar } from "drizzle-orm/pg-core";
export const users = pgTable("users", { id: uuid("id").primaryKey().defaultRandom(), createdAt: timestamp("created_at", { precision: 3, withTimezone: true }) .notNull() .defaultNow(), updatedAt: timestamp("updated_at", { precision: 3, withTimezone: true }) .notNull() .defaultNow(), username: varchar("username", { length: 16 }).notNull().unique(), passwordHash: varchar("password_hash", { length: 97 }).notNull(),});
Why use a varchar(n)
instead of text
?
In PostgreSQL, varchar
allows you to specify a maximum length for the column without using blank character padding.
Setting a character limit is mostly helpful for saving you from yourself. Imagine you have a web app and an iOS/Android one. If they rely on separate input validation processes, then you run the risk of them getting out of sync. Setting a restriction on the database will enforce a very basic level of data normalization to the input.
There is a security concern outlined below, but if a hacker can bypass your server validation, then character limits are probably not a robust enough solution to prevent damage.
With no character limit, the size of the input could be as large as the column allows. The maximum size for a single field in PostgreSQL is 1 GB. That could be exploited by a bad actor to use extra space and processing in your database.
Why use { length: 16 }
for username
?
Specifying the actual number to use for the limit comes down to personal preference and UI constraints. I chose 16 somewhat arbitrarily. For comparison, Twitter/X uses 15, TikTok uses 24, and Instagram uses 30.
If you are building a system where actual names are used for the username (ex: [firstname]_[lastname]
), then a longer limit may be appropriate.
Why use
For this recommendation, I assume that passwords will be run through argon2id with recommended settings before being sent to the database. The total output should be 97 characters long if the hash length is set to 32 bytes (the digest will be 43 characters).{ length: 97 }
for password_hash
?
Ideally, you should use the exact length of the output of whatever hashing algorithm you are using. But if you aren’t sure, text
is an adequate choice.
Why not use
pgcrypto crypt()
?pgcrypto
provides a built-in crypt()
function that hashes passwords.
There are four main reasons wunshot opts not to use it:
- Relying on a database function for hashing means that the password would need to remain in plaintext while in transit. (All transit between your app and the database should be encrypted, but that’s a different topic.)
- The recommended
argon2id
algorithm is not available inpgcrypto
. pgcrypto
integration with Drizzle ORM is not straight-forward. It would involve a lot of magicsql
- Hashing in the server/application layer continues to just work if a different RDBMS is used.
Sessions
In addition to simply storing session data, I had some specific objectives for the sessions implementation:
- Associate “unauthenticated” sessions with a user when they sign in (The
sessionId
for unauthenticated users is created on-the-fly on the server, not in the database.) - Track failed sign in attempts for basic rate-limiting without relying on Redis or the server’s memory
- Lookup “known” ip addresses associated with a user
- Lookup deleted sessions associated with a current user
- Lookup deleted sessions associated with a deleted user
Reasonable defaults for handling multi-device sessions<- will be a separate section, the default is to only allow a single session per user
The following schemas should facilitate all of those use-cases.
import { pgEnum, pgTable, text, timestamp, uuid } from "drizzle-orm/pg-core";
import { users } from "@/db/users/schema";
export const sessions = pgTable("sessions", { id: uuid("id").primaryKey().defaultRandom(), userId: uuid("user_id") .notNull() .references(() => users.id), expiresAt: timestamp("expires_at", { precision: 3, withTimezone: true }) .notNull() .defaultNow(),});
const sessionEventType = pgEnum("session_event_type", [ "CREDENTIALS_FAILED", "SESSION_CREATED", "SESSION_RENEWED", "SESSION_DELETED", "UNAUTHENTICATED_SESSION_CONVERTED",]);
export const sessionLogs = pgTable("session_logs", { id: uuid("id").primaryKey().defaultRandom(), sessionId: uuid("session_id").notNull(), userId: uuid("user_id").notNull(), eventType: sessionEventType("event_type").notNull(), eventMessage: text("event_message"), ipAddress: text("ip_address").notNull(), // you can use varchar("ip_address", { length: 45 }) if you prefer userAgent: text("user_agent").notNull(), // you can use varchar("user_agent", { length: 16384 }) if you prefer origin: text("origin").notNull(), // you can use varchar("origin", { length: 16384 }) if you prefer timestamp: timestamp("timestamp", { precision: 3, withTimezone: true }) .notNull() .defaultNow(),});
Why isn’t there a
The assumption wunshot makes is that you want as few trips to the database as possible. Therefore, storing data for unauthenticated users in the database is not a default.unauthenticatedSessions
table?
If the needs of your application are such that you need to store data for unauthenticated sessions, you can add a unauthenticatedSessions
table. Be aware that it will increase the amount trips to the database and more advanced rate-limiting may be necessary to prevent possible attacks.
Why does the
Unauthenticated user storage should be flexible because if the session cookie is deleted, then there is no way to recover it.deletedUnauthenticatedSessions
table use arrays?
Imagine a scenario where an unauthenticated user has a shopping cart attached to their session. They are browsing on a mobile device and switch between wifi and cellular data. If the session cookie is locked to a single ip address, then the shopping cart will be lost.
Why isn’t there a sessionValidationSuccessLogs
table?
I originally intended to use only three tables: sessions
, sessionValidationFailureLogs
, and sessionValidationSuccessLogs
. The benefit was that several of the queries were simple and didn’t need to rely on any joins. But, the simplicity of the queries came at the cost of complexity in the models.
Particularly, handling session renewal and the deletion of unauthorized sessions added a number of permutations to what data would or would not be available at the time of logging. Those permutations would only increase over time.
For example, imagine an instance where you wanted a shopping cart in the session - do you include the cart items in the log? Even if the answer is probably “no”, you have to spend extra time thinking about that decision and it’s repercussions.
One option I explored was to use a jsonb
column with metadata about whichever event was logged. But, that requires the overhead of writing additional schemas/validations for the JSON of each event type.
The schemas as they are now are still subject to change, but I think they provide a good balance between performance, flexibility, and reasonability.
Why use a uuid
as the primary key for sessions?
According to OWASP, a good session id must:
- Not contain any information about the user or framework
- Use at least 64 bits of entropy
Using postgres built-in functionality has some advantages over third-party random id generators.
- any application that connects to the database will automatically use the same random id generator
- the uuid format can be enforced by the database
- uuidv4 is ubiquitous, so any connected applications will likely be able to generate ids on-the-fly if they need to
Overall, it’s a good compromise between performance, security, and consistency.
Why does userId
use { onDelete: "cascade", onUpdate: "cascade" }
?
Drizzle ORM supports PostgreSQL foreign key actions. If cascade
is set, then if the user is deleted then the sessions for that user will be deleted as well.
The downside to enabling this behavior is that it raises the probability of a developer forgetting to update the deletedSessions
table while deleting a user. But, I view the impact of that mistake as lower than the impact of user
deletes failing because of foreign key constraints. I also think a PostgreSQL Trigger will be a better way to handle that when Dizzle ORM supports them.
Queries
import { eq, sql } from "drizzle-orm";
const preparedInsertUser = db .insert(users) .values({ username: sql.placeholder("username"), passwordHash: sql.placeholder("passwordHash"), }) .prepare("insert_user");
const preparedSelectUserByUsername = db .select() .from(users) .where(eq(users.username, sql.placeholder("username"))) .limit(1) .prepare("select_user_by_username");
export async function selectUserByUsername(username: string) { const [user] = await preparedSelectUserByUsername.execute({ username }); if (!user) throw new Error("User not found"); return user;}
Validations
import * as v from "valibot";
const usernameSchema = v.pipe( v.string(), v.minLength(4), v.maxLength(16), v.regex( /^[a-z0-9_]+$/, "can only contain lowercase letters, numbers, or underscores", ), v.regex(/^[a-z0-9].*$/, "must start with a letter"), v.regex(/^.+[a-z0-9]$/, "must end with a letter"), v.regex(/^((?!__).)*$/, "cannot contain consecutive underscores"), v.regex(/^((?!admin).)*$/i, "cannot contain the word 'admin'"),);
const passwordSchema = v.pipe( v.string(), v.minLength(8), v.normalize("NFKC"), v.maxBytes(72),);
export const insertUserSchema = v.object({ username: usernameSchema, password: passwordSchema,});
What’s going on with the (lack of) password requirements
Based on information from OWASP and NIST, the only hard restriction on password input should be a minimum length of 8 characters.
The normalize
function is used to ensure unicode support. Read more about unicode normalization here and here.
The 72 maxBytes
limit is not exactly in line with the OWASP and NIST recommendations, but I think it will work well for most applications.
- 72 bytes is the maximum input size the BCRYPT hashing algorithm can handle. (
aron2id
is the recommended hashing algorithm, but I can’t control where this code is pasted to. So may as well maximize compatibility if it doesn’t realistically harm security) - It is probably large enough for a 64 character passphrase (if it’s UTF-8 encoded)
- It prevents DoS attacks associated with long passwords
If you are using a different hashing algorithm, you can use something like v.maxLength(64)
or higher instead.
Operations
Hash Password
import "server-only"; // <- if you are using react server components
import { hash as hashPrimitive, verify as verifyPrimitive,} from "@node-rs/argon2";
const hashOptions = { memoryCost: 19456, timeCost: 2, outputLen: 32, parallelism: 1,};
export async function hash(password: Parameters<typeof hashPrimitive>[0]) { return hashPrimitive(password, hashOptions);}
export async function verify( hashed: Parameters<typeof verifyPrimitive>[0], password: Parameters<typeof verifyPrimitive>[1],) { return verifyPrimitive(hashed, password, hashOptions);}
Create session
import "server-only"; // <- if you are using react server components
import { insertSession } from "@/db/sessions/queries";import { insertSessionSchema } from "@/db/sessions/validations";
export async function createSession(input: unknown) { try { const { userId, expiresAt } = v.parse(insertSessionSchema, input); await insertSession({ userId, expiresAt }); } catch (error) { // TODO: use logging and error reporting console.error(error); }}
Create user
import "server-only"; // <- if you are using react server components
import * as v from "valibot";
import { insertUser } from "@/db/users/queries";import { insertUserSchema } from "@/db/users/validations";import { hash } from "@/ops/hash-password";
export async function createUser(input: unknown) { try { const { username, password } = v.parse(insertUserSchema, input); const passwordHash = await hash(password); await insertUser({ username, passwordHash }); // TODO: create session } catch (error) { // TODO: use logging and error reporting console.error(error); }}
Sign in
import "server-only"; // <- if you are using react server components
import * as v from "valibot";
import { selectUserByUsername } from "@/db/users/queries";import { insertUserSchema } from "@/db/users/validations";import { verify } from "@/ops/hash-password";
export async function signIn(input: unknown) { try { const { username, password } = v.parse(insertUserSchema, input);
const user = await selectUserByUsername(username); // <- Consider caching this result
const isPasswordValid = await verify(user.passwordHash, password);
if (!isPasswordValid) throw new Error("Invalid password");
return user;
// TODO: create session // TODO: add a brute force mitigation strat } catch (error) { // TODO: use logging and error reporting console.error(error);
// When sending an error response to the client, leave it vague and generic such as: "Invalid username or password" // You do not want to accidentally leak if a user exists or not. // For extra obfuscation, you can add a timeout to the error response if the user doesn't exist. }}
Using verify
instead of sending passwordHash
to search the database
A different approach to the sign in data flow would be to hash the password on the server, and then query the database for a row with the matching username and hashed password.
There are a few downsides to that approach:
- It probably means using a single salt for all passwords
- Every sign-in attempt will require a database query (whereas the server can cache the result of searching by username and use
verify
on the cached result) - Since every request requires using a relatively intensive hashing algorithm, a bad actor could potentially overload the server with a large amount of sign-in attempts