Skip to content

Username & Password

🚧 Feature has not been tested. Proceed at your own risk. 🚧

Schema

Users

@/db/users/schema.ts
import "server-only"; // <- if you are using react server components
import { pgTable, timestamp, uuid, varchar } from "drizzle-orm/pg-core";
export const users = pgTable("users", {
id: uuid("id").primaryKey().defaultRandom(),
createdAt: timestamp("created_at", { precision: 3, withTimezone: true })
.notNull()
.defaultNow(),
updatedAt: timestamp("updated_at", { precision: 3, withTimezone: true })
.notNull()
.defaultNow(),
username: varchar("username", { length: 16 }).notNull().unique(),
passwordHash: varchar("password_hash", { length: 97 }).notNull(),
});
Why use a varchar(n) instead of text?

In PostgreSQL, varchar allows you to specify a maximum length for the column without using blank character padding.

Setting a character limit is mostly helpful for saving you from yourself. Imagine you have a web app and an iOS/Android one. If they rely on separate input validation processes, then you run the risk of them getting out of sync. Setting a restriction on the database will enforce a very basic level of data normalization to the input.

There is a security concern outlined below, but if a hacker can bypass your server validation, then character limits are probably not a robust enough solution to prevent damage.

With no character limit, the size of the input could be as large as the column allows. The maximum size for a single field in PostgreSQL is 1 GB. That could be exploited by a bad actor to use extra space and processing in your database.

Why use { length: 16 } for username?

Specifying the actual number to use for the limit comes down to personal preference and UI constraints. I chose 16 somewhat arbitrarily. For comparison, Twitter/X uses 15, TikTok uses 24, and Instagram uses 30.

If you are building a system where actual names are used for the username (ex: [firstname]_[lastname]), then a longer limit may be appropriate.

Why use { length: 97 } for password_hash? For this recommendation, I assume that passwords will be run through argon2id with recommended settings before being sent to the database. The total output should be 97 characters long if the hash length is set to 32 bytes (the digest will be 43 characters).

Ideally, you should use the exact length of the output of whatever hashing algorithm you are using. But if you aren’t sure, text is an adequate choice.

Why not use pgcrypto crypt()? pgcrypto provides a built-in crypt() function that hashes passwords. There are four main reasons wunshot opts not to use it:

  1. Relying on a database function for hashing means that the password would need to remain in plaintext while in transit. (All transit between your app and the database should be encrypted, but that’s a different topic.)
  2. The recommended argon2id algorithm is not available in pgcrypto.
  3. pgcrypto integration with Drizzle ORM is not straight-forward. It would involve a lot of magic sql
  4. Hashing in the server/application layer continues to just work if a different RDBMS is used.

Sessions

In addition to simply storing session data, I had some specific objectives for the sessions implementation:

  1. Associate “unauthenticated” sessions with a user when they sign in (The sessionId for unauthenticated users is created on-the-fly on the server, not in the database.)
  2. Track failed sign in attempts for basic rate-limiting without relying on Redis or the server’s memory
  3. Lookup “known” ip addresses associated with a user
  4. Lookup deleted sessions associated with a current user
  5. Lookup deleted sessions associated with a deleted user
  6. Reasonable defaults for handling multi-device sessions <- will be a separate section, the default is to only allow a single session per user

The following schemas should facilitate all of those use-cases.

@/db/sessions/schemas.ts
import { pgEnum, pgTable, text, timestamp, uuid } from "drizzle-orm/pg-core";
import { users } from "@/db/users/schema";
export const sessions = pgTable("sessions", {
id: uuid("id").primaryKey().defaultRandom(),
userId: uuid("user_id")
.notNull()
.references(() => users.id),
expiresAt: timestamp("expires_at", { precision: 3, withTimezone: true })
.notNull()
.defaultNow(),
});
const sessionEventType = pgEnum("session_event_type", [
"CREDENTIALS_FAILED",
"SESSION_CREATED",
"SESSION_RENEWED",
"SESSION_DELETED",
"UNAUTHENTICATED_SESSION_CONVERTED",
]);
export const sessionLogs = pgTable("session_logs", {
id: uuid("id").primaryKey().defaultRandom(),
sessionId: uuid("session_id").notNull(),
userId: uuid("user_id").notNull(),
eventType: sessionEventType("event_type").notNull(),
eventMessage: text("event_message"),
ipAddress: text("ip_address").notNull(), // you can use varchar("ip_address", { length: 45 }) if you prefer
userAgent: text("user_agent").notNull(), // you can use varchar("user_agent", { length: 16384 }) if you prefer
origin: text("origin").notNull(), // you can use varchar("origin", { length: 16384 }) if you prefer
timestamp: timestamp("timestamp", { precision: 3, withTimezone: true })
.notNull()
.defaultNow(),
});

Why isn’t there a unauthenticatedSessions table? The assumption wunshot makes is that you want as few trips to the database as possible. Therefore, storing data for unauthenticated users in the database is not a default.

If the needs of your application are such that you need to store data for unauthenticated sessions, you can add a unauthenticatedSessions table. Be aware that it will increase the amount trips to the database and more advanced rate-limiting may be necessary to prevent possible attacks.

Why does the deletedUnauthenticatedSessions table use arrays? Unauthenticated user storage should be flexible because if the session cookie is deleted, then there is no way to recover it.

Imagine a scenario where an unauthenticated user has a shopping cart attached to their session. They are browsing on a mobile device and switch between wifi and cellular data. If the session cookie is locked to a single ip address, then the shopping cart will be lost.

Why isn’t there a sessionValidationSuccessLogs table?

I originally intended to use only three tables: sessions, sessionValidationFailureLogs, and sessionValidationSuccessLogs. The benefit was that several of the queries were simple and didn’t need to rely on any joins. But, the simplicity of the queries came at the cost of complexity in the models.

Particularly, handling session renewal and the deletion of unauthorized sessions added a number of permutations to what data would or would not be available at the time of logging. Those permutations would only increase over time.

For example, imagine an instance where you wanted a shopping cart in the session - do you include the cart items in the log? Even if the answer is probably “no”, you have to spend extra time thinking about that decision and it’s repercussions.

One option I explored was to use a jsonb column with metadata about whichever event was logged. But, that requires the overhead of writing additional schemas/validations for the JSON of each event type.

The schemas as they are now are still subject to change, but I think they provide a good balance between performance, flexibility, and reasonability.

Why use a uuid as the primary key for sessions?

According to OWASP, a good session id must:

  1. Not contain any information about the user or framework
  2. Use at least 64 bits of entropy

Using postgres built-in functionality has some advantages over third-party random id generators.

  • any application that connects to the database will automatically use the same random id generator
  • the uuid format can be enforced by the database
  • uuidv4 is ubiquitous, so any connected applications will likely be able to generate ids on-the-fly if they need to

Overall, it’s a good compromise between performance, security, and consistency.

Why does userId use { onDelete: "cascade", onUpdate: "cascade" }?

Drizzle ORM supports PostgreSQL foreign key actions. If cascade is set, then if the user is deleted then the sessions for that user will be deleted as well.

The downside to enabling this behavior is that it raises the probability of a developer forgetting to update the deletedSessions table while deleting a user. But, I view the impact of that mistake as lower than the impact of user deletes failing because of foreign key constraints. I also think a PostgreSQL Trigger will be a better way to handle that when Dizzle ORM supports them.

Queries

@/db/users/queries.ts
import { eq, sql } from "drizzle-orm";
const preparedInsertUser = db
.insert(users)
.values({
username: sql.placeholder("username"),
passwordHash: sql.placeholder("passwordHash"),
})
.prepare("insert_user");
const preparedSelectUserByUsername = db
.select()
.from(users)
.where(eq(users.username, sql.placeholder("username")))
.limit(1)
.prepare("select_user_by_username");
export async function selectUserByUsername(username: string) {
const [user] = await preparedSelectUserByUsername.execute({ username });
if (!user) throw new Error("User not found");
return user;
}

Validations

@/db/users/validations.ts
import * as v from "valibot";
const usernameSchema = v.pipe(
v.string(),
v.minLength(4),
v.maxLength(16),
v.regex(
/^[a-z0-9_]+$/,
"can only contain lowercase letters, numbers, or underscores",
),
v.regex(/^[a-z0-9].*$/, "must start with a letter"),
v.regex(/^.+[a-z0-9]$/, "must end with a letter"),
v.regex(/^((?!__).)*$/, "cannot contain consecutive underscores"),
v.regex(/^((?!admin).)*$/i, "cannot contain the word 'admin'"),
);
const passwordSchema = v.pipe(
v.string(),
v.minLength(8),
v.normalize("NFKC"),
v.maxBytes(72),
);
export const insertUserSchema = v.object({
username: usernameSchema,
password: passwordSchema,
});
What’s going on with the (lack of) password requirements

Based on information from OWASP and NIST, the only hard restriction on password input should be a minimum length of 8 characters.

The normalize function is used to ensure unicode support. Read more about unicode normalization here and here.

The 72 maxBytes limit is not exactly in line with the OWASP and NIST recommendations, but I think it will work well for most applications.

If you are using a different hashing algorithm, you can use something like v.maxLength(64) or higher instead.

Operations

Hash Password

@/ops/hash-password.ts
import "server-only"; // <- if you are using react server components
import {
hash as hashPrimitive,
verify as verifyPrimitive,
} from "@node-rs/argon2";
const hashOptions = {
memoryCost: 19456,
timeCost: 2,
outputLen: 32,
parallelism: 1,
};
export async function hash(password: Parameters<typeof hashPrimitive>[0]) {
return hashPrimitive(password, hashOptions);
}
export async function verify(
hashed: Parameters<typeof verifyPrimitive>[0],
password: Parameters<typeof verifyPrimitive>[1],
) {
return verifyPrimitive(hashed, password, hashOptions);
}

Create session

@/ops/create-session.ts
import "server-only"; // <- if you are using react server components
import { insertSession } from "@/db/sessions/queries";
import { insertSessionSchema } from "@/db/sessions/validations";
export async function createSession(input: unknown) {
try {
const { userId, expiresAt } = v.parse(insertSessionSchema, input);
await insertSession({ userId, expiresAt });
} catch (error) {
// TODO: use logging and error reporting
console.error(error);
}
}

Create user

@/ops/create-user.ts
import "server-only"; // <- if you are using react server components
import * as v from "valibot";
import { insertUser } from "@/db/users/queries";
import { insertUserSchema } from "@/db/users/validations";
import { hash } from "@/ops/hash-password";
export async function createUser(input: unknown) {
try {
const { username, password } = v.parse(insertUserSchema, input);
const passwordHash = await hash(password);
await insertUser({ username, passwordHash });
// TODO: create session
} catch (error) {
// TODO: use logging and error reporting
console.error(error);
}
}

Sign in

@/ops/sign-in.ts
import "server-only"; // <- if you are using react server components
import * as v from "valibot";
import { selectUserByUsername } from "@/db/users/queries";
import { insertUserSchema } from "@/db/users/validations";
import { verify } from "@/ops/hash-password";
export async function signIn(input: unknown) {
try {
const { username, password } = v.parse(insertUserSchema, input);
const user = await selectUserByUsername(username); // <- Consider caching this result
const isPasswordValid = await verify(user.passwordHash, password);
if (!isPasswordValid) throw new Error("Invalid password");
return user;
// TODO: create session
// TODO: add a brute force mitigation strat
} catch (error) {
// TODO: use logging and error reporting
console.error(error);
// When sending an error response to the client, leave it vague and generic such as: "Invalid username or password"
// You do not want to accidentally leak if a user exists or not.
// For extra obfuscation, you can add a timeout to the error response if the user doesn't exist.
}
}
Using verify instead of sending passwordHash to search the database

A different approach to the sign in data flow would be to hash the password on the server, and then query the database for a row with the matching username and hashed password.

There are a few downsides to that approach:

  • It probably means using a single salt for all passwords
  • Every sign-in attempt will require a database query (whereas the server can cache the result of searching by username and use verify on the cached result)
  • Since every request requires using a relatively intensive hashing algorithm, a bad actor could potentially overload the server with a large amount of sign-in attempts

Sign out

Update password

Update username