Finally testing my app after months of breaking everything

For the past few months, I’ve been working on an app in which my only form of testing has been manually executing whatever feature I would like to test and almost every commit I would break something without even realizing it. Now this wasn’t without reason, time and motivation are factors when developing an app solo or at all for that matter. I work on this in my free time after leaving my full-time job and hitting the gym. Working on new features instead of writing boring tests is what keeps me excited about developing.

For a while it felt like the lack of testing was working, until it wasn’t. The cool features that kept me excited broke more often than not and I would spent countless hours manually debugging and console logging everything. All of this hassle when tests would’ve allowed me to repeatedly test a feature in a controlled environment with one simple command.

A few weeks ago, I realized that if I ever planned on launching this thing, breaking things with every commit doesn’t work and I should stop being a lazy idiot and implement testing. So I did, a little background on the app, it is a Monorepo with a Svelte frontend and Elysia JS backend. Drizzle is the object-relational mapping (ORM) of choice and while it’s great it introduces some new challenges which we’ll dive into later on.

Mocks the false sense of security

Mocks are used to imitate the behavior of real systems usually by returning a hard coded response to a function or API call instead of calling out to real systems or faking some module you are not currently testing. Mocking is an essential part of testing and absolutely necessary if you make any calls to a third party API. Now, ideally I would like to run one command and test my queries against a real database using my actual schema and not some hard-coded mock response which will always return the same response. It’s impossible for your database mock to detect if you have incorrectly written a query as it will always return the same response. I write queries all the time in which I make little mistakes like forgetting to put a column in a group by clause or misspelling a column name in a raw query. Things which an ORM or database mock simply cannot catch.

My app has a user facing analytics dashboard to display the loyalty rewards program performance. Under the hood each endpoint calls out to my PostgreSQL database with the TimescaleDB extension which provides time-series aggregate functions making analytical queries much simpler; as well as automatic compression of time-series data and continuous aggregates. No matter how good your ORM is even the GOAT Drizzle cannot provide any type safety for raw SQL queries and if you’re using PostgreSQL extensions you will have to write a lot of them.

Drizzle does support postgis and pgvector so the correct column types are included in migrations. TimescaleDB is not on the list of supported extensions, to work around this you must first modify the SQL migration files (or provide your own custom migration with bunx drizzle-kit generate —custom) then you can query your data. Below the auto-generated migration file which stores the customers points history they’ve earned and spent overtime.

CREATE TABLE "customers"."points_history" (
	"shop" text NOT NULL,
	"customerId" integer NOT NULL,
	"actionCompleted" integer,
	"rewardRedeemed" integer,
	"wayToEarnTitle" text NOT NULL,
	"amount" integer NOT NULL,
	"type" "customers"."points_history_enum",
	"createdAt" timestamp with time zone DEFAULT now(),
	CONSTRAINT "points_history_reference_check" CHECK (
        CASE
          WHEN "type" = 'Earned' THEN
            "actionCompleted" IS NOT NULL
          WHEN "type" = 'Redeemed' THEN
            "rewardRedeemed" IS NOT NULL
          ELSE
            true
        END
      )
);

In TimescaleDB you can only run handy time aggregate functions over hypertables which are special tables which are sorted by a timestamp to ensure uniqueness (there cannot be any duplicate timestamps.) Which are usually further segmented (or partitioned) by some arbitrary key such as customer_id for performance. Another caveat is the order by key must be included in any primary key or index on the table, this is not a TimescaleDB specific thing this is due to the way Postgres works. Right now we can’t run any Timescale functions on this regular old table. To fix this, we can simply append the following to the table declaration.

CREATE TABLE "customers"."points_history" (
	-- Removed for clarity
) WITH (
  tsdb.hypertable,
  tsdb.segmentby = 'shop',
  tsdb.orderby = '"createdAt" ASC'
);

The customer points history table is segmented by each customers shop, because analytical queries will only be viewed by the store owner and thus will only be queried by the customers shop. Now that we’ve got a hypertable, we don’t want to query the table directly each time every store owner opens the dashboard. Pre-computed data would be way more efficient, this is why TimescaleDB provides us with continuous aggregates which are materialized views which continuously update the most recent data, instead of refreshing the entire table on each update. Let’s create a continuous aggregate to view the weekly points generated by customers. To do this generate a custom SQL migration file by running bunx drizzle-kit generate —custom then write your migration file.

CREATE MATERIALIZED VIEW "customers"."weekly_points_generated"
  WITH (timescaledb.continuous) AS
  SELECT
    "shop",
    sum("amount") AS total_points,
    time_bucket ('1 week', "createdAt") AS "weekly_bucket"
  FROM
    "customers"."points_history"
  WHERE
    "customers"."points_history"."type" = 'Earned'
  GROUP BY
    weekly_bucket,
    "customers"."points_history"."shop"
  ORDER BY
    weekly_bucket DESC
WITH NO DATA;
SELECT add_continuous_aggregate_policy('customers.weekly_points_generated',
  start_offset => INTERVAL '1 week',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '30 minutes');

For the most part this is a regular materialized view other than the WITH (timescaledb.continuous) which makes this a continuous aggregate and the add_continous_aggregate_policy. This function call gives TimescaleDB all the information it needs to update the continuous aggregate, start_offset basically means recompute all rows within the last week; end_offset means don’t recompute anything within the last hour, and schedule_interval is how often the continuous aggregate should be recomputed.

Now that we’ve created the migration file, we need to let Drizzle know it exists so we can have some type safety and actually query the view.

const weeklyPointsGeneratedView = customersSchema
  .materializedView('weekly_points_generated', {
    shop: text(),
    totalPoints: numeric('total_points'),
    weeklyBucket: timestamp('weekly_bucket'),
  })
  .existing();

The existing method let’s Drizzle know it doesn’t need to generate any migrations for this view, now we can finally query the view.

const monthlyCustomerPoints = await db
  .select({
   total: sum(weeklyPointsGeneratedView.totalPoints),
   time: sql`time_bucket_gapfill('1 month', ${weeklyPointsGeneratedView.weeklyBucket})`                    .as("bucket")
   })
   .from(weeklyPointsGeneratedView)
   .where(
     and(
       gte(
         weeklyPointsGeneratedView.weeklyBucket,
         sql<Date>`date_trunc('month', NOW()) - interval '1 month'`
       ),
       lt(
         weeklyPointsGeneratedView.weeklyBucket,
         sql<Date>`date_trunc('month', NOW()) + interval '1 month'`
       ),
       eq(weeklyPointsGeneratedView.shop, shop),
     ),
   )
   .groupBy(sql`bucket`)
   .orderBy(sql`bucket DESC`)

This query groups all customer points within this month and last month for the given shop into monthly buckets. Both this month and last month are queried to show month over month improvement in the dashboard. As you can see there’s a lot of raw SQL which cannot be type checked, luckily there is a solution for this exact problem.

Testcontainers

Testcontainers allow you to programmatically spin up a Docker container without conflicting with any containers currently running on your machine. It does this by running each container on a random port instead of 5432 which you would normally use for Postgres for example. Most containers are fast and can be started and shutdown for each individual test like so:

await using container = await new ValkeyContainer(IMAGE).start();

const client = createClient({ url: container.getConnectionUrl() });
await client.connect();

await client.set("key", "val");
expect(await client.get("key")).toBe("val");

client.destroy();

The await using container automatically shuts down and cleans up the container when it goes out of scope similar to Go’s defer keyword. As I said, this works for most containers, but heavy containers like databases which take multiple seconds to spin up instead of milliseconds; would take a long time to spin up for each test and likely cause each one to time out. What we can do to fix this is spin up a database container before each test runs then use this same container for all test and shut it down after each test completes. Luckily Bun test runner allows you to preload a setup file for your tests for something like this.

class PostgresTestContainer {
  private container: StartedTestContainer | null = null;
  getConnectionUri () {
    const host = this.container?.getHost();
    const port = this.container?.getFirstMappedPort();

    return `postgres://postgres@${host}:${port}/postgres`
  }

  async initialize(): Promise<void> {
    try {
      const container = await new GenericContainer("timescale/timescaledb:2.25.2-pg18").withEnvironment({
          POSTGRES_USER: "postgres",
          POSTGRES_PASSWORD: "password",
          PGUSER: "postgres",
          PGPASSWORD: "password",
          POSTGRES_HOST_AUTH_METHOD: "trust"
        })
        .withExposedPorts(5432)
        .withWaitStrategy(
          Wait.forLogMessage("database system is ready to accept connections", 2),
        )
        .start();
      this.container = container;
    } catch(e) {
      console.error(`Failed to start test postgres container: ${e}`)
    }
  }

  async teardown(): Promise<void> {
    try {
      await this.container?.stop();
    } catch(e) {
      console.error(`Failed to stop Postgres container: ${e}`)
    }
  }
}

const postgresContainer = new PostgresTestContainer();
beforeAll(async () => {
  console.log("Setting up test environment")
  await postgresContainer.initialize()

  process.env.DATABASE_URL = postgresContainer.getConnectionUri();

  await $`bunx drizzle-kit migrate`

  mock.module("@/db", () => {
    const db = drizzle(postgresContainer.getConnectionUri()!);
    return { db }
  })
  const debug = process.env.DEBUG
})

afterAll(async () => {
  await postgresContainer.teardown()
})

This custom class allows us to control when the container is created and destroyed, so that we can spin up the container before all tests run and shut down after all tests run. After the container has spun up, Drizzle migrates SQL files for us and we need to mock the Drizzle singleton with a Drizzle singleton using the updated database URL. This may seem redundant, but after about a week of fighting Drizzle not connecting to the database, I realized it is initialized before the database URL updates and this is the only way I could fix it. Now in our bunfig.toml file we can add this preload file.

[test]
preload = ["./setup.ts"]

Finally, we can write our tests which is the easiest part since Elysia makes this a breeze.

describe('Elysia', () => {
    it('returns a response', async () => {
      await db.insert(pointsHistory).values([idk])  
      const response = await app
            .handle(new Request('http://localhost/customer-points'))
            .then((res) => res.json())

        expect(response.status).toBe({
          thisMonth: 200,
          lastMonth: 100
        })
    })
})

Testing web components

The front-end was a lot simpler, initially I tried jsdom, but as far as I could tell it doesn’t render web components, but they’re required as apart of the Shopify Polaris Web Components UI library, so this could not be changed. I opted for Vitest browser mode, which uses Playwright a widely used tool for running tests in a real browser and simulating user inputs.

This solves the problem of rendering web components, but introduces another which is mocking requests in the frontend, one cannot simply mock fetch and return a response. This has something to do with the way the browser handles module resolution, so Vite recommends using msw which uses a service worker to intercept requests and return mock responses. The recommended way of implementing msw per their documentation is by adding a worker method to the test function.

import { test as testBase } from "vitest";
import { worker } from "./mocks/browser";
import type { setupWorker } from "msw/browser";

export const test = testBase.extend<{
  worker: {
    use: typeof setupWorker
  }
}> ({
	worker: [
		async ({}, use) => {
			// Start the worker before the test.
			await worker.start({ onUnhandledRequest: "warn"});

			// Expose the worker object on the test's context.
			await use(worker);

			// Remove any request handlers added in individual test cases.
			// This prevents them from affecting unrelated tests.
			worker.resetHandlers();

			// Stop the worker after the test.
			worker.stop();
		},
		{
			auto: true,
		},
	],
});

With this we can import the test function exported by this module and run our tests like so:

import { beforeAll, describe, expect, vi } from "vitest";
import { render } from "vitest-browser-svelte";
import { http, HttpResponse } from "msw";
import { test } from "./test-extend";


import Layout from "../src/routes/+layout.svelte";
import SetupPage from "../src/routes/setup/+page.svelte";
import * as nav from "$app/navigation";

vi.mock('$app/navigation', { spy: true });

describe("/setup", () => {
	test("Should redirect to home page", async ({ worker }) => {
	  let capturedBody: unknown;
		worker.use(
			http.post("http://localhost:3000/api/shop", async ({ request }) => {
			  capturedBody = await request.json();
				return new HttpResponse(null, { status: 201 });
			}),
		);
		const screen = render(Layout, {
			props: {
				children: SetupPage,
			},
		});

		const birthdayReward = screen.getByText("Birthday Reward");
		const signUpReward = screen.getByText("Sign up");
		const firstOrderReward = screen.getByText("First Order", { exact: true });
		const submitButton = screen.getByText("Save");

		await birthdayReward.click();
		await signUpReward.click();
		await firstOrderReward.click();

		await submitButton.click();
		await expect.poll(() => capturedBody).toEqual({
        birthdayReward: true,
        signUpReward: true,
        firstOrderReward: true,
    });
		await expect.poll(() => vi.mocked(nav.goto)).toHaveBeenCalled();
	});
});

This test simply clicks a few buttons in a form and ensures that the correct request is sent to the backend then msw intercepts this request and returns a 201 which is expected to cause a redirect.

My initial assessment of testing was absolutely not wrong tests are boring and I got bored writing them. The upside of tests is undeniable, being able to run one command and ensure that everything is working correctly is worth all of the boring hours of setting up and writing tests. Now when shipping my application I can be confident that everything works correctly and get a good nights sleep!

Finally testing my app

Mocks the false sense of security

Testcontainers

Testing web components

Reply

Keep Reading

STAY CONNECTED