Testing Strategy

Comprehensive testing approach for Noozer: unit tests, integration tests, E2E tests, and mocks

Testing Philosophy
Test Pyramid
Unit Tests
Integration Tests
E2E Tests
Mocking External Services
Test Fixtures
CI/CD Integration
Performance Testing
Security Testing

Testing Philosophy

Principles

Test behavior, not implementation - Tests should verify what the code does, not how it does it
Fast feedback loops - Unit tests run in milliseconds, integration tests in seconds
Realistic mocks - External service mocks should behave like the real thing
Deterministic - Tests must produce the same result every time
Independent - Tests should not depend on each other or shared state
Comprehensive - Critical paths must have high coverage

Coverage Targets

Component	Target	Rationale
Core business logic	90%+	Critical for correctness
API handlers	80%+	User-facing, high impact
Queue consumers	80%+	Data integrity
Utility functions	70%+	Lower risk
Type definitions	N/A	Compile-time checked

Test Pyramid

                    /\
                   /  \
                  / E2E \           ~5% of tests
                 /  Tests \         Slow, expensive
                /----------\
               / Integration \      ~20% of tests
              /    Tests      \     Medium speed
             /----------------\
            /    Unit Tests    \    ~75% of tests
           /                    \   Fast, isolated
          /______________________\

Test Distribution

tests/
├── unit/                    # Fast, isolated tests
│   ├── services/
│   ├── utils/
│   ├── validators/
│   └── transformers/
├── integration/             # Tests with real bindings
│   ├── api/
│   ├── queues/
│   ├── database/
│   └── vectorize/
├── e2e/                     # Full system tests
│   ├── flows/
│   └── smoke/
├── fixtures/                # Shared test data
│   ├── articles.ts
│   ├── sources.ts
│   └── customers.ts
├── mocks/                   # Service mocks
│   ├── zenrows.ts
│   ├── data4seo.ts
│   ├── openai.ts
│   └── sharedcount.ts
└── helpers/                 # Test utilities
    ├── setup.ts
    ├── factories.ts
    └── assertions.ts

Unit Tests

Framework: Vitest

// vitest.config.ts
import { defineConfig } from 'vitest/config';

export default defineConfig({
  test: {
    globals: true,
    environment: 'miniflare',
    include: ['tests/unit/**/*.test.ts'],
    coverage: {
      provider: 'v8',
      reporter: ['text', 'html', 'lcov'],
      exclude: ['tests/**', 'node_modules/**'],
    },
  },
});

Example: Testing a Service

// tests/unit/services/classification.test.ts
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { ClassificationService } from '@/services/classification';
import { mockArticle } from '../../fixtures/articles';

describe('ClassificationService', () => {
  let service: ClassificationService;
  let mockVectorize: any;
  let mockLLM: any;

  beforeEach(() => {
    mockVectorize = {
      query: vi.fn(),
    };
    mockLLM = {
      run: vi.fn(),
    };
    service = new ClassificationService(mockVectorize, mockLLM);
  });

  describe('classifyArticle', () => {
    it('should use rules engine for keyword matches', async () => {
      const article = mockArticle({
        headline: 'Apple announces new iPhone',
        body_text: 'Apple Inc. today unveiled...',
      });

      const result = await service.classifyArticle(article);

      expect(result.classified_by).toBe('rules');
      expect(result.topics).toContainEqual(
        expect.objectContaining({ label: 'Technology' })
      );
      // LLM should not be called for clear keyword matches
      expect(mockLLM.run).not.toHaveBeenCalled();
    });

    it('should fall back to vector matching for ambiguous content', async () => {
      const article = mockArticle({
        headline: 'Market movements today',
        body_text: 'Various factors contributed...',
      });

      mockVectorize.query.mockResolvedValue({
        matches: [
          { id: 'topic-finance', score: 0.85 },
          { id: 'topic-business', score: 0.72 },
        ],
      });

      const result = await service.classifyArticle(article);

      expect(result.classified_by).toBe('vector');
      expect(mockVectorize.query).toHaveBeenCalledWith(
        expect.any(Float32Array),
        { topK: 10, namespace: 'taxonomy' }
      );
    });

    it('should use LLM for low-confidence cases', async () => {
      const article = mockArticle({
        headline: 'Complex situation unfolds',
        body_text: 'In a nuanced development...',
      });

      mockVectorize.query.mockResolvedValue({
        matches: [
          { id: 'topic-politics', score: 0.45 },
          { id: 'topic-social', score: 0.43 },
        ],
      });

      mockLLM.run.mockResolvedValue({
        response: JSON.stringify({
          topics: [{ label: 'Politics', confidence: 0.8 }],
          sentiment: 0.1,
        }),
      });

      const result = await service.classifyArticle(article);

      expect(result.classified_by).toBe('llm');
      expect(mockLLM.run).toHaveBeenCalled();
    });
  });

  describe('extractEntities', () => {
    it('should extract person entities', async () => {
      const article = mockArticle({
        body_text: 'Elon Musk announced that Tesla will...',
      });

      const entities = await service.extractEntities(article);

      expect(entities).toContainEqual(
        expect.objectContaining({
          canonical_name: 'Elon Musk',
          type: 'person',
        })
      );
      expect(entities).toContainEqual(
        expect.objectContaining({
          canonical_name: 'Tesla',
          type: 'org',
        })
      );
    });
  });
});

Example: Testing Utilities

// tests/unit/utils/url.test.ts
import { describe, it, expect } from 'vitest';
import { normalizeUrl, hashUrl, isValidNewsUrl } from '@/utils/url';

describe('URL utilities', () => {
  describe('normalizeUrl', () => {
    it('should remove tracking parameters', () => {
      const url = 'https://example.com/article?id=123&utm_source=twitter&fbclid=xxx';
      expect(normalizeUrl(url)).toBe('https://example.com/article?id=123');
    });

    it('should lowercase the hostname', () => {
      const url = 'https://EXAMPLE.COM/Article';
      expect(normalizeUrl(url)).toBe('https://example.com/Article');
    });

    it('should remove trailing slashes', () => {
      const url = 'https://example.com/article/';
      expect(normalizeUrl(url)).toBe('https://example.com/article');
    });

    it('should sort query parameters', () => {
      const url = 'https://example.com/article?b=2&a=1';
      expect(normalizeUrl(url)).toBe('https://example.com/article?a=1&b=2');
    });
  });

  describe('hashUrl', () => {
    it('should produce consistent SHA-256 hash', () => {
      const url = 'https://example.com/article';
      const hash1 = hashUrl(url);
      const hash2 = hashUrl(url);
      expect(hash1).toBe(hash2);
      expect(hash1).toMatch(/^[a-f0-9]{64}$/);
    });

    it('should produce different hashes for different URLs', () => {
      const hash1 = hashUrl('https://example.com/article1');
      const hash2 = hashUrl('https://example.com/article2');
      expect(hash1).not.toBe(hash2);
    });
  });

  describe('isValidNewsUrl', () => {
    it('should accept valid news URLs', () => {
      expect(isValidNewsUrl('https://nytimes.com/2024/01/15/article.html')).toBe(true);
      expect(isValidNewsUrl('https://bbc.co.uk/news/world-12345')).toBe(true);
    });

    it('should reject non-article URLs', () => {
      expect(isValidNewsUrl('https://nytimes.com/')).toBe(false);
      expect(isValidNewsUrl('https://nytimes.com/login')).toBe(false);
      expect(isValidNewsUrl('https://nytimes.com/subscribe')).toBe(false);
    });

    it('should reject non-HTTP URLs', () => {
      expect(isValidNewsUrl('ftp://example.com/article')).toBe(false);
      expect(isValidNewsUrl('javascript:void(0)')).toBe(false);
    });
  });
});

Example: Testing Validators

// tests/unit/validators/profile.test.ts
import { describe, it, expect } from 'vitest';
import { validateProfileCreate, validateProfileUpdate } from '@/validators/profile';

describe('Profile validators', () => {
  describe('validateProfileCreate', () => {
    it('should accept valid profile', () => {
      const input = {
        name: 'Tech Watch',
        keywords: ['AI', 'machine learning'],
        topics: ['technology'],
        notify_threshold: 0.7,
      };

      const result = validateProfileCreate(input);
      expect(result.success).toBe(true);
      expect(result.data).toEqual(input);
    });

    it('should require name', () => {
      const input = { keywords: ['AI'] };
      const result = validateProfileCreate(input);
      expect(result.success).toBe(false);
      expect(result.error.issues[0].path).toContain('name');
    });

    it('should reject too many keywords', () => {
      const input = {
        name: 'Test',
        keywords: Array(51).fill('keyword'),
      };
      const result = validateProfileCreate(input);
      expect(result.success).toBe(false);
      expect(result.error.issues[0].message).toContain('50');
    });

    it('should clamp notify_threshold to valid range', () => {
      const input = {
        name: 'Test',
        notify_threshold: 1.5, // Out of range
      };
      const result = validateProfileCreate(input);
      expect(result.success).toBe(false);
      expect(result.error.issues[0].path).toContain('notify_threshold');
    });
  });
});

Integration Tests

Framework: Vitest + Miniflare

// vitest.integration.config.ts
import { defineConfig } from 'vitest/config';

export default defineConfig({
  test: {
    globals: true,
    environment: 'miniflare',
    environmentOptions: {
      modules: true,
      d1Databases: ['DB'],
      r2Buckets: ['RAW_SNAPSHOTS'],
      kvNamespaces: ['HOT_CACHE', 'RATE_LIMITS'],
    },
    include: ['tests/integration/**/*.test.ts'],
    setupFiles: ['tests/helpers/setup-integration.ts'],
    testTimeout: 30000,
  },
});

Setup File

// tests/helpers/setup-integration.ts
import { beforeAll, afterAll, afterEach } from 'vitest';
import { setupMiniflare } from './miniflare';
import { seedTestData } from './seed';

let mf: Miniflare;

beforeAll(async () => {
  mf = await setupMiniflare();
  await seedTestData(mf);
});

afterEach(async () => {
  // Clean up test data but keep seed data
  await cleanupTestArticles(mf);
});

afterAll(async () => {
  await mf.dispose();
});

Example: API Integration Tests

// tests/integration/api/feed.test.ts
import { describe, it, expect, beforeEach } from 'vitest';
import { createTestClient, seedArticles, seedProfile } from '../../helpers';

describe('GET /v1/feed', () => {
  let client: TestClient;
  let apiKey: string;
  let profileId: string;

  beforeEach(async () => {
    client = await createTestClient();
    apiKey = await client.createApiKey();
    profileId = await seedProfile(client.db, {
      keywords: ['artificial intelligence', 'AI'],
      topics: ['technology'],
    });
    await seedArticles(client.db, [
      { headline: 'AI breakthrough announced', topics: ['technology'] },
      { headline: 'Sports update', topics: ['sports'] },
      { headline: 'New machine learning model', topics: ['technology'] },
    ]);
    await client.runMatcher(); // Process matches
  });

  it('should return articles matching profile', async () => {
    const response = await client.get('/v1/feed', {
      headers: { 'X-API-Key': apiKey },
      query: { profile_id: profileId },
    });

    expect(response.status).toBe(200);
    expect(response.body.data).toHaveLength(2);
    expect(response.body.data[0].headline).toContain('AI');
    expect(response.body.data[1].headline).toContain('machine learning');
  });

  it('should respect min_relevance filter', async () => {
    const response = await client.get('/v1/feed', {
      headers: { 'X-API-Key': apiKey },
      query: { profile_id: profileId, min_relevance: 0.9 },
    });

    expect(response.status).toBe(200);
    // Only highest relevance articles
    expect(response.body.data.length).toBeLessThanOrEqual(2);
  });

  it('should paginate results', async () => {
    // Seed more articles
    await seedArticles(client.db, Array(50).fill(null).map((_, i) => ({
      headline: `AI article ${i}`,
      topics: ['technology'],
    })));
    await client.runMatcher();

    const page1 = await client.get('/v1/feed', {
      headers: { 'X-API-Key': apiKey },
      query: { limit: 20 },
    });

    expect(page1.body.data).toHaveLength(20);
    expect(page1.body.pagination.has_more).toBe(true);

    const page2 = await client.get('/v1/feed', {
      headers: { 'X-API-Key': apiKey },
      query: { limit: 20, cursor: page1.body.pagination.next_cursor },
    });

    expect(page2.body.data).toHaveLength(20);
    // No overlap
    const page1Ids = page1.body.data.map(a => a.id);
    const page2Ids = page2.body.data.map(a => a.id);
    expect(page1Ids).not.toContain(page2Ids[0]);
  });

  it('should require authentication', async () => {
    const response = await client.get('/v1/feed');
    expect(response.status).toBe(401);
  });

  it('should rate limit excessive requests', async () => {
    const requests = Array(100).fill(null).map(() =>
      client.get('/v1/feed', { headers: { 'X-API-Key': apiKey } })
    );

    const responses = await Promise.all(requests);
    const rateLimited = responses.filter(r => r.status === 429);
    
    expect(rateLimited.length).toBeGreaterThan(0);
  });
});

Example: Queue Integration Tests

// tests/integration/queues/classifier.test.ts
import { describe, it, expect, beforeEach } from 'vitest';
import { createTestEnv, mockVectorize, mockLLM } from '../../helpers';
import { handleClassify } from '@/workers/classifier';
import { mockArticle } from '../../fixtures/articles';

describe('Classifier Queue Consumer', () => {
  let env: TestEnv;

  beforeEach(async () => {
    env = await createTestEnv();
    mockVectorize(env);
    mockLLM(env);
  });

  it('should classify article and store results', async () => {
    const article = await env.db.prepare(
      'INSERT INTO articles (id, headline, body_text, processing_status) VALUES (?, ?, ?, ?) RETURNING *'
    ).bind('test-1', 'Tech company releases product', 'Details about...', 'processing').first();

    const message = { article_id: article.id, stages: ['rules', 'vector'] };
    
    await handleClassify({ messages: [{ body: message }] }, env);

    const classification = await env.db.prepare(
      'SELECT * FROM article_classifications WHERE article_id = ?'
    ).bind(article.id).first();

    expect(classification).toBeDefined();
    expect(classification.topics).toBeDefined();
    expect(classification.classified_by).toMatch(/rules|vector/);
  });

  it('should batch process multiple messages', async () => {
    const articles = await Promise.all([1, 2, 3].map(i =>
      env.db.prepare(
        'INSERT INTO articles (id, headline, body_text, processing_status) VALUES (?, ?, ?, ?) RETURNING id'
      ).bind(`test-${i}`, `Headline ${i}`, `Body ${i}`, 'processing').first()
    ));

    const messages = articles.map(a => ({
      body: { article_id: a.id, stages: ['rules'] },
    }));

    await handleClassify({ messages }, env);

    const classifications = await env.db.prepare(
      'SELECT COUNT(*) as count FROM article_classifications'
    ).first();

    expect(classifications.count).toBe(3);
  });

  it('should handle classification errors gracefully', async () => {
    const message = { article_id: 'non-existent-id', stages: ['rules'] };

    // Should not throw
    await handleClassify({ messages: [{ body: message }] }, env);

    const error = await env.db.prepare(
      'SELECT * FROM pipeline_errors WHERE article_id = ?'
    ).bind('non-existent-id').first();

    expect(error).toBeDefined();
    expect(error.stage).toBe('classify');
    expect(error.retryable).toBe(false);
  });
});

Example: Database Integration Tests

// tests/integration/database/articles.test.ts
import { describe, it, expect, beforeEach } from 'vitest';
import { createTestDb } from '../../helpers';
import { ArticleRepository } from '@/db/repositories/article';

describe('ArticleRepository', () => {
  let db: D1Database;
  let repo: ArticleRepository;

  beforeEach(async () => {
    db = await createTestDb();
    repo = new ArticleRepository(db);
  });

  describe('create', () => {
    it('should insert article and return with ID', async () => {
      const article = await repo.create({
        source_id: 'source-1',
        canonical_url: 'https://example.com/article',
        url_hash: 'abc123',
        headline: 'Test Article',
        body_text: 'Content here...',
      });

      expect(article.id).toBeDefined();
      expect(article.headline).toBe('Test Article');
      expect(article.processing_status).toBe('pending');
    });

    it('should reject duplicate URL hash', async () => {
      await repo.create({
        source_id: 'source-1',
        canonical_url: 'https://example.com/article1',
        url_hash: 'same-hash',
        headline: 'First',
      });

      await expect(repo.create({
        source_id: 'source-1',
        canonical_url: 'https://example.com/article2',
        url_hash: 'same-hash',
        headline: 'Duplicate',
      })).rejects.toThrow(/UNIQUE constraint/);
    });
  });

  describe('findByUrlHash', () => {
    it('should find existing article', async () => {
      await repo.create({
        source_id: 'source-1',
        canonical_url: 'https://example.com/article',
        url_hash: 'unique-hash',
        headline: 'Test',
      });

      const found = await repo.findByUrlHash('unique-hash');
      expect(found).toBeDefined();
      expect(found.headline).toBe('Test');
    });

    it('should return null for non-existent', async () => {
      const found = await repo.findByUrlHash('does-not-exist');
      expect(found).toBeNull();
    });
  });

  describe('search', () => {
    beforeEach(async () => {
      await repo.create({ source_id: 's1', url_hash: 'h1', canonical_url: 'u1', headline: 'Apple announces iPhone' });
      await repo.create({ source_id: 's1', url_hash: 'h2', canonical_url: 'u2', headline: 'Google releases Android' });
      await repo.create({ source_id: 's2', url_hash: 'h3', canonical_url: 'u3', headline: 'Apple and Google partnership' });
    });

    it('should search by keyword', async () => {
      const results = await repo.search({ query: 'Apple' });
      expect(results).toHaveLength(2);
    });

    it('should filter by source', async () => {
      const results = await repo.search({ query: 'Apple', source_ids: ['s2'] });
      expect(results).toHaveLength(1);
      expect(results[0].headline).toContain('partnership');
    });
  });
});

E2E Tests

Framework: Playwright + Custom API Client

// tests/e2e/flows/article-lifecycle.test.ts
import { describe, it, expect, beforeAll } from 'vitest';
import { createE2EClient } from '../../helpers/e2e-client';

describe('Article Lifecycle (E2E)', () => {
  let client: E2EClient;

  beforeAll(async () => {
    client = await createE2EClient({
      baseUrl: process.env.API_URL || 'https://api.staging.noozer.io',
      apiKey: process.env.TEST_API_KEY,
    });
  });

  it('should process article from crawl to notification', async () => {
    // 1. Create a keyword set that will match test article
    const keywordSet = await client.createKeywordSet({
      name: 'E2E Test',
      keywords: ['e2e-test-' + Date.now()],
    });

    // 2. Create a profile to receive notifications
    const profile = await client.createProfile({
      name: 'E2E Profile',
      keywords: [keywordSet.keywords[0]],
      notify_threshold: 0.5,
      notify_channels: { webhook: true },
    });

    // 3. Create a webhook to receive notification
    const webhookReceiver = await client.createWebhookReceiver();
    await client.createWebhook({
      url: webhookReceiver.url,
      events: ['article.matched'],
    });

    // 4. Inject test article (admin endpoint)
    const article = await client.admin.injectTestArticle({
      headline: `Test article ${keywordSet.keywords[0]}`,
      body_text: `This is a test article containing ${keywordSet.keywords[0]}`,
      source_domain: 'test-source.example.com',
    });

    // 5. Wait for processing (with timeout)
    await client.waitForProcessing(article.id, { timeout: 60000 });

    // 6. Verify article appears in feed
    const feed = await client.getFeed({ profile_id: profile.id });
    const matchedArticle = feed.data.find(a => a.id === article.id);
    expect(matchedArticle).toBeDefined();
    expect(matchedArticle.relevance_score).toBeGreaterThan(0.5);

    // 7. Verify webhook was called
    const webhookPayload = await webhookReceiver.waitForPayload({ timeout: 30000 });
    expect(webhookPayload.event).toBe('article.matched');
    expect(webhookPayload.article.id).toBe(article.id);

    // Cleanup
    await client.deleteProfile(profile.id);
    await client.deleteKeywordSet(keywordSet.id);
  }, 120000); // 2 minute timeout
});

Smoke Tests

// tests/e2e/smoke/health.test.ts
import { describe, it, expect } from 'vitest';

const API_URL = process.env.API_URL || 'https://api.staging.noozer.io';

describe('Smoke Tests', () => {
  it('should return healthy status', async () => {
    const response = await fetch(`${API_URL}/v1/health`);
    const body = await response.json();

    expect(response.status).toBe(200);
    expect(body.status).toBe('healthy');
  });

  it('should authenticate with valid API key', async () => {
    const response = await fetch(`${API_URL}/v1/feed`, {
      headers: { 'X-API-Key': process.env.TEST_API_KEY },
    });

    expect(response.status).toBe(200);
  });

  it('should reject invalid API key', async () => {
    const response = await fetch(`${API_URL}/v1/feed`, {
      headers: { 'X-API-Key': 'invalid-key' },
    });

    expect(response.status).toBe(401);
  });

  it('should return articles from feed', async () => {
    const response = await fetch(`${API_URL}/v1/feed?limit=1`, {
      headers: { 'X-API-Key': process.env.TEST_API_KEY },
    });
    const body = await response.json();

    expect(response.status).toBe(200);
    expect(Array.isArray(body.data)).toBe(true);
  });

  it('should search articles', async () => {
    const response = await fetch(`${API_URL}/v1/search?q=technology&limit=1`, {
      headers: { 'X-API-Key': process.env.TEST_API_KEY },
    });
    const body = await response.json();

    expect(response.status).toBe(200);
    expect(Array.isArray(body.data)).toBe(true);
  });
});

Mocking External Services

ZenRows Mock

// tests/mocks/zenrows.ts
import { rest } from 'msw';

export const zenrowsHandlers = [
  rest.get('https://api.zenrows.com/v1/', (req, res, ctx) => {
    const url = req.url.searchParams.get('url');
    
    // Return canned responses based on URL patterns
    if (url?.includes('nytimes.com')) {
      return res(ctx.status(200), ctx.text(MOCK_NYTIMES_HTML));
    }
    
    if (url?.includes('blocked.com')) {
      return res(ctx.status(403), ctx.json({ error: 'Blocked' }));
    }
    
    // Default success response
    return res(ctx.status(200), ctx.text(MOCK_DEFAULT_HTML));
  }),
];

const MOCK_NYTIMES_HTML = `
<!DOCTYPE html>
<html>
<head><title>Test Article - NYTimes</title></head>
<body>
  <article>
    <h1>Test Headline</h1>
    <p class="byline">By Test Author</p>
    <p class="timestamp">2024-01-15</p>
    <div class="article-body">
      <p>This is the article content...</p>
    </div>
  </article>
</body>
</html>
`;

const MOCK_DEFAULT_HTML = `
<!DOCTYPE html>
<html>
<head><title>Default Article</title></head>
<body>
  <h1>Default Headline</h1>
  <p>Default content...</p>
</body>
</html>
`;

DataForSEO Mock

// tests/mocks/data4seo.ts
import { rest } from 'msw';

export const data4seoHandlers = [
  // Google News results
  rest.post('https://api.dataforseo.com/v3/serp/google/news/live/advanced', async (req, res, ctx) => {
    const body = await req.json();
    const keyword = body[0]?.keyword || 'test';

    return res(ctx.json({
      tasks: [{
        result: [{
          items: [
            {
              title: `News about ${keyword}`,
              url: `https://example.com/news/${encodeURIComponent(keyword)}`,
              source: 'Example News',
              date: new Date().toISOString(),
              snippet: `Latest updates on ${keyword}...`,
            },
            // More mock results...
          ],
        }],
      }],
    }));
  }),

  // Backlink data
  rest.post('https://api.dataforseo.com/v3/backlinks/summary/live', async (req, res, ctx) => {
    return res(ctx.json({
      tasks: [{
        result: [{
          target: 'example.com',
          total_backlinks: 1500,
          referring_domains: 250,
          domain_rank: 65,
        }],
      }],
    }));
  }),
];

OpenAI Mock

// tests/mocks/openai.ts
import { rest } from 'msw';

export const openaiHandlers = [
  // Embeddings
  rest.post('https://api.openai.com/v1/embeddings', async (req, res, ctx) => {
    const body = await req.json();
    const inputs = Array.isArray(body.input) ? body.input : [body.input];

    return res(ctx.json({
      object: 'list',
      data: inputs.map((_, i) => ({
        object: 'embedding',
        embedding: generateMockEmbedding(1536),
        index: i,
      })),
      model: body.model,
      usage: { prompt_tokens: 100, total_tokens: 100 },
    }));
  }),

  // Chat completions (for LLM classification)
  rest.post('https://api.openai.com/v1/chat/completions', async (req, res, ctx) => {
    const body = await req.json();
    const lastMessage = body.messages[body.messages.length - 1]?.content || '';

    // Return appropriate mock response based on prompt
    let content: string;
    if (lastMessage.includes('classify')) {
      content = JSON.stringify({
        topics: [{ label: 'Technology', confidence: 0.85 }],
        sentiment: 0.2,
        entities: [{ name: 'Test Entity', type: 'org' }],
      });
    } else if (lastMessage.includes('summarize')) {
      content = 'This is a mock summary of the article content.';
    } else {
      content = 'Mock response';
    }

    return res(ctx.json({
      id: 'mock-completion-id',
      object: 'chat.completion',
      choices: [{
        index: 0,
        message: { role: 'assistant', content },
        finish_reason: 'stop',
      }],
      usage: { prompt_tokens: 100, completion_tokens: 50, total_tokens: 150 },
    }));
  }),
];

function generateMockEmbedding(dimensions: number): number[] {
  // Generate deterministic but realistic-looking embedding
  return Array(dimensions).fill(0).map((_, i) => Math.sin(i * 0.1) * 0.1);
}

SharedCount Mock

// tests/mocks/sharedcount.ts
import { rest } from 'msw';

export const sharedcountHandlers = [
  rest.get('https://api.sharedcount.com/v1.0/', (req, res, ctx) => {
    const url = req.url.searchParams.get('url');

    // Return different engagement based on domain
    let engagement = {
      Facebook: { share_count: 100, comment_count: 20, reaction_count: 50 },
      Twitter: 75,
      Pinterest: 10,
      LinkedIn: 25,
    };

    if (url?.includes('viral')) {
      engagement = {
        Facebook: { share_count: 10000, comment_count: 2000, reaction_count: 5000 },
        Twitter: 7500,
        Pinterest: 1000,
        LinkedIn: 2500,
      };
    }

    return res(ctx.json(engagement));
  }),
];

MSW Setup

// tests/helpers/msw-setup.ts
import { setupServer } from 'msw/node';
import { zenrowsHandlers } from '../mocks/zenrows';
import { data4seoHandlers } from '../mocks/data4seo';
import { openaiHandlers } from '../mocks/openai';
import { sharedcountHandlers } from '../mocks/sharedcount';

export const server = setupServer(
  ...zenrowsHandlers,
  ...data4seoHandlers,
  ...openaiHandlers,
  ...sharedcountHandlers,
);

// Start server before all tests
beforeAll(() => server.listen({ onUnhandledRequest: 'warn' }));

// Reset handlers after each test
afterEach(() => server.resetHandlers());

// Clean up after all tests
afterAll(() => server.close());

Test Fixtures

Data Strategy

Important: In production, sources are NOT pre-seeded. They are discovered automatically when articles are crawled.

┌─────────────────────────────────────────────────────────────────────────────┐
│                        DATA DISCOVERY FLOW                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   Keyword ("bitcoin")                                                       │
│        │                                                                    │
│        ▼                                                                    │
│   Google News Crawl                                                         │
│        │                                                                    │
│        ▼                                                                    │
│   Article URL (https://techcrunch.com/2024/01/15/bitcoin-news)             │
│        │                                                                    │
│        ├──▶ Extract domain: "techcrunch.com"                               │
│        │                                                                    │
│        ▼                                                                    │
│   Source exists in DB?                                                      │
│        │                                                                    │
│        ├── YES: Use existing source_id                                     │
│        │                                                                    │
│        └── NO: Auto-create source from domain                              │
│                 • name: "TechCrunch" (from meta or inferred)               │
│                 • domain: "techcrunch.com"                                  │
│                 • type: "news_org" (default, can be updated)               │
│                 • authority_score: 0.5 (default, recalculated later)       │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

For testing, we use fixtures that simulate this flow:

Create test keywords → mock crawl → articles with realistic URLs
Sources are auto-created from article domains (like production)
No manual source seeding required

Article Fixtures

// tests/fixtures/articles.ts
import { faker } from '@faker-js/faker';

export interface MockArticleOptions {
  id?: string;
  source_id?: string;
  headline?: string;
  body_text?: string;
  topics?: string[];
  published_at?: Date;
  language?: string;
}

export function mockArticle(options: MockArticleOptions = {}) {
  return {
    id: options.id || faker.string.uuid(),
    source_id: options.source_id || faker.string.uuid(),
    canonical_url: faker.internet.url(),
    url_hash: faker.string.alphanumeric(64),
    headline: options.headline || faker.lorem.sentence(),
    subheadline: faker.lorem.sentence(),
    body_text: options.body_text || faker.lorem.paragraphs(5),
    summary: faker.lorem.paragraph(),
    published_at: (options.published_at || faker.date.recent()).toISOString(),
    captured_at: new Date().toISOString(),
    language: options.language || 'en',
    word_count: faker.number.int({ min: 100, max: 2000 }),
    processing_status: 'complete',
    topics: options.topics || [faker.word.noun()],
  };
}

// Pre-built fixtures for common scenarios
export const fixtures = {
  techArticle: () => mockArticle({
    headline: 'Apple announces new AI features for iPhone',
    body_text: 'Apple Inc. today revealed new artificial intelligence capabilities...',
    topics: ['technology', 'AI'],
  }),

  sportsArticle: () => mockArticle({
    headline: 'Team wins championship in overtime thriller',
    body_text: 'In a dramatic finish, the home team clinched victory...',
    topics: ['sports'],
  }),

  politicsArticle: () => mockArticle({
    headline: 'Senate passes bipartisan infrastructure bill',
    body_text: 'The United States Senate voted today to approve...',
    topics: ['politics', 'government'],
  }),

  breakingNews: () => mockArticle({
    headline: 'Breaking: Major event unfolds',
    body_text: 'Developing story...',
    topics: ['breaking'],
    published_at: new Date(), // Just now
  }),
};

Customer Fixtures

// tests/fixtures/customers.ts
import { faker } from '@faker-js/faker';

export function mockCustomer(options: Partial<Customer> = {}) {
  return {
    id: options.id || faker.string.uuid(),
    email: options.email || faker.internet.email(),
    name: options.name || faker.person.fullName(),
    tier: options.tier || 'pro',
    settings: options.settings || {},
    is_active: options.is_active ?? true,
    created_at: new Date().toISOString(),
  };
}

export function mockProfile(options: Partial<Profile> = {}) {
  return {
    id: options.id || faker.string.uuid(),
    customer_id: options.customer_id || faker.string.uuid(),
    name: options.name || faker.commerce.productName(),
    keywords: options.keywords || [faker.word.noun(), faker.word.noun()],
    topics: options.topics || ['technology'],
    regions: options.regions || ['US'],
    languages: options.languages || ['en'],
    min_authority_score: options.min_authority_score ?? 0,
    notify_threshold: options.notify_threshold ?? 0.7,
    notify_channels: options.notify_channels || { email: true },
    is_active: options.is_active ?? true,
  };
}

export function mockApiKey(customerId: string) {
  const prefix = 'nz_test_';
  const key = prefix + faker.string.alphanumeric(32);
  return {
    id: faker.string.uuid(),
    customer_id: customerId,
    key_hash: hashKey(key), // SHA-256
    key_prefix: prefix,
    name: 'Test Key',
    scopes: ['read:feed', 'write:profiles'],
    rate_limit_tier: 'standard',
    is_active: true,
    _plaintext: key, // For testing only
  };
}

Factory Helpers

// tests/helpers/factories.ts
import { mockArticle, mockCustomer, mockProfile, mockApiKey } from '../fixtures';

export class TestFactory {
  constructor(private db: D1Database) {}

  async createCustomer(options = {}) {
    const customer = mockCustomer(options);
    await this.db.prepare(
      'INSERT INTO customers (id, email, name, tier, settings, is_active) VALUES (?, ?, ?, ?, ?, ?)'
    ).bind(customer.id, customer.email, customer.name, customer.tier, JSON.stringify(customer.settings), 1).run();
    return customer;
  }

  async createProfile(customerId: string, options = {}) {
    const profile = mockProfile({ customer_id: customerId, ...options });
    await this.db.prepare(
      'INSERT INTO customer_profiles (id, customer_id, name, keywords, topics, notify_threshold) VALUES (?, ?, ?, ?, ?, ?)'
    ).bind(profile.id, profile.customer_id, profile.name, JSON.stringify(profile.keywords), JSON.stringify(profile.topics), profile.notify_threshold).run();
    return profile;
  }

  async createApiKey(customerId: string) {
    const apiKey = mockApiKey(customerId);
    await this.db.prepare(
      'INSERT INTO api_keys (id, customer_id, key_hash, key_prefix, name, scopes, is_active) VALUES (?, ?, ?, ?, ?, ?, ?)'
    ).bind(apiKey.id, apiKey.customer_id, apiKey.key_hash, apiKey.key_prefix, apiKey.name, JSON.stringify(apiKey.scopes), 1).run();
    return apiKey;
  }

  async createArticle(sourceId: string, options = {}) {
    const article = mockArticle({ source_id: sourceId, ...options });
    await this.db.prepare(
      'INSERT INTO articles (id, source_id, canonical_url, url_hash, headline, body_text, published_at, processing_status) VALUES (?, ?, ?, ?, ?, ?, ?, ?)'
    ).bind(article.id, article.source_id, article.canonical_url, article.url_hash, article.headline, article.body_text, article.published_at, article.processing_status).run();
    return article;
  }

  async createSource(options = {}) {
    const source = {
      id: faker.string.uuid(),
      name: options.name || faker.company.name(),
      domain: options.domain || faker.internet.domainName(),
      type: options.type || 'news_org',
      authority_score: options.authority_score ?? 0.7,
      is_active: 1,
    };
    await this.db.prepare(
      'INSERT INTO sources (id, name, domain, type, authority_score, is_active) VALUES (?, ?, ?, ?, ?, ?)'
    ).bind(source.id, source.name, source.domain, source.type, source.authority_score, source.is_active).run();
    return source;
  }
}

CI/CD Integration

GitHub Actions Configuration

# .github/workflows/test.yml
name: Tests

on:
  push:
    branches: [main, staging]
  pull_request:
    branches: [main]

jobs:
  unit-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - uses: pnpm/action-setup@v2
        with:
          version: 8
          
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'pnpm'
          
      - name: Install dependencies
        run: pnpm install --frozen-lockfile
        
      - name: Run unit tests
        run: pnpm test:unit --coverage
        
      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          files: ./coverage/lcov.info
          fail_ci_if_error: true

  integration-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - uses: pnpm/action-setup@v2
        with:
          version: 8
          
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'pnpm'
          
      - name: Install dependencies
        run: pnpm install --frozen-lockfile
        
      - name: Run integration tests
        run: pnpm test:integration
        env:
          TEST_API_KEY: ${{ secrets.TEST_API_KEY }}

  e2e-tests:
    runs-on: ubuntu-latest
    needs: [unit-tests, integration-tests]
    if: github.ref == 'refs/heads/main' || github.ref == 'refs/heads/staging'
    steps:
      - uses: actions/checkout@v4
      
      - uses: pnpm/action-setup@v2
        with:
          version: 8
          
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'pnpm'
          
      - name: Install dependencies
        run: pnpm install --frozen-lockfile
        
      - name: Run E2E tests
        run: pnpm test:e2e
        env:
          API_URL: ${{ github.ref == 'refs/heads/main' && 'https://api.noozer.io' || 'https://api.staging.noozer.io' }}
          TEST_API_KEY: ${{ secrets.E2E_TEST_API_KEY }}

Package.json Scripts

{
  "scripts": {
    "test": "vitest",
    "test:unit": "vitest run --config vitest.unit.config.ts",
    "test:integration": "vitest run --config vitest.integration.config.ts",
    "test:e2e": "vitest run --config vitest.e2e.config.ts",
    "test:smoke": "vitest run tests/e2e/smoke/",
    "test:coverage": "vitest run --coverage",
    "test:watch": "vitest --watch"
  }
}

Performance Testing

Load Testing with k6

// tests/performance/load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '30s', target: 20 },  // Ramp up
    { duration: '1m', target: 20 },   // Steady state
    { duration: '30s', target: 50 },  // Peak
    { duration: '30s', target: 0 },   // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'],  // 95% of requests under 500ms
    http_req_failed: ['rate<0.01'],    // Less than 1% failure
  },
};

const API_URL = __ENV.API_URL || 'https://api.staging.noozer.io';
const API_KEY = __ENV.API_KEY;

export default function () {
  // GET /v1/feed
  const feedRes = http.get(`${API_URL}/v1/feed?limit=20`, {
    headers: { 'X-API-Key': API_KEY },
  });
  check(feedRes, {
    'feed status 200': (r) => r.status === 200,
    'feed has data': (r) => JSON.parse(r.body).data.length > 0,
  });

  sleep(1);

  // GET /v1/search
  const searchRes = http.get(`${API_URL}/v1/search?q=technology&limit=10`, {
    headers: { 'X-API-Key': API_KEY },
  });
  check(searchRes, {
    'search status 200': (r) => r.status === 200,
  });

  sleep(1);
}

Security Testing

OWASP ZAP Integration

# .github/workflows/security.yml
name: Security Scan

on:
  schedule:
    - cron: '0 0 * * 0'  # Weekly
  workflow_dispatch:

jobs:
  zap-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: ZAP Scan
        uses: zaproxy/action-api-scan@v0.4.0
        with:
          target: 'https://api.staging.noozer.io/v1/'
          format: openapi
          openapi: 'api/docs/api-openapi.yaml'
          
      - name: Upload Report
        uses: actions/upload-artifact@v3
        with:
          name: zap-report
          path: report_html.html

Dependency Scanning

# In test.yml
- name: Audit dependencies
  run: pnpm audit --audit-level moderate

Last updated: 2024-01-15

Table of Contents​

Testing Philosophy​

Principles​

Coverage Targets​

Test Pyramid​

Test Distribution​

Unit Tests​

Framework: Vitest​

Example: Testing a Service​

Example: Testing Utilities​

Example: Testing Validators​

Integration Tests​

Framework: Vitest + Miniflare​

Setup File​

Example: API Integration Tests​

Example: Queue Integration Tests​

Example: Database Integration Tests​

E2E Tests​

Framework: Playwright + Custom API Client​

Smoke Tests​

Mocking External Services​

ZenRows Mock​

DataForSEO Mock​

OpenAI Mock​

SharedCount Mock​

MSW Setup​

Test Fixtures​

Data Strategy​

Article Fixtures​

Customer Fixtures​

Factory Helpers​

CI/CD Integration​

GitHub Actions Configuration​

Package.json Scripts​

Performance Testing​

Load Testing with k6​

Security Testing​

OWASP ZAP Integration​

Dependency Scanning​

Table of Contents