Skip to content
Acecore

Monkey Testing Your Website with GitHub Copilot × Playwright: A Practical Guide

by Gui
Table of Contents
Monkey Testing Your Website with GitHub Copilot × Playwright: A Practical Guide

Introduction

Website quality assurance is not sufficient with a one-time check before release. Unexpected issues can arise at any point—content additions, library updates, CDN configuration changes, and more.

This article documents a hands-on monkey testing session where VS Code Agent Mode (GitHub Copilot) directly operated a browser to test an entire site. We’ve systematized the testing methodology that AI consistently executed, from static source code analysis to dynamic browser verification.


Test Environment

ItemDetails
EditorVS Code + GitHub Copilot (Agent Mode)
AI ModelClaude Opus 4.6
Browser ControlVS Code built-in Playwright tools
Test TargetStatic site built with Astro + UnoCSS + Cloudflare Pages
Previewnpm run preview (local) + production URL

In Agent Mode, Copilot autonomously executes terminal commands, reads/writes files, and operates the browser. The tester simply instructs “please test,” and the AI automatically executes the entire process below.


Phase 1: Test Target Inventory

Full Source Code Read

The AI first scans the project’s directory structure and reads all source code for components, pages, and utilities.

src/
├── components/    ← All 28 components read
├── content/blog/  ← Frontmatter of 16 articles parsed
├── pages/         ← All routing files identified
├── layouts/       ← BaseLayout structure understood
└── utils/         ← rehype plugins & OG image generation reviewed

At this stage, the AI automatically identifies:

  • Full route list: 7 static pages + blog-related routes (articles, tags, archive, authors, pagination)
  • Interactive elements: Search modal, FAQ toggles, copy buttons, YouTube facade, scroll-to-top, hero slider
  • External integrations: ssgform.com (forms), Cloudflare Turnstile (bot protection), Google AdSense, GA4

Automatic Test Plan Generation

From the source code analysis results, the AI automatically generates a test plan as a Todo list. No need for humans to create checklists.


Phase 2: Full Route Crawl Testing

HTTP Status Verification

The built site is launched with npm run preview, and Playwright accesses all routes.

Test targets: 31 routes
├── Static pages       7 (/, /about/, /services/, etc.)
├── Blog posts        16
├── Tag pages         24
├── Archive            4
├── Pagination         2 (/blog/page/2/, /blog/page/3/)
├── Author pages       2
├── RSS                1
└── 404 test           1

Result: All routes 200 OK (except intentional 404)

DOM Structure Check

The following are automatically verified on each page:

Check ItemVerification MethodResult
Broken imagesimg.complete && img.naturalWidth === 00 found
Empty linkshref is empty, #, or unset0 found
Unsafe external linkstarget="_blank" without rel="noopener"0 found
H1 countdocument.querySelectorAll('h1').length === 1All pages OK
Skip linkPresence of “Skip to content”All pages OK
lang attributehtml[lang="ja"]All pages OK

Internal links were recursively collected from the entry page, confirming reachability of all 55 unique URLs. 0 dead links were found.


Phase 3: Interaction Verification

The AI directly manipulates browser elements with Playwright to verify JavaScript-powered functionality.

FAQ (<details> elements)

// Example test code executed by the AI
const details = document.querySelectorAll('details');
// Initial state: all closed → OK
// Click to open → OK
// Click again to close → OK

Search Modal (Pagefind)

  1. Open search dialog with window.openSearch()
  2. Wait for Pagefind UI to finish loading
  3. Enter “Astro” and confirm search results appear
  4. Confirm closing with ESC key

YouTube Facade Pattern

  1. Click the .yt-facade element
  2. Confirm an iframe for youtube-nocookie.com/embed/ is dynamically generated
  3. Confirm the autoplay=1 parameter is included

Copy Button (After View Transitions)

Confirmed that code block copy buttons are re-initialized and functional after page transitions via View Transitions. The re-registration via the astro:page-load event was working correctly.

ScrollToTop Button

Scroll to the bottom of the page → button appears → click → confirm window.scrollY returns to 0.


Phase 4: SEO & Structured Data Audit

OGP Meta Tags

The following were verified on all pages:

  • og:title / og:description / og:image / og:url / og:type are set
  • twitter:card is set to summary_large_image
  • canonical URL is correct
  • OG image URL exists and is the recommended size (1200×630)

Structured Data (JSON-LD)

JSON-LD on each page was parsed to verify schema types and content.

Page TypeStructured Data
All pagesOrganization, WebSite
Blog postsBreadcrumbList, BlogPosting, FAQPage
Articles with FAQFAQPage (mainEntity contains questions and answers)

Sitemap

Confirmed that sitemap-index.xmlsitemap-0.xml contains all 57 URLs. The sitemap reference from robots.txt was also working correctly.


Phase 5: Accessibility Verification

AXE engine-equivalent checks were run via Playwright on multiple pages.

Check ItemPages TestedViolations
img alt attributes40
button labels40
Heading hierarchy (h1→h2→h3 order)40
Form input labels1 (Contact)0
Landmark elements40
External link rel attributes40
tabindex values40

Zero violations across all 4 pages and all check items.


Phase 6: View Transitions Navigation Testing

With Astro View Transitions, the DOM is differentially updated, making JavaScript re-initialization a challenge. The following transition patterns were verified:

Home → Blog List → Article → Tag → Author → Contact → Services → Home

Items confirmed after each transition:

  • URL, title, and H1 are correctly updated
  • Search button works
  • Copy buttons are re-initialized
  • Breadcrumb navigation is updated
  • Zero JS errors

Phase 7: Security Header Verification

Verification of response headers on the production site:

HeaderValueRating
Content-Security-PolicyFully configured
X-Frame-OptionsSAMEORIGIN
X-Content-Type-Optionsnosniff
Strict-Transport-Securitymax-age=15552000
Referrer-Policystrict-origin-when-cross-origin
Permissions-Policygeolocation=(), camera=(), etc.

Discovered Bugs and Fixes

This test session uncovered 2 bugs, both fixed within the same session.

Bug 1: Search Modal Lacks Resilience

Symptom: If the search button is pressed before the Pagefind script finishes loading, the UI becomes unresponsive.

Cause: loadPagefindScript() had no retry mechanism after an initial failure.

Fix: Implemented clearing the Promise cache on failure and displaying a “Reload” button as fallback UI.

Bug 2: Missing Google Origins in CSP Header

Symptom: Google AdSense-related resources are blocked by CSP, causing console errors.

Cause: connect-src and frame-src did not include https://www.google.com / https://www.google.co.jp.

Fix: Added Google-related origins to the CSP directives in public/_headers.


Systematizing the Testing Methodology

Organizing this AI monkey testing approach, it can be classified into the following layers:

Layer 1: Static Analysis (Source Code Reading)

  • Directory structure scanning
  • Component dependency mapping
  • Frontmatter schema (Zod) analysis
  • CSP and redirect configuration review

Layer 2: HTTP Layer Testing (Full Route Crawl)

  • Status code verification (200/404/301)
  • Response header audit (security, cache)
  • Sitemap, robots.txt, ads.txt consistency

Layer 3: DOM Layer Testing (Structure Verification)

  • Broken images, empty links, unsafe external links
  • H1 uniqueness and heading hierarchy
  • Meta tags (OGP, canonical, description)
  • Structured data (JSON-LD)

Layer 4: Interaction Layer Testing (Behavior Verification)

  • Click, input, keyboard operations
  • Modal open/close, form validation
  • JS re-initialization after View Transitions
  • Scroll events, lazy loading

Layer 5: Accessibility Layer Testing

  • alt attributes, labels, ARIA
  • Heading hierarchy, landmarks
  • Focus management, tabindex
  • Skip links

Limitations and Constraints

AI monkey testing has several limitations:

ConstraintDetails
Viewport emulationMobile-width emulation doesn’t work in VS Code’s built-in browser. CSS validity was verified through static analysis of build output instead
Network conditionsOffline and slow connection simulation not possible. Service Worker testing also not covered
User “feel”Design beauty, readability, and brand consistency require human judgment
Authentication flowsPages requiring login need separate secure credential management

For CSS responsive design, we substituted by directly analyzing CSS files in the build output, confirming that @media(min-width:768px) media queries were correctly generated.


Summary

GitHub Copilot Agent Mode can complete an entire QA cycle—from source code analysis → test planning → automated browser operation → bug fixing → re-verification—starting from a single instruction: “please test.”

Here’s a summary of this session’s results:

  • Test targets: 31 routes + 24 tags + 4 archives + 2 pagination = 61 routes
  • Test items: HTTP status, DOM structure, interactions, SEO, accessibility, security, View Transitions
  • Bugs found: 2 (search modal, CSP header) → fixed on the spot
  • Accessibility violations: 0
  • Dead links: 0

Combining human visual inspection with AI automated verification achieves both test coverage and efficiency.

How to Conduct AI Monkey Testing

Inventory

Read all source code to identify routes, components, and interactions to test.

Crawl Testing

Send HTTP requests to all routes and detect status codes, broken images, and empty links.

Interaction Verification

Operate JS-driven elements like FAQ toggles, copy buttons, search modals, and YouTube embeds.

Structure & SEO Audit

Verify structured data, OGP, meta tags, heading hierarchy, and accessibility across all pages.

Comparison with Manual Testing

Traditional Manual Testing
  • Visually check each page one by one in a browser
  • Manually create and manage checklists
  • Prone to oversight and missed checks
  • Recording reproduction steps is time-consuming

AI Monkey Testing
  • Automatically crawl all routes to verify HTTP status and DOM structure
  • AI automatically extracts test targets from source code
  • Zero-miss detection of broken images, empty links, and JS errors
  • Discovery → root cause → fix → retest all completed within a single session
Frequently Asked Questions
Is GitHub Copilot Agent Mode free to use?
The GitHub Copilot Free plan has monthly usage limits for Agent Mode. Pro and Business plans have relaxed limits. The latest features are available early in VS Code Insiders.
Can the same approach be used with browser tools other than Playwright?
We use VS Code's built-in browser tools (Simple Browser + Playwright integration). Since Copilot directly operates the browser via the run_playwright_code tool, there's no need to install Playwright separately.
Can this be applied to non-static sites?
Yes. The same approach works for SPAs and SSR sites. However, pages requiring login authentication need a mechanism to securely manage test credentials.
Can the AI also fix bugs it discovers?
In Agent Mode, file read/write is possible, so the entire flow from bug detection to fixing and build verification can be completed within a single session. In this article, we discovered 2 bugs and fixed them on the spot.
G

Gui

CEO of Acecore. A versatile engineer covering system development, web production, infrastructure operations, and IT education. Enjoys solving organizational and human challenges through technology.

System development Web production Infrastructure operations IT education

Want to learn more about our services?

We provide comprehensive support including system development, web design, graphic design, and IT education.

Related Posts

Search articles