1.165 Stroke Order & Writing (CJK)#

SVG stroke order data, animated dictionaries, and stroke count databases for Chinese, Japanese, and Korean character learning

Explainer

Stroke Order & Writing: Domain Explainer#

Research ID: research-k6iy Date: 2026-01-29 Audience: Technical decision makers, product managers, architects without CJK language expertise

What This Solves#

The Problem: Chinese, Japanese, and Korean characters are made up of multiple strokes (individual pen movements), and there’s a specific sequence that experienced writers follow. Without guidance, learners often develop inconsistent or inefficient habits that are difficult to correct later.

Who Encounters This:

Educational platforms teaching CJK languages
Language learning apps adding writing practice features
Digital textbooks and reference materials
Handwriting recognition systems that need stroke order data
Calligraphy training applications

Why It Matters: Learning correct stroke order isn’t just about aesthetics. It affects:

Writing speed - The standard sequence is optimized for flow
Character recognition - Handwritten characters become more legible
Memory retention - The kinesthetic pattern helps learners remember characters
Cultural authenticity - Shows respect for the writing tradition

Getting the sequence wrong doesn’t prevent others from reading your writing (like spelling mistakes might), but learning the standard sequence from the start is far easier than breaking bad habits later.

Accessible Analogies#

Dance Choreography#

Learning to write complex characters is like learning dance choreography. There’s a specific sequence that experienced practitioners follow - you could technically reach the same final position by moving randomly, but:

The standard sequence flows naturally
It’s easier to perform at speed once memorized
Everyone learning the same sequence can practice together
Teachers can spot when you’re doing it wrong and correct you early

Just as dancers learn “step-ball-change” as a unit, Chinese writers learn common stroke patterns like “horizontal-then-vertical” or “left-falling-then-right-falling.”

Following a Recipe#

Think of stroke order like following a recipe’s sequence: you could add ingredients in any order and still end up with something edible, but the standard order exists because it:

Makes the process more efficient
Produces more consistent results
Makes it easier to troubleshoot when something goes wrong
Allows you to learn transferable techniques

Just as you learn “mise en place” (prep everything first) in cooking, you learn “outside-then-inside” in character writing.

Assembly Instructions#

Like assembling furniture, there’s often a “right” order that makes the process easier. You could attach parts in any sequence, but the manual’s sequence:

Prevents you from getting stuck (painted into a corner)
Makes the structure stable during assembly
Follows a logical progression that experienced builders recognize

Stroke order follows similar principles - certain sequences make the character easier to balance while writing and create natural momentum for the next stroke.

When You Need This#

You NEED stroke order data when:#

Building Educational Features:

Adding writing practice to a language learning app
Creating interactive worksheets or exercises
Building a handwriting evaluation system
Designing a character lookup tool for learners

Interactive Content:

Animated character demonstrations in digital textbooks
Step-by-step writing tutorials
Gamified learning experiences with stroke-by-stroke feedback
Calligraphy practice applications

Assessment Tools:

Evaluating whether learners are writing correctly
Providing real-time feedback during practice
Generating progress reports on writing accuracy

You DON’T need stroke order data when:#

Display-only applications (dictionaries that just show finished characters)
Reading comprehension tools (no writing practice involved)
Typography/font rendering (fonts handle display automatically)
Speech-focused learning (listening and speaking only)
Input methods (typing Chinese on a keyboard)

Decision Criteria:#

Ask yourself: “Will users be learning to write or evaluating their writing?” If yes, you need stroke order data. If users only need to recognize or type characters, you probably don’t.

Trade-offs#

Data Source Complexity Spectrum#

Low Complexity → High Flexibility

Option 1: Pre-built Library (Hanzi Writer)

✅ Fastest integration (< 1 day)
✅ Battle-tested and actively maintained
✅ Handles animation and rendering automatically
❌ Less control over appearance and behavior
❌ Web-focused (mobile requires additional work)
Use when: You want to ship quickly with standard features

Option 2: Raw SVG Data (Make Me a Hanzi, KanjiVG)

✅ Complete control over rendering and animation
✅ Use on any platform (web, mobile, desktop)
✅ Customize appearance and interaction patterns
❌ Requires building animation system yourself
❌ More initial development time (1-2 weeks)
Use when: You need custom behavior or non-web platforms

Option 3: Stroke Count/Metadata Only (CCDB API, Unihan)

✅ Lightweight data (just numbers and metadata)
✅ Fast lookups for reference purposes
✅ Minimal integration effort
❌ No visual demonstration of sequence
❌ Cannot show animated writing
Use when: You only need stroke counts for sorting/searching

Build vs. Integrate#

Integrate Existing Data (Recommended):

Pros: Data already verified by language experts, covers thousands of characters, maintained by community
Cons: Must accept existing coverage (some rare characters missing)
Timeline: MVP in < 1 week

Build Your Own Dataset:

Pros: Complete control over coverage and accuracy
Cons: Requires language expertise, time-intensive (months), error-prone without validation
Timeline: 3-6 months for basic coverage
Reality check: Only consider this if you have native-level expertise in the target language AND need characters not covered by existing datasets

Verdict: Unless you’re a language institute with specific research needs, integrate existing open-source data. The community datasets are production-ready and cover 99%+ of use cases.

Implementation Reality#

Realistic Timeline Expectations#

Week 1: Research and Setup

Evaluate data sources for your needs
Verify licensing compatibility with your project
Set up development environment
Download and test datasets locally

Weeks 2-3: Core Development

Integrate stroke order library or build rendering system
Create basic practice interface
Implement character lookup and display
Build minimal progress tracking

Weeks 4-6: Polish and Testing

Add animations and visual feedback
Test across devices and browsers
Create learning content and exercises
Beta test with real learners

Reality Check: Getting a basic demo working takes days. Building a polished, learner-friendly experience takes weeks. Creating comprehensive curriculum content (selecting which characters to teach, in what order, with what exercises) takes months.

Team Skill Requirements#

Minimum Team:

1 frontend developer (React/Vue/Angular or mobile native)
1 backend developer (if building API, otherwise optional)
1 content creator/language expert (part-time) for curriculum

Skills Needed:

Frontend: Working with SVG, animations (CSS/Canvas/WebGL)
Backend: REST APIs, database queries (if building lookup service)
Language: Understanding of target language writing system (or access to expert reviewer)

Can One Person Do This?: Yes, for an MVP. A full-stack developer with basic knowledge of the target language can build a functional prototype. However, quality content creation and cultural accuracy require native-level expertise.

Common Pitfalls#

Underestimating Mobile Performance:

SVG animations can be sluggish on older devices
Solution: Test early on low-end hardware, optimize rendering, consider Canvas instead of SVG for complex animations

Assuming All Characters Are Available:

Even comprehensive datasets have gaps (rare variants, historical forms)
Solution: Check coverage for YOUR specific character set early, have a fallback display for missing data

Ignoring Regional Variations:

Simplified vs. Traditional Chinese have different forms
Japanese kanji may differ from Chinese equivalents
Solution: Clearly define your target writing system upfront

Overlooking Licensing:

Some datasets have share-alike requirements (CC BY-SA)
Solution: Review licenses in Phase 1, ensure compliance with attribution requirements

First 90 Days: What to Expect#

Days 1-30: Building

You’ll have a working prototype that can display and animate characters
Expect excitement as you see characters come to life
Also expect frustration with edge cases and rendering quirks

Days 31-60: Testing

Beta testers will find bugs you never imagined
You’ll realize content creation (writing exercises, learning paths) is more work than the tech
Performance issues on real-world devices will surface

Days 61-90: Refining

You’ll iterate based on user feedback
The tech will feel stable, but content creation will feel endless
Marketing and user acquisition will become the bottleneck

Key Insight: The technical challenge of stroke order visualization is solved (libraries exist, data is available, integration is straightforward). The real work is creating engaging educational content and building a user base. Budget your time accordingly - 20% tech, 80% content and marketing.

Summary for Decision Makers#

The Data Exists: Open-source stroke order datasets cover 9,000+ Chinese characters and full Japanese kanji coverage, all with permissive licenses suitable for commercial use.

The Tools Are Ready: Libraries like Hanzi Writer make integration straightforward for web applications. Raw SVG data is available for custom implementations.

The Challenge Is Execution: Technology is not the bottleneck. Success depends on:

Creating effective learning content
Designing an engaging user experience
Acquiring and retaining learners

Time to First Demo: < 1 week for basic web implementation using existing libraries.

Time to Production: 6-8 weeks for a polished MVP with core features and initial content.

Skills Required: Frontend developer + language expert (or consultant) for content validation.

Cost: Primarily developer time and content creation - all stroke order data is free and open-source. No API costs or licensing fees for the core data.

Document Status: Complete Related Documents: See 01-discovery/ for detailed technical resources and implementation guides

S1: Rapid Discovery

Stroke Order Resources: Quick Reference#

Research ID: research-k6iy Date: 2026-01-29 Pass: S1 (Rapid Discovery)

TL;DR#

Need stroke order data for CJK characters? Start here:

Resource	Language	Coverage	Best For
Hanzi Writer	Chinese	9,000+	Web apps (easiest)
Make Me a Hanzi	Chinese	9,000+	Custom implementations
KanjiVG	Japanese	Kanji	Production-ready SVGs
animCJK	CJK (all)	7,672+	Multi-language apps
CCDB API	Chinese	20,902	Stroke count lookups

Quick Start#

Web App (5 minutes):

import HanziWriter from 'hanzi-writer';
const writer = HanziWriter.create('div-id', '你', {
  width: 100, height: 100
});
writer.animateCharacter();

Stroke Count Lookup:

API: http://ccdb.hemiola.com/characters/unicode/{codepoint}
Python: pip install cjklib
Database: ChineseStrokes (81,000 characters)

Licensing Quick Check#

Resource	License	Commercial OK?
Hanzi Writer	MIT	✅ Yes
KanjiVG	CC BY-SA 3.0	✅ Yes (with attribution)
Make Me a Hanzi	Mixed	⚠️ Check repo
animCJK	Open-source	⚠️ Verify license
cjklib	LGPL	✅ Yes

Next Steps#

For web apps: Start with Hanzi Writer (easiest integration)
For custom needs: Use Make Me a Hanzi or KanjiVG SVGs directly
For stroke counts: CCDB API or cjklib
For deep dive: See S2-comprehensive for full catalog

Key Files Location#

S2-comprehensive/: Full catalog of all data sources
S3-need-driven/: Implementation guides and use cases
S4-strategic/: Implementation roadmap and metrics

S2: Comprehensive

Stroke Order Data Sources: Comprehensive Catalog#

Research ID: research-k6iy Date: 2026-01-29 Pass: S2 (Comprehensive Coverage) Purpose: Complete catalog of SVG stroke order data, stroke count databases, and animated dictionary resources for CJK characters

1. SVG Stroke Order Data Sources#

1.1 Make Me a Hanzi (Chinese Characters)#

Repository: skishore/makemeahanzi Website: makemeahanzi Coverage: 9,000+ most common simplified and traditional Chinese characters License: Multiple (see repository for details)

Key Features:

Stroke-order vector graphics for all characters
Dictionary data (definitions, pinyin)
Graphical data (stroke decomposition)
Experimental animated SVGs (svgs.tar.gz)
SVGs named by Unicode codepoint

Data Format:

dictionary.txt - character definitions, pronunciations
graphics.txt - stroke order and decomposition data
svgs.tar.gz - pre-rendered animated SVG files

Use Cases:

Foundation for building stroke order animation systems
Reference for stroke decomposition algorithms
Educational apps requiring accurate stroke order

1.2 KanjiVG (Japanese Kanji)#

Repository: KanjiVG/kanjivg Website: kanjivg.tagaini.net Coverage: Japanese kanji characters License: Creative Commons Attribution-Share Alike 3.0

Key Features:

SVG file for each character with stroke shape and direction
Stroke order information
Component metadata (radicals, stroke types)
Variant forms included
Widely adopted (used by Duolingo, many dictionary sites)

Distribution:

Zip file with all non-variant SVG files
Individual files in repository
Vector graphics suitable for scaling

Notable Users:

Duolingo (language learning platform)
Multiple Japanese dictionary websites
Educational apps for kanji learning

1.3 HanziVG (Chinese Hanzi)#

Repository: Connum/hanzivg Goal: Become for Chinese what KanjiVG is for Japanese Coverage: Traditional and Simplified Chinese characters

Key Features:

SVG stroke order files with metadata
Radical information
Character component decomposition
Modeled after KanjiVG structure

Status: Active development, growing coverage

1.4 animCJK (Multi-Language)#

Repository: parsimonhi/animCJK Coverage: Chinese, Japanese (Kanji + Kana), Korean (Hanja) Total Characters: 7,672+ in Chinese simplified folder

Key Features:

Animated stroke order using SVG
Free and open-source
Multi-language support (CJK)
Organized by language:
- svgsZhHans/ - Simplified Chinese (7,000 common + uncommon)
- Traditional Chinese variants
- Japanese Kanji and Kana
- Korean Hanja
- Basic strokes and components

Use Cases:

Universal CJK character applications
Cross-language learning platforms
Comparative stroke order analysis

1.5 Hanzi Writer (JavaScript Library + Data)#

Repository: chanind/hanzi-writer Website: hanziwriter.org Data Explorer: chanind.github.io/hanzi-writer-data Type: JavaScript library with accompanying SVG data

Key Features:

Free and open-source library for stroke order animations
Based on Make Me a Hanzi data
HTML5 + SVG rendering
Stroke order practice quizzes
Embeddable in web applications
Character data in separate repository

Technical Stack:

JavaScript/TypeScript
SVG rendering
No backend required

Use Cases:

Web-based character writing practice
Interactive quizzes
Browser-based learning applications

2. Online Animated Dictionaries#

2.1 strokeorder.info#

URL: strokeorder.info Format: Animated GIFs Coverage: 4,000+ characters

Features:

Pre-rendered animated GIFs
Instant playback (no JavaScript required)
Easy to embed in static sites

2.2 strokeorder.com#

URL: strokeorder.com

Features:

Type-to-animate interface
Automatic playback on character entry
Interactive stroke order display

2.3 Chinese Character Web API#

URL: ccdb.hemiola.com Type: RESTful API Data Source: Unihan Database (MySQL + PHP)

Key Features:

20,902 characters (CJK Unified Ideographs range)
Stroke count information
Radical lookup (kRSKangXi field)
Programmatic access

Use Cases:

Backend for dictionary apps
Automated stroke count lookup
Character metadata retrieval

3. Stroke Count Databases#

3.1 Chinese Character Stroke Count Resources#

GitHub Repository: caiguanhao/ChineseStrokes Coverage: 81,000+ Chinese characters Purpose: Sort characters by stroke count

Key Features:

Comprehensive stroke count data
Suitable for dictionary lookup systems
Enables stroke-based search

Use Cases:

Implement radical/stroke lookup in dictionaries
Sort characters by complexity
Character learning progression systems

3.2 Unihan Database (kTotalStrokes)#

Source: Unicode Consortium Coverage: 101,996 CJK unified ideographs (as of Unicode 17.0) Field: kTotalStrokes

Note: Some errors exist in the data. Cross-reference recommended.

Access Methods:

Direct download from Unicode.org
Via libraries (cjklib, Python)
Through APIs (CCDB)

3.3 cjklib (Python Library)#

PyPI: cjklib Documentation: cjklib.readthedocs.io

Key Features:

Language routines for Han characters (Chinese, Japanese, Korean, Vietnamese)
Character pronunciations
Radical information
Glyph component analysis
Stroke decomposition
Variant information
Locale-aware stroke counts (simplified vs. traditional)

Important: Stroke counts can vary by locale (traditional vs. simplified Chinese)

Use Cases:

Building Python-based dictionary tools
Linguistic analysis
Character decomposition systems

3.4 KRADFILE/RADKFILE (Kanji Radical Decomposition)#

Maintainer: Electronic Dictionary Research and Development Group (EDRDG) Website: edrdg.org/krad/kradinf.html License: EDRDG License Coverage: 6,355+ kanji (JIS X 0208-1997) + 5,801 (JIS X 0212)

Key Features:

Kanji decomposition into visual elements/radicals
Enables radical-based lookup
KRADFILE: Kanji → Radicals mapping
RADKFILE: Radicals → Kanji mapping (inverted, used by lookup software)

Historical Context:

Initial work by Michael Raine (1994/1995)
Revised by Jim Breen (1995)
Extended by Jim Rose (2007)

Use Cases:

Implement radical-based kanji search
Component-based learning systems
Dictionary lookup by visual elements

4. Reference Data#

4.1 Frequency and Stroke Count Tables#

Resource: technology.chtsai.org/charfreq

Available Data:

Characters sorted by frequency
Stroke counts for common characters
Statistical analysis

4.2 Wiktionary Appendix#

Resource: Wiktionary - Chinese total strokes

Features:

Community-maintained stroke count data
Free to use
Multiple character variants

References#

Primary Sources#

Make Me a Hanzi - Chinese stroke order SVG data
KanjiVG - Japanese kanji stroke order
HanziVG - Chinese hanzi stroke order
animCJK - Multi-language CJK animations
Hanzi Writer - JavaScript library and data

APIs and Libraries#

Chinese Character Web API - Unihan-based API
cjklib - Python library for CJK processing
ChineseStrokes - Stroke count database

Reference Databases#

KRADFILE/RADKFILE - Kanji radical decomposition
Frequency and Stroke Counts - Statistical data
Wiktionary - Chinese total strokes - Community data

Online Tools#

strokeorder.info - Animated GIF dictionary
strokeorder.com - Interactive stroke order
Hanzi Writer Data Explorer - Browse character data

Document Status: Complete Last Updated: 2026-01-29

S3: Need-Driven

Stroke Order Implementation Guide#

Research ID: research-k6iy Date: 2026-01-29 Pass: S3 (Need-Driven Application) Purpose: Practical guidance for implementing stroke order features in educational platforms

1. Implementation Considerations#

1.1 Licensing#

Open Licenses:

Make Me a Hanzi: Mixed licenses (check repository)
KanjiVG: CC BY-SA 3.0 (attribution + share-alike)
animCJK: Open-source (verify specific license)
KRADFILE: EDRDG License (check restrictions)

Action Items:

Review license terms before commercial use
Provide proper attribution
Comply with share-alike requirements where applicable

1.2 Data Formats#

SVG (Recommended for stroke order):

Scalable without quality loss
Embeddable in web/mobile apps
Supports animation paths
Lightweight

JSON (Recommended for metadata):

Easy to parse
Works with all modern platforms
Suitable for APIs

GIF (Legacy, limited use):

Pre-rendered animations
No customization
Larger file sizes

1.3 Technical Integration#

For Web Applications:

// Example: Hanzi Writer
import HanziWriter from 'hanzi-writer';

const writer = HanziWriter.create('character-target-div', '你', {
  width: 100,
  height: 100,
  padding: 5
});

writer.animateCharacter();

For Mobile Applications:

Embed SVG files directly
Use native SVG rendering libraries
Pre-cache common characters for offline use

For Backend Systems:

cjklib (Python) for character analysis
Chinese Character Web API for stroke counts
PostgreSQL with Unihan data for lookups

1.4 Performance Optimization#

Strategies:

Lazy Loading: Load stroke data only when character is displayed
Caching: Pre-cache common characters (top 3,000)
CDN: Serve SVG files from CDN for faster delivery
Progressive Enhancement: Show static character first, animate on interaction

Estimated Data Sizes:

Per-character SVG: 2-10 KB
1,000 characters: 2-10 MB
Full dataset (9,000+): 18-90 MB

2. Use Cases for Learning Applications#

2.1 Stroke Order Practice#

Features:

Display stroke-by-stroke animation
User traces character with finger/stylus
Real-time validation of stroke direction and order
Feedback on accuracy

Data Required:

SVG stroke paths (from Make Me a Hanzi or KanjiVG)
Stroke sequence metadata
Direction vectors

2.2 Dictionary Lookup by Stroke Count#

Features:

Filter characters by total stroke count
Combine with radical lookup
Progressive narrowing (radical + stroke count)

Data Required:

Stroke count database (Unihan or ChineseStrokes)
Radical decomposition (KRADFILE)

Example Lookup:

User: "Radical 水 (water) + 7 strokes"
Result: 汰, 汲, 汴, 汾 (candidates)

2.3 Handwriting Recognition Training#

Features:

Collect user stroke data
Train ML models for character recognition
Validate correct stroke order

Data Required:

Labeled stroke order sequences
Variant forms (different handwriting styles)
Stroke direction and timing

2.4 Gamified Learning#

Features:

“Draw the character” challenges
Timed stroke order races
Achievement badges for stroke accuracy

Engagement Mechanics:

Progress tracking (characters mastered)
Leaderboards (speed + accuracy)
Unlock levels based on stroke complexity

2.5 Adaptive Learning Paths#

Features:

Start with simple characters (few strokes)
Progress to complex characters
Focus on commonly confused characters

Data-Driven Approach:

Sort characters by stroke count (ascending)
Track user errors (confusion matrix)
Recommend practice based on weak points

3. Integration with Educational Platforms#

3.1 Docusaurus Integration#

Approach:

Create MDX components for stroke order display
Embed Hanzi Writer or animCJK SVGs
Add interactive quizzes

Example MDX:

import StrokeOrder from '@site/src/components/StrokeOrder';

<StrokeOrder character="学" />

3.2 QRCards Certificate Integration#

Certificate Fields:

{
  "certification_info": {
    "type": "competency_badge",
    "name": "Hanzi Writing Fundamentals",
    "issued_date": "2026-XX-XX",
    "level": 1
  },
  "skills": {
    "characters_mastered": 500,
    "stroke_accuracy": "95%",
    "writing_speed": "15 chars/min"
  },
  "portfolio_evidence": [
    {
      "name": "Stroke Order Video",
      "url": "example.com/demo"
    }
  ]
}

3.3 Learning Path Design#

Beginner Path (8 weeks):

Week 1-2: Basic strokes (8 types)
Week 3-4: Simple characters (1-4 strokes)
Week 5-6: Radicals (214 traditional)
Week 7-8: Common characters (200 most frequent)

Intermediate Path (12 weeks):

Compound characters (5-12 strokes)
Stroke order rules and exceptions
Handwriting speed optimization
Character variants (simplified vs. traditional)

Advanced Path (16 weeks):

Complex characters (13+ strokes)
Calligraphy styles (kaishu, xingshu)
Historical forms
Error correction (common mistakes)

4. Recommended Tech Stack#

4.1 For Web-Based Learning Apps#

Frontend:

React/Next.js for UI
Hanzi Writer for character animations
SVG.js for custom stroke rendering

Backend:

Node.js API for character data
PostgreSQL with Unihan data
Redis for caching common characters

Data Storage:

CDN for SVG files (Cloudflare)
JSON API for metadata
User progress in database

4.2 For Mobile Apps#

iOS:

SwiftUI for UI
Core Graphics for SVG rendering
Local SQLite database with stroke data

Android:

Jetpack Compose for UI
AndroidX SVG libraries
Room database for offline data

Cross-Platform:

React Native + react-native-svg
Flutter + flutter_svg

5. Example Implementations#

5.1 Web Component (React)#

import React, { useEffect, useRef } from 'react';
import HanziWriter from 'hanzi-writer';

const StrokeOrderDisplay = ({ character }) => {
  const targetRef = useRef(null);
  const writerRef = useRef(null);

  useEffect(() => {
    if (targetRef.current) {
      writerRef.current = HanziWriter.create(targetRef.current, character, {
        width: 200,
        height: 200,
        padding: 10,
        showOutline: true,
        strokeAnimationSpeed: 1,
        delayBetweenStrokes: 300
      });
    }

    return () => {
      if (writerRef.current) {
        writerRef.current = null;
      }
    };
  }, [character]);

  const handleAnimate = () => {
    writerRef.current?.animateCharacter();
  };

  const handleQuiz = () => {
    writerRef.current?.quiz();
  };

  return (
    <div>
      <div ref={targetRef} />
      <button onClick={handleAnimate}>Animate</button>
      <button onClick={handleQuiz}>Practice</button>
    </div>
  );
};

export default StrokeOrderDisplay;

5.2 Backend API (Node.js + Express)#

const express = require('express');
const { Pool } = require('pg');

const app = express();
const pool = new Pool({
  connectionString: process.env.DATABASE_URL
});

// Get stroke count for a character
app.get('/api/strokes/:character', async (req, res) => {
  const { character } = req.params;
  const codepoint = character.codePointAt(0).toString(16).toUpperCase();

  const result = await pool.query(
    'SELECT stroke_count, radical FROM unihan WHERE codepoint = $1',
    [codepoint]
  );

  if (result.rows.length === 0) {
    return res.status(404).json({ error: 'Character not found' });
  }

  res.json(result.rows[0]);
});

// Search characters by stroke count
app.get('/api/search/strokes/:count', async (req, res) => {
  const { count } = req.params;

  const result = await pool.query(
    'SELECT codepoint, character FROM unihan WHERE stroke_count = $1 LIMIT 100',
    [parseInt(count)]
  );

  res.json(result.rows);
});

app.listen(3000, () => {
  console.log('API running on port 3000');
});

5.3 Python Stroke Analysis#

from cjklib import characterlookup

cjk = characterlookup.CharacterLookup('C')  # 'C' for Chinese

# Get stroke count
character = '学'
stroke_count = cjk.getStrokeCount(character)
print(f"Stroke count for {character}: {stroke_count}")

# Get radicals
radicals = cjk.getCharacterRadicalResidualStrokeCount(character)
print(f"Radicals: {radicals}")

# Find characters by stroke count
chars_with_5_strokes = cjk.getCharactersForStrokeCount(5)
print(f"Characters with 5 strokes: {chars_with_5_strokes[:10]}")

6. Testing and Validation#

6.1 Data Quality Checks#

Validation Steps:

Verify stroke count matches across data sources
Check SVG files render correctly
Validate stroke order follows standard conventions
Test on different screen sizes

Automated Testing:

describe('Stroke Order Data', () => {
  test('SVG files exist for common characters', async () => {
    const commonChars = ['的', '一', '是', '不', '了'];

    for (const char of commonChars) {
      const svg = await loadCharacterSVG(char);
      expect(svg).toBeDefined();
      expect(svg).toContain('<path');
    }
  });

  test('Stroke counts match database', async () => {
    const testCases = [
      { char: '一', expectedStrokes: 1 },
      { char: '二', expectedStrokes: 2 },
      { char: '三', expectedStrokes: 3 }
    ];

    for (const { char, expectedStrokes } of testCases) {
      const count = await getStrokeCount(char);
      expect(count).toBe(expectedStrokes);
    }
  });
});

6.2 User Experience Testing#

Test Scenarios:

Stroke animation speed (too fast/slow?)
Touch responsiveness on mobile
Accuracy threshold for practice mode
Feedback clarity (correct/incorrect strokes)

Metrics to Track:

Animation load time
Practice completion rate
User accuracy over time
Session engagement duration

7. Deployment Checklist#

7.1 Data Preparation#

Download required datasets (Make Me a Hanzi, KanjiVG, etc.)
Process SVG files for CDN delivery
Set up database with Unihan data
Create character metadata JSON files
Implement caching strategy

7.2 Infrastructure#

Set up CDN for SVG files
Configure API endpoints
Set up Redis for caching
Configure database backups
Set up monitoring and logging

7.3 Integration#

Test Hanzi Writer integration
Verify mobile responsiveness
Test offline functionality
Validate cross-browser compatibility
Test performance under load

7.4 Content#

Create learning path content
Write exercise instructions
Prepare quiz questions
Create tutorial videos (optional)
Design achievement badges

Document Status: Complete Last Updated: 2026-01-29 Related: See S2-comprehensive for data sources, S4-strategic for roadmap

S4: Strategic

Stroke Order Implementation: Strategic Roadmap#

Research ID: research-k6iy Date: 2026-01-29 Pass: S4 (Strategic Planning) Purpose: High-level implementation strategy, research gaps, success metrics, and recommendations

1. Research Gaps and Future Work#

1.1 Missing Coverage#

Gaps:

Korean Hangul stroke order (limited resources)
Vietnamese Chu Nom characters
Historical Chinese variants
Regional variations in stroke order

Opportunities:

Crowdsource additional data
Partner with language institutes
Expand animCJK coverage

1.2 Quality Improvements#

Needed:

Error correction in Unihan stroke counts
Standardization across datasets
Variant form mapping (simplified ↔ traditional)
Handwriting style variations

1.3 AI/ML Applications#

Potential:

Stroke prediction models (next stroke suggestion)
Handwriting style transfer
Automated stroke order generation for rare characters
Personalized difficulty adaptation

2. Implementation Roadmap#

Phase 1: Data Acquisition (Week 1)#

Objectives:

Secure all required datasets
Verify licensing compatibility
Set up local development environment

Tasks:

Download Make Me a Hanzi dataset
Clone KanjiVG repository
Set up local mirror of CCDB API
Download ChineseStrokes database
Review license terms for commercial use

Deliverables:

Local data repository
License compliance documentation
Data inventory spreadsheet

Phase 2: Infrastructure Setup (Week 2)#

Objectives:

Build backend infrastructure
Set up data pipelines
Configure hosting and CDN

Tasks:

Set up PostgreSQL with Unihan data
Create CDN bucket for SVG files
Build REST API for character lookup
Implement caching layer (Redis)
Configure monitoring and logging

Deliverables:

API endpoints (stroke count, character lookup)
CDN with SVG files
Database with metadata
Performance monitoring dashboard

Phase 3: Frontend Development (Week 3-4)#

Objectives:

Build user-facing components
Implement interactive features
Ensure mobile responsiveness

Tasks:

Create Hanzi Writer integration
Build stroke order visualization component
Implement practice mode with validation
Add progress tracking
Design responsive layouts
Test cross-browser compatibility

Deliverables:

React components for stroke order display
Practice mode with scoring
Mobile-optimized interface
User progress tracking system

Phase 4: Content Creation (Week 5-6)#

Objectives:

Develop learning curriculum
Create exercises and assessments
Prepare supporting materials

Tasks:

Design learning path curriculum
Write exercises and quizzes
Create video tutorials (optional)
Develop grading rubrics
Design achievement badges
Write instructional content

Deliverables:

Structured learning paths (Beginner, Intermediate, Advanced)
50+ practice exercises
Quiz bank (100+ questions)
Achievement system
Tutorial videos (if included)

Phase 5: Testing & Launch (Week 7-8)#

Objectives:

Validate functionality
Optimize performance
Launch pilot program

Tasks:

Beta test with 10 learners
Collect feedback on UX
Optimize performance
Launch pilot learning path
Monitor initial usage metrics
Iterate based on feedback

Deliverables:

Beta test report
Performance optimization results
Launch-ready platform
Initial user feedback summary

3. Success Metrics#

3.1 Engagement Metrics#

Daily Active Users (DAU):

Target: 50+ users within first month
Growth rate: 20% month-over-month

Characters Practiced per Session:

Target: 10-20 characters
Indicator of engagement depth

Session Duration:

Target: 15+ minutes average
Indicates meaningful practice time

Return Rate:

Target: 40%+ weekly return rate
Measures habit formation

3.2 Learning Outcome Metrics#

Stroke Accuracy Improvement:

Baseline: Initial assessment score
Target: 20%+ improvement after 4 weeks
Measure: Automated scoring of practice exercises

Character Retention Rate:

1 week retention: 70%+ (characters practiced still remembered)
1 month retention: 50%+ (long-term memory formation)
Measure: Periodic review quizzes

Writing Speed Increase:

Baseline: Characters per minute at start
Target: 30%+ improvement after 8 weeks
Measure: Timed writing exercises

Mastery Progression:

Beginner (1-4 strokes): 80%+ accuracy within 2 weeks
Intermediate (5-12 strokes): 80%+ accuracy within 6 weeks
Advanced (13+ strokes): 70%+ accuracy within 12 weeks

3.3 Business Metrics#

Learning Path Completion Rate:

Target: 50%+ completion for enrolled users
Industry benchmark: 30-40% for online courses
Indicates content quality and engagement

Certificate Issuance Volume:

Target: 20+ certificates in first quarter
Demonstrates skill achievement
Marketing value (user testimonials)

User Satisfaction (NPS Score):

Target: NPS > 40 (good)
Stretch goal: NPS > 70 (excellent)
Measure: Post-learning path survey

Cost per Acquisition (CPA):

Baseline: Track marketing spend
Target: CPA < $10 for free tier users
Measure: Marketing spend / new users

Lifetime Value (LTV):

For paid tiers (if applicable)
Target: LTV > 3x CPA
Measure: Average revenue per user over 12 months

3.4 Technical Performance Metrics#

Page Load Time:

Target: < 2 seconds
Critical for user experience

API Response Time:

Stroke count lookup: < 100ms
Character metadata: < 200ms

CDN Cache Hit Rate:

Target: > 95% for SVG files
Reduces bandwidth costs

Error Rate:

Target: < 0.1% of requests
Monitoring critical for reliability

4. Risk Assessment and Mitigation#

4.1 Technical Risks#

Risk: Data quality issues (incorrect stroke orders)

Impact: Medium (user confusion, learning incorrect forms)
Probability: Low (using established datasets)
Mitigation: Cross-reference multiple sources, community validation

Risk: Performance issues at scale

Impact: High (poor user experience)
Probability: Medium (depends on infrastructure)
Mitigation: Load testing, CDN optimization, caching strategy

Risk: Mobile compatibility issues

Impact: High (majority of language learners use mobile)
Probability: Low (tested during development)
Mitigation: Responsive design, device testing matrix

4.2 Business Risks#

Risk: Low user adoption

Impact: High (project viability)
Probability: Medium (depends on marketing)
Mitigation: Beta testing, user feedback loops, marketing strategy

Risk: Licensing issues with data sources

Impact: High (legal liability)
Probability: Low (verified during Phase 1)
Mitigation: Legal review, proper attribution, license compliance

Risk: Competition from established platforms

Impact: Medium (market share)
Probability: High (Duolingo, Pleco, etc. exist)
Mitigation: Differentiation strategy, unique features, niche targeting

4.3 Operational Risks#

Risk: Content creation bottleneck

Impact: Medium (delays launch)
Probability: Medium (resource-intensive)
Mitigation: Prioritize core content, phase additional content

Risk: Maintenance burden for data updates

Impact: Low (gradual degradation)
Probability: Medium (Unicode updates, new characters)
Mitigation: Automated data refresh scripts, community contributions

5. Strategic Recommendations#

5.1 Recommended Starting Point#

Minimum Viable Product (MVP):

Web-first approach using Hanzi Writer
- Fastest time to market
- Lowest development cost
- Proven technology stack
Focus on Chinese characters initially
- Largest user base
- Best data availability (Make Me a Hanzi)
- Expand to Japanese/Korean later
Core features only:
- Stroke order animation
- Practice mode with basic validation
- Progress tracking (characters completed)
- Single learning path (Beginner)

Rationale: Validate product-market fit before investing in advanced features.

5.2 Differentiation Strategy#

How to Stand Out:

Integration with existing platforms
- Docusaurus plugin for documentation sites
- Embeddable widgets for blogs/tutorials
- API for third-party apps
Credential-focused
- Issue verifiable certificates (QRCards)
- Portfolio evidence (practice videos)
- LinkedIn-compatible badges
Adaptive learning
- Personalized difficulty adjustment
- Focus on user’s weak points
- Spaced repetition for retention
Community features
- Leaderboards (opt-in)
- Shared progress achievements
- Study groups / cohorts

5.3 Technology Choices#

Recommended Stack:

Frontend: Next.js + React
- Server-side rendering for SEO
- Fast page loads
- Large ecosystem
Stroke Animation: Hanzi Writer
- Battle-tested library
- Active development
- Good documentation
Backend: Node.js + Express + PostgreSQL
- JavaScript everywhere (full-stack)
- PostgreSQL for complex queries (stroke count + radical lookup)
- Redis for caching
Hosting: Vercel (frontend) + Railway (backend)
- Easy deployment
- Auto-scaling
- Good free tiers for MVP
CDN: Cloudflare
- Free tier sufficient for MVP
- Global distribution
- DDoS protection

5.4 Go-to-Market Strategy#

Phase 1: Beta Launch (Weeks 1-4)

Recruit 10-20 beta testers
Offer free lifetime access for feedback
Iterate based on user input

Phase 2: Soft Launch (Weeks 5-8)

Launch on Product Hunt, Hacker News
Target language learning communities (Reddit, forums)
Content marketing (blog posts, tutorials)

Phase 3: Growth (Weeks 9-16)

SEO optimization for “Chinese stroke order” keywords
Partnership with language schools/tutors
Paid ads (Google, Facebook) if budget allows

Phase 4: Scale (Weeks 17+)

Expand to Japanese and Korean
Add advanced features (calligraphy styles, handwriting recognition)
Enterprise sales to educational institutions

5.5 Monetization Options#

Freemium Model (Recommended):

Free: Basic stroke order practice (200 characters)
Paid ($5/month): Full character set, certificates, advanced features

One-Time Purchase:

$29 for lifetime access to full content
Appeals to serious learners

Enterprise Licensing:

API access for third-party apps
White-label for educational institutions
Custom content for corporate training

6. Alternative Approaches#

6.1 If Limited Resources#

Approach: Start even smaller

Use Hanzi Writer demo page as MVP
Embed pre-existing tools (strokeorder.info)
Focus on content curation, not tech development
Validate demand before building custom platform

6.2 If Large Budget Available#

Approach: Build comprehensive platform from day one

Mobile apps (iOS + Android) alongside web
AI-powered handwriting recognition
Live tutoring integration
Gamification with 3D animations
Multi-language from launch (Chinese + Japanese + Korean)

6.3 If Targeting Niche Audience#

Approach: Specialize deeply

Focus on calligraphy enthusiasts (not general learners)
Historical script variants (seal script, clerical script)
Professional certification for Chinese teachers
Premium pricing, boutique experience

7. Conclusion#

7.1 Key Takeaways#

Ecosystem is Mature: Open-source data for CJK stroke order is production-ready (Make Me a Hanzi, KanjiVG)
Low Barrier to Entry: Hanzi Writer library makes web integration straightforward (< 1 week MVP)
Market Validation: Existing platforms (Duolingo, Pleco) prove demand for stroke order features
Differentiation Possible: Credentials, integration, and adaptive learning offer competitive advantage
Execution Matters: Success depends more on product design and marketing than data availability

7.2 Recommended Next Steps#

Immediate (This Week):

Select target language (Chinese recommended)
Choose data source (Hanzi Writer for easiest start)
Prototype stroke order component (1 day)
Show to 3-5 potential users for feedback

Short-term (Weeks 2-4):

Build MVP with core features only
Beta test with 10 users
Validate product-market fit

Medium-term (Months 2-3):

Launch publicly
Iterate based on usage data
Expand content and features

Long-term (Months 4-12):

Scale to additional languages
Add advanced features (AI recognition, calligraphy)
Explore monetization strategies

7.3 Critical Success Factors#

User Experience: Stroke animation must be smooth and intuitive
Content Quality: Learning paths must be well-structured and effective
Performance: Fast load times critical for mobile learners
Engagement: Gamification and progress tracking keep users coming back
Differentiation: Clear value proposition vs. existing platforms

7.4 Final Recommendation#

Start with Hanzi Writer for web-based Chinese stroke order practice.

Fastest path to MVP
Proven technology
Best data availability
Largest potential user base
Expandable to Japanese/Korean later

Once product-market fit is validated, invest in:

Mobile apps
Advanced features (AI recognition)
Multi-language expansion
Enterprise features

The data is ready. The tools exist. The market is proven. Success depends on execution.

Document Status: Complete Last Updated: 2026-01-29 Related: See S1-rapid for quick start, S2-comprehensive for data sources, S3-need-driven for implementation details

Published: 2026-03-06 Updated: 2026-03-06