Skip to content
Takumi Tokunaga
research

Reproducible design for education institution search

Implementation notes for turning official public data into a search experience

Period: 2026

Background

Institution names are entered in many ways: kanji, kana, katakana, romanized text, abbreviations, and partial names. Exact-match search is often too brittle because a user may remember a spelling that differs from the source data.

This note organizes how public education data can be transformed into a structure that is useful in a real search UI.

Design Points

  • Separate source-data acquisition from generated search data
  • Filter step by step by institution type, prefecture, institution, faculty or graduate school, and department
  • Generate search terms in kana, katakana, and romanized forms in addition to kanji
  • Split list search and autocomplete into separate endpoints to keep typing-time requests light
  • Publish notes on source data, processing scope, and disclaimers

Implementation

The ideas are reflected in the sDB search page and API. The API is implemented in Go with PostgreSQL, the frontend is published on Cloudflare Pages, and the backend runs on Cloud Run.

The public repository does not include secrets or private data. It focuses on the reproducible processing workflow and the user-facing search surface.

Open Questions

Handling name variants, update cadence, source-diff detection, and explainable search results remain active areas for improvement. Education institution names change through reorganizations and policy changes, so the important part is not only building a database, but making the update pipeline maintainable.

©

GitHub