Skip to main content
Institute for Social Vision Design
ISVD-LAB-006Hypothesis

Machikarte Research Lab — Hypotheses and Overview: Reading Assembly Speeches as Observations

Naoya Yokota
About 5 min read

A research lab built on a cross-searchable record of speeches from all 1,788 Japanese local assemblies, designed to read the data as observations rather than verdicts. The subject is structural — distribution, propagation, and silence across municipalities — not individual condemnation. Differentiation rests on data quality assurance, verification scripts, and transparent editorial judgment.

This note sets out the working hypotheses, observation policy, and editorial premises that all articles in the Machikarte Research Lab build upon. It serves as a map before any individual piece.

What problem we are working on

Japanese local assembly minutes have long been published — but in fragmented systems, one municipality at a time, with little capacity for cross-cutting search. Questions raised by councillors and responses from mayors and officials accumulate daily as the language that precedes policy, before it is fixed in ordinances or budget documents.

machikarte (machikarte.isvd.or.jp) is the infrastructure that arranges the minutes of all 1,788 local assemblies in a form that allows cross-cutting citation. On top of that base, this lab takes one stance: read the data as observations. Observations here are inputs for structural reading, not grounds for ranking or judgment.

Rather than "Municipality X is the worst at deferring decisions", the subject of analysis is "How is the use of deferral phrasing distributed across the country", "Since when has the same wording appeared in different assemblies", or "On which topics do how many assemblies remain silent". This distinction is operationalised as a three-tier rule (see below).

Outline of the lab

AspectContent
Data basemachikarte (~126 million speech records)
Subject of publicationStructural analysis, national distribution, time-series patterns, topic-level silence counts
Not publishedEvaluative descriptions of individual councillors, worst-of-the-worst naming of individual assemblies
OperatorInstitute for Social Vision Design (ISVD)
ReviewEditorial committee (Phase 1 in formation) + DA review

Editorial judgment — three tiers for councillor-level data

To balance research utility and neutrality, councillor-level data is published at three distinct granularities.

  • Municipality-level aggregates: "Across the country, the share of deferral phrasing ranges from 0% at the minimum to 21% at the maximum"
  • Party-group-level aggregates: "On a given topic, group A speaks roughly twice as often as group B"
  • Individual verbatim quotation: Always with a direct URL to the original assembly record, and without interpretive or evaluative framing

Individual evaluative descriptions ("councillor A is unmotivated", and similar) are not produced in this lab. Individual scrutiny belongs to journalism as a separate layer; the lab's contribution stops at structural analysis.

Judgment criteria, the complaints-handling protocol, and the procedure for retractions are documented and published. Summaries of editorial committee discussions will also be made public.

Distinguishing what we push and what is pulled

The lab distinguishes between push and pull in handling findings.

ModeFormExpected flow
pullA structural analysis published as a labs/machikarte articleJournalists, researchers, and citizens cite it voluntarily
pushHanding findings directly to a chosen newsroom ("This municipality is the problem")Not done

Pushing findings to specific media would mean ISVD setting the agenda of journalism from outside, which runs into journalistic-ethics tension. The lab's role stops at placing verifiable structures in the public layer. Where a joint project shares methodology, primary data, and verification scripts with a partner, the verbal articulation of findings is not part of the deliverable.

The core of differentiation — data quality, verification scripts, transparent editorial judgment

If the National Diet Library, the Ministry of Internal Affairs, the Digital Agency, and others advance official open-data publication of assembly proceedings, the differentiator of "owning the data" may structurally erode. What the lab can keep as a long-term distinction lies in the process of preparing the data into a trustworthy form and in making that judgment externally verifiable.

Three investments anchor this position:

  • Data quality assurance: Public quality gates for deduplication, entity resolution, and municipality-name verification
  • Verification scripts: Aggregation queries and reproducible code published on GitHub so that third parties can reach the same conclusions
  • Transparent editorial judgment: Public documentation of the criteria for avoiding individual condemnation, the retraction protocol, and the editorial committee's operations

This direction connects with an earlier observation that "the real substance is in proof systems, quality, and grasp of reality".

Scope of themes

Published and forthcoming articles point in the following directions:

  • Time-series analysis: When a given word first appears in assemblies and how it spreads (first article: "When did 'AI' and 'Generative AI' first appear in Japanese local assembly speeches")
  • Structural analysis: Reading national distributions of response phrasing and the allocation of debate (flagship: distribution of response phrasing, gaps in childcare debate)
  • Structuring silence: Making visible the topics on which debate fails to take place
  • Policy propagation: How the same policy framing travels between municipalities
  • Cross-lab citation: Connection with other labs (agnotology, public-asset-ppp, and others)

Each article follows the MUST principle — "the subject is structural; do not condemn the individual" — and passes editorial committee review (Phase 2 onwards) and DA review before publication.

Limits — what is not yet covered

  • Incomplete coverage: About 1,200 of 1,788 assemblies have been ingested. Older years are less covered, so time-series trends carry some signal from ingestion progress
  • Surface keyword stage: Many analyses sit at the surface aggregation stage; polarity, support/opposition, and topic clustering are being extended in stages
  • No individual attribution: Evaluation of individual councillors is not within scope. Individual scrutiny by journalism is respected as a separate layer

These limits are stated in the methodology section of each article. The lab's operating policy, the editorial committee's bylaws, and the corrections contact will be presented on a separate operations page.

References

machikarte — Nationwide Local Assembly Speech Search Platform (Beta)Institute for Social Vision Design (ISVD). ISVD

machikarte (GitHub) — schema, aggregation queries, licenses (MIT + CC BY 4.0)Institute for Social Vision Design (ISVD). GitHub

When Politicians Talk AI: Issue-Frames in Parliamentary Debates Before and After ChatGPTSuter, V. et al.. Policy & Internet

Related Content

Participate in & Support Research

If you're interested in ISVD's research, we welcome your support as a supporting member.