Machikarte Research Lab — Hypotheses and Overview: Reading Assembly Speeches as Observations

Jun 1, 2026

Naoya Yokota

About 6 min read

A research lab built on a cross-searchable record of speeches from all 1,788 Japanese local assemblies, designed to read the data as observations rather than verdicts. The subject is structural — distribution, propagation, and silence across municipalities — not individual condemnation. Differentiation rests on data quality assurance, verification scripts, and transparent editorial judgment.

X FBFacebook Threads

This note sets out the working hypotheses, observation policy, and editorial premises that all articles in the Machikarte Research Lab build upon. It serves as a map before any individual piece.

What problem we are working on

Japanese local assembly minutes have long been published — but in fragmented systems, one municipality at a time, with little capacity for cross-cutting search. Questions raised by councillors and responses from mayors and officials accumulate daily as the language that precedes policy, before it is fixed in ordinances or budget documents.

machikarte (machikarte.isvd.or.jp) is the infrastructure that arranges the minutes of all 1,788 local assemblies in a form that allows cross-cutting citation. On top of that base, this lab takes one stance: read the data as observations. Observations here are inputs for structural reading, not grounds for ranking or judgment.

Rather than "Municipality X is the worst at deferring decisions", the subject of analysis is "How is the use of deferral phrasing distributed across the country", "Since when has the same wording appeared in different assemblies", or "On which topics do how many assemblies remain silent". This distinction is operationalised as a three-tier rule (see below).

Outline of the lab

Aspect	Content
Data base	machikarte (~126 million speech records)
Subject of publication	Structural analysis, national distribution, time-series patterns, topic-level silence counts
Not published	Evaluative descriptions of individual councillors, worst-of-the-worst naming of individual assemblies
Operator	Institute for Social Vision Design (ISVD)
Review	Editorial committee (Phase 1 in formation) + DA review

Editorial judgment — three tiers for councillor-level data

To balance research utility and neutrality, councillor-level data is published at three distinct granularities.

Municipality-level aggregates: "Across the country, the share of deferral phrasing ranges from 0% at the minimum to 21% at the maximum"
Party-group-level aggregates: "On a given topic, group A speaks roughly twice as often as group B"
Individual verbatim quotation: Always with a direct URL to the original assembly record, and without interpretive or evaluative framing

Individual evaluative descriptions ("councillor A is unmotivated", and similar) are not produced in this lab. Individual scrutiny belongs to journalism as a separate layer; the lab's contribution stops at structural analysis.

Judgment criteria, the complaints-handling protocol, and the procedure for retractions are documented and published. Summaries of editorial committee discussions will also be made public.

Distinguishing what we push and what is pulled

The lab distinguishes between push and pull in handling findings.

Mode	Form	Expected flow
pull	A structural analysis published as a labs/machikarte article	Journalists, researchers, and citizens cite it voluntarily
push	Handing findings directly to a chosen newsroom ("This municipality is the problem")	Not done

Pushing findings to specific media would mean ISVD setting the agenda of journalism from outside, which runs into journalistic-ethics tension. The lab's role stops at placing verifiable structures in the public layer. Where a joint project shares methodology, primary data, and verification scripts with a partner, the verbal articulation of findings is not part of the deliverable.

The core of differentiation — data quality, verification scripts, transparent editorial judgment

If the National Diet Library, the Ministry of Internal Affairs, the Digital Agency, and others advance official open-data publication of assembly proceedings, the differentiator of "owning the data" may structurally erode. What the lab can keep as a long-term distinction lies in the process of preparing the data into a trustworthy form and in making that judgment externally verifiable.

Three investments anchor this position:

Data quality assurance: Public quality gates for deduplication, entity resolution, and municipality-name verification
Verification scripts: Aggregation queries and reproducible code published on GitHub so that third parties can reach the same conclusions
Transparent editorial judgment: Public documentation of the criteria for avoiding individual condemnation, the retraction protocol, and the editorial committee's operations

This direction connects with an earlier observation that "the real substance is in proof systems, quality, and grasp of reality".

Scope of themes

Published and forthcoming articles point in the following directions:

Time-series analysis: When a given word first appears in assemblies and how it spreads (first article: "When did 'AI' and 'Generative AI' first appear in Japanese local assembly speeches")
Structural analysis: Reading national distributions of response phrasing and the allocation of debate (flagship: distribution of response phrasing, gaps in childcare debate)
Structuring silence: Making visible the topics on which debate fails to take place
Policy propagation: How the same policy framing travels between municipalities
Cross-lab citation: Connection with other labs (agnotology, public-asset-ppp, and others)

Each article follows the MUST principle — "the subject is structural; do not condemn the individual" — and passes editorial committee review (Phase 2 onwards) and DA review before publication.

Limits — what is not yet covered

Incomplete coverage: About 1,200 of 1,788 assemblies have been ingested. Older years are less covered, so time-series trends carry some signal from ingestion progress
Surface keyword stage: Many analyses sit at the surface aggregation stage; polarity, support/opposition, and topic clustering are being extended in stages
No individual attribution: Evaluation of individual councillors is not within scope. Individual scrutiny by journalism is respected as a separate layer

These limits are stated in the methodology section of each article. The lab's operating policy, the editorial committee's bylaws, and the corrections contact will be presented on a separate operations page.

This section organizes 10 research notes by type.

Time Series Trend Analysis (trends-)

Case Studies (case-)

National Distribution of Deferral Phrasing in Assembly Responses — A Structural Analysis of 18.97 Million Records from 870 Municipalities

Literature Maps (literature-map-)

Literature Map: A Lineage of Local Assembly Speech Data Research — Centered on the Work of Haruka Watanabe, Yasutomo Kimura, and Kenjiro Higashi

Regional Distribution Analysis

References

machikarte — Nationwide Local Assembly Speech Search Platform (Beta) — Institute for Social Vision Design (ISVD). ISVD

machikarte (GitHub) — schema, aggregation queries, licenses (MIT + CC BY 4.0) — Institute for Social Vision Design (ISVD). GitHub

When Politicians Talk AI: Issue-Frames in Parliamentary Debates Before and After ChatGPT — Suter, V. et al.. Policy & Internet

Machikarte Research Lab — Hypotheses and Overview: Reading Assembly Speeches as Observations

What problem we are working on

Outline of the lab

Editorial judgment — three tiers for councillor-level data

Distinguishing what we push and what is pulled

The core of differentiation — data quality, verification scripts, transparent editorial judgment

Scope of themes

Limits — what is not yet covered

Time Series Trend Analysis (trends-)

Case Studies (case-)

Literature Maps (literature-map-)

Regional Distribution Analysis

References

Related Content

Participate in & Support Research

Machikarte Research Lab — Hypotheses and Overview: Reading Assembly Speeches as Observations

What problem we are working on

Outline of the lab

Editorial judgment — three tiers for councillor-level data

Distinguishing what we push and what is pulled

The core of differentiation — data quality, verification scripts, transparent editorial judgment

Scope of themes

Limits — what is not yet covered

Related Research Notes

Time Series Trend Analysis (trends-)

Case Studies (case-)

Literature Maps (literature-map-)

Regional Distribution Analysis

References

Related Content

Participate in & Support Research