Skip to main content
Institute for Social Vision Design

What Are the Seven Axes of SDI (Social Design Index)? — A Guide to Scoring Logic and Interpretation

Naoya Yokota
About 10 min read

ISVD's Social Design Index (SDI) is a diagnostic tool that scores social design projects across seven axes. This article walks through what each axis means, how scoring works, and how to read benchmark cases. SDI is not a ranking tool — it is a frame for visualizing the structural strengths and weaknesses of a project.

TL;DR

  1. SDI scores social design projects across seven axes — structure, causality, impact, feasibility, reproducibility, cocreation, and learning.
  2. Scoring is performed by Claude Sonnet on a 1.0 to 5.0 scale (in 0.5 increments), with grades A through D assigned from an equal-weight average.
  3. The seven axes are not meant to be read in isolation. The intended use is to read them in 2D matrices such as impact × feasibility or structure × cocreation, alongside benchmark cases.

What Is Happening

The common misreading of SDI as a ranking tool, and the gap with its actual design intent.

The Social Design Index (SDI) published by ISVD is often met with questions like these.

"What total score qualifies as a top-tier project?" "How do we rank against other NPOs?" "What does it take to score full marks across all seven axes?"

All three of these questions read SDI as a ranking tool. But that is not the design intent of SDI. SDI is designed to make the structural strengths and weaknesses of one's own project visible in three dimensions — through multiple, deliberately distinct axes.

The design is not built to be maximized across all seven axes. On the contrary, by deliberately including axes that tend to be in tension with each other (impact and feasibility, structure and cocreation), SDI is intended to surface the trade-offs that any project actually faces.

Background & Context

SDI design background and its relation to existing frames (logic model, theory of change, SROI).

The seven axes

The seven axes of SDI are designed as follows.

AxisWhat it evaluates
structureWhether the causal chain from problem to activity to outcome is logically organized
causalityWhether the mechanism through which activity produces change is articulated for third parties to follow
impactWhether the project envisions reach beyond the individual or the local — to institutions, social norms, or market structures
feasibilityWhether the structure, resources, and timeline required for execution are realistically in place
reproducibilityWhether context-dependent and context-independent factors are separated for use in other regions or organizations
cocreationWhether the design actively engages stakeholders as agents rather than recipients
learningWhether mechanisms exist to absorb feedback and counter-evidence into self-renewal

Scoring is performed by Claude Sonnet (a generative AI model) on a 1.0 to 5.0 scale (in 0.5 increments), based on the project description provided. The total score is computed as an equal-weight average, with grades of A (4.0 and above), B (3.0 and above), C (2.0 and above), and D (below 2.0).

LLM-based scoring is an auxiliary means of quickly evaluating the logical coherence and completeness of a project description. It does not replace empirical measurement of social change or third-party evaluation. The intended use is in combination with field research, stakeholder interviews, and effect verification.

Position relative to existing evaluation frames

A number of frames for evaluating social impact have been proposed over the years. Representative examples include the , , SROI (Social Return on Investment), IRIS+, and the OECD DAC Evaluation Criteria.

SDI does not stand in opposition to these. It is positioned as complementary. The logic model is a format for structuring inputs, activities, outputs, and outcomes — SDI's structure axis scores the quality of that design. Theory of change is a narrative explaining why something works — overlapping with SDI's causality axis. SROI and IRIS+ are indicator systems for measuring and reporting social outcomes — informing SDI's impact axis.

What SDI emphasizes that is distinctive is the pair of cocreation and learning. While many existing frames center on "design and measurement," SDI brings "the agency that sustains the design" and "the mechanism that updates the design" into the scoring. This reflects ISVD's methodological premise that social design is not a fixed blueprint but a process that emerges through collaboration with relevant actors and ongoing learning.

Collective Impact, proposed by Kania and Kramer in 2011, places five conditions at its core: a common agenda, shared measurement systems, mutually reinforcing activities, continuous communication, and a backbone support organization. SDI's cocreation axis takes Collective Impact's emphasis on stakeholders' active collaboration and embeds it explicitly as a scoring dimension — assessing the quality of design at the level of relationships.

Backbone Organization
Intermediary organization coordinating the whole
↕ Coordinating 4 conditions
Common Agenda
All stakeholders share the same problem definition
Shared Measurement
Measuring outcomes with the same indicators
Mutually Reinforcing Activities
Each organization plays a complementary role
Continuous Communication
Regular information sharing and trust building
Fig: Five Conditions of Collective Impact — Hub structure centered on the backbone organization

Why equal-weight averaging

The decision to use an equal-weight average across the seven axes often receives the comment that "weights should vary by industry or phase." For instance, feasibility may matter most in the early stages of a startup, while impact may matter most at maturity.

This comment has merit. At the same time, introducing use-case-specific weighting on the scoring side would make the result "a number that depends on the weighting" and erode the transparency of the frame. SDI currently leaves the weighting to the interpreter. The total score is treated as a reference value, and the primary objects of reading are the per-axis scores and the relative positions in 2D matrices.

Reading the Structure

Each of the seven axes explained, axis combinations, and typical patterns in the benchmark cases.

Detail of each axis and common misreadings

For each of the seven axes, the scoring perspective and a common misreading are summarized below.

structure. Looks at whether the causal chain from problem to activity to outcome is logically organized. A common misreading is "as long as there is a logic model, the score is full." The existence of a logic model is not itself the object of scoring. What matters is logical coherence, definitional clarity, and the absence of missing links.

causality. Looks at whether the mechanism through which activity produces change is articulated for third parties to follow. Confusing correlation with causation is a common reason for losing points. Stating "change was observed after the intervention" alone does not lift causality. The psychological, social, or economic mechanism by which the intervention produces the change must be explained.

impact. Looks at whether the project envisions reach beyond the individual or the local. Scale alone is not the direct scoring criterion. A project that changes the lives of 100 people can score highly on impact if that change carries downstream reach into institutional design or social norms.

feasibility. Looks at whether the structure, resources, and timeline required for execution are realistically in place. The statement "we can implement it if we have funding" is insufficient. What is in question is the realism of the execution structure (key people, organizational capacity, external collaborators) and the workflow.

reproducibility. Looks at whether barriers to deploying the project in other regions or organizations have been considered. Manualization is a positive factor but not sufficient for full marks. What matters is whether context-dependent factors (conditions that hold only in that region) and context-independent factors (elements applicable anywhere) have been separated, with the latter made explicit.

cocreation. Looks at whether stakeholders are actively engaged as agents. The frequency of meetings or briefings does not necessarily mean cocreation is high. What matters is whether those involved or affected participate in the design process from the early stages and influence decisions.

learning. Looks at whether mechanisms exist to absorb feedback and counter-evidence into self-renewal. The phrase "we run a PDCA cycle" alone is insufficient. The accumulation of failure logs and counter-evidence, and the path by which they feed into organizational decisions, must be explicit.

Reading in 2D matrices

Reading the seven axes in isolation is less useful than reading them in pairs. Combining two axes brings out the relative position of a project and the trade-offs it faces. Some representative combinations follow.

impact × feasibility. The matrix of social reach and feasibility. A region with high impact and low feasibility risks becoming a "pipe dream"; a region with low impact and high feasibility settles into "mere improvement." How to connect the two is the core question of the design.

structure × cocreation. The matrix of logical rigor and stakeholder engagement. A region with high structure and low cocreation becomes "design on paper"; a region with low structure and high cocreation drifts into "movement-ism." The balance between logic and relationships is in question.

reproducibility × causality. The matrix of reproducibility and causal clarity. A region with high reproducibility and low causality becomes "serendipity" (it works for unknown reasons); a region with low reproducibility and high causality becomes "context-dependent" (the reasons are known but cannot travel).

Typical patterns from the benchmark cases

On the SDI diagnostic page, five benchmark cases are displayed for comparison — four research labs within ISVD plus the institute's own self-evaluation. At the time of writing, all benchmark cases are internal ISVD cases, and their scores are initial values assigned by the ISVD editorial team. Comparisons with external projects will be updated as case accumulation progresses.

Quiet City Project (ISVD-LAB-001, average 3.79, grade B). Four research hypotheses and a Phase 0-3 roadmap lift structure. Because the primary subject is people with sensory sensitivities, impact sits in the middle range. A typical example of "deepening through narrowing the subject."

Agnotology Research Lab (ISVD-LAB-002, average 4.00, grade A). The theme of epistemic injustice has a reach that extends across society, lifting impact. As a research-driven project, however, the engagement of those affected as agents is limited, leaving room on cocreation. A typical example of where "depth of research" and "development into practice" must be balanced.

Social Design Foundations Research Lab (ISVD-LAB-003, average 3.64, grade B). The mapping of six academic fields and the design for citation network analysis lift structure. As meta-research (research on research), reproducibility is limited by nature. How "meta-level contribution" is measured remains an open question for SDI itself.

Public Asset Utilization Research Lab (ISVD-LAB-005, average 4.21, grade A). The design of making the gap between the existence of institutions and their actual functioning visible lifts causality, and the reach across Japan's PPP/PFI institutional landscape lifts impact. Cocreation remains as a future direction.

ISVD (the institute itself, average 3.86, grade B). A meta-case of self-evaluating the organization. Mechanisms for continuous improvement (the memory system and the hook system) lift learning, and the long-term goal of establishing the field of social design carries impact. Causality is still in the hypothesis stage; how to demonstrate the causal link between organizational activity and social change is a future task.

How to relate to SDI

An SDI score is a starting point for dialogue, not a final evaluation.

When a low score appears on an axis, the recommended posture is not to immediately treat it as "a weakness to be fixed." It can first be received as a characteristic that the project structurally carries. For example, "academic research tends to be low on cocreation" or "meta-research tends to be low on reproducibility" — these derive from the nature of the project and are not necessarily targets for improvement.

Conversely, where a score is high on an axis, it is worth examining whether the result reflects the actual situation or only the persuasive quality of the description. Because SDI scores what is written, it carries a structural dependence on how well the project is described. Discrepancies between description and reality need to be supplemented by other forms of evaluation (field research, hearings with those affected, effect verification).

Using SDI well means not using the score to improve the design, but using the score to raise the quality of dialogue about the design.

Further Reading

For deeper understanding of social impact evaluation and the theoretical background of SDI, the following books are recommended.

『社会的インパクトとは何か』 — What is Social Impact? A Guide to Investment, Evaluation, and Business Strategy for Social Change by Mark J. Epstein (Eiji Press, Japanese edition) comprehensively organizes the international standard frameworks for measuring, evaluating, and investing in social impact. It serves as a reference point for positioning the relationship between SDI and existing tools such as logic model, theory of change, and SROI.

『インパクト評価と社会イノベーション』 — Impact Evaluation and Social Innovation: How to Visualize the Outcomes of Social Enterprises in the SDGs Era edited by Ichiro Tsukamoto and Masao Seki (Daiichi Hoki) is a collection focused on the evaluation practice of social enterprises in Japan. With on-the-ground case studies, it discusses the selection of evaluation indicators in the SDGs era, co-creation with stakeholders, and the practical use of evaluation results — informative when designing the operational aspects of SDI.

References

Logic Model Development GuideW.K. Kellogg Foundation. W.K. Kellogg Foundation

Collective ImpactKania, J. & Kramer, M.. Stanford Social Innovation Review, Winter 2011

Co-creation and the New Landscapes of DesignSanders, E. B.-N. & Stappers, P. J.. CoDesign, 4(1), 5-18

Double Loop Learning in OrganizationsArgyris, C.. Harvard Business Review, 55(5), 115-125

Better Criteria for Better Evaluation: Revised Evaluation Criteria Definitions and Principles for UseOECD. OECD DAC Network on Development Evaluation

Questions to Reflect On

  1. Looking at your project's scores, would it be more valuable to strengthen the weakest axis or to further extend the strongest one?
  2. When does a "high social impact but low feasibility" project carry value, and when does a "high feasibility but limited impact" project carry value?
  3. If use-case-specific weighting were introduced instead of equal-weight averaging, what would be gained and what would be lost?

Key Terms in This Article

Theory of Change (ToC)
A planning method that works backward from long-term social change goals to specify necessary intermediate outcomes and causal pathways of intervention.
Logic Model
A framework that visually maps the causal relationships from inputs to activities, outputs, and outcomes of a program.

Related Content

Related Research Labs

ISVD researches and verifies the topics covered in this article on an ongoing basis at the following lab.

Get new columns by email

1-2 social structure analysis columns per week. Free to subscribe.

Join ISVD's activities?

Sign up to receive the latest research and activity reports. Feel free to reach out about collaboration or project participation.