Exploring Large Language Models to Improve Grantmaking Data Aggregation in Canada

Research summary

Justification of research project

This project explores how Large Language Models (LLMs) can enhance the aggregation and analysis of T3010 data. The 2024 Landscape Report by Philanthropic Foundations Canada (PFC) and subsequent scoping project with Carleton’s MPNL students highlighted significant limitations with the timeliness and accuracy of grantmaking data, which hinder the field’s ability to address pressing questions. For example, aggregating grantmaking flows—a process demonstrated by PFC—required over 80 hours of manual work to produce a single bar chart. T3010 filings, self-reported by foundations, remain Canada’s most comprehensive data source despite their delays of 18-24 months, broad and outdated classificatory system, and inability to address critical questions about grantmaking behaviour and impact.

Recent regulatory reforms, such as changes to disbursement quotas (DQ) and non-qualifying disbursements (NQD), have aimed to address societal inequalities and disparities in resource allocation within the philanthropic sector. The successful adoption of these reforms, supported by diverse stakeholders and equity goals, has further illuminated gaps in foundational knowledge about grantmaking behaviour. These gaps present substantial obstacles to understanding the effects of current and future policy changes.

At the core of these challenges is the limited utility of the T3010’s broad classificatory framework, which fails to capture the nuance required for meaningful analysis. To confront this problem, this project will employ Large Language Models (LLMs) and cluster analysis to develop a more detailed and granular typology of charitable activities. This enhanced classification system will support the philanthropic sector in answering key regulatory and policy questions, improving data-driven decision-making, and fostering equitable outcomes across society.

Research question

1. How do the CRA’s existing categorizations compare to those generated using LLM-based methods?

2. Has the increase in the disbursement quota to 5% led to a higher proportion of grants awarded to new grantees, or does it primarily benefit organizations with prior funding histories?

3. What variations exist in how foundations respond to disbursement quota changes across different programmatic categories?

Research purpose

This study has 3 main objectives:

1. Develop a Detailed Typology: Create a classification system for foundations’ grantmaking activities, leveraging Large Language Models (LLMs).

2. Evaluate Policy Impacts: Assess the effects of recent regulatory changes, including disbursement quota (DQ) and non-qualifying disbursement (NDQ) adjustments, on grantmaking behavior and resource distribution patterns.

3. Facilitate Knowledge Mobilization: Disseminate findings and insights through targeted knowledge mobilization activities, such as reports, webinars, and academic or policy papers, to engage stakeholders and inform practice and policy.

Research approach

The project will attempt to estimate the responsiveness of grantmaking to the distribution requirement using a bunching estimator. This estimator exploits mass points in an otherwise smooth distribution to estimate the responsiveness of organizations to the distribution threshold. Identifying bunchers (those accumulated in the neighborhood of the mass point) can help us identify recipient organizations of the foundations most responsive to the threshold. These recipients can then be analyzed and compared to a sample of “always funded”, “seldom funded” and “never funded” comparison organizations by using a propensity score matching estimator.

Summary creation date : April 2025
Project start :  Fall 2024
Project end :  Summer 2025

Funding

5000$ from PhiLab

Supervisor(s)

  • Ross Hickey
    UBC – Okanagan Campus

Researchers

  • Michele Fugiel Gartner
    Lead Researcher
    Philanthropic Foundations Canada

Students

  • Siya Gupta
    UBC