| TL;DR | The defining methodological question of the field: who holds interpretive authority, and how should the analytic labor be divided between human and AI? The frameworks that have emerged since 2022 converge on a principle the empirical studies rarely articulate explicitly — the human must remain interpretive ground truth. How they operationalize this principle varies enormously. |
|---|
The core question
“AI-assisted qualitative research” encompasses everything from a researcher using ChatGPT to brainstorm codes (AI as notebook) to a fully automated pipeline that generates grounded theory without human reading (AI as analyst). These are not variations on one practice; they are different practices with different epistemological implications.
The central question across the corpus is: who decides what the data means? The answer determines the validity of the research, the ethics of the method, and the epistemological tradition it belongs to. See epistemology for the full mapping.
The most important structural insight comes from carlsen-ralund-computational-grounded-theory-2022: the division of labor between human and computer should be organized around their respective competences. Computers handle scale — finding enough relevant cases, scaling a validated classifier to a large corpus. Humans provide interpretive ground truth — developing coding schemes, reading extensively, validating categories against direct human coding of a random sample. Any framework that puts the computer in the interpretive role has the division of labor backwards.
The main frameworks
CALM — Computer Assisted Learning and Measurement
Source: carlsen-ralund-computational-grounded-theory-2022
Position: Human is ground truth; computer handles scale and rarity.
The most carefully theorized framework in the corpus. CALM organizes analytic labor across five stages:
| Stage | Task | AI role | Human role |
|---|---|---|---|
| Discovery | Find candidate categories and search terms | HSBM generates candidates | Decides which to pursue |
| Interpretation | Read documents, build coding scheme | Retrieves relevant documents | Reads extensively, writes memos, develops scheme |
| Classification | Apply coding scheme | ML scales classification | Provides training examples |
| Validation | Verify classification quality | Applies to test set | Codes random sample; adjudicates |
| Measurement | Apply to full corpus | Runs classifier | Interprets results |
The key move: AI solves the rarity problem (finding enough cases of rare phenomena to understand them) and the scale problem (applying a validated scheme to thousands of documents). The human solves the interpretation problem (what do these patterns mean within this specific social context?).
What CALM rejects: CGT’s pattern discovery approach, where computer-led topic modeling is supposed to find the right categories. See computational-grounded-theory.
CAAI — Conversational Analysis to the Power of AI
Source: friese-caai-framework-2026
Position: Replace coding with structured dialogic interaction; AI as hermeneutic partner.
CAAI makes the most radical departure from conventional AI coding studies. The framework:
- Define research questions precisely
- Delineate core concepts and their expected textual markers
- Conduct systematic conceptually-driven dialogic interaction with AI
- Verify and cross-check interpretations with additional data queries
- Integrate interpretations into research output
AI never classifies segments. It engages in structured dialogue, guided by theoretically informed researcher prompts, that helps develop understanding rather than produce counts. This is analogous to how a researcher uses literature: not to confirm a coding scheme but to think through what the data means.
The epistemological position: Hermeneutic. Understanding is iterative and dialogic. AI participates in the hermeneutic circle — offering interpretations that the researcher evaluates, challenges, refines. The researcher’s theoretical knowledge and cultural competence govern the interaction throughout.
What CAAI rejects: The coding paradigm entirely. Friese argues that treating qualitative analysis as a classification task misrepresents what qualitative understanding is. The discourse analysis pilot demonstrated that CAAI surfaces richer interpretive terrain than coding permits.
AbductivAI
Source: costa-abductivai-2025
Position: AI as co-researcher in a distributed analytic system; abductive reasoning as the organizing logic.
Built on Actor-Network Theory and distributed cognition, AbductivAI positions AI as one node in a network that includes the researcher, the data, prior theory, and methodological conventions. The AI is not a tool wielded by the researcher; it is an actor with its own “inscribed” norms that shape the analytic process.
Chain-of-Prompting: Structured prompting sequences that move from data-driven observation to theoretical hypothesis to confirmation of the unexpected. The chain replicates abductive reasoning: notice something anomalous → generate an explanation → check it against more data.
What makes this distinctive: The focus on the unexpected rather than the consistent. AI is designed to surface what defies expectation, not to confirm what is statistically probable. This directly addresses epistemic-flattening.
GAITA — Guided AI Thematic Analysis
Source: nguyen-trung-gaita-2025
Position: Researcher as reflexive leader throughout; AI as structural support for Template Analysis adaptation.
Four stages adapted from Template Analysis (King, 2012):
- Familiarization — researcher reads data; AI assists with organization and initial summaries
- Initial coding — researcher develops provisional template; AI applies to data segments
- Template refinement — researcher refines codes iteratively based on AI-assisted pattern review
- Theme finalization — researcher finalizes themes; AI supports cross-data verification
ACTOR prompting framework: Anchoring (research question context), Chaining (sequential prompting), Tasking (specific tasks per stage), Organizing (structure requirements), Reflecting (prompting AI to explain reasoning).
The position on interpretive authority: The researcher guides every stage. AI is explicitly positioned as implementing the researcher’s analytic vision, not developing its own. Reflexive memos at each stage document how AI shaped the process.
AI-in-the-Loop Analysis
Source: wise-et-al-2026-ai-not-the-enemy
Position: AI incorporated into human-led analytic processes to deepen core qualitative commitments — not for efficiency but for interpretive depth.
The most technically detailed framework in the corpus. Wise et al. explicitly reframe the debate: the problem with most AI-assisted qualitative research is that it uses AI for efficiency and automation, ceding interpretive judgment to the machine. AI-in-the-loop inverts this — computational capabilities are enlisted specifically to help researchers enact the commitments that interpretive qualitative analysis requires but structurally struggles to achieve.
Five core qualitative commitments mapped to LLM properties:
| Commitment | LLM property | Practical application |
|---|---|---|
| Close attention + iterative layering | Attention mechanisms; auto-regression | Multiple passes, deepening embeddings |
| Immersion in data with full context | Long-context (128K–1M tokens) | Entire corpus held in model context simultaneously |
| Context surrounding data interpreted | Large-scale pre-training | Culturally grounded interpretation of implicit meaning |
| Positionality as analytic resource | Auto-regression + temperature | Varied responses; persona prompting for multiple stances |
| Multiple perspectives in dialogue | All of above | Iterative exploration of interpretive alternatives |
Trustworthiness connections: Credibility (full-corpus contextualization), dependability (temporal audit of theme stability), confirmability (systematic surfacing of disconfirming instances), transferability (detailed prompt documentation), authenticity (targeted search for underrepresented voices). A new criterion is proposed: transparency in analytic decision-making — full documentation of model, parameters, and iterative prompting choices.
Technical caveats: RAG (Retrieval-Augmented Generation) is explicitly discouraged; the full corpus must be in the model’s active context, not selectively retrieved. WEIRD bias in training data must be interrogated with the same reflexivity researchers apply to their own positionality.
CGT (the cautionary case) — Computational Grounded Theory
Source: nelson-computational-grounded-theory-2020 (original); carlsen-ralund-computational-grounded-theory-2022 (critique)
CGT placed AI in the discovery role: unsupervised topic models find patterns; researchers interpret what the model found. This reversal of the human-AI relationship — computer leads discovery, human interprets computer output — is the paradigm case of what the critical literature argues against.
The structural problem: reading model-selected “paradigmatic” documents does not qualify a researcher to interpret meaning across a community. Paradigmatic cases are only intelligible in relation to the full corpus. The researcher who has not read the data extensively is not positioned to judge whether the model’s topics are real, meaningful, or artifacts of LDA’s mathematical assumptions. See computational-grounded-theory.
NITA — Narrative-Integrated Thematic Analysis
Source: nguyen-trung-nita-2026
Position: Non-coding, narrative-centered approach; PERFECT monitoring framework as the structured reflexive audit trail.
NITA is the second fully non-coding AI-assisted TA framework in the corpus (alongside CAAI). The division of labor is organized not just by task type (mechanical vs. interpretive) but by a temporal architecture — the PERFECT monitoring procedure governs when human and AI engage:
- PER (Purpose, Envision, Realize): Researcher-only space. Identity, positionality, and analytic vision established before any AI contact with data.
- FE (Formulate, Experiment): Human-AI collaborative space. Analyst approach developed with LLM assistance; prompts iterated on sample data.
- CT (Check & Reflect, Tune): Researcher returns as sole evaluator. AI outputs interrogated, distortions identified, approach adjusted.
This temporal structure is a contribution: rather than specifying which tasks AI can perform, PERFECT specifies at what stages AI enters the process. The researcher’s interpretive authority is established and documented before AI contact begins. From this secure foundation, the researcher can genuinely evaluate rather than be anchored by AI output.
The dialogic thinking mode: NITA proposes “dialogic thinking” as a new mode in Freeman’s (2016) taxonomy of qualitative thinking modes, alongside categorizing, narrative, visual, and others. In dialogic thinking mode, AI outputs are provocations that the researcher responds to — not evidence to accept or coding to verify. The epistemic unit is the exchange, not the AI output.
What distinguishes NITA from CAAI: CAAI replaces coding with dialogic interaction while still involving the researcher in iterative questioning at every stage. NITA bypasses the interactive stage at theme generation (AI generates candidate themes from the corpus) and shifts researcher involvement to the narrative construction stages that follow. The researcher’s interpretive work happens in building individual and meta-narratives, not in the prompting dialogue.
Reflexive TA with AI (Xu, Wheeler)
Sources: xu-ai-thematic-analysis-2026, wheeler-technological-reflexivity-2026
Rather than proposing a new framework, these papers demonstrate how to practice existing reflexive methods with AI assistance while maintaining the epistemological commitments of those methods.
Xu’s approach: AI as phase-by-phase assistant through Braun & Clarke’s six-phase reflexive TA. Researcher reflexivity documented at every phase. ChatGPT used for data familiarization support, initial theme generation (to be interrogated, not accepted), and cross-data pattern checking. The posthumanist framing positions AI as a thinking tool that extends researcher cognition rather than replacing it.
Wheeler’s technological reflexivity: Developed through empirical comparison of MAXQDA, NVivo, and ChatGPT on 1,300+ climate survey responses. Reflexivity is not just about the researcher’s relationship to data; it includes the researcher’s critical examination of how their tools shape what they see. Prompts are methodological choices. Platform affordances constrain analytic options. Distributed reflexivity: self, tool, context.
The automation spectrum
All frameworks can be positioned on a spectrum from fully human-led to fully AI-led:
Fully human-led ←──────────────────────────────────→ Fully AI-led
Traditional CAAI CALM CGT AcademiaOS
qualitative GAITA AbductivAI
research Reflexive TA
(Xu, Wheeler)
Consensus position: The interpretive tasks — developing coding schemes, deciding what themes mean, reflexively examining the analytic process — belong to the left side regardless of how much AI assists with mechanical tasks on the right.
The contested frontier: Whether AI can participate in developing interpretations (CAAI, AbductivAI) or only applying interpretations the human has developed (CALM, GAITA). This is the most active theoretical debate in the 2025–2026 literature.
What makes collaboration legitimate?
Synthesizing across the corpus, six conditions appear repeatedly as necessary (though not always sufficient) for legitimate human-AI collaboration:
- Human scheme development — coding categories and interpretive frameworks are developed by the researcher, not generated autonomously by AI
- Researcher immersion — the researcher has read sufficient data to evaluate AI output from a position of genuine interpretive competence (not just “paradigmatic cases”)
- Direct validation — AI output is validated against human coding of a representative sample, not just correlated with external variables
- Documented prompts — prompts are treated as methodological choices, documented in the audit trail, and reported in the paper
- Reflexive memos — researcher documents how AI shaped the analytic process and where human judgment overrode AI output
- Transparency about AI role — the specific task AI performed, the tool used, and the validation procedure are reported with enough detail for peer scrutiny
Failure modes
The corpus documents several patterns of illegitimate collaboration:
- AI-led discovery without human immersion (CGT paradigm) — reading paradigmatic cases does not constitute qualification (carlsen-ralund-computational-grounded-theory-2022)
- Unvalidated AI coding — high κ with prior human coding does not guarantee validity (validity-trustworthiness)
- Quote acceptance without verification — AI fabricates quotes; 58% fabrication rate documented (jowsey-frankenstein-ai-ta-2025)
- Opaque prompting — treating prompts as private workflow rather than methodological choices requiring documentation (wheeler-technological-reflexivity-2026)
- Speed as success criterion — “28× faster” (prescott-ai-thematic-analysis-2024) ignores validation burden and quality trade-offs
- Undisclosed use — dellafiore-et-al-2025-expert-interviews documents a culture of concealment: 13/14 expert researchers use AI but many initially presented as non-users; shame around disclosure produces a systematic under-reporting problem with governance implications. andrews-progress-or-perish-2026 names this “AI shaming” (Giray 2024) and documents it as cross-disciplinary; the qualitative methods community’s strong normative signaling around AI use may be contributing to the same dynamic it aims to prevent.
Empirical tests of framework typologies
ayik-et-al-2026-human-vs-ai-ta-tools provides the first empirical test of the framework taxonomy: four structurally distinct AI tools (ChatGPT-4o as unstructured assistant; QInsights as CAAI-implementation; ATLAS.ti AI as QDAS-integrated; MAXQDA AI Assist as QDAS-integrated) compared against validated human TA. Results:
- Tools designed around dialogic principles (QInsights/MAXQDA) produce interpretivist-oriented output; tools with minimal structure (ChatGPT, ATLAS.ti) produce post-positivist output
- Framework choice is partly made at tool selection
- No hallucinations with explicit constraints — hallucination risk is prompt-design-dependent, not tool-inherent
- MAXQDA (50% exact theme match) came closest; all tools diverged substantially from human TA in thematic structure
See also
- qualitative-ai-methods — taxonomy of AI roles
- epistemology — epistemological foundations of the frameworks; includes post-humanist/ANT stance
- validity-trustworthiness — how each framework handles rigor
- computational-grounded-theory — the cautionary case study
- epistemic-flattening — the structural risk of AI-led meaning-making
- prompt-engineering — the craft of collaborative prompting
- contested-claims — active disputes about AI’s legitimate role; Claim 9 covers the categorical rejection debate
- de-paoli-reject-rejection-2026 — ANT-based argument that withdrawal from AI collaboration cedes the domain to computer scientists
- ayik-et-al-2026-human-vs-ai-ta-tools — empirical test of how tool design encodes framework typology; dialogic tools produce interpretivist output, unstructured tools produce post-positivist output
- dellafiore-et-al-2025-expert-interviews — expert practitioner perspectives; concealment culture; technical/interpretive task split confirmed; “illusion of meaning” as risk
- jowsey-et-al-2025-we-reject — the position that no collaboration framework can satisfy Big-Q validity criteria; the strongest challenge to the legitimacy of all frameworks on this page
- greenhalgh-2026-beyond-the-binary — governance framing; argues that the question of legitimate collaboration requires governance structures, not just methodological frameworks
- friese-et-al-beyond-binary-2026 — 100+ signatory response to Jowsey; theoretical grounding for why collaboration can be legitimate; assemblage/distributed cognition/posthumanism/sociomateriality as the philosophical resources for defending the frameworks on this page
- nguyen-trung-nita-2026 — NITA/PERFECT; the temporal architecture of human-AI division of labor; “dialogic thinking mode” as a new contribution to the collaboration framework vocabulary
What links here
- Ayik et al. (2026) — Human vs. AI: Evaluating TA With ChatGPT, QInsights, ATLAS.ti AI, and MAXQDA AI Assist
- Brailas (2025) — AI in Qualitative Research: Beyond Outsourcing Data Analysis to the Machine
- Computational Grounded Theory
- Contested Claims
- Costa et al. (2025) — AI as a Co-researcher in the Qualitative Research Workflow: Transforming Human-AI Collaboration
- Dellafiore et al. (2025) — Artificial Intelligence in Qualitative Research: Insights From Experts via Reflexive Thematic Analysis
- Empirical Findings
- Epistemology — Stances Across the Literature
- Friese (2026) — From Coding to Conversation: A New Methodological Framework for AI-Assisted Qualitative Analysis
- Friese, Nguyen-Trung, Powell & Morgan (2026) — Beyond Binary Positions
- Greenhalgh (2026) — Reflexive Qualitative Research and Generative AI: A Call to Go Beyond the Binary
- AI in Qualitative Research
- Index
- LLMs for Qualitative Research
- Nguyen-Trung & Nguyen (2026) — Narrative-Integrated Thematic Analysis (NITA)
- Qualitative AI Methods — A Living Taxonomy
- Sinha et al. (2024) — The Role of Generative AI in Qualitative Research: GPT-4's Contributions to a Grounded Theory Analysis
- Übellacker (2024) — AcademiaOS: Automating Grounded Theory Development with Large Language Models
- Validity and Trustworthiness
- Wise et al. (2026) — Why AI is Not the Enemy: Trustworthy AI-in-the-Loop Analysis
- Zhang et al. (2025) — Harnessing the Power of AI in Qualitative Research: Exploring, Using and Redesigning ChatGPT