Xu (2026) — Doing Thematic Analysis in the Age of Generative AI: Practices, Ethics and Reflexivity

Source
url	https://doi.org/10.1177/16094069261425173
raw	raw/xu-2026-doing-thematic-analysis-in-the-age-of-generative-ai-practices-ethics-and-reflexivity.pdf

TL;DR: A rigorous worked example of Big Q reflexive thematic analysis conducted side-by-side with ChatGPT across Braun & Clarke’s six phases. Xu demonstrates that AI can support interpretively rich TA without collapsing it into positivist coding — but only through sustained researcher reflexivity at every step, not passive delegation.

Problem

The methodological literature on AI-assisted qualitative research has a bifurcation problem. On one side: comparative studies that benchmark AI against human coders on reliability metrics (κ, Jaccard), operating within a small q framework where reliability is the goal. On the other: epistemological critiques that warn against using AI in Big Q research without providing concrete alternatives. What is missing is a worked example — one that shows, phase by phase, how a researcher committed to reflexive TA actually does this with ChatGPT in practice.

Xu’s prior 2020 article (International Journal of Qualitative Methods) provided an end-to-end manual TA template. This paper is the GenAI sequel: same research context (learning experiences of disadvantaged students in an Australian primary classroom), same Braun & Clarke framework, new partner in the analysis. The question is not whether AI can match human coders, but how researcher and AI can work together without sacrificing what Big Q reflexive TA is actually for.

A secondary problem: much existing AI-TA literature misunderstands TA as a monolithic method. Small q (positivist, reliability-oriented) and Big Q (reflexive, interpretivist) TA require fundamentally different approaches to AI integration. Xu insists on this distinction and designs the worked example accordingly.

Approach

The paper follows Braun & Clarke’s six phases explicitly, showing AI contributions and researcher responses at each:

Phase 1: Familiarization. ChatGPT assists with audio transcription. Xu emphasizes that familiarization cannot be outsourced — the researcher must still read closely, even if AI generates the transcript. The AI handles labor; the researcher handles attunement.

Phase 2: Generating initial codes. AI produces first-pass codes from data segments. Xu uses these as candidate prompts for researcher reflection, not as finished outputs. The analytic record includes not just AI codes but researcher annotations interrogating each one.

Phase 3: Searching for themes. AI proposes candidate themes across code clusters. Xu’s critical contribution here: she documents specific moments where ChatGPT’s theme proposals reflected culturally generic framings that obscured the community-specific experiences of the students. These are not discarded as errors — they become reflexive data about algorithmic bias.

Phase 4: Reviewing themes. The researcher systematically challenges AI proposals against data extracts. Where AI has fused distinct experiences into a single theme, Xu splits them. Where AI has missed a latent pattern, she adds it.

Phases 5–6: Defining, naming, and writing up. AI assists with proofreading and structural suggestions. Final interpretive authority remains with the researcher.

Throughout, reflexive exercises are integrated into each phase — not as a separate “limitations” section, but as constitutive of the analytic process.

AI’s Role

AI is positioned as a reflexive partner under researcher authority — it contributes to speed and pattern-surfacing, but every contribution is subject to interrogation. The paper operationalizes what “critical engagement” with AI output actually means: not reading outputs skeptically in the abstract, but documenting specific moments where AI proposals are accepted, challenged, revised, or rejected, with reasons.

Crucially, Xu frames AI through a posthumanist lens: not as a tool external to the researcher, but as a non-human actor entangled in the analytic process. This framing has methodological implications — if AI shapes what gets noticed, that shaping is itself part of the research account and must be documented.

Epistemological Stance

Explicitly Big Q, non-positivist, reflexive. Xu is working in the tradition of Braun & Clarke’s reflexive TA, where the researcher’s positionality is not a source of bias to be controlled but a constitutive resource. She explicitly flags her insider positionality as a researcher working with minority populations, and shows how this positionality shapes her responses to AI’s flatter proposals.

The paper’s epistemological contribution is demonstrating that Big Q TA with AI is coherent — not despite the reflexive commitment but through it. The distinction from small q is not just philosophical: it changes every methodological decision in the paper.

Rigor and Trustworthiness

Trustworthiness here is established through methodological transparency: the paper shows prompts, AI outputs, and researcher responses at each phase. A reader can trace every interpretive move. This is not the same as intercoder reliability — it is closer to an audit trail, the qualitative standard appropriate to reflexive TA.

The algorithmic bias mitigation strategies are documented in enough detail to be replicated or adapted. The cases where ChatGPT produced culturally generic themes are treated as analytical events, not embarrassments — this is the kind of methodological honesty the corpus often lacks.

Limitations

The dataset is a practitioner inquiry with a relatively small corpus (one classroom), which limits the generalizability of the efficiency claims. For large corpora — the use case where AI assistance is most compelling — the degree of close reading required for genuine reflexive engagement may create the same time pressures the technology promises to relieve.

The paper does not address model differences (ChatGPT-4 vs. earlier versions, or other LLMs) or reproducibility. Because the prompts and AI outputs are ChatGPT-specific and session-specific, the worked example documents one path through the analysis, not a reproducible protocol.

The posthumanist framing, while epistemologically sophisticated, is briefly sketched rather than fully developed. Researchers unfamiliar with this tradition may find it underdeveloped relative to its implications.

Connections

llm-qualitative-research — the broader landscape this paper intervenes in
brailas-ai-qualitative-research-2025 — parallel argument for reflexive, non-positivist AI engagement; Xu is the practice to Brailas’s theory
epistemic-flattening — Xu’s moments of ChatGPT cultural flattening are the concept made visible
prompt-engineering — the paper treats prompts as part of the analytic record, not just technical inputs
bijker-chatgpt-qca-2024 — the contrasting small q benchmark; illuminates what Xu’s Big Q approach is not
friese-caai-framework-2026 — parallel framework moving away from coding altogether; compare different approaches to dialogic AI integration
validity-trustworthiness — the audit-trail approach to trustworthiness Xu models
intercoder-agreement — the small q reliability framework Xu explicitly rejects as inappropriate for reflexive TA