Customizing mistral 7B large language model for qualitative research: A feasibility study
Keywords:
Automated Analysis, Mistral 7B, LLMs, Pragmatics, Gricean MaximsAbstract
In qualitative linguistic research, particularly within the domain of discourse analysis, the manual identification of pragmatic features such as Grice’s conversational maxims can be time-consuming and cognitively demanding. This feasibility study investigates the potential of using the Mistral 7B large language model (LLM) to support such analysis by automating the classification of Gricean maxims, Quantity, Quality, Relevance, and Manner, and identifying corresponding illocutionary acts in Instagram captions. A dataset comprising 88 bilingual captions (primarily English with several in Indonesian) from Samsung Indonesia’s official Instagram account was used. The model was prompted to analyze each caption, score the observance of the four maxims, assign an illocutionary act type, and provide justification for its classifications. The outputs were compared to a previously published human-coded analysis. Results showed that Mistral could produce accurate classifications for most captions, particularly in identifying directives and informative acts, and provided plausible justifications. However, the model displayed a bias toward higher maxim observance scores (3 and 4), showing reluctance to assign lower ratings such as “barely observed” or “not observed,” which human coders used more readily. Mistral also failed to parse a syntactically complex caption, indicating limitations in handling mixed or informal structures. Overall, the findings highlight Mistral’s potential as a fast, accessible tool for supporting qualitative linguistic inquiry, especially in large-scale or exploratory settings. While its accuracy and interpretive depth require refinement, Mistral offers a promising starting point for integrating AI into pragmatic analysis workflows. Further development in prompt design and model calibration is recommended.
