Abstract:Objective To develop a scale for evaluating nursing students′ usage behavior of large language models (LLMs) and to test its reliability and validity. Methods Guided by the Theory of Planned Behavior (TPB), an initial questionnaire was developed through literature review, semi-structured interviews, research team discussions, two rounds of expert consultation, and a pilot survey. A purposive sampling method was employed to survey eligible nursing students from three medical universities between September and November 2024. A total of 634 valid questionnaires were collected for reliability and validity testing. Results The finalized assessment questionnaire for nursing students′ LLM usage behavior comprised five dimensions (attitude, subjective norms, perceived behavioral control, usage intention, and usage behavior) with 28 items in total and the cumulative variance contribution rate of 66.584%. The content validity index (CVI) of individual items ranged from 0.800 to 1.000, with an average CVI of 0.914. Confirmatory factor analysis demonstrated good model fit, with χ2/df=1.647,RMSEA=0.044,CFI=0.951,IFI=0.951,TLI=0.945, and GFI and NFI within acceptable ranges. The overall Cronbach′s α coefficient of the questionnaire was 0.917, with dimension-specific coefficients ranging from 0.814 to 0.898. The overall test-retest reliability was 0.823, with dimension-specific values ranging from 0.770 to 0.864. The overall split-half reliability was 0.775, with dimension-specific values ranging from 0.823 to 0.905. Conclusion The developed assessment questionnaire demonstrates good reliability and validity, providing a reference for nursing educators and administrators to evaluate nursing students′ intentions and behaviors regarding LLM usage.