Surgery, Gastroenterology and Oncology
Vol. 30, No. 3, Sept 2025
The Assessment of Fallacies of TIRADS and Bethesda Scores in Decision-Making Management among Thyroid Nodules: A Multicentric Cohort Study
Ahmed Tarabay, Mohamad S Moussa, Mohamed Selim, Hossam Darwish, Doaa M Elkafrawy, Sara Selmy Awad, Azzah Alzahrani, Abdullah M Altalhi, Malak F Almogathali, Esraa J Kaheel, Nadiah G AlAmri, Zahra Abdullah Alsamiri, Mohammed Quaider, Saud M Altalhi, Khalaf Helal Alghumuy, Sohila Selmy Awad, Abdullah Ahmed Hassan Almalki, Mohammed Hani Mohammed Alzahrani, Mahmoud R Abdulshafi, Farah C Pantaran, Selmy S Awad, Ehab M Khedr
ORIGINAL PAPER, Sept 2025
Article DOI: 10.21614/sgo-792

ABSTRACT

Background and Aim: The Thyroid Imaging Reporting and Data System (TIRADS) and Bethesda cytopathology classification are widely used to stratify malignancy risk in thyroid nodules; however, their diagnostic fallacies remain understudied in multicenter cohorts. This study evaluated the accuracy and limitations of the TIRADS and Bethesda systems in differentiating benign from malignant thyroid nodules across multiple centers.

Materials and Methods: A multicenter cohort of 500 patients with thyroid nodules (Group I, 250; Group II, 250) was analyzed. All patients underwent ultrasonography (TIRADS scoring), fine-needle aspiration cytology (FNAC; Bethesda classification), and histopathological examination. The diagnostic performance, discordance rates, and ultrasound features were assessed.

Results: FNAC demonstrated superior sensitivity (87.5%) compared with that of ultrasound (57.1%), although both had 100% specificity. False negatives: Ultrasound misclassified 14.3% of malignancies as benign, and FNAC missed 5.6% of malignancies in Bethesda II nodules. TIRADS 3 nodules had a 24% malignancy rate, contradicting their "low risk" label, whereas TIRADS 5 included 14% benign cases. Bethesda III-IV nodules showed a 43.8% malignancy rate, highlighting the pitfalls of indeterminate categories. Ultrasound features (microcalcifications, taller-than-wide shape) were not statistically significant (p > 0.05). Multinodular goiters (MNGs) had similar malignancy rates to solitary nodules (33.5% vs. 34.3%, p = 0.95).

Conclusion: The TIRADS and Bethesda systems exhibit critical limitations, including an over-reliance on ultrasound features and indeterminate-category inaccuracy. FNAC remains the gold standard for diagnosis.

INTRODUCTION

The thyroid gland is a richly vascularized organ located at the front of the neck that secretes hormones, mainly thyroxine (T4) and triiodothyronine (T3), which regulate metabolism, heart rate, neural development, and the cardiovascular, renal, and brain systems. Multinodular goiter (MNG) is an enlarged thyroid gland characterized by the presence of multiple nodules (1-3).

Goiters can be categorized as toxic or nontoxic, diffuse or nodular, and solitary or multiple types. Globally, MNG is the most prevalent endocrine disorder, impacting 500 to 600 million individuals, with iodine deficiency frequently being the cause (4-5).

Currently, the primary diagnostic procedures for assessing nodular goiter before surgery include thyroid gland ultrasonography and ultrasound-guided fine-needle aspiration biopsy (FNAB). Increasingly, the decision to proceed with surgery for thyroid focal lesions is informed by molecular techniques that reveal potential mutations and epigenetic alterations, which are crucial in the process of malignant transformation (6).

Ultrasonography serves as an invaluable diagnostic modality for assessing focal thyroid lesions, particularly in the context of multinodular goiter, as it facilitates the identification of areas amenable to fine needle aspiration biopsy. It is now acknowledged that the primary determinants of a thyroid nodule's malignant potential are not its size, but rather its vascularization, presence of microcalcifications, height-to-width ratio, structural composition (whether solid or solid-fluid), echogenicity, border margins, and presence of a halo. Consequently, thyroid nodules with a higher likelihood of malignancy exhibit the following ultrasonographic characteristics: increased central vascularization or absence of flow in power Doppler; microcalcifications; a height exceeding the width; solid lesions are more concerning than those with both solid and fluid components; hypoechogenic lesions are more alarming than isoechogenic ones; lesions with irregular margins and those lacking a halo or possessing an irregular, thick halo are more suspicious (7).

In contemporary medical practice, ultrasound-guided FNAB is considered the definitive method for diagnosing nodular goiters. This procedure is straightforward, safe, and cost-effective. Cytological assessment of samples collected using this biopsy method adheres to the international classification known as the BETHESDA System for Reporting Thyroid Cytopathology. According to this system, the outcomes of fine-needle aspiration biopsy of a thyroid nodule are divided into six diagnostic cytopathology categories: I, nondiagnostic or unsatisfactory; II, benign; III, atypia of undetermined significance or follicular lesion of undetermined significance; IV, follicular neoplasm or suspicious for a follicular neoplasm; V, suspicious for malignancy; and VI, malignant. Categories IV, V, and VI diagnoses generally necessitate surgical intervention. Recommen-dations are also made for diagnoses in groups III and I. It is crucial to recognize that even a benign diagnosis (group II) from a fine-needle aspiration biopsy carries a 3% risk of false-negative results (8).

The Thyroid Imaging Reporting and Data System (TIRADS) was introduced as a uniform approach for assessing and documenting thyroid ultrasound results, as well as for evaluating risk, to facilitate interpretation and ensure consistency among professionals (9).

This study offers statistical proof of the accuracy of sonographic findings in thyroid nodules for

differentiating between malignant and benign cases compared to the outcomes of fine-needle aspiration of thyroid nodules evaluated using the BETHESDA Classification. It was determined to be particularly beneficial in situations where FNAC is not readily available, allowing for decisions to be made significantly based on the TIRADS Classification, thereby facilitating early management of the condition.

MATERIALS AND METHODS

Study Design and Setting

This multicenter cohort study was conducted at the Surgery Department of multiple centers, encompassing data from multiple centers to evaluate the accuracy and clinical implications of TIRADS Thyroid Imaging Reporting and Data System (TIRADS) and Bethesda scoring in thyroid nodules. Patient data spanning February 2022 to January 2025 were reviewed, and cases were included based on predefined selection criteria. The study included patients diagnosed with thyroid nodules who underwent fine-needle aspiration cytology (FNAC) and had imaging categorized as TIRADS.

Patient recruitment

Patients of all age groups with confirmed thyroid nodules were included. Individuals who underwent both ultrasonographic evaluation and FNAC. Bethesda scores and histopathological confirmation, when applicable.

A total of 500 patients with thyroid nodules were divided into two groups: Group I [n=250] underwent ultrasonographic evaluation and Thyroid Imaging Reporting and Data System [TIRADS] classification, while Group II [n=250] underwent FNAC and Bethesda classification. The study compared clinical parameters, ultrasound characteristics, and histopathological findings between the groups.

Exclusion criteria

- Patients with incomplete records or missing relevant histopathological reports.

- Cases with a history of thyroid surgery or malignancy.

- Nodules that lacked definitive classification under TIRADS.

Data collection and sorting

Clinical records were extracted from the hospital database, and relevant imaging, cytological, and histopathological reports were reviewed. The following variables were recorded.

Demographics: Age, sex, and comorbidity

Diagnostic methods

- Imaging findings: TIRADS score and ultrasound characteristics (9).

- Cytology findings: Bethesda classification (8).

- Final histopathology: malignant vs. benign nodules.

Assessment of outcome measures

The primary outcomes included:

- Diagnostic performance of TIRADS and Bethesda scores in predicting thyroid malignancy.

- Rate of discordance between imaging-based risk stratification (TIRADS) and cytology (Bethesda classification).

- The sensitivity, specificity, and predictive values for malignancy detection were calculated.

Secondary outcomes assessed were as follows:

- Interobserver variability in TIRADS assignment.

- False-negative and false-positive rates in the Bethesda classification.

Statistical Analysis

Data analysis was performed using SPSS. Continuous variables are reported as mean ± SD or median (IQR), and categorical data are reported as frequencies and percentages. For continuous variables, means were used if the data were symmetrical, and medians with ranges were used if they were not. Categorical variables were expressed as proportions. Statistical significance was set at p < 0.05.

RESULTS

The study involved 500 participants, divided equally into Groups I and II, each with 250 patients. The average age was similar between the groups, with Group I having a mean age of 48.8 ± 6.0 years and Group II at 49.3 ± 5.4 years (p = 0.718). There was a higher proportion of women in both groups, with 68% in Group I and 78% in Group II (p = 0.111). No notable differences were found in terms of BMI, blood pressure, or hematological indices (p > 0.05) (table 1).

Table 1 - Demographic dataset and characteristics of the study population

<strong>Table 1 - Demographic dataset and characteristics of the study population</strong>

Solitary nodules predominated in both groups (Group I: 66%; Group II: 68%; p = 0.738). Left lobe nodules were more frequent in Group II than in Group I (68% vs. 50%, p = 0.024). Nodule size (length/ width) and multiplicity showed no intergroup differences (p > 0.4) (table 2).

Table 2 - Distribution of all studied cases number of nodules (n=500)

<strong>Table 2 - Distribution of all studied cases number of nodules (n=500)</strong>

Calcifications: Absent in 40% (Group I) vs. 54% (Group II) (p=0.297). Micro-calcifications were rare (6% versus 4%). Echogenicity/Shape: Hypoechogenicity (26% vs. 20%) and taller-than-wide shape (22% vs. 30%) were not predictive of malignancy (p > 0.4). Doppler flow: Central vascularity (36-38%) showed no diagnostic utility (p = 0.422) (table 3).

Table 3 - Ultrasound features between the two groups (n=500)

<strong>Table 3 - Ultrasound features between the two groups (n=500)</strong>

TIRADS 4-5 nodules comprised 32% (Group I) and 24% (Group II) of the nodules (p=0.727). Bethesda III-IV accounted for 32% (Group I) and 36% (Group II) (p= 0.721), with a 43.8% malignancy risk. Betheseda V-VI nodules had a 14.3% false-positive rate (table 4).

Table 4 - Distribution according to TIRADs and Bethesda Classification (n=500)

<strong>Table 4 - Distribution according to TIRADs and Bethesda Classification (n=500)</strong>

Ultrasound misclassified 14.3% of malignancies as benign (sensitivity: 57.1%; specificity: 100). FNAC demonstrated higher sensitivity (87.5%) but missed 5.6% of malignancies in Bethesda II nodules. False positives: 16% of TIRADS 5 and 14.3% of Bethesda V-VI nodules were histopathologically benign (table 5).

Table 5 - Correlation between post operative histopathology and both TIRAD and Bethesda scores

<strong>Table 5 - Correlation between post operative histopathology  and both TIRAD and Bethesda  scores</strong>

DISCUSSION

This multicenter cohort study of 500 patients critically evaluated the diagnostic accuracy and limitations of the TIRADS and Bethesda classification systems in differentiating benign thyroid nodules from malignant ones. Our findings reveal significant discrepancies between imaging, cytology, and final histopatho-logy, underscoring the challenges of current thyroid nodule risk stratification. Below, we contextualize our results within the existing literature and highlight their clinical implications.

Diagnostic Performance: Ultrasound vs. FNAC

Our data confirm that FNAC (87.5% sensitivity, 100% specificity) outperforms ultrasound (57.1% sensitivity, 100% specificity) in malignancy detection, consistent with previous studies (10-12). However, the 14.3% false-negative rate of ultrasound (30/210 benign-classified nodules were malignant) mirrors reports by Ha et al. (13), who noted that isoechoic or hyperechoic malignancies were often misclassified. The 5.6% false-negative rate of FNAC (10/180 Bethesda II nodules were malignant) parallels the findings of Bongiovanni et al. (14), who attributed this to sampling errors or cystic degeneration.

TIRADS and Bethesda: Clinical Fallacies

A. TIRADS over - and underestimation

In our cohort, TIRADS 3 nodules had a 24% malignancy rate, contradicting its "low suspicion" label. Similar findings were reported by Russ et al. (15), where 15-30% of TIRADS 3 nodules were malignant. TIRADS 5 nodules included 14% of benign cases, echoing the findings of Park et al. (16) who found that macrocalcifications or Hashimoto’s thyroiditis can mimic malignancy.

B. Bethesda III–IV: the "gray zone" problem

The 43.8% malignancy rate in Bethesda III-IV nodules aligns with that of Cibas et al. (17), but

highlights the system’s inability to stratify the risk within this category. Recent studies have advocated refined subcategories (e.g., IIIA/IIIB) or RAS mutation testing to reduce unnecessary surgeries (18).

Ultrasound Features: Poor Predictive Value

Contrary to the guidelines (19), microcalcifications (6% vs. 4%, p=0.297) and taller-than-wide shape (22–30%, p=0.679) were not statistically predictive of malignancy in our cohort. This conflicts with the findings of Moon et al. (20), but supports those of Grani et al. (21), who found that echogenicity and vascularity vary widely across populations.

Nodule Characteristics: Multinodularity and Location

Multinodular goiters (MNGs) had similar malignancy rates to solitary nodules (33.5% vs. 34.3%, p = 0.95), challenging the dogma that MNGs are at lower risk (22). Left lobe predominance (68% in Group II, p = 0.024) lacked a clear biological explanation but may reflect anatomic sampling bias (23). Revising the TIRADS 3-4 thresholds to reduce false reassurance. Adjunct tools (e.g., molecular testing) are needed for Bethesda III-IV nodules. The adoption of adjunct tools (e.g., Afirma GSC, ThyroSeq) for Bethesda III-IV nodules may reduce fallacies. Prospective validation of AI-based ultrasonography algorithms is required in future studies.

Our research highlights the diagnostic reliability of both USG and FNAC in distinguishing malignant thyroid nodules. Although ultrasonography is effective in identifying nodules, FNAC demonstrates greater accuracy, serving as a minimally invasive technique with impressive precision in differentiating malignant from benign tumors.

A significant observation is the risk of misdiagnosis and missed diagnoses when relying solely on qualitative assessments like the Bethesda classification. Therefore, we recommend a combined diagnostic approach that includes ultrasound TIRADS grading, providing a more comprehensive method for accurately diagnosing malignant thyroid nodules. This study offers valuable insights for clinicians, advocating for a refined diagnostic strategy to enhance patient care and intervention. Our findings confirm the diagnostic specificity of both USG and FNAC. USG is effective in identifying thyroid nodules; however, FNAC shows higher accuracy, making it a dependable, minimally invasive method to differentiate between benign and malignant tumors.

Limitations

The limitations of this study may include single-institution bias; despite multicenter recruitment, regional practice patterns may affect generalizability. Cytopathologist variability: Interobserver discordance in Bethesda grading was not quantified. Molecular testing not included: Emerging markers (e.g., TERT, BRAF) can refine indeterminate nodules. While prospective studies are often seen as more reliable, retrospective designs offer their own advantages. This method enables researchers to efficiently examine extensive datasets and formulate hypo-theses for future studies. By thoroughly analyzing multiple outcomes, we were able to comprehensively assess the disease burden and associated effects and gain deeper insights into patients with goiter. Consequently, a multidisciplinary strategy that emphasizes early diagnosis and personalized treatment is crucial for effective management of MNG. We anticipate that the findings of our study will be relevant to a broader population across various geographical areas and healthcare settings, thereby enhancing the external validity of our results. Large-scale prospective studies are required to precisely identify the onset period and provide more detailed information on the primary and secondary outcomes.

CONCLUSION

This observational retrospective study revealed critical gaps in the TIRADS and Bethesda systems, including an overreliance on ultrasound features and indeterminate-category pitfalls. The revision of the TIRADS thresholds and standardization of ultrasound reporting are recommended to improve diagnostic precision. Revising the TIRADS 3-4 thresholds to reduce false reassurance. Adjunct tools (e.g., molecular testing) are needed for Bethesda III-IV nodules. Adopting adjunct tools (e.g., Afirma GSC, ThyroSeq) for Bethesda III-IV nodules may reduce fallacies. Prospective validation of AI-based ultrasonography algorithms is needed in future studies.

Clinical Significance

This study elucidates the characteristics, outcomes, and indications of the TIRADS and Bethesda systems, including their overreliance on ultrasound features and indeterminate-category pitfalls. This study will help revise the TIRADS thresholds and standardize ultrasound reporting to improve diagnostic precision for better surgical implications and practice. The results provide an overview of the diagnostic modalities, which would support further studies based on these findings.

Conflicts of interest statement: none.

Sources of funding for research: none.

Ethics approval and consent to participate: available.

Availability of data and materials: available.

REFERENCES

1. Braun EM, Windisch G, Wolf G, Hausleitner L, Anderhuber F. The pyramidal lobe: Clinical anatomy and its importance in thyroid surgery. Surg Radiol Anat. 2007;29(1):21-7.

2. Beynon ME, Pinneri K. An Overview of the Thyroid Gland and Thyroid-Related Deaths for Forensic Pathologists. Acad Forensic Pathol. 2016;6(2):217-236.

3. Mondal S, Raja K, Schweizer U, Mugesh G. Chemistry and Biology of Thyroid Hormone. Angew Chem Int Ed Engl. 2016;55(27):7606-30.

4. Mortensen JD, Woolner LB, Bennett WA. Gross and microscopic findings clinically normal thyroid glands. J Clin Endocrinol Metab. 1955;15(10):1270-80.

5. Matovinovic J. Endemic goiter and cretinism at the dawn of the third millennium. Annu Rev Nutr. 1983:3:341-412.

6. Nikiforov YE. Molecular analysis of thyroid tumors. Mod Pathol. 2011;24 Suppl 2:S34-43.

7. Adamczewski Z, Lewinski A. Proposed algorithm for management of patients with thyroid nodules/focal lesions, based on ultrasound (US) and fine-needle aspiration biopsy (FNAB); our own experience. Thyroid Res. 2013;6:6.

8. Cibas ES, Ali SZ. Bethesda System for Reporting Thyroid Cytopathology. Thyroid. 2009;19(11):1159-1165.

9. Horvath E, Majlis S, Rossi R, Franco C, Niedmann JP, Castro A, et al. An ultrasonogram reporting system for thyroid nodules stratifying cancer risk for clinical management. J Clin Endocrinol Metab. 2009; 94(5):1748-51.

10. Gharib H, Papini E, Garber JR, Duick DS, Harrell RM, Hegedüs L, et al. American Association of Clinical Endocrinologists, American College of Endocrinology, and Associazione Medici Endocrinologi Medical Guidelines for Clinical Practice for the Diagnosis and Management of Thyroid Nodules - 2016 update. Endocr Pract. 2016;22(5):622-39.

11. Tessler FN, Middleton WD, Grant EG, Hoang JK, Berland LL, Teefey SA, et al. ACR Thyroid Imaging, Reporting, and Data System (TI-RADS): White Paper of the ACR TI-RADS Committee. J Am Coll Radiol. 2017;14(5):587-595.

12. Cibas ES, Ali SZ. The 2017 Bethesda System for Reporting Thyroid Cytopathology. Thyroid. 2017;27(11):1341-1346.

13. Russ G, Bonnema SJ, Erdogan MF, Durante C, Ngu R, Leenhardt L. European Thyroid Association Guidelines for Ultrasound Malignancy Risk Stratification of Thyroid Nodules in Adults: The EU-TIRADS. Eur Thyroid J. 2017;6(5):225-237.

14. Haugen BR, Alexander EK, Bible KC, Doherty GM, Mandel SJ, Nikiforov YE, et al. 2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid. 2016;26(1):1-133.

15. Grant EG, Tessler FN, Hoang JK, Langer JE, Beland MD, Berland LL, et al. Thyroid Ultrasound Reporting Lexicon: White Paper of the ACR Thyroid Imaging, Reporting, and Data System (TIRADS) Committee. J Am Coll Radiol. 2015;12(12 Pt A):1272-1279.

16. Moon WJ, Jung SL, Lee JH, Na DG, Baek JH, Lee YH, et al. Benign and malignant thyroid nodules: US differentiation-multicenter retrospective study. Radiology. 2008;247(3):762-770.

17. Grani G, Lamartina L, Cantisani V, Maranghi M, Lucia P, Durante C. Interobserver agreement of various thyroid imaging reporting and data systems. Endocr Connect. 2018;7(1):1-7.

18. Nikiforov YE, Carty SE, Chiosea SI, Coyne C, Duvvuri U, Ferris RL, et al. Highly accurate diagnosis of cancer in thyroid nodules with follicular neoplasm/suspicious for a follicular neoplasm cytology by ThyroSeq v2 next-generation sequencing assay. Cancer. 2014; 120(23):3627-3634.

19. Durante C, Grani G, Lamartina L, Filetti S, Mandel SJ, Cooper DS. The Diagnosis and Management of Thyroid Nodules: A Review. JAMA. 2018;319(9):914-924.

20. Park JY, Lee HJ, Jang HW, Kim HK, Yi JH, Lee W, et al. A proposal for a thyroid imaging reporting and data system for ultrasound features of thyroid carcinoma. Thyroid. 2009;19(11):1257-1264.

21. Ali SZ and Cibas ES. The Bethesda System for Reporting Thyroid Cytopathology: Definitions, Criteria, and Explanatory Notes. 2nd ed. Springer; 2018.

22. Valderrabano P, McIver B. Evaluation and Management of Indeterminate Thyroid Nodules: The Revolution of Risk Stratification Beyond Cytological Diagnosis. Cancer Control. 2017;24(5): 1073274817729231.

23. Patel KN, Yip L, Lubitz CC, Grubbs EG, Miller BS, Shen W, et al. The American Association of Endocrine Surgeons Guidelines for the Definitive Surgical Management of Thyroid Disease in Adults. Ann Surg. 2020;271(3):e21-e93.



Full Text Sources: Download pdf
Abstract:   Abstract EN
Views: 13


Watch Video Articles


For Authors



Journal Subscriptions

Current Issue

Sept 2025

Supplements

Instructions for authors
Online submission
Contact
ISSN: 2559 - 723X (print)

e-ISSN: 2601 - 1700 (online)

ISSN-L: 2559 - 723X

Journal Abbreviation: Surg. Gastroenterol. Oncol.

Surgery, Gastroenterology and Oncology (SGO) is indexed in:
  • SCOPUS
  • EBSCO
  • DOI/Crossref
  • Google Scholar
  • SCImago
  • Harvard Library
  • Open Academic Journals Index (OAJI)

Open Access Statement

Surgery, Gastroenterology and Oncology (SGO) is an open-access, peer-reviewed online journal published by Celsius Publishing House. The journal allows readers to read, download, copy, distribute, print, search, or link to the full text of its articles.

Journal Metrics

Time to first editorial decision: 25 days
Rejection rate: 61%
CiteScore: 0.2



Meetings and Courses in 2025
Meetings and Courses in 2024
Meetings and Courses in 2023
Meetings and Courses in 2022
Meetings and Courses in 2021
Meetings and Courses in 2020
Meetings and Courses in 2019
Verona expert meeting 2019

Creative Commons License
Surgery, Gastroenterology and Oncology applies the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits readers to copy and redistribute the material in any medium or format, remix, adapt, build upon the published works non-commercially, and license the derivative works on different terms, provided the original material is properly cited and the use is non-commercial. Please see: https://creativecommons.org/licenses/by-nc/4.0/
Publisher’s Note:
The opinions, statements, and data contained in article are solely those of the authors and not of Surgery, Gastroenterology and Oncology journal or the editors. Publisher and the editors disclaim responsibility for any damage resulting from any ideas, instructions, methods, or products referred to in the content.