Citation
Shaheed, I. M., Khudhair, K. T., & Hasan, N. F. (2026). Content Validity Testing of Items for Determining the Appropriateness of a Computer Science-Specific Learning Taxonomy Instrument. International Journal of Research, 13(4), 155–167. https://doi.org/10.26643/ijr/edupub/12
Iman Mousa Shaheed1, *, Kifah Taha Khudhair2, Noor Flayyih Hasan3
1General Directorate of Education in Najaf, Kufa department of education, Najaf, Iraq
2Technical College of Management – Kufa, Al-Furat Al-Awsat Technical University, Kufa, 54003, Iraq
3Southern Technical University, Thi-Qar Technical College, Department of Accounting Techniques, Iraq
*Corresponding author: eman_musa21@yahoo.com
Abstract
Keywords: Learning Taxonomy; Appropriateness; Instrument Development; Content Validity Index.
1.0 Introduction
Learning taxonomies are useful planning tools for instructors, helping them to assess curriculum and related educational objectives. With respect to computer science, educators have widely used Bloom’s taxonomy and its revised versions [1, 2]. However, numerous other computer science-specific taxonomies have also been recommended [3-5] because of the original taxonomy’s unsuitability for learning computer science subjects [6]. Teodorescu, Bennhold [7] asserted that to help educators plan and assess their teaching, taxonomies must suit their goals and include subject-specific requirements.
According to Kropp, Stoker [8], a major problem is providing evidence of a taxonomy’s appropriateness including the development of a valid statistical methodology and models.
Unfortunately, there are few studies of the development of such models. Hauenstein [9] suggested five general rules of taxonomy evaluation: it should be applicable, inclusive, consist of categories that are independent from one another, reflect a consistent order, and use terms that are relevant to the subject area. Inclusivity prevents standards from being omitted, and mutual exclusivity prevents overlapping categories in a taxonomy.
The purpose of this study was to determine the content validity of an instrument to assess the appropriateness of a computer science-specific taxonomy. The results address the existing knowledge gap, and this instrument will provide computer science educators with a reliable, valid, and convenient tool for selecting the best taxonomy to use in their teaching practices.
To address this gap in the literature, the authors reviewed 40 studies of the application of Bloom’s taxonomy in computer programming courses. The aim was to answer the following key research questions: What are the deficiencies affecting currently used learning taxonomies with regard to computer-programming courses?
To answer this question, qualitative content analysis techniques were used to analyze statements about the computer programming-related shortcomings of Bloom’s taxonomy. These shortcomings were used to develop specifications for the appropriate computer science-specific learning taxonomy. Since the current adoption of Bloom’s taxonomy by ACM and IEEE Computer Society [10] to categorize the learning results of the basic programming course in the prospectus of the ACM/IEEE-CS, this search was limited to investigating the weaknesses of the original Bloom’s taxonomy and its revised versions. However, this analysis may also indicate other weaknesses in existing Bloom-based taxonomies.
The next sub-section describes the study performed to identify the specifications of a computer science-specific taxonomy and the dimensions required to evaluate the appropriateness of this learning taxonomy.
2.1 Specifications Identification
A qualitative content analysis was conducted using the NVivo version 10 qualitative software database (QSR International Pty Ltd, Burlington, MA, USA) and was guided by the procedure of Edwards-Jones [11] to partially automate our analysis of the discussion sections in the reviewed articles.
In particular, one of the authors performed a constant comparison analysis [12] of both deductive and inductive coding approaches [13]. In the deductive phase, the aforementioned rules by Hauenstein [9] were considered. This step was performed by reading the entire set of data. Then, the author chunked the data into smaller meaningful parts. The author then labeled each chunk with a descriptive title or a “code”. NVivo was used to highlight segments of the text that included coding representing a specific weakness. Each new chunk of data was then compared with previous codes so that similar chunks were labeled with the same code. After all the data were coded, the codes were grouped by similarity, and a theme was identified and documented based on each grouping.
As a result, comprehensive computer science-specific taxonomy specifications are proposed, namely, consistency, inclusivity, hierarchical adequacy, representativeness, usability, coherence, mutual exclusivity, and dimensional adequacy. Table 1 presents these primary dimensions along with the approach used and their descriptions.
To ensure inter-rater reliability, the data were coded first. Themes and randomly selected sample statements related to these themes were then given to two reviewers who had taken a course in qualitative research methods. The reviewers were Ph.D. holders in education whose research interests included computer science education. The reviewers were asked to code the documents based on the themes. The agreement between the two experts’ reports measured 86%.
Table 1 Computer Science-Specific Taxonomy Specifications
| No | Dimension | Approach | Description |
| 1 | Usability | Inductive | The taxonomy should categorize programming learning objectives in a simple way that could break these objectives into their components (i.e. task(s) and knowledge(s)). |
| 2 | Consistency | Deductive | The taxonomy should involve a dependable classification and interpretation of programming learning outcomes. These outcomes should always be expressed the same way. |
| 3 | Learnability | Inductive | Taxonomic categories and their interpretations should be comprehensible. |
| 4 | Hierarchical adequacy | Deductive | The hierarchy of categories should effectively describe programming learning objectives. |
| 5 | Dimensional adequacy | Inductive | The taxonomy should have two distinct dimensions (knowledge types and cognitive processes) to successfully describe the constructive learning objectives of programming. According to Airasian and Miranda [14], a two-dimensional approach allows educators to create stronger objectives that address increasingly complex instruction methods. |
| 6 | Mutual exclusivity | Inductive | Each learning objective should be assigned to only one category. |
| 7 | Inclusivity | Deductive | The taxonomy should include a sufficient list of all necessary programming knowledge types and skills for the user to classify all programming learning standards. |
| 8 | Representativeness | Deductive | The taxonomy should use common relevant terms to describe programming skills, knowledge types, and competencies required for each skill. The programming knowledge framework should be considered [15, 16]. |
The development process of Lynn [17] was used to guide the content development for this instrument. In this process, when content is being developed for an affective measure such as one of taxonomic appropriateness, two sub-processes occur: development and judgment. Development involves the identification of dimensions or sub-dimensions and extends to item generation and the subsequent integration of items into a suitable form, according to Lynn [17]. Judgment involves determining whether the given content and instrument are sufficiently valid [17]. According to Turner, Quittner [18], during initial instrument development, a conceptual framework should be identified. This framework should be representative so that the domain content is specific and relates to the subject area. This specificity is achieved by reviewing the related literature, during which potential items are identified. Once the preliminary scope of the taxonomy has been identified, the proposed content is analyzed to achieve a satisfactory final structure.
3.1 Conceptual Framework and Domain Content Identification
The identification of items involved writing items for the scales. Initially, items from a previously validated questionnaire, specifically, the Measurement Scales for Perceived Usefulness and Perceived Ease of Use, by Davis [19], was examined and adapted. Then, suitable items were written for each scale based on a review of the literature [3, 20-37], and these items were incorporated into the taxonomy framework and were finally related to particular dimensions. Table 2 shows the items developed for each dimension.
| Table 2 Taxonomy Appropriateness Items. | |
| Dimension | Items |
| 1. Usability | 1.1 This taxonomy is easy to use. |
| 1.2 This taxonomy is flexible in describing learning objectives. | |
| 1.3 Using this taxonomy is effortless. | |
| 1.4 This taxonomy gives me more control over the activities in my course. | |
| 2. Consistency | 2.1 This taxonomy can be used to interpret programming learning tasks every time. |
| 2.2 This taxonomy can be used to interpret programming learning knowledge every time. | |
| 2.3 This taxonomy can be used to classify programming learning outcomes every time. | |
| 3. Learnability | 3.1 The categories in this taxonomy are comprehensible. |
| 3.2 The categories in this taxonomy can be clearly interpreted. | |
| 3.3 This taxonomy is readable. | |
| 4. Hierarchical adequacy | 4.1 The ordering of the taxonomy’s skill sets appropriately reflects the programming learning process. |
| 4.2 The ordering of the taxonomy’s knowledge types appropriately reflects the programming learning process. | |
| 4.3 The ordering of the taxonomy’s categories appropriately reflects programming learning objectives. | |
| 5. Dimensional adequacy | 5.1 This taxonomy includes enough distinctive dimensions of knowledge that can be used to successfully describe constructive programming learning objectives. |
| 5.2 This taxonomy includes enough distinctive dimensions of cognitive that can be used to successfully describe constructive programming learning objectives. | |
| 5.3 This taxonomy includes enough distinctive categories that can be used to successfully describe constructive programming learning objectives. | |
| 6. Mutual exclusivity | 6.1 When using this taxonomy, each knowledge type required in programming learning can be assigned to a single category. |
| 6.2 When using this taxonomy, each programming learning skill can be assigned to a single category. | |
| 6.3 When using this taxonomy, each programming learning objective can be assigned to a single category. | |
| 7. Inclusivity | 7.1 The set of knowledge types in this taxonomy include all necessary knowledge types that students must know to perform a given programming learning task. |
| 7.2 The skills in this taxonomy include all the necessary skills that students must acquire to perform a given programming learning task. | |
| 7.3 The knowledge types in this taxonomy include all appropriate types that students must know to perform a given programming learning task. | |
| 7.4 The skills in this taxonomy include all appropriate skills that students must acquire to perform a given programming learning task. | |
| 8. Representativeness | 8.1 The categories in this taxonomy are relevant to learning computer programming. |
| 8.2 The knowledge types in this taxonomy are relevant to knowledge required to perform computer programming learning tasks. | |
| 8.3 The skill sets in this taxonomy are relevant to skills that must be acquired by students to perform computer programming learning tasks. | |
4.0 Measuring Content Validity
Once the items have been generated, the validity of an item and of the overall instrument must be quantitatively determined [17]. In doing this, researchers frequently calculate a content validity index (CVI). Hambleton, Swaminathan [38] first presented this index and advocated its use in nursing research conducted by Waltz and Bausell [39].
Many factors guided the selection of this index, including its ease of calculation and understanding. In contrast, the content validity ratio (CVR) developed by Lawshe [40], for example, is easy to calculate but not as easy to interpret [41]. Another desirable quality of a content validity measure is that it yields item-level information that can be used to refine or discard items and a summary of the content validity of the overall scale [41].
The CVI is the percentage of respondents who assign an item a score of 3 or 4 on a 1–4 scale of relevance or representativeness. It has been recommended that an individual CVI (I-CVI) and a scale CVI (S-CVI) should be calculated separately and that the S-CVI be reported [39, 41-43].
Polit and Beck [42] preferred the S-CVI in cases where more content-expert panel members are involved because one hundred percent agreement is not feasible. The S-CVI is determined by averaging I-CVI scores. When six or more experts are involved, Lynn [17] recommended a minimum I-CVI of 0.78. However, Waltz and Bausell [39] recommended a minimum S-CVI value of 0.90 for a valid scale in which items should be retained. In this study, we use both the I-CVI and S-CVI to determine the content validity of statements related to taxonomy appropriateness.
4.1 Expert Panel
Lynn [17] argued that at least three experts should be consulted when performing content validation. Our expert panel included five subject matter experts with more than 10 years of teaching programming experience. These experts were invited to evaluate the content validity based on the I-CVI and S-CVI. Each respondent received an informational email that included a hyperlink to a questionnaire. Survey security was maintained using Secure Sockets Layer technologies to protect confidentiality, and no personal identifiers were collected. A four-point scale was used to evaluate the content validity, and the values were matched with verbal descriptions of taxonomic appropriateness as follows: 1 = the item is not representative; 2 = the item requires major revisions to be representative; 3 = the item requires minor revisions to be representative; 4 = the item is representative. The CVI was calculated as the percentage of experts who selected 3 or 4 when scoring the items. As prescribed in the proposed methodology of our study, both the I-CVI and S-CVI were calculated. The average scale CVI (S-CVI/Ave) was determined from all the I-CVI values. The target SCVI/Ave value, according to Polit and Beck [42], is 0.9.
For greater reliability, we then calculated a modified kappa statistic (k*) described by Polit, Beck [41]. According to Wynd, Schmidt [44], the kappa statistic is an important supplement to the CVI because it indicates the degree of agreement beyond chance. To assess the degree of agreement based on the value of κ*, the guidelines by Landis and Koch [45] are used.
According to Polit and Beck [42], the I-CVI of a new instrument should range between 0.78 and 0.80. As indicated, all the I-CVI scores for this instrument were 1.0. Therefore, all the items were retained in the questionnaire. Following the recommendations of Lynn [17], testing of a psychometric instrument should be conducted next. The expert panel assigned the instrument I-CVI scores of 1.0 (Table 3). Thus, the S-CVI/Ave value was recorded as 1.0, confirming that each individual item can be retained. The 26 items received an I-CVI value of 1.0. Because the CVI scores were consistently high, we concluded that none of the experts’ suggestions regarding item content needed to be adopted. The high degree of concurrence regarding taxonomy appropriateness among the respondents indicates that the instrument for assessing taxonomy appropriateness is adequate for progression to the next step of instrument development.
A modified kappa statistic (k*) was calculated to determine if there was agreement between the raters’ judgments regarding whether the 26 items regarding taxonomy appropriateness were relevant. There was high agreement between the five raters’ judgments of all the items: κ* = 1.0.
Table 3 Content validity indices (I-CVI and S-CVI)
| Items | I-CVI | S-CVI/ave | k* | No. of Respondents |
| 1. Usability | ||||
| 1.1 This taxonomy is easy to use. | 1.0 | 1.0 | 5 | |
| 1.2 This taxonomy is flexible in describing learning objectives. | 1.0 | 1.0 | 5 | |
| 1.3 Using this taxonomy is effortless. | 1.0 | 1.0 | 5 | |
| 1.4 This taxonomy gives me more control over the activities in my course. | 1.0 | 1.0 | 5 | |
| 2. Consistency | ||||
| 2.1 This taxonomy can be used to interpret programming learning tasks every time. | 1.0 | 1.0 | 5 | |
| 2.2 This taxonomy can be used to interpret programming learning knowledge every time. | 1.0 | 1.0 | 5 | |
| 2.3 This taxonomy can be used to classify programming learning outcomes every time. | 1.0 | 1.0 | 5 | |
| 3. Learnability | ||||
| 3.1 The categories in this taxonomy are comprehensible. | 1.0 | 1.0 | 5 | |
| 3.2 The categories in this taxonomy can be clearly interpreted. | 1.0 | 1.0 | 5 | |
| 3.3 This taxonomy is readable. | 1.0 | 1.0 | 5 | |
| 4. Hierarchical adequacy | ||||
| 4.1 The ordering of the taxonomy’s skill sets appropriately reflects the programming learning process. | 1.0 | 1.0 | 5 | |
| 4.2 The ordering of the taxonomy’s knowledge types appropriately reflects the programming learning process. | 1.0 | 1.0 | 5 | |
| 4.3 The ordering of the taxonomy’s categories appropriately reflects programming learning objectives. | 1.0 | 1.0 | 5 | |
| 5. Dimensional adequacy | ||||
| 5.1 This taxonomy includes enough distinctive dimensions of knowledge that can be used to successfully describe constructive programming learning objectives. | 1.0 | 1.0 | 5 | |
| 5.2 This taxonomy includes enough distinctive dimensions of cognitive that can be used to successfully describe constructive programming learning objectives. | 1.0 | 1.0 | 5 | |
| 5.3 This taxonomy includes enough distinctive categories that can be used to successfully describe constructive programming learning objectives. | 1.0 | 1.0 | 5 | |
| 6. Mutual exclusivity | ||||
| 6.1 When using this taxonomy, each knowledge type required in programming learning can be assigned to a single category. | 1.0 | 1.0 | 5 | |
| 6.2 When using this taxonomy, each programming learning skill can be assigned to a single category. | 1.0 | 1.0 | 5 | |
| 6.3 When using this taxonomy, each programming learning objective can be assigned to a single category. | 1.0 | 1.0 | 5 | |
| 7. Inclusivity | ||||
| 7.1 The set of knowledge types in this taxonomy include all necessary knowledge types that students must know to perform a given programming learning task. | 1.0 | 1.0 | 5 | |
| 7.2 The skills in this taxonomy include all the necessary skills that students must acquire to perform a given programming learning task. | 1.0 | 1.0 | 5 | |
| 7.3 The knowledge types in this taxonomy include all appropriate types that students must know to perform a given programming learning task. | 1.0 | 1.0 | 5 | |
| 7.4 The skills in this taxonomy include all appropriate skills that students must acquire to perform a given programming learning task. | 1.0 | 1.0 | 5 | |
| 8. Representativeness | ||||
| 8.1 The categories in this taxonomy are relevant to learning computer programming. | 1.0 | 1.0 | 5 | |
| 8.2 The knowledge types in this taxonomy are relevant to knowledge required to perform computer programming learning tasks. | 1.0 | 1.0 | 5 | |
| 8.3 The skill sets in this taxonomy are relevant to skills that must be acquired by students to perform computer programming learning tasks. | 1.0 | 1.0 | 5 | |
| Scale | 1.0 | |||
| I-CVI, individual content validity Index; S-CVI/ave, average scale content validity index; k*, modified kappa statistic | ||||
6.0 Conclusions
Acknowledgement
The authors are thankful to anonymous reviewers whose comments significantly improved this manuscript.
References
[1] Anderson, L., D. Krathwohl, and B. Bloom, A taxonomy for learning, teaching and assessing: a revision of Bloom’s taxonomy of educational objectives. 2001: Longman.
[2] Bloom, B.S., et al., Taxonomy of educational objectives: Handbook I: Cognitive domain. New York: David McKay, 1956. 19: p. 56.
[3] Fuller, U., et al., Developing a computer science-specific learning taxonomy. ACM SIGCSE Bulletin, 2007. 39(4): p. 152-170.
[4] Meerbaum-Salant, O., M. Armoni, and M. Ben-Ari, Learning computer science concepts with scratch. Computer Science Education, 2013. 23(3): p. 239-264.
[5] Santos, A., A. Gomes, and A. Mendes. A taxonomy of exercises to support individual learning paths in initial programming learning. in Frontiers in Education Conference, 2013 IEEE. 2013.
[6] Johnson, C.G. and U. Fuller. Is Bloom’s taxonomy appropriate for computer science? in Proceedings of the 6th Baltic Sea conference on Computing education research: Koli Calling 2006. 2006. ACM.
[7] Teodorescu, R.E., et al., New approach to analyzing physics problems: A Taxonomy of Introductory Physics Problems. Physical Review Special Topics-Physics Education Research, 2013. 9(1): p. 010103.
[8] Kropp, R.P., H.W. Stoker, and W.L. Bashaw, The Validation of the Taxonomy of Educational Objectives. The Journal of Experimental Education, 1966. 34(3): p. 69-76.
[9] Hauenstein, A.D., A conceptual framework for educational objectives: A holistic approach to traditional taxonomies. Vol. 100. 1998: University Press of America Lanham, MD.
[10] ACM and IEEE Computer Society. Computer Science Curriculum 2013. 2013; Available from: http://www.sigart.org/CS2013-EAAI2011panel-RequestForFeedback.pdf.
[11] Edwards-Jones, A., Qualitative data analysis with NVIVO. Journal of Education for Teaching, 2014. 40(2): p. 193-195.
[12] Leech, N.L. and A.J. Onwuegbuzie, An array of qualitative data analysis tools: A call for data analysis triangulation. School psychology quarterly, 2007. 22(4): p. 557.
[13] Hsieh, H.-F. and S.E. Shannon, Three approaches to qualitative content analysis. Qualitative health research, 2005. 15(9): p. 1277-1288.
[14] Airasian, P.W. and H. Miranda, The role of assessment in the revised taxonomy. Theory into practice, 2002. 41(4): p. 249-254.
[15] Mayer, R.E., From Novice to expert, in Handbook of Human-Computer Interaction (Second Edition), M.G. Helander, T.K. Landauer, and P.V. Prabhu, Editors. 1997, North-Holland: Amsterdam. p. 781-795.
[16] Shneiderman, B. and R.E. Mayer, Syntactic/semantic interactions in programmer behavior: A model and experimental results. International Journal of Computer & Information Sciences, 1979. 8(3): p. 219-238.
[17] Lynn, M.R., Determination and quantification of content validity. Nursing Research, 1986. 35(6): p. 382-385.
[18] Turner, R.R., et al., Patient-Reported Outcomes: Instrument Development and Selection Issues. Value in Health, 2007. 10: p. S86-S93.
[19] Davis, F.D., Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS quarterly, 1989: p. 319-340.
[20] Alaoutinen, S., Evaluating the effect of learning style and student background on self-assessment accuracy. Computer Science Education, 2012. 22(2): p. 175-198.
[21] Ming-Han, L., R. Guido, and ling, Integrating categories of algorithm learning objective into algorithm visualization design: a proposal, in Proceedings of the fifteenth annual conference on Innovation and technology in computer science education. 2010, ACM: Bilkent, Ankara, Turkey.
[22] Gluga, R., et al. Coming to terms with Bloom: an online tutorial for teachers of programming fundamentals. in Proceedings of the Fourteenth Australasian Computing Education Conference-Volume 123. 2012. Australian Computer Society, Inc.
[23] Athanassiou, N., J.M. McNett, and C. Harvey, Critical thinking in the management classroom: Bloom’s taxonomy as a learning tool. Journal of Management Education, 2003. 27(5): p. 533-555.
[24] Ari, A., Finding Acceptance of Bloom’s Revised Cognitive Taxonomy on the International Stage and in Turkey. Educational Sciences: Theory and Practice, 2011. 11(2): p. 767-772.
[25] Thompson, E., et al. Bloom’s taxonomy for CS assessment. in Proceedings of the tenth conference on Australasian computing education-Volume 78. 2008. Australian Computer Society, Inc.
[26] Amer, A., Reflections on Bloom’s revised taxonomy. Electronic Journal of Research in Educational Psychology, 2006. 4(1): p. 213-230.
[27] Shuhidan, S., M. Hamilton, and D. D’Souza. A taxonomic study of novice programming summative assessment. in Proceedings of the Eleventh Australasian Conference on Computing Education-Volume 95. 2009. Australian Computer Society, Inc.
[28] Johnson, G., et al., Applying the revised Bloom’s taxonomy of the cognitive domain to linux system administration assessments. J. Comput. Sci. Coll., 2012. 28(2): p. 238-247.
[29] Petersen, A., M. Craig, and D. Zingaro, Reviewing CS1 exam question content, in Proceedings of the 42nd ACM technical symposium on Computer science education. 2011, ACM: Dallas, TX, USA. p. 631-636.
[30] Kyllonen, P.C. and V.J. Shute, Taxonomy of learning skills. 1988, DTIC Document.
[31] Wong, G. and H. Cheung, Outcome-Based Teaching and Learning in Computer Science Education at Sub-degree Level. International Journal of Information and Education Technology, 2011. 1(1).
[32] Starr, C.W., B. Manaris, and R.H. Stalvey. Bloom’s taxonomy revisited: specifying assessable learning objectives in computer science. in ACM SIGCSE Bulletin. 2008. ACM.
[33] Gluga, R., et al., Mastering cognitive development theory in computer science education. Computer Science Education, 2013. 23(1): p. 24-57.
[34] Annett, J. and K.D. Duncan, Task analysis and training design. 1967, HULL UNIV. (ENGLAND), DEPT. OF PSYCHOLOGY.
[35] Johnson, G., et al. Multi-perspective survey of the relevance of the revised bloom’s taxonomy to an introduction to linux course. in Proceedings of the 13th annual conference on Information technology education. 2012. ACM.
[36] Bümen, N.T., Effects of the original versus revised Bloom’s taxonomy on lesson planning skills: a Turkish study among pre-service teachers. International Review of Education, 2007. 53(4): p. 439-455.
[37] Lahtinen, E. A categorization of Novice Programmers: a cluster analysis study. in Proceedings of the 19th annual Workshop of the Psychology of Programming Interest Group, Joensuu, Finnland. 2007.
[38] Hambleton, R.K., et al., Criterion-referenced testing and measurement: A review of technical issues and developments. Review of Educational Research, 1978: p. 1-47.
[39] Waltz, C.F. and B.R. Bausell, Nursing research: design statistics and computer analysis. 1981: Davis FA.
[40] Lawshe, C.H., A quantitative approach to content validity. Personnel psychology, 1975. 28(4): p. 563-575.
[41] Polit, D.F., C.T. Beck, and S.V. Owen, Is the CVI an acceptable indicator of content validity? Appraisal and recommendations. Research in nursing & health, 2007. 30(4): p. 459-467.
[42] Polit, D.F. and C.T. Beck, The content validity index: are you sure you know what’s being reported? Critique and recommendations. Research in nursing & health, 2006. 29(5): p. 489-497.
[43] Grant, J.S. and L.L. Davis, Selection and use of content experts for instrument development. Research in nursing & health, 1997. 20(3): p. 269-274.
[44] Wynd, C.A., B. Schmidt, and M.A. Schaefer, Two quantitative approaches for estimating content validity. Western Journal of Nursing Research, 2003. 25(5): p. 508-518.
[45] Landis, J.R. and G.G. Koch, The measurement of observer agreement for categorical data. biometrics, 1977: p. 159-174.
