My questions: 1-What is the best basis for analysis: by theme or by grouped eras? 1-Can I use cohen`s Kappa to compare the gold standard compliance of each new test? 2- Is this formula correct? K=Po-Pe/1-Pe Po=(TP+TN) /tot Pe= Probability of an accidental positive+Probability of a random negative. 3-Do I have to calculate the average and the SD Thank you in advance Alisa, If each of the two evaluators has to determine which of the 50 species belongs to a certain number of subjects, then it would take the cappa of Oui Cohen with 50 categories. You don`t need a large sample to use Cohen`s cappa, but the confidence interval would be quite wide unless the sample size is large enough. You could calculate the percentage of agreement, but it wouldn`t be cohens Kappa, and how you would use that value isn`t clear. If you are using ME3L or ME3M, see the agreement. Which kappa should I use to calculate your decision agreement? I hope that makes sense. Bassam, you can use Cohens Kappa to determine the agreement between two evaluators A and B, with A being the gold standard. If you have another C reviewer, you can also use Cohens Kappa to compare A with C. I don`t know how to use Cohens Kappa in your case with 100 subjects and 30000 eras. If eras are part of the themes, your data may be composed of measurements for the 30,000 epochs. I`m not sure, because I don`t know what these eras represent or how they relate to the themes. Note that another approach to this type of scenario is the use of Bland-Altman, described on the site at www.real-statistics.com/reliability/bland-altman-analysis/ Charles Q2 – Is there a way for me to aggregate the data in order to generate an overall match between the 2 evaluators for the cohort of 8 subjects? 1.
That it is acceptable is your interpretation. Some may agree with you, but others would say that this is not acceptable. 2. Cohens Kappa measures consistency, not importance. Charles Fleiss`Kappa is a way to measure the degree of compliance between three or more evaluators if evaluators assign categorical ratings to a number of elements. My questions: Q1- I understand that I could use Cohens Kappa to individually determine the concordance between the evaluators for each of the subjects (i.e. establish a statistic for each of the 8 participants). Am I in the right place? Is this the appropriate test? Hi, I was wondering if you could help me — I have 4 reviewers in total for one project, but only 2 reviewers per article.