Comparative values, correlation and classification of basketball players based on the efficiency index and expert evaluation by coaches

Measuring the efficiency of athletes during competition has been a subject of interest both for experts and scientists in sports for more than a hundred years. Basketball has recognized in the 1940s how important it is to analyze efficiency indicators because these procedures allow coaches to increase their knowledge. There are two basic methods – objective and subjective – for evaluating the efficiency, or real quality of basketball players. The aim of this research is to establish the level of correlation between these two methods and to identify clusters, i.e. player hierarchy based on the results of both methods of efficiency evaluation. The sample of variables consisted of 12 basketball players who participated in the 2010 FIBA World Championships in Turkey. The subjective evaluation, also called expert evaluation, was performed by coaches of seven national teams that participated in the Championship. The objective evaluation was performed using the EEF efficiency index. The data was processed using z-scoring, the Pearson coefficient, and hierarchical cluster analysis. The Pearson coefficients of linear correlation between the efficiency index and the expert evaluation is r = 0.859 with a statistical significance of p ≤ 0.01. The cluster analysis distinguished two groups of players, which were named quality and super quality. The variance analysis showed that the probability of the clusters being equal is less than p ≤ 0.00. The research has shown that the evaluation by coaches is relevant and is fully consistent with the efficiency index formula. Also, the distinction of two groups of players by clustering is not uncommon in the basketball practice and is linked with efficiency at the given time.


Introduction
Measuring athlete efficiency is the subject of numerous academic papers [1]. In sports, various forms of notational analysis are used to that end. The analysis is done by evaluating the competitive efficiency of successful and unsuccessful teams and athletes as the result of examining various data collected during a game [2]. It includes the analysis of player movement during a game and the evaluation of their technique and tactics against the collected indicators of situational efficiency [3,4]. In the early days, events in the field were recorded manually with various notational abbreviations, while the mid-1980s introduced computerized notational systems [5]. This data recording tradition is more than a hundred years long. The first manual data recording system was used in baseball by Hugh Stuart Fullerton [6].
It is fair to say that basketball is one of the major sports, with a long history of notational analysis [7]. The first papers on this topic in basketball were published by Lloyd Lowell Messersmith. He published research papers on the notational analysis of the distance basketball players traverse during games in the period of 1931-1944 [8][9][10][11][12][13]. This inspired others to tackle the same or similar issues [14,15]. For instance, Elbel and Allen (1941) suggested a method of assessing individual and team performance based on recording events during a game (performance factor) with a positive or negative impact on the final outcome of the game [16].

Milenko Vojvodić is an associate professor at the Faculty of Physical Education and Sport, University of Banja Luka, giving courses in Methodology of Research in Sports, Sports
Statistics, and Kinesiometrics. He was a visiting lecturer in other major universities in Bosnia and Herzegovina. For a period of time, he worked as a teaching assistant for the course unit Volleyball at the Faculty of Physical Education and Sport in Banja Luka, and he was also engaged as an expert associate by several volleyball premier league clubs.
Dražen Dizdar defined two elementary methods for assessing overall efficiency, or actual quality of a basketball player [19]. The first comprises procedures for objective assessment of the situational efficiency of basketball players based on the box score, the assessment of their technique and tactics against the collected statistical indicators of situational efficiency. The second method includes procedures for subjective assessment of situational efficiency of basketball players by experts on the sport. The same author then added a third method and named it synthesis (combination) of the two approaches [20].
Today, basketball is a "wonderful sport for statistics" because after each game a box score is made available, which "provides for each player and each team, quantitative information about 15 variables" [17]. In the words of Víctor Blanco, Román Salmerón and Samuel Gómez-Haro: "One of the main differences between basketball teams and other sports comes from the availability of information" [18]. Joze Martinez lists over 200 systems for objective assessment of situational efficiency of basketball players [21]. In his 1996 study called Total Basketball Proficiency Score (TBPS), H. Key, one of the pioneers in the assessment of player efficiency, mathematically determines the values of routinely monitored performance indicators [22]. J. Gomez i J. A. Moll compiled Individual Efficiency at Games (IEG) -Redimento Individual en los Partidos (RIP), as an attempt to devise an analysis method which is not based on the number of points scored by the player [23]. S. Garba modified the formula designed by Velkov in 1974 and presented it as Individual Efficiency Coefficient -Coeficiente de Eficacia Individual (CEI), which, essentially, strives to adapt a player's efficiency relative to minutes played [24]. D. Bradshaw provided a very similar formula [25]. Dave Heeren offered the Tendex coefficient for assessing the efficiency of basketball players, based on the work of Roberto Azar, who devised the eBA Staff system for assessing individual and team efficiency in basketball. [26]. In the following years, a separate tendex formula was developed for offensive rating [27], and then for defensive rating as well [28]. Dale Brown focused exclusively on defensive efficiency through the Defensive Intensity Chart (DIC), aiming to measure the value of a player's defensive plays in a match [29]. Kenneth Swalgin and Damir Knjaz provided a tool for a more objective analysis of the player efficiency measurements, known as the Basketball Evaluation System (BES) [30][31][32]. Jean-Franci Gréhaigne, Danie Bouthier, and Paul Godbout proposed a player efficiency assessment procedure for collective sports (basketball, handball, football, soccer and volleyball), comprising two indices: efficiency and scope of play [33]. Slavko Trninić, Anto Perica, and Dražen Dizdar proposed nineteen criteria for the assessment of the overall situational efficiency in individual and team performance of a given player, after which a number of criteria were replaced with appropriate variables of situational efficiency [34]. The assistant coach of the East Wake Zebolun team is developing a concept he called Points Responsible (PR) [35]. Based on Heerens's tendex formula, Mays Consulting Group developed a complex efficiency coefficient, which they called Magic Metric (MM). Among the more recent proposals there is the IBM Watson Research Centre efficiency coefficient, developed in cooperation with the NBA technical commission NBA, called MVPIBM [35]. Having performed regression analysis on 22 seasons of statistical data from the NBA, D. Berri (2008) concluded that a basketball player's efficiency can be expressed through a simple index called the Win Score [36]. To adequately assess a player's efficiency on the court, NBA teams of today use various forms of advanced notational analysis. Here are some examples: The NBA Efficiency Formula, which is used to assess a player's contribution to the team; the Player Efficiency Rating (PER), developed by John Hollinger, which rates players relative to minutes played; Win-Share is an analysis of a player's contribution to the team's victories, and was adapted and developed for basketball by Jason Kubatko, based on Bill James's baseball formula; Plus-Minus/Adjusted was created from a formula used in ice-hockey, and determines how many points a player scores while on the court. [37,38]. Joško Sindik, Igor Jukić, and Maja Adžija determined that the distribution of standard and derived parameters of situational efficiency is in line with the distribution of events during a basketball match, with a statistically significant correlation [39]. The issue of basketball player efficiency has been gradually shifting from the field of sport to the field of economy, and other sciences as well. As a result, there are volumes of research on efficiency in basketball in the popular Data Envelopment Analysis [40][41][42][43].
If we took a closer look at these system, we would notice the prevalence of: the simple linear combination, z-score simple linear combination, partially weighted linear combinations, absolute and relative success rate of a basketball player, the MVP assessment of player usefulness of a basketball player, the Swalgin system for player assessment and the PC system for assessment of player efficiency [19].
In other studies so far, the constraints of measurement instruments for direct measurement of basketball player quality resulted in the use of subjective assessment of player quality, based on evaluation by independent basketball experts. They were given a measurement scale (usually 1 to 5) to assess player performance, applying one or more criteria [44]. E. Sorak attempted to assess basketball player quality by awarding points for individual elements of situational efficiency according to their importance [45]. M. Brooks, L. Boleach, and J. Mayhew used expert evaluation to analyse the assessment of events at a basketball game and basketball player performance [46]. Brane Dežman researched potential basketball player efficiency using various models of expert systems [47]. Frane Erčulj compared coaches' and assistant coaches' subjective evaluation with efficiency calculated using the efficiency index on a sample of 12 female basketball players. Swaling researched the validity of the two models of assessment of basketball player situational efficiency. Eighteen basketball coaches assessed the total efficiency of 45 NCAA players on the Likert scale [48]. Slavko Trninić, Dražen Dizdar, and Brane Dežman [49] carried out a similar study on a sample of 60 basketball player from 12 clubs of the Croatian basketball premier league in the 1998/99 season. The research was carried out using standardized situational efficiency data and subjective assessment by 10 basketball coaches who had led the teams during that season [48]. S. Jakovljević, M. Karalejić, and I. Radovanović examined the relation between the two methods of assessment of actual quality of basketball players: expert evaluation (EE) and quality index (INK) [50]. Jose A. Martinez (2012) points out that identifying the optimal method of assessment of basketball players is turning into the quest for the Holy Grail [51]. The reason can primarily be traced to the nonlinearity of relations between efficiency and multidimensionality, as well as the unpredictability of player behaviour in specific, constantly fluctuating circumstances during matches [52]. As a consequence, new criteria systems are constantly being devised to aid in the selection and development of players, in the selection of efficient and safe training technologies, as well as in the selection of strategic and tactical ideas which would yield the expected results [53]. This quest may have been best described by the great coach Pat Riley when he said that not all skills can be measured mechanically, but that he was sure that they were all measurable in one way or another and that the events observed and noted during matches can be expressed in numbers [54].
The aim of this research is to ascertain the degree of correlation between the efficiency index (EEF) as an objective indicator of basketball player efficiency and their successfulness derived from subjective expert (coach) evaluation (EE), and to ascertain the presence of clusters, i.e. a hierarchy of players based on both methods of efficiency assessment. The research was carried out on a sample of pre-eminent basketball players during a top-tier basketball competition

The sample of subjects
The sample of variables consisted of 12 basketball players who participated in the 2010 FIBA World Championships in Turkey: Luís Scola, Linas Kleiza, Marcelo Huertas, Kevin Duran, Miloš Teodosić, Bostjan Nachbar, Hidayet Türkoğlu, Tiago Splitter, Juan Carlos Navarro, Nenad Krstić, Chauncey Billups and Robertas Javtokas. These players were nominated as the best players at the World Championship by the coaches who took part in the research. The XVI FIBA World Championship was held between August 28 and September 12 in four cities: Istanbul, Ankara, Izmir, and Kayseri.

Variables and methods of data collection
The expert evaluation was performed by coaches of seven national teams that participated in the Championship. The coaches were instructed to pick the best five players in the competition. They ranked the players from first to fifth place. The highest-ranking player was in their opinion the best player at the World Championship. The ranking of the other players was also determined according to the quality of their performance. The coaching staff of the representations that carried out the assessment are: Argentina, Lithuania, Serbia, Slovenia, Spain, Turkey, and USA.
The objective assessment of situational efficiency was carried out using the EEF formula. The formula is used at the official NBA website (http://www.nba.com/statistics/efficiency.html) to calculate individual player efficiency [36].
Field goals attempted; FGM -Field goals made; FTA -Free throws attempted; FTM -Free throws made; TO -Turn over.

Methods of data processing
The research uses quite extensive and complex statistics methodology. The EEF and EE variables were transformed into z EEF and z EE, and then transformed again into the combined variable of Sum z+10.
For the purpose of ascertaining the correlation between variables EEF and EE, the Pearson coefficient of linear correlation r was calculated.
The question if there is a group of super quality among the top 12 players was answered using the Hierarchical Cluster Analysis, applying the Squared Euclidean Distance method..

Results
As seen in Table 1, column 4 ranks the players based on results from column 2. Column 5 ranks the players based on results from column 3. Results in column 6 relate to standardizes values based on results in column 2. The same procedure was applied based on results in column 3, while the results are presented in column 7. Standardization, i.e. transformation of the original data into z-scores enabled further statistical procedures. It can be seen that values in column 8 are derived by adding values from columns 6 and 7, after which the constant 10 is added. And in the end, column 9 presents the ranking based on z-scores from column 8. Table 2 represents the basic indicators of a taxonomical analysis based on the efficiency index (EEF), based on expert evaluation (EE) and based on transformed summary values (Sum z).
Taxonomical analysis yielded two groups, which we provisionally called super quality and sub quality. Which players belong to which clusters can be seen in Table 3. Figure 1 shows an approximation of the total clusterisation.

Discussion
One of the problems in sports that draws the attention of researchers is how to objectively measure the game efficiency of both individual players and teams as a whole. This problem has been present in sports literature for a long time [16,33,44,52,55,56]. Slavko Trninić, Vladan Papić, Viktorija Trninić, and Damir Vukičević point out that the processes of assessment of the total potential and actual quality of a player, selection of a team, and selection of tactics are on-going, aiming to ensure the maximum level of player skill and success in the sport. In top-tier professional sport teams, the leader of these processes is the coach, with their coaching staff and external associates. [57]. Contemporary literature distinguishes two methods of assessment of the actual quality of basketball players [19]. Although a lot of research has been published in the past 70 years dealing with issues of assessment of athletes in team sports, not a lot of studies have dealt with comparing the two methods [20].
In our research, the Pearson coefficients of linear inter-correlation (r) between the efficiency index EEF and the expert evaluation EE is .859, with a statistical significance of p ≤ .01. The statistical error is less than 1%, indicating a high level of agreement between efficiency evaluation using a formula (EEF) and the subjective basketball expert evaluation (EE). These values provide an exact answer to the basic aim of the research, relating to the correlation between the efficiency index as a relevant efficiency value, and the subjective efficiency evaluation by experts, in particular the coaches of the aforementioned national teams. For those who are less familiar with basketball and statistics, the study sample is relatively low, and according to probability laws the smaller the sample, the higher the correlation coefficient needs to be to make the sample statistically significant, and viceversa. Having in mind the rule that the theoretical range of the Pearson coefficient of linear correlation is -1 ≤ r ≤ 1, the correlation coefficient of .859 is extremely high and therefore statistically significant, regardless of the small size of the sample. .

Figure 1 Plot of Means for each Cluster
Consequently, there is no doubt that the evaluation by basketball experts is relevant and that it almost fully corresponds the EEF formula, and that there should be no doubt when it comes to these experts in the objectiveness and ability to distinguish quality. The prediction for these results was realistic. Looking back, we see that it was basketball experts who first defined the EEF formula, based on their long, successful and prolific careers as both players and coaches.
Comparing the variables of player situational efficiency with expert evaluation, Swalgin ascertained a high level of compatibility in six to eight variables [30]. He reached similar results in his next study, where he also attempted to ascertain the validity of the two models of evaluating basketball players' situational efficiency. The study had a group of top-tier coaches (n = 18) assess the total efficiency of 45 NCAA basketball league players using the Likert scale. The results of the research showed that both proposed BES (Basketball Evaluation System) models of objective assessment correlate to the coaches' evaluations of situational efficiency [58]. Trninić et al. combined these two methods of basketball player assessment by expanding the existing set of variables with 7 additional variables for the evaluation of basketball player efficiency.
That was the basis for obtaining expert evaluation (EE), which was used to develop a Combined Criteria Model [34]. S. Jakovljević, M. Karalejić, and I. Radovanović also compared two methods of evaluating the individual quality of basketball players. The research covered 44 professional basketball players who played in the First YUBA League of Yugoslavia in the 2001/2002 season. The expert evaluation (EE) system in this paper was derived from the assessments of 5 basketball experts and based on the Quality index (INK) which was derived from official statistical data based on basketball player situational efficiency data. The authors ascertained that the correlation between these two evaluation methods was medium (r = .643; p < .01) [50].
The underlying problem of expert evaluation (ЕЕ) is in the selection of experts. In this particular research it can be noted that the experts who agreed to participate come from teams that have qualified for the top eight positions in this World Championship: USA (1st), Turkey (2nd), Lithuania (3rd), Serbia (4th), Argentina (5th), Spain (6th), and Slovenia (8th). The seventh-ranked national team, Russia, did not provide their opinion. If we look at the FIBA ranking after the 2014 FIBA Basketball World Cup we can note that the national teams that provided the data are in fact top-tier teams: USA (1st), Spain (2nd), Argentina (3rd), Lithuania (4th), Serbia (7th), Turkey (8th) and Slovenia (13th). Aside from the already mentioned national team of Russia (6th place at FIBA Ranking Men), the following teams are missing from the data: France (5th), Brazil (9th), Greece (10th), Australia (11th) and Croatia (12th). All these teams took part at the FIBA World Championship 2010, but ranked in the lower half of the table (Brazil 9th, Australia 10th, Greece 11th, France 13th, and Croatia 14th), so their opinion could not have been taken as relevant, since these teams were eliminated from further competition in the round of 16. This is important because many deficiencies have been identified when devising procedures for objective evaluation of actual player quality in team sport games [20].
A study by Dizdar ascertained that the upper limit of the prognostic ability of 13 indicators of situational efficiency in assessing actual quality of basketball players was at 77%, and the methods used to evaluation the total situational efficiency explained between 38 and 67% of the total actual quality of basketball players. In addition, it can be said that there are no comparable studies for any other sport games [19]. Pored toga može se konstatovati da i u ostalim sportskim igrama ne postoji niti jedno slično istraživanje. This brings Dizdar to the conclusion that at this stage of development of the sport, subjective evaluation by sports experts is a much more suitable method of assessing actual quality of players in sport games [20]. The authors of this paper completely agree with this statement. It could be said that the "statistical tool" recognizes only events. At the same time, it is unable to register their timing (accuracy and timeliness). In other words, it is unable to register the spatial and temporal parameters of the events. The sheer complexity of the basketball game makes it difficult, or rather impossible for "statistics" to recognize inadequate "reading" of the game by a player. For instance, an ill-timed pass to a "poorly" positioned player, or a pass to the best-positioned player, but with significantly poor timing. All of this confirms the stated opinions that statistics merely registers events, but without their important parameters. The same findings arose from other similar research [59][60][61].
Such an approach, which is the only correct one in the eyes of the authors of this paper, indicate that at least for now, there is no viable alternative to expert evaluation. On the other hand, statistical monitoring of situational efficiency parameters is something that can only partially support expert evaluation. For this reason, Dizdar [20] recommends the AHP (Analytical Hierarchy Process) method [62] for resolving this problem (evaluation of actual player quality in team sports), as very suitable in terms of simplicity and applicability.
The secondary aim of this research was to ascertain the presence of subquality, i.e. whether a group of 12 basketball players derived and verified according to the EEF index and the expert evaluation (EE) possibly comprises subgroups with higher and lower quality. This question was answered using cluster (taxonomical) analysis. A series, or better said, all available variants of cluster methods were applied. The cluster methods were alternated against the criteria of categorical and continual variables. Variables to which cluster analysis was applied in the sample were: the EEF efficiency index, the EE basketball expert evaluation and the transformed standardized summaries of these two variables. In other words, clusterisation was applied to variables from Table 1, columns 2, 3 and 9.
The clustering algorithm was the same for all cases. Given the substantial number of matrices and quantitative indicators, they were filtered and colligated, i.e. reduced to the most relevant taxonomical indicators.
Based on the derived exact numerical quantitative indicators in all variants, the 12 top players who were treated as one and unified quality were divided into two groups, which could be called super quality and quality. Which players belong to which cluster can be seen in Table 3. Overall, the ratio is 3:9 in absolute terms, or 25% to 75% in relative terms. This means that the first cluster includes 3 or 4 players, while the second one includes 8 or 9 players.
The statistical significance of the clusterisation is confirmed by values in Table 4. As seen, the difference between the two clusters was tested through ANOVA, where the values of the differences are directed at the F-coefficient, which is statistically significant. In all the examples, the probability of the clusters being equal is less than p ≤ .00.
This ratio of super quality players to quality players is not rare in basketball practice. It should be noted here that the value of such a classification must be limited exclusively to one event, one time-period, i.e. the duration of the championship, because it is the product of player efficiency for the given point in time, and is subject to change in a different interval of time. This assumption of the authors is not revolutionary, but, given the sensibility of the game of basketball itself, it is certainly correct.

Conclusion
The basketball of today uses two basic methods for assessing situational efficiency -the objective assessment based on statistical records, and the subjective assessment provided by basketball experts. The results of our research indicate a high level of consistency between these two methods of assessment on a sample of preeminent basketball players during a top-tier basketball competition. In addition, the study, conducted for this competition, yielded two clusters (two groups) of players, which we called super quality and quality, which is not a rare occurrence in basketball practice.