This study is not a mathematical discussion of numerical analysis, rather it highlights the problems that arise in preparing cast list data for analysis.
Apart from the need to exclude incomplete cast lists, and the difficulties of dealing with hybrid characters, the main problems arise from variations in the naming of characters in sources, since the variant names for each character have to be unified to permit analysis. These variations mainly result from informants and/or collectors using names in the line tags and commentary which do not tally with names given in the dialogue. If the character is not named in the dialogue, the amount of variation is even greater.
Several techniques and aids are described which help to resolve these problems, using examples and statistical data drawn from Nottinghamshire plays.