Views of New Testament Textual Space (DRAFT)


Table of Contents

1. Abstract
2. Introduction
3. Sources
4. Data Sets and Analysis Results
5. Discussion
5.1. Gospels
5.1.1. Matthew
5.1.1.1. INTF
5.1.1.2. UBS
5.1.1.3. Brooks
5.1.1.4. Ehrman
5.1.1.5. Racine
5.1.1.6. Wasserman
5.1.1.7. Summary
5.1.2. Mark
5.1.2.1. INTF
5.1.2.2. UBS
5.1.2.3. Hurtado
5.1.2.4. Mullen
5.1.2.5. Wasserman
5.1.3. Luke
5.1.4. John
5.2. Acts and the General Letters
5.2.1. Acts
5.2.2. James
5.2.3. 1 Peter
5.2.4. 2 Peter
5.2.5. 1 John
5.2.6. 2 John
5.2.7. 3 John
5.2.8. Jude
5.3. Paul's Letters
5.3.1. Entire Collection
5.3.2. 2 Corinthians
5.3.3. Hebrews
5.4. Revelation
6. What Difference Does It Make?
7. Acknowledgments
Bibliography
[Note]Note

This is a draft.

Multivariate analysis of textual variation among witnesses of the New Testament reveals features of the textual space constituted by those witnesses. Originally written in Greek, the New Testament was copied by hand for almost fifteen centuries until the advent of mechanized printing provided an alternative means of propagation. Translations into other languages were produced as well. Some of these, such as the Latin, Coptic, Syriac, and Armenian versions, appeared early and so provide insights into ancient states of the text. Patristic citations form another class of evidence which allows varieties of the text to be associated with particular localities and epochs.

As with every widely read work from antiquity, the New Testament exhibits textual variation which has been introduced by scribes and correctors. Sites where textual variation occurs are identified by comparing extant witnesses of the text.[1] Alternative readings at a variation site may be classified as orthographic or substantive. Orthographic variations are often ignored as they only affect the surface forms of words (e.g. spelling) and not their meaning. Substantive variations do affect meaning: they are called variants. The list of witnesses which support a particular reading of a particular variation site is known as the reading's attestation. A list of all readings at a variation site along with the attestation of each reading is called a variation unit.

There is an ongoing effort to establish the initial text which stands behind the range of texts found among surviving witnesses of the New Testament.[2] The most important witnesses for establishing the initial text fall into these categories:

  • Greek manuscripts

  • ancient versions

  • patristic citations.

Greek manuscripts are the primary witnesses to the text of the New Testament. Ancient versions are early translations of the Greek text into languages such as Latin, Coptic, Syriac, and Armenian. It is often possible to establish which Greek variant a version supports by translating its text at a variation site back into Greek. Patristic citations are quotations of the scripture by Church Fathers. Which variant was in a Church Father's copy of the text at a particular variation site can often be discerned if that part of the text is covered by one of his quotations.

A large proportion of the textual evidence disappeared long ago. In general, the older the copy, the more likely it is to have been lost. Even the most comprehensive data sets are mere samples of what once existed. Results obtained by analysis of these data sets are therefore provisional because it is always possible that including further data would produce different results. Nevertheless, it is reasonable to expect that analysis results derived from a sufficiently large sample are approximately the same as results that would be obtained if a more comprehensive data set were analysed.

Even though much is lost, a stupendous amount of evidence remains. There are many thousands of manuscripts in Greek, Latin, Syriac, Armenian, and other languages. Patristic citations are also very numerous. Given such a great cloud of witnesses, it can be difficult to see where each one stands in relation to the others. Fortunately, various methods of statistical analysis can be applied to data sets which relate to textual variation in order to explore relationships among the witnesses.

Analysis might begin from a number of starting points. One is a critical apparatus which gives attestations (i.e. lists of witnesses) in support of variants found at variation sites. The information contained in the apparatus must first be encoded as illustrated by reference to this entry from the fourth edition of the United Bible Societies' Greek New Testament (UBS4):


The data sets presented in this article use a number of encoding conventions. Exotic characters and superscripts can cause problems when plotting analysis results so witness identifiers (i.e. sigla) are Romanized and hyphens indicate where superscripts occur. Apart from these changes, the source's method of identifying witnesses is usually retained. This has the potential to cause confusion if two sources use different identifiers for the same witness. For example, Codex Sinaiticus may be identified as Aleph or 01. Also, the critically established text used in the INTF's Editio Critica Maior may be referred to as A (for Ausgangstext), making it easy to confuse with the A often used to represent Codex Alexandrinus.

The textual states of witnesses included in an apparatus entry may be encoded with numerals, letters, or other symbols. In this example, witnesses in the attestation list for the first variant are assigned the code 1, those in the second variant's list are given the code 2, and so on. The state of a witness is classified as undefined and given a code of NA (for not available) when it is not clear which variant the witness supports. For manuscripts this may be due to physical damage or because the manuscript does not include the section of text being examined; for versions, it may not be clear which state of the Greek text is supported by a back-translation of the version; for patristic citations, the reading of a Church Father's text may be unclear if the quotations are not exact (e.g. adaptations, allusions, or quotations from memory) or if different witnesses of the Church Father's text have different readings. In the present example, a number of versions (Latin, Syriac, Coptic) and patristic citations (e.g. those of Irenaeus, Ambrose, Chromatius, Jerome, and Augustine) are treated as undefined because it is not clear which variant each supports at this variation site.[3]


Encoded variants are entered into a data matrix which has a row for every witness and a column for every variation site. Manuscript correctors are treated as separate witnesses, as are supplements. Witnesses that are not specifically mentioned in a UBS4 apparatus entry are treated as undefined (NA):


The next step is to construct a distance matrix which tabulates the simple matching distance between every possible pair for the set of witnesses under examination. The simple matching distance between two witnesses is the proportion of disagreements between them in those variation units where the textual states of both are defined. Being a ratio of two pure numbers, this quantity is dimensionless (i.e. has no unit). It varies from a value of zero for complete agreement to a value of one for no agreement between the two witnesses.[4] A pair of witnesses only qualifies for inclusion in a distance matrix if both members of the pair share a minimum number of variation units at which the states of both are defined. This minimum requirement is intended to reduce sampling errors to a tolerable level. In the analysis performed here, the minimum required number is usually set at fifteen.[5]


A wide variety of multivariate analysis techniques are available to explore relationships between the objects whose distances from each other are tabulated in a distance matrix. Other generic terms such as observation, case, or item may also be used when refering to the objects being compared, which in this article are New Testament textual witnesses. Only three analysis techniques are presented here, namely classical multidimensional scaling (CMDS), divisive clustering (DC), and partitioning around medoids (PAM). My article on How To Discover Textual Groups” provides an introduction to these techniques. While each method has its own merits and potential weaknesses, the ones used here serve to introduce results obtained when exploratory multivariate analysis is applied to distance matrices based on textual variation among New Testament witnesses. All of the statistical analysis in this study is performed using the R Language and Environment for Statistical Computing. The relevant programs are available here.

Classical multidimensional scaling finds the set of object coordinates which best reproduces the actual distances between objects in the distance matrix. A plot of these coordinates shows how the objects are disposed with respect to one another when all distances are considered. This study refers to such a plot as a map and uses the term textual space for the space obtained when the objects are textual witnesses. Achieving a perfect spatial representation of a distance matrix may require any number of dimensions up to one less than the number of objects. This presents a problem when a large number of objects is being examined because our spatial perception is three-dimensional. Fortunately, three dimensions is often sufficient to achieve a reasonable approximation to the actual situation. The CMDS analysis produces a coefficient called the proportion of variance which is the proportion of distance matrix information explained (i.e. accounted for) by a map. This coefficient ranges from a value of zero to one, with a value of one indicating that the map is a perfect representation of the actual distances.

Divisive clustering begins with a single cluster and ends with individual objects. The program documentation describes the clustering algorithm as follows:[6]

At each stage, the cluster with the largest diameter is selected. (The diameter of a cluster is the largest dissimilarity between any two of its observations.) To divide the selected cluster, the algorithm first looks for its most disparate observation (i.e., which has the largest average dissimilarity to the other observations of the selected cluster). This observation initiates the "splinter group". In subsequent steps, the algorithm reassigns observations that are closer to the "splinter group" than to the "old party". The result is a division of the selected cluster into two new clusters.

This type of analysis produces a dendrogram which shows the heights at which clusters divide into sub-clusters. A divisive coefficient which measures the amount of clustering structure is presented as well. The value of this coefficient ranges from zero to one with larger values indicating a greater degree of clustering. It should be emphasized that a DC dendrogram is not a genealogical tree of the type produced by phylogenetic analysis. Instead, it merely shows a reasonable way to progressively subdivide an all-encompassing cluster until every sub-cluster is comprised of a single object.[7]

Partitioning around medoids (PAM) builds clusters around representative objects called medoids. The program documentation provides this description:[8]

The ‘pam’-algorithm is based on the search for ‘k’ representative objects or medoids among the observations of the dataset. These observations should represent the structure of the data. After finding a set of ‘k’ medoids, ‘k’ clusters are constructed by assigning each observation to the nearest medoid. The goal is to find ‘k’ representative objects which minimize the sum of the dissimilarities of the observations to their closest representative object.

PAM analysis produces a statistic called the mean silhouette width (MSW) which indicates how many groups are in a data set.

The analysis results presented in this article are based on data extracted from a number of sources. Each source is assigned an identifier based on the responsible author or party. Results are keyed to these identifiers so that the basis of the relevant data set can be determined. Information on each source is given below under the heading of its identifier.

The sources use various data formats, including tables of percentage agreement and lists of pair-wise proportional agreements. These are converted to distance matrices before analysis. A number of the sources have tables of percentage agreement with blank entries where there is insufficient data to complete a calculation. This can happen if two witnesses do not cover a sufficient quantity of overlapping text (e.g. Old Latins e and k) or if the two are mutually exclusive (e.g. a scribe and corrector of the same manuscript). Analysis cannot proceed if a distance matrix has missing data. However, it is possible to produce multiple distance matrices from a table of percentage agreement which has missing entries so that each distance matrix has no missing data. For example, Fee's tables of percentage agreement include blank entries for cells relating to the scribe and corrector of a manuscript. Two distance matrices are produced from each table of this kind, one including entries for the scribes and the other containing entries for the correctors.

Brooks

Tables of percentage agreement from James Brooks' New Testament Text of Gregory of Nyssa covering: Matthew (table 1, 58-9); Luke (table 7, 90-1); John (table 13, 138-9); and Paul's Letters (table 18, 254-5). The tables were transcribed by Richard Mallett.

CB

Data matrices for each Gospel compiled by Richard Mallett using Comfort's New Testament Text and Translation Commentary and Comfort and Barrett's Text of the Earliest New Testament Greek Manuscripts.

Cunningham

Tables of percentage agreement for the Gospel of John and Paul's Letters from Arthur Cunningham's New Testament Text of St. Cyril of Alexandria, 421-2 and 753.

Donker

Data matrices for Acts, the General Letters, and Paul's Letters from Gerald Donker's Text of the Apostolos in Athanasius of Alexandria. Gerald Donker and the SBL have made this data available through an archive located at sbl-site.org/assets/pdfs/pubs/Donker/Athanasius.zip. May their respective tribes increase!

EFH

Data used by Jared Anderson for his ThM thesis, Analysis of the Fourth Gospel in the Writings of Origen. The data was originally collected by Bart D. Ehrman, Gordon D. Fee, and Michael W. Holmes for their Text of the Fourth Gospel in the Writings of Origen. (Bruce Morrill did the statistical analysis presented in that volume.) A revised version of Anderson's thesis will be published in SBL's New Testament in the Greek Fathers series.

Ehrman

Table of percentage agreement for the Gospel of Matthew from Bart Ehrman's Didymus the Blind and the Text of the Gospels. The table was transcribed by Richard Mallett.

Fee

Tables of percentage agreement from three articles by Gordon Fee: (1) tables covering John 4, John 1-8, and John 9 from Codex Sinaiticus in the Gospel of John; (2) another table covering John 4 but including patristic data from The Text of John in Origen and Cyril of Alexandria; and (3) a table covering Luke 10 from The Myth of Early Textual Recension in Alexandria.

Hurtado

Tables of percentage agreement from Larry Hurtado's Text-Critical Methodology and the Pre-Caesarean Text. There is one table for each of the first fourteen chapters of the Gospel of Mark, one for Mark 15.1-16.8, and another for places where P45 is legible.[9]

INTF-General

Distance matrices made by an R script using an INTF database related to their Novum Testamentum Graecum: Editio Critica Maior: Catholic Letters. The INTF kindly provided access to this data.

INTF-Parallel

Distance matrices made by an R script using INTF tables located at http://intf.uni-muenster.de/PPApparatus/ which are related to Strutwolf and Wachtel (eds.), Novum Testamentum Graecum: Editio Critica Maior: Parallel Pericopes. The INTF generously provides open access to this data.

Mullen

Data extracted from Roderic Mullen's The New Testament Text of Cyril of Jerusalem. Two data sets have been prepared for the Gospel of Mark: one is a data matrix based on citations isolated by Mullen (112-7); the other is a distance matrix corresponding to a table of percentage agreement which relates to the parts of Mark's Gospel covered by P45 (41). Mullen based the latter on data compiled by Larry Hurtado but added other texts such as Family 1, 28, 157, and 700 (40, n. 81).

Osburn

Tables of percentage agreement for Acts and Paul's Letters from Carroll Osburn's The Text of the Apostolos in Epiphanius of Salamis. Richard Mallett transcribed these tables.

Racine

Table of percentage agreement for Matthew's Gospel from Jean-François Racine's The Text of Matthew in the Writings of Basil of Caesarea. The table was transcribed by Richard Mallett.

Richards

Table of percentage agreement from W. L. Richards' Classification of the Greek Manuscripts of the Johannine Epistles based on 159 variation units (72, 76-84).

UBS2

Tables of percentage agreement compiled from the apparatus of the second edition of the UBS Greek New Testament by Maurice A. Robinson. The tables were originally presented in Robinson's Determination of Textual Relationships and Textual Interrelationships. They were transcribed by Claire Hilliard and Kay Smith.

UBS4

Data matrices constructed from the apparatus of the fourth edition of the UBS Greek New Testament. Richard Mallett constructed the matrices for Mark, 2 Corinthians, and Revelation. A substantial part of the matrix for Matthew was encoded by Mark Spitsbergen. (Only the first fourteen chapters of Matthew are presently covered.) The UBS4 apparatus includes minuscule 2427 which is now regarded as a forgery. The data for this manuscript has been retained for the sake of interest; dropping it would have little effect on analysis results. In some cases, the evidence for a number of similar witnesses is consolidated to produce a group variant. For example, the majority reading of vg-cl, vg-st, and vg-ww is counted as the reading of the Vulgate (vg) in 1 John.

Wasserman

Tables of proportional agreement from Tommy Wasserman's Patmos Family of New Testament MSS covering Matt 19.13-26, Mark 11.15-26, Luke 13.34-14.11, John 6.60-7.1, and the Pericope Adulterae (usually John 7.53-8.11). The underlying collations used a reconstructed text to represent Family Π in Matt 19.13-26 and the Pericope Adulterae, which text is labelled f-Pi in the analysis results.

The following table presents data sets and analysis results based on the sources. The source column identifies the source of a data set, the matrices column presents data and dissimilarity matrices extracted from the data set, and the CMDS, DC, and PAM columns present analysis results. Data sets and some analysis results are accessed via links in the table. A distance matrix is always provided but a data matrix is only included if one has been constructed. If there is no data matrix then NA for not available is entered in the relevant column. Data and distance matrices are formatted as comma-separated vector (CSV) files so that they can be downloaded and imported into a spreadsheet program for inspection.

Some data sets represent manuscripts by Gregory-Aland numbers (e.g. 01, 02, 03, 044) while others use letters or latinized forms (e.g. Aleph, A, B, Psi). These symbols carry through to the analysis results. In INTF data, ECM or A (for Ausgangstext or initial text) represents the text of the Editio Critica Maior. The A for Ausgangstext in INTF data sets should not be confused with the A for Codex Alexandrinus in other data sets. Abbreviations UBS, WH, and TR stand for the texts of the United Bible Societies' Greek New Testament, Westcott and Hort's New Testament in the Original Greek, and the Textus Receptus, respectively. Maj, Byz, and Lect stand for majority, Byzantine, and lectionary texts, respectively. The relevant printed editions should be consulted for explanations of what these group symbols represent.

The analysis techniques employed in this study operate on a distance matrix, not a data matrix. If there is a data matrix for the source data then the corresponding distance matrix is obtained by calculating the simple matching distance between each pair of witnesses with enough defined variation sites. In other cases, the distance matrix is constructed from a table of percentage agreement or from a table which records the number of agreements and number of variation units where both witnesses of a pair are defined. Analysis cannot proceed if entries are missing from a distance matrix. Missing entries in tables of percentage agreement are therefore eliminated by an iterative procedure which at every step drops those witnesses with the most missing entries.[10] Three decimal places are used for distances regardless of whether this level of precision is warranted.

Results are presented for three modes of analysis: classical multidimensional scaling (CMDS), divisive clustering (DC), and partitioning around medoids (PAM). The proportion of variance coefficient associated with a CMDS map indicates the proportion of information contained in the distance matrix which is represented by the map. The divisive coefficient associated with a DC dendrogram indicates the degree of clustering of witnesses in the data set.[11]

The PAM column does not present partitions but instead gives indications of preferred numbers of groups for the data set. PAM analysis partitions cases (here, New Testament witnesses) into a number of groups which must be specified before the analysis begins. A statistic called the mean silhouette width (MSW) that is calculated during PAM analysis indicates preferable numbers of groups. Repeating the analysis for each possible number of groups shows which numbers are better suited to the data set because larger MSW values indicate more natural partitions. That is, local maxima (i.e. peaks) in a graph of MSW versus the number of groups correspond to preferable numbers of groups. The MSW column has a link to the MSW versus number of groups plot while the groups column gives a selection of preferred numbers of groups suggested by the graph.

A typical MSW plot for New Testament data contains numerous peaks. In this article, only those peaks which exceed the average of all MSW values for the plot are considered as eligible indicators of preferred numbers of groups.[12] Even with a reduced number of peaks, it can be difficult to choose how many groups to use for a partition. The highest peak often corresponds to only two or three groups but using such a small number produces some nebulous groups. By contrast, using a larger number tends to produce more coherent groups. Due to the tendency for MSW values to decrease as the number of groups increases, a peak to the right of another one has a claim to being preferable even if it has a smaller magnitude. Sometimes there is only one eligible peak, in which case only one number of groups can be given. While it would be reasonable to select the number of groups corresponding to any eligible peak of an MSW plot, the groups column usually gives only two suggestions:

  • one corresponding to the greatest MSW value

  • one corresponding to the last (i.e. rightmost) eligible MSW peak.

Table 2. Data sets and analysis results

Book or division Source Matrices CMDS DC PAM
Data Distance Map Coeff. Tree Coeff. MSW Groups
Matthew Brooks NA 0.54 0.71 2 14
CB 0.73 0.69 2 8
Ehrman NA 0.63 0.68 2 12
INTF-Parallel 0.28 0.80 2 121
Racine NA 0.69 0.72 2 13
UBS2 NA 0.35 0.70 2 32
UBS4 0.53 0.73 3 34
Wasserman NA 0.51 0.85 28 33
Mark CB 0.76 0.74 3 5
Hurtado (Mk 1) NA 0.76 0.51 6
Hurtado (Mk 2) NA 0.82 0.54 3 5
Hurtado (Mk 3) NA 0.76 0.56 4 6
Hurtado (Mk 4) NA 0.82 0.64 5
Hurtado (Mk 5) NA 0.76 0.52 5
Hurtado (Mk 6) NA 0.81 0.59 3 7
Hurtado (Mk 7) NA 0.85 0.62 4
Hurtado (Mk 8) NA 0.83 0.62 5
Hurtado (Mk 9) NA 0.79 0.56 5
Hurtado (Mk 10) NA 0.82 0.61 6
Hurtado (Mk 11) NA 0.77 0.60 6
Hurtado (Mk 12) NA 0.81 0.61 6
Hurtado (Mk 13) NA 0.82 0.64 5
Hurtado (Mk 14) NA 0.83 0.64 5
Hurtado (Mk 15+) NA 0.86 0.61 2 6
Hurtado (P45) NA 0.84 0.67 6
Mullen 0.62 0.73 8 13
Mullen (P45) NA 0.76 0.68 7
INTF-Parallel 0.33 0.80 2 109
UBS2 NA 0.42 0.69 2 46
UBS4 0.51 0.74 3 34
Wasserman NA 0.59 0.87 3 32
Luke Brooks NA 0.45 0.71 3 7
CB 0.75 0.77 3 7
Fee (Lk 10) NA 0.67 0.69 3 11
INTF-Parallel 0.27 0.81 3 86
UBS2 NA 0.38 0.71 7 32
Wasserman NA 0.61 0.82 2 18
John CB 0.72 0.67 2 10
Cunningham NA 0.54 0.67 3 19
EFH 0.61 0.64 3 18
Fee (Jn 1-8) NA 0.80 0.57 2 6
Fee (Jn 1-8, corr.) NA 0.77 0.51 2 6
Fee (Jn 4) NA 0.83 0.64 2 7
Fee (Jn 4, corr.) NA 0.83 0.65 6
Fee (Jn 9) NA 0.89 0.49 4
Fee (Jn 9, corr.) NA 0.87 0.49 4 6
Fee (Jn 4, pat.) NA 0.60 0.71 3 15
Fee (Jn 4, pat., corr.) NA 0.60 0.71 3 14
INTF-Parallel 0.34 0.84 2 98
UBS2 NA 0.36 0.68 3 51
Wasserman NA 0.57 0.89 3 44
Wasserman (PA) NA 0.66 0.85 12 22
Acts Donker 0.64 0.71 2 14
Donker (Acts 1-12) 0.66 0.75 2 9
Donker (Acts 13-28) 0.67 0.77 2 12
Osburn NA 0.76 0.83 4 10
UBS2 NA 0.41 0.71 11 31
General Letters Donker 0.83 0.76 4 10
James INTF-General NA 0.35 0.83 3 111
1 Peter INTF-General NA 0.35 0.78 2 125
UBS4 0.45 0.72 21 34
2 Peter INTF-General NA 0.36 0.79 39 81
1 John INTF-General NA 0.33 0.74 51 87
Richards NA 0.50 0.83 3 56
UBS4 0.40 0.79 12 31
2 John INTF-General NA 0.32 0.80 51 84
3 John INTF-General NA 0.35 0.84 36 74
Jude INTF-General NA 0.32 0.81 52 87
Paul's Letters Cunningham NA 0.70 0.71 3 15
Donker 0.70 0.67 3
Osburn NA 0.78 0.75 3 13
Romans Donker 0.71 0.76 4 10
1 Corinthians Donker 0.68 0.74 3 9
2 Corinthians UBS4 0.44 0.75 3 41
2 Cor. - Titus Donker 0.66 0.69 3 14
Hebrews Donker 0.72 0.63 2 9
UBS4 0.39 0.76 27 32
Revelation UBS4 0.41 0.61 16 23

A typical multivariate analysis result such as the CMDS map obtained from the UBS2 data set for Matthew's Gospel shows that groups exist among New Testament witnesses.


The analogy of a galaxy used by Eldon Epp to describe text-types seems apt for these groups:

A term such as group, cluster, or nucleus might be used to describe a local maximum in the density of objects within a CMDS map. A line which joins two items might be called a trajectory, and a region between groups where there is a higher than usual concentration of witnesses might be called a stream or corridor.[14]

The vocabulary of tree structures is useful when discussing DC dendrograms.


A branching point is called a node, each structure which descends from a node is called a branch, and terminals are called leaves. The dendrograms produced by analysing New Testament data have a self-similar character where, apart from scale, smaller parts have the same appearance as larger parts. (Self-similarity is evident in the CMDS maps too.) Each branch contains its own sub-branches, unless terminated by leaves (i.e. individual witnesses). A partition based on a DC dendrogram is obtained by means of a horizontal line which cuts across the dendrogram at some height to produce a set of separate branches.

PAM analysis partitions the constituents of a data set into a number of groups such that each constituent appears in only one group. Plotting a statistic called the mean silhouette width against each possible number of groups identifies preferable numbers of groups. That is, the MSW plot indicates which numbers of groups are more natural for the data set.


Peaks in the MSW value indicate which numbers of groups are preferable. Partitioning the example data set into 13 groups produces this result:


The PAM algorithm identifies a medoid for each group. The medoid is the member for which the sum of distances from all other members of the group is a minimum. In groups with three or more members, it is fair to describe the medoid as the most central member of its group.[15] This article uses the medoid identifier as a label for the associated group.

As a data set is progressively divided into larger numbers of groups, it often happens that a parent group gives rise to narrower child groups. Quite often there is a single group which acts as a catch-all for witnesses that do not fall into narrower, more coherent groups. Group K of the 13-way partition is an example: if the same data set is partitioned into more groups then this group is likely to produce subgroups while retaining a core membership.[16] Dividing a data set into a large number of parts reveals group cores. Whichever groups remain are highly coherent, entirely comprised of closely related members. Dividing a data set into numerous groups often produces singletons, each comprised of a single member. A singleton is isolated, not having any close relatives among the items in the data set.

Adding the partition's number of groups to the group label produces a more specific identifier. For example, in partitions of the INTF data set for Matthew, 033 (2) refers to the group with medoid 033 in a two-way partition while 033 (24) refers to the group with medoid 033 in a 24-way partition. Corresponding groups such as 033 (2) and 033 (24) are often produced when the same data set is divided into different numbers of parts. However, the medoids of such groups are not necessarily the same. Adding or subtracting even a single member can cause the medoid of a group to change. Consequently, correspondence must be established on the basis of common membership, not shared medoids. If groups from different partitions have the same core membership but differing medoids then descendant groups can be labelled by chaining the respective medoids together. To give an example, the relevant MSW plot suggests that a 93-way partition is another reasonable choice for the INTF data set for Matthew's Gospel. The members of group 031 (93) are also found in group 045 (24) yet the respective medoids differ. Labelling the subgroup as 045/031 signifies the connection with the parent group from which its members are drawn.

Even if an item is placed in a group with others, it may not be a good fit. Poorly classified items are identified using the silhouette width statistic which is calculated for every item during PAM analysis. Its value ranges from +1 to -1: the closer it is to +1, the better the associated object fits into its assigned group; by contrast, the closer the statistic is to -1, the worse the fit. Items with a negative silhouette width are listed at the end of each PAM partition presented in this article to show that they are poorly classified. They are listed in order of increasingly negative silhouette width so the worst classified appears last.

A poor fit may indicate that a witness has a mixed text. It may also indicate that the chosen number of groups is too small for a text to be grouped with like texts alone. Another indication of a questionable classification is when the results of different analysis modes point to a range of possible affiliations. When the classification of a witness is uncertain, further information may produce a more definite result. However, if a witness actually does have a mixed text then it will remain difficult to classify as anything but a mixture. In the case of a partition, a mixed text will not fit well into any group unless other witnesses happen to possess the same kind of mixture of texts.

The respective views produced by CMDS, DC, and PAM analysis are often but not always consistent. If results obtained by all modes point to the same conclusion with respect to implied clustering then that can be taken as a reasonably sure result; however, if they differ then each result needs to be handled with due caution. The distance matrix is the final arbiter when the affiliation of a witness is not clearly indicated by concurrence of analysis results. A statistical analysis reveals the range of distances which normally occur between two texts comprised of states (i.e. readings) which are randomly selected from those available at each variation site. Distances outside the normal range indicate an adjacent or opposite relationship between two texts: adjacent if the distance is less than normal and opposite if greater.[17]

To explore the affiliation of a reference witness, the relevant row of the distance matrix is extracted and its entries are ranked in order of increasing distance from the reference. Any distance in the normal range is marked by an asterisk to show that it is not statistically significant. Proximity is still a useful indicator of affiliation even when the distances are not statistically significant. Lack of significance does not imply lack of relationship: all New Testament witnesses are ultimately related even though not all have close relationships. The relative size of the normal range of distances contracts as the number of places compared increases so some of the closest witnesses might become significantly close if a larger data set is examined.

In DC and PAM analysis, a slight change in the distance matrix can cause a witness which lies near the midpoint between two groups to jump from one to the other. If a text is largely comprised of a mixture of variants characteristic of particular groups then a CMDS map will locate it between the relevant groups, proportionally closer to those whose variants it most often contains. Singular readings and variants shared with only a few other witnesses tend to isolate a text, increasing its distance from all others. The more eccentric a text relative to others in the data set, the more remote its location in a CMDS map.

The analysis techniques used in this study do not provide direct guidance concerning the relative priority of groups or the witnesses they contain. Another technique such as the Coherence-based Genealogical Method (CBGM) developed by the INTF can be used to investigate whether the witnesses in one group are closer to the initial text than those in another.[18] While the analysis results presented here are not genealogical, they do reveal textual affiliation and the main streams of New Testament textual development. To use another analogy, a CMDS map is like a time-exposure where texts of differing date occupy the same picture. Even though the techniques used here do not indicate temporal order among the texts, they are valuable for exploring relative dispositions, with similar texts being close to each other in a CMDS map, in the same branch of a DC dendrogram, and the same group of a PAM result. That is, similar texts tend to collocate in results produced by all three analysis modes.

Clustering may be identified by inspecting a CMDS map, cutting a DC dendrogram, or choosing a local maximum in a plot of mean silhouette widths and producing a corresponding partition using PAM analysis. We have a natural facility for recognising group structure in a point cloud such as constitutes a CMDS map. However, we are prone to misinterpret a purely random arrangement of points as constituting a cluster. One way to avoid this kind of error is to know what analysis results look like for randomly generated analogues of the data sets being examined, an approach demonstrated in my Groups article.

The following discussion of analysis results typically begins with partitions obtained by PAM analysis and may consider CMDS and DC results as well. More comprehensive data sets for a book or section take priority although others may be examined if they contain versional or patristic evidence. Many of the groups found by multivariate analysis have been discovered before. Unless otherwise indicated, associations between known groups and those revealed by PAM analysis are established by reference to the table of Profile Classification contained in Frederik Wisse's Profile Method, 52-90. Wisse's test passages are from three chapters of Luke's Gospel so his classifications are not necessarily valid in other parts of the New Testament.

As far as manuscripts are concerned, the most comprehensive data sets for the Gospels are those associated with the INTF's Parallel Pericopes volume. These data sets will be the first to be examined by PAM analysis in the gospel sections of this article. For Matthew's Gospel, the first few numbers of groups indicated by above average values in the relevant MSW plot are 2, 20, and 24. Using PAM analysis to split the data set into two groups produces this result:


The groups are named 033 and 826 after their medoids. Apart from 038, all of the witnesses in group 826 are members of Family 13. In this partition, 33, 05, 01, 892, A (i.e. the INTF's Ausgangstext), and 03 have increasingly negative silhouette widths and are thus identified as progressively worse fits within their assigned groups.

The next two peaks with above average MSW values occur for 20 and 24 groups. The value for 24 groups is greater than for 20, indicating that a 24-way partition of this data set is more natural than a 20-way partition.


Peripheral members of 033 (2) form new groups in the 24-way partition, leaving behind the manuscripts that constitute 033 (24). Many of these complexes are known: group 2193 is Family Π; group 045 has points of contact with von Soden's K1 and Wisse's cluster Ω; group 042 contains two of the purple manuscripts; group A is centred on the INTF's Ausgangstext and includes Alexandrian texts 03 and 892 as well; groups 1582 and 209 are parts of Family 1; group 968 is Wisse's cluster 1012; group 826 is Family 13; groups 184 and 61 are parts of von Soden's Iβ; and group 517 corresponds to von Soden's Iφa.[19] The other groups, namely 0233, 1230, and 372, may not have been noticed before. Manuscripts 01, 038, 05, 33, 579, 740, 79, 792, and 807 are singletons, implying that they have no close neighbours among the witnesses in the data set.

The corresponding CMDS and DC results are broadly consistent with the PAM results. The DC dendrogram also shows that all of the manuscripts apart from those comprising groups A, 1582, 826, and 05 (a singleton) belong to a single branch which corresponds to the Byzantine textual variety.

Another prominent peak in the MSW plot occurs for 93 groups. Partitioning the data set into this many groups produces this result:


A number of groups in the 24- and 93-way partitions are the same, others are narrower, some emerge for the first time, and one disintegrates. Groups 042, 1582, 968, 209, 372, and 61 are the same in 24- and 93-way partitions. Groups 0233, 031, 1230, 826, 517, and 184 of the 93-way partition are narrower versions of groups found in the 24-way partition. Those members which remain constitute the group cores. Group 2546 (93) is new, taking one member from 033 (24) and the other from 045 (24). Group 1528 (93) is a subgroup of 184 (24). Group A of the 24-way partition is not sufficiently coherent to survive a 93-way partition.

As for prior classifications, the members of group 033 (93) do not align with any particular group identified by von Soden or Wisse apart from often being classified in a K subgroup. Wisse classifies most of the members of group 031 (93) as Kx and four (07, 028, 031, 045) as cluster Ω. He classifies all of the members of group 1339 as Kr. He also recognises the two members of group 1528 (93) as a pair but does not note any special relationship between the two members of group 2546.

The next most comprehensive data set for Matthew is based on data compiled from the UBS2 apparatus by Maurice Robinson, and the corresponding MSW plot indicates that a 13-way partition is a reasonable choice:[20]


There are some interesting alignments between these groups and conventional categories used to describe New Testament texts. Wisse places all of the Greek manuscripts of groups cop and Cyril (i.e. Aleph, 33, B, 892) in his group B, which corresponds to the Alexandrian category. Group K includes: the Byzantine standard text; numerous Greek manuscripts and lectionaries; Family 13; Latin f; the Gothic, Ethiopic, and two Syriac (i.e. Peshitta and Harclean) versions; the texts of Chrysostom and Basil inferred from their quotations; and a set of corrections to Codex Sinaiticus. Three of the groups have a Western flavour: group it-d contains the Greek and Latin sides of Codex Bezae; group it-b is comprised of five Latin manuscripts and the quotations of Ambrose; group it-aur includes five more Latin manuscripts, Jerome's Vulgate, and the quotations of Jerome and Augustine. Finally, all members of groups geo, f-1, and syr-c except Cyprian are members of what B. H. Streeter called an Eastern type.[21]

While most of the groupings implied by the 13-way partition are not unexpected, it is surprising to see Cyprian placed in a group with Family 1 and Origen. Negative silhouette widths imply that both Cyprian and Origen do not fit well into this group. One might expect Cyprian and Latin k to be grouped together because they are supposed to have a close relationship.[22] However, a list of witnesses ranked by distance from Cyprian's text implies that the two are not alike:


Asterisks mark distances which are not statistically significant.[23] Latin k has an opposite relationship to Cyprian's text in the sense of being at a greater than normal distance. Perhaps Cyprian and Latin k actually are close to each other but the UBS2 apparatus is skewed, presenting an inordinate number of variation sites where the two differ. Even though this is possible, it is more probable that these two texts actually are significantly different in Matthew. Another surprise concerning Cyprian's text is that its nearest neighbour is minuscule 1546, a member of group K. Analysis of a more comprehensive data set may throw light on these puzzling aspects of Cyprian's text as represented by the UBS2 apparatus.

Studies devoted to particular Church Fathers help their texts to be located relative to other New Testament witnesses. Brooks has isolated the citations of Gregory of Nyssa in Matthew. The CMDS and DC analysis results based on the distance matrix derived from Brooks' data indicate that Gregory of Nyssa's text is somewhat isolated. Ranking witnesses by distance from Gregory's inferred text shows which ones are closest:


While none of the witnesses is significantly close, some are further away than expected. It is therefore safe to say that Gregory of Nyssa's text as preserved in his quotations is not like that of W, Pi, 565, 13, 1424, L, 700, B, Aleph, D, it-c, Theta, it-a, or it-b. In conventional terms, Gregory of Nyssa's text is non-Alexandrian, non-Western, and, perhaps, non-Eastern.

The closest witnesses to Gregory of Nyssa's text are Byzantine. Wisse places codices V (031), E (07), and Ω (045) in cluster Ω of the Kx group. He classifies Codex U (030) as Kx in one test passage and Kmix in the other two. While Wisse places 1241 in his (Alexandrian) group B, the 24- and 93-way partitions of INTF data, above, indicate that it is a peripheral member of Family Π in Matthew.[24]

Brooks concludes that Gregory of Nyssa's text is closest to Byzantine witnesses but rightly points out that this does not necessarily make Gregory a Byzantine witness:[25]

Virtually all of the evidence indicates that Gregory's quotations from the NT have their greatest affinity with the Byzantine type of text. There is still the nagging question, however, whether Gregory should actually be classified as a Byzantine witness. Obviously every witness is more closely related to one of the text-types than the others, but not every witness belongs to the text-type to which it is most closely related.

Tommy Wasserman's study on the Patmos Family of New Testament MSS” presents collation data for 34 manuscripts which share an unusual reading at John 8.8-9. While the highest peak in the relevant MSW plot corresponds to 28 groups, dividing the data set into this many parts causes most of the textual complexes to fragment. Another peak corresponds to 18 groups, and division into this many parts preserves the less coherent complexes:


The table includes columns showing current locations and Wisse's textual classifications. Wisse's classifications are based on three test passages in Luke's Gospel and his classifications for a manuscript sometimes vary across those three places. This is why manuscripts 725, 191, 2608, 1033, 1699, and 1169 have alternative classifications in the fourth column. Instances of NA (not available) in the fourth column refer to manuscripts in Wasserman's study that were not examined by Wisse.

Some of these manuscripts belong to ancient collections and may therefore never have changed location.[26] There is perfect correlation between group membership and locality for the manuscripts in groups 1068 and 1385. Group 1385 corresponds to the Patmos Family of manuscripts identified by Silva New based on collations of the Gospel of Mark.[27] Groups besides 1068 and 1385 conform to the more usual pattern among New Testament manuscripts where textual nature bears little relationship to present location. While it is reasonable to expect the provenance of a manuscript to correlate with its textual nature, manuscript mobility tends to obscure the relationship.

Certain groups identified by PAM analysis correspond to ones isolated by Wisse's profile method: group 1690 matches Wisse's Πa; group 2146 contains members of a number of Wisse's M groups, particularly M651; group 1385 matches Cl 1173, which Wisse identifies with Silva New's Patmos Family; and group 1204 matches M1402. Groups Maj and 1089 share a few common members with Wisse's Kx; however, this does not signify much because of the catch-all nature of Kx.

The groups identified by PAM analysis correlate fairly well with ones proposed by Wasserman on the basis of 42 test passages in Matthew 19:13-26: group 1690 aligns with his group 5 (581, 992, 1571, 1690, 2463); four out of six members of group 2146 are in his group 1 (651, 1549, 2146, 191); group 1068 has the same members as his group 6; group 1385 has three members in common with his group 2 (1169, 1173, 1385, 1033); and group 1204 contains the constituents of his groups 3 and 4 (1402, 2295; 1204, 2315). One difference between the 18-way PAM result and Wasserman's classification relates to group 1089: it contains two members of his group 7 (1089, 1218), one member of his group 2 (1033) and two others (1272, 1699) which he does not classify for Matthew.[28]

The first few eligible peaks in the MSW plot for the INTF data set relating to Mark's Gospel correspond to two, four, seven, and 17 groups. The highest value besides those associated with two and four groups corresponds to 63 groups. Partitions into these numbers of groups are presented below:


Group A, mainly comprised of what many would call Alexandrian manuscripts, separates from the initial group to leave behind group 1339. Codex Bezae (05) is included in group A although it is not a good fit as indicated by a negative silhouette width. Interestingly, Wisse also places 05 in an Alexandrian group.[29]


Proceeding to a four-way partition, the composition of group A stays exactly the same. Three child groups emerge from parent 1339 (2): 1339 (4), 209 (4), and 826 (4). Not many would object to 1339 (4) being labelled as Byzantine although it is perhaps better described as a catch-all category of texts which gives rise to more coherent groups in more granular partitions. Apart from a few interlopers, groups 209 (4) and 826 (4) are recognisable as Families 1 and 13. The texts in group 209 which are not usually classified as members of Family 1 are 032 and 28; the texts in group 826 not usually placed in Family 13 are 038 and 565. Streeter regarded all of the members of groups 209 (4) and 826 (4) as primary or secondary authorities for the Caesarean branch of his Eastern type.[30]


When the data set is divided into seven parts, groups A (7) and 209 (7) remain precisely the same as their counterparts in the four-way partition. The continuing inclusion of 032 and 28 in group 209 suggests that these texts are affiliates of Family 1 in Mark. Comparing group 826 (7) and 826 (4) shows that 038 and 565 have migrated to other groups, making their connection to Family 13 somewhat tenuous.

The other groups which emerge in the seven-way partition are children of group 1339 (4): group 041 corresponds to Family Π; apart from 1241, group 517 corresponds to Wisse's cluster 1675 (i.e. Family 1424); group 1528 contains members of Wisse's groups 16 (1528, 16) and 1216 (1279, 1579, 184, 2726, 348, 555, 829) along with a few others (61, 565, 700); and group 1339 (7) is what remains of group 1339 (4) once these other groups separate.[31]


Some groups narrow in the 17-way partition: group A loses 05, 33, and 579; group 041 loses 034, 038, 1071, 732, and 863; group 209 loses 032 and 28; and group 1528 loses 565 and 700. Groups 826 and 517 have the same membership in both seven- and 17-way partitions. A number of new groups form as well, often drawing members from group 1339 (7): group 07 (17) takes almost half of the members of group 1339 (7) to form a complex which has points of contact with von Soden's Ki, also known as Family E; group 022 contains two purple codices (022 and 042) along with minuscules 1071, 1273, and 2766; group 565 is made up of 038 and 565, manuscripts which Streeter regarded as primary authorities for his Caesarean text; four of the six members of group 1457 are in Wisse's cluster 827; group 579 is comprised of two of the texts which dropped out of group A (7); and group 732 is comprised of three texts whose affiliation may not have been noticed before.

Only the most coherent groups remain in a 63-way partition:


The correspondence column associates groups found in the 63-way partition with known groups.[32] Members of group 031 are classified in various ways: many are in Wisse's Kx and four (07, 028, 031, 045) are in his cluster Ω; seven are in von Soden's Ki (07, 09, 011, 013) or K1 (028, 031, 045); six (07, 09, 011, 013, 028, 031) are in Family E. Ten members of group 031 (07, 011, 013, 028, 031, 045, 3, 1296, 1341, 1343) are in the same branch of the corresponding DC dendrogram.

Wisse's groups differ in some respects from those identified by the 63-way partition: his cluster 827 is split into groups 1593 and 1457 and members of his groups 16 and 1216 combine to form group 1528, which corresponds to von Soden's Iβ. While groups 1230, 1338, 4, 372, and 732 do not have counterparts in Wisse's taxonomy, groups 1230, 1338, and 372 do correspond to clusters found in the 93-way partition of the INTF data set for Matthew's Gospel. The Alexandrian group A (63) only survives at this level of partitioning due to the proximity of the synthetic Ausgangstext and Codex Vaticanus (B 03).

Shifting attention to the UBS4 data set for Mark, the MSW plot suggests that three-, six-, 11-, and 24-way partitions are among the most preferable.


The three-way partition produces Alexandrian (Psi), Byzantine (Byz), and Western (it-i) groups, although some texts (namely Codex C along with the Sinaitic Syriac, Armenian, and Vulgate versions) do not easily fit into this scheme.


The six-way partition also has groups which might be styled Alexandrian (B), Byzantine (Byz), and Western (it-ff-2). If minuscule 205 is counted as a member of Family 1 then all eight members of groups arm and 205 belong to what Streeter described as an Eastern type of the New Testament text.[33] The same eight texts form a single group in a five-way partition of the data set.

One of the six groups is centred on Jerome's Vulgate (vg) and includes: Codex Koridethi (Theta or 038); certain Latin manuscripts (it-aur, it-f, it-l, it-q); the Palestinian Syriac and Ethiopic versions; and Augustine's quotations. While Augustine's quotations might be expected to belong to this group, the presence of 038 is somewhat surprising. Streeter regarded 038 as a primary authority for the Caesarean branch of his Eastern type. Accordingly, this manuscript might be expected to fall into group arm or 205. However, the relevant CMDS map indicates that 038 has a Western component, as implied by its gravitation towards the region of the map occupied by Latin witnesses.[34] It might be more accurate to describe the Latin manuscripts of the vg group (i.e. it-aur, it-f, it-l, it-q) as Vulgate rather than Old Latin. Back-translations of Jerome's Vulgate could have infiltrated versions such as the Palestinian Syriac and Ethiopic, drawing them into this group as well.


In an 11-way partition, new groups form around Codex Sangallensis (Delta or 037), Codex Koridethi (Theta or 038), and the Bohairic Coptic version (cop-bo). The Delta group (C, L, 037) contains texts which are sometimes called secondary Alexandrian. The Theta group (038, 565, syr-pal) is comprised of three texts which could be regarded as mixtures in various proportions of ancient textual varieties. Under this interpretation of the CMDS result, all three share one pole, which is a text of the kind found in the Sinaitic Syriac, Armenian, and Georgian versions. The other pole for 038 and 565 would then be the Western variety, and that for the Sinaitic Syriac would be the Byzantine variety.[35] Returning to the new groups which emerge from the 11-way partition, the Bohairic group is comprised of two Coptic varieties and Greek minuscule 892. Codex W and Old Latin k are the only members of their respective groups, not having any close relatives among the surveyed texts.


Going to a 24-way partition narrows group membership so that more central constituents remain. Many texts which migrate out of the 11-way groups become singletons. Group it-ff2 (11) splits into groups it-d (24) and it-i (24). Group Psi (24) draws members from separate groups of the 11-way partition.

A few individual witnesses in the UBS4 data set will now be examined separately. Henry A. Sanders noticed that agreements between Codex W (032) and Old Latin readings, especially those of Codex Palatinus (e), have a higher relative frequency in the first few chapters than in the rest of Mark.[36] Splitting the UBS4 data set at the end of chapter four then ranking witnesses by distance from W shows the varying nature of this manuscript's text:



Producing distance matrices which include P45 and Origen then ranking witnesses by distance produces these results:



These tables show that the nearest texts to P45, W (for Mark chapters 5-16), and Origen are predominately members of the textual complexes represented by the arm and 205 groups found in the six- and eleven-way partitions of the UBS4 data set for Mark.

Hurtado compiled tables of percentage agreement between a selection of witnesses for each of the first 14 chapters of Mark, Mark 15.1-16.8, and places where P45 is extant.[37] An expanded version of Hurtado's data on P45 will be considered below under the section on Mullen's data for Mark.

Mean silhouette width plots suggest that a five-way partition is a reasonable choice for a number of Hurtado's chapter-wise data sets. Dividing each data set into five groups produces the partitions shown in the following table, which includes links to associated CMDS and DC results for comparison. Groups are arranged in the order produced by the analysis program so corresponding groups do not necessarily occupy the same columns.


Certain witnesses tend to occur together: (1) the Textus Receptus (TR) and Alexandrinus (A); (2) Sinaiticus (Aleph) and B; (3) Koridethi (Theta) and 565. PAM analysis places Family 13 (F13) with the Textus Receptus and Alexandrinus in eight sections, with W (032) in four, on its own in two, and with Koridethi and 565 in one section.

Many of these affiliations are confirmed by corresponding CMDS and DC results. However, a difference between the results produced by the respective analysis modes occurs for codices D and W in the initial chapters of Mark. The changing textual complexion of W noticed by Sanders is confirmed by the DC dendrograms, which place D and W in the same dendrogram branches for the first four chapters of Mark but in different branches for the rest. The CMDS maps indicate that the texts of D and W are relatively close in chapters one, three, and four. By contrast, the five-way PAM partitions only place D and W in the same group for chapter four. Part of the disparity is due to the number of groups used for partitioning: in three-way partitions, D and W occupy the same groups for chapters two, three, and four.


The same data set can be sliced to obtain analysis results for P45.



In this 11-way partition, PAM analysis places P45 in a group with 04 (C), 019 (L), 040 (Ξ), 33, 892, and 1241 -- what some have called secondary Alexandrian texts.












A number of the nearest witnesses to P13 are of the Byzantine type.


The analysis results presented here highlight variations between witnesses of the New Testament. This naturally raises the question of what difference the variations make to the meaning of the text. Many variations are of little consequence — whether an added or dropped article, a change of word order, or substitution of a synonymous phrase. Some variations have a larger effect, the two most extreme examples being Mark 16.9-20 and John 7.53-8.11 which are absent from a number of witnesses.

One way to convey how much difference the variations make is to provide translations of a number of textual varieties for the same section of text. The following table gives a parallel translation of four varieties of the first chapter of Mark, highlighting the variation sites identified in the fourth edition of the United Bible Societies Greek New Testament. This edition only presents a selection of textual variations:

The variation units presented in the UBS apparatus constitute a small proportion of the total number of variation units that exist. However, the ones given below should provide a reasonably good impression of how much the respective varieties of text differ in meaning. This is because the great majority of variations which are not presented in the UBS apparatus have only a slight semantic effect.

The textual varieties shown in the table consist of four clusters identified by reference to the DC dendrogram of UBS4 combined data for Mark:[39]

  • A: The mainly Byzantine cluster comprised of A ... syr-pal

  • B: Aleph B C L Delta Psi 892 1342 cop-bo cop-sa it-k

  • C: W Theta f-1 28 205 565 arm geo syr-s

  • D: D it-a it-b it-c it-d it-ff-2 it-i it-q it-r-1

For each variation unit, the variant supported by a textual variety is taken to be the one that occurs most frequently among its members. To illustrate, suppose that a variation unit has three variants and that two witnesses in cluster C have the first, three have the second, and four have the third. The variant supported by cluster C would then be taken to be the third. For the purpose of this exercise, if a tie occurs then the supported variant is taken to be the one with the greatest tendency to isolate the variety.

Table 56. Four way parallel translation of Mark chapter one

Reference A B C D
1.1 The beginning of the good news about Jesus Christ, Son of God. The beginning of the good news about Jesus Christ, Son of God. The beginning of the good news about Jesus Christ. The beginning of the good news about Jesus Christ, Son of God.
1.2 As written in the prophets, "Look, I send my messenger before you, who will prepare your way;" As written in the prophet Isaiah, "Look, I send my messenger before you, who will prepare your way;" As written by Isaiah the prophet, "Look, I send my messenger before you, who will prepare your way;" As written in the prophet Isaiah, "Look, I send my messenger before you, who will prepare your way;"
1.3 "A voice shouting in the wilderness, 'Prepare the way of the Lord! Make his paths straight!'" "A voice shouting in the wilderness, 'Prepare the way of the Lord! Make his paths straight!'" "A voice shouting in the wilderness, 'Prepare the way of the Lord! Make his paths straight!'" "A voice shouting in the wilderness, 'Prepare the way of the Lord! Make his paths straight!'"
1.4 John appeared, baptizing in the wilderness and announcing a baptism of a changed attitude for forgiveness of wrong deeds. John the Baptist appeared in the wilderness, and [was] announcing a baptism of a changed attitude for forgiveness of wrong deeds. John the Baptist appeared in the wilderness, and [was] announcing a baptism of a changed attitude for forgiveness of wrong deeds. John appeared in the wilderness, baptizing and announcing a baptism of a changed attitude for forgiveness of wrong deeds.
1.5 They went out to him, all of the land of Judea and those of Jerusalem, and were baptized by him, confessing their wrong deeds. They went out to him, all of the land of Judea and those of Jerusalem, and were baptized by him, confessing their wrong deeds. They went out to him, all of the land of Judea and those of Jerusalem, and were baptized by him, confessing their wrong deeds. They went out to him, all of the land of Judea and those of Jerusalem, and were baptized by him, confessing their wrong deeds.
1.6 John was clothed [with] camel hair and a leather covering around his waist; he ate locusts and wild honey. John was clothed [with] camel hair and a leather covering around his waist; he ate locusts and wild honey. John was clothed [with] camel hair and a leather covering around his waist; he ate locusts and wild honey. John was clothed [with] camel hair and a leather covering around his waist; he ate locusts and wild honey.
1.7 He gave notice saying, "One more powerful than me comes after me, whose sandal straps I am not worthy to bend down and untie." He gave notice saying, "One more powerful than me comes after me, whose sandal straps I am not worthy to bend down and untie." He gave notice saying, "One more powerful than me comes after me, whose sandal straps I am not worthy to bend down and untie." He gave notice saying, "I baptize you in water. One more powerful than me comes after me, whose sandal straps I am not worthy to bend down and untie."
1.8 "I baptize you in water; he will baptize you in the Holy Spirit." "I baptize you [in] water; he will baptize you in the Holy Spirit." "I baptize you in water; he will baptize you in the Holy Spirit." "He will baptize you in the Holy Spirit."
1.9 In those days Jesus came from Nazareth, Galilee, and was baptized in the Jordan by John. In those days Jesus came from Nazareth, Galilee, and was baptized in the Jordan by John. In those days Jesus came from Nazareth, Galilee, and was baptized in the Jordan by John. In those days Jesus came from Nazareth, Galilee, and was baptized in the Jordan by John.
1.10 Then coming up from the water he saw the heavens being torn open and the Spirit coming down to him like a dove. Then coming up from the water he saw the heavens being torn open and the Spirit coming down to him like a dove. Then coming up from the water he saw the heavens being torn open and the Spirit coming down to him like a dove; Then coming up from the water he saw the heavens being torn open and the Spirit coming down to him like a dove;
1.11 There came from the heavens a voice: "You are my beloved Son; I am delighted with you." There came from the heavens a voice: "You are my beloved Son; I am delighted with you." from the heavens he heard a voice: "You are my beloved Son; I am delighted with you." from the heavens a voice: "You are my beloved Son; I am delighted with you."
1.12 Then the Spirit drives him into the wilderness. Then the Spirit drives him into the wilderness. Then the Spirit drives him into the wilderness. Then the Spirit drives him into the wilderness.
1.13 He was in the desert forty days being tested by Satan; he was with the wild animals and the angels waited on him. He was in the desert forty days being tested by Satan; he was with the wild animals and the angels waited on him. He was in the desert forty days being tested by Satan; he was with the wild animals and the angels waited on him. He was in the desert forty days being tested by Satan; he was with the wild animals and the angels waited on him.
1.14 After John had been arrested, Jesus went into Galilee announcing the good news of the kingdom of God After John had been arrested, Jesus went into Galilee announcing the good news of God After John had been arrested, Jesus went into Galilee announcing the good news of God After John had been arrested, Jesus went into Galilee announcing the good news of the kingdom of God
1.15 saying, "The time has come and God's kingdom is near. Change your attitude and believe the good news." saying, "The time has come and God's kingdom is near. Change your attitude and believe the good news." saying, "The time has come and God's kingdom is near. Change your attitude and believe the good news." saying, "The time has come and God's kingdom is near. Change your attitude and believe the good news."
1.16 Passing by the Sea of Galilee he saw Simon and Andrew, Simon's brother, throwing a net into the sea. (They were fishermen.) Passing by the Sea of Galilee he saw Simon and Andrew, Simon's brother, throwing a net into the sea. (They were fishermen.) Passing by the Sea of Galilee he saw Simon and Andrew, Simon's brother, throwing a net into the sea. (They were fishermen.) Passing by the Sea of Galilee he saw Simon and Andrew, Simon's brother, throwing nets into the sea. (They were fishermen.)
1.17 Jesus said to them, "Come with me and I will make you into fishers of men." Jesus said to them, "Come with me and I will make you into fishers of men." Jesus said to them, "Come with me and I will make you into fishers of men." Jesus said to them, "Come with me and I will make you into fishers of men."
1.18 Then they left the nets and followed him. Then they left the nets and followed him. Then they left the nets and followed him. Then they left the nets and followed him.
1.19 Going a bit further he saw Jacob Zebedee and his brother John who were in the boat fixing the nets. Going a bit further he saw Jacob Zebedee and his brother John who were in the boat fixing the nets. Going a bit further he saw Jacob Zebedee and his brother John who were in the boat fixing the nets. Going a bit further he saw Jacob Zebedee and his brother John who were in the boat fixing the nets.
1.20 Then he called them. Leaving their father Zebedee in the boat with the hired hands, they went after him. Then he called them. Leaving their father Zebedee in the boat with the hired hands, they went after him. Then he called them. Leaving their father Zebedee in the boat with the hired hands, they went after him. Then he called them. Leaving their father Zebedee in the boat with the hired hands, they went after him.
1.21 They go into Capernaum. Then, on the Sabbath, having gone into the synagogue, he taught. They go into Capernaum. Then, on the Sabbath, having gone into the synagogue, he taught. They go into Capernaum. Then, on the Sabbath, having gone into the synagogue, he taught. They go into Capernaum. Then, on the Sabbath, having gone into the synagogue, he taught.
1.22 They were shocked by his teaching because he taught them like someone with authority, not like the scholars. They were shocked by his teaching because he taught them like someone with authority, not like the scholars. They were shocked by his teaching because he taught them like someone with authority, not like the scholars. They were shocked by his teaching because he taught them like someone with authority, not like the scholars.
1.23 Then there was a man with an unclean spirit in their synagogue. He screamed, Then there was a man with an unclean spirit in their synagogue. He screamed, Then there was a man with an unclean spirit in their synagogue. He screamed, Then there was a man with an unclean spirit in their synagogue. He screamed,
1.24 "What's with us and you, Jesus Nazarene? Have you come to destroy us? I know who you are — God's holy one!" "What's with us and you, Jesus Nazarene? Have you come to destroy us? I know who you are — God's holy one!" "What's with us and you, Jesus Nazarene? Have you come to destroy us? I know who you are — God's holy one!" "What's with us and you, Jesus Nazarene? Have you come to destroy us? I know who you are — God's holy one!"
1.25 Jesus told it off saying, "Be quiet! Get out of him!" Jesus told it off saying, "Be quiet! Get out of him!" Jesus told it off saying, "Be quiet! Get out of him!" Jesus told it off saying, "Be quiet! Get out of him!"
1.26 Throwing a fit and shouting with a loud voice, the unclean spirit got out of him. Throwing a fit and shouting with a loud voice, the unclean spirit got out of him. Throwing a fit and shouting with a loud voice, the unclean spirit got out of him. Throwing a fit and shouting with a loud voice, the unclean spirit got out of him.
1.27 All being shocked they asked each other, "What is this? What new teaching is this, that with authority he gives orders even to unclean spirits and they obey him?" All being shocked they asked each other, "What is this new teaching with authority? He gives orders even to unclean spirits and they obey him." All being shocked they asked each other, "What is this, this new teaching with authority? He gives orders even to unclean spirits and they obey him." All being shocked they asked each other, "What is that teaching, this new one with authority, that he gives orders even to unclean spirits and they obey him?"
1.28 The news about him then got out everywhere in the whole region of Galilee. The news about him then got out everywhere in the whole region of Galilee. The news about him then got out everywhere in the whole region of Galilee. The news about him then got out everywhere in the whole region of Galilee.
1.29 Then, leaving the synagogue, they went to Simon and Andrew's house with Jacob and John. Then, leaving the synagogue, they went to Simon and Andrew's house with Jacob and John. Then, leaving the synagogue, he went to Simon and Andrew's house with Jacob and John. Leaving the synagogue, he went to Simon and Andrew's house with Jacob and John.
1.30 Simon's mother-in-law lay sick with fever. Then they tell him about her. Simon's mother-in-law lay sick with fever. Then they tell him about her. Simon's mother-in-law lay sick with fever. Then they tell him about her. Simon's mother-in-law lay sick with fever. Then they tell him about her.
1.31 He went over, took hold of her hand, and helped her up. The fever left her and she began to wait on them. He went over, took hold of her hand, and helped her up. The fever left her and she began to wait on them. He went over, took hold of her hand, and helped her up. The fever left her and she began to wait on them. He went over, took hold of her hand, and helped her up. The fever left her and she began to wait on them.
1.32 In the evening after sunset they began to bring everyone who was suffering from sickness and the demonized. In the evening after sunset they began to bring everyone who was suffering from sickness and the demonized. In the evening after sunset they began to bring everyone who was suffering from sickness and the demonized. In the evening after sunset they began to bring everyone who was suffering from sickness and the demonized.
1.33 The whole town was gathered at the door. The whole town was gathered at the door. The whole town was gathered at the door. The whole town was gathered at the door.
1.34 He cured a lot who suffered a variety of sicknesses and got out a lot of demons. He did not allow the demons to speak because they had recognized him. He cured a lot who suffered a variety of sicknesses and got out a lot of demons. He did not allow the demons to speak because they had recognized him to be Christ. He cured a lot who suffered a variety of sicknesses and got out a lot of demons. He did not allow the demons to speak because they had recognized him to be Christ. He cured a lot who suffered a variety of sicknesses and got out a lot of demons. He did not allow the demons to speak because they had recognized him.
1.35 Getting up early while it was still dark, he left and went away to a deserted spot and prayed there. Getting up early while it was still dark, he left and went away to a deserted spot and prayed there. Getting up early while it was still dark, he left and went away to a deserted spot and prayed there. Getting up early while it was still dark, he left and went away to a deserted spot and prayed there.
1.36 Simon and those with him hunted him down. Simon and those with him hunted him down. Simon and those with him hunted him down. Simon and those with him hunted him down.
1.37 They find him and say to him, "Everyone is looking for you." They find him and say to him, "Everyone is looking for you." They find him and say to him, "Everyone is looking for you." They find him and say to him, "Everyone is looking for you."
1.38 He says to them, "Let's go somewhere else -- into the next towns -- so that I can campaign there too, because I came out for this." He says to them, "Let's go somewhere else -- into the next towns -- so that I can campaign there too, because I came out for this." He says to them, "Let's go somewhere else -- into the next towns -- so that I can campaign there too, because I came out for this." He says to them, "Let's go somewhere else -- into the next towns -- so that I can campaign there too, because I came out for this."
1.39 He was campaigning in their synagogues throughout Galilee, driving out demons too. He went campaigning in their synagogues throughout Galilee, driving out demons too. He was campaigning in their synagogues throughout Galilee, driving out demons too. He was campaigning in their synagogues throughout Galilee, driving out demons too.
1.40 A leper came towards him begging and kneeling to him, saying "If you want to you can make me clean." A leper came towards him begging and kneeling, saying "If you want to you can make me clean." A leper came towards him begging and kneeling, saying "If you want to you can make me clean." A leper came towards him begging, saying "If you want to you can make me clean."
1.41 Deeply moved, reaching out his hand he takes hold of him and says: "I want to. Be clean." Deeply moved, reaching out his hand he takes hold of him and says: "I want to. Be clean." Deeply moved, reaching out his hand he takes hold of him and says: "I want to. Be clean." Getting annoyed, reaching out his hand he takes hold of him and says: "I want to. Be clean."
1.42 Then the leprosy left him and he was cleansed. Then the leprosy left him and he was cleansed. Then the leprosy left him and he was cleansed. Then the leprosy left him and he was cleansed.
1.43 He told him off then sent him away. He told him off then sent him away. He told him off then sent him away. He told him off then sent him away.
1.44 He says to him, "Look, don't say anything to anyone. Instead, go off, show yourself to the priest, and offer what Moses commanded for your cleansing as proof to them." He says to him, "Look, don't say anything to anyone. Instead, go off, show yourself to the priest, and offer what Moses commanded for your cleansing as proof to them." He says to him, "Look, don't say anything to anyone. Instead, go off, show yourself to the priest, and offer what Moses commanded for your cleansing as proof to them." He says to him, "Look, don't say anything to anyone. Instead, go off, show yourself to the priest, and offer what Moses commanded for your cleansing as proof to them."
1.45 However, he went out and began much campaigning and spreading the word so that Jesus couldn't openly go into a city anymore but stayed outside in remote places. They came to him from everywhere. However, he went out and began much campaigning and spreading the word so that Jesus couldn't openly go into a city anymore but stayed outside in remote places. They came to him from everywhere. However, he went out and began much campaigning and spreading the word so that Jesus couldn't openly go into a city anymore but stayed outside in remote places. They came to him from everywhere. However, he went out and began much campaigning and spreading the word so that Jesus couldn't openly go into a city anymore but stayed outside in remote places. They came to him from everywhere.

Notes

  1. Sometimes the most frequently supported variants of the four varieties are all the same, as in Mark 1.6 where two witnesses from cluster D have leather instead of hair.

  2. A variation unit may affect more than one verse, as at Mark 1.7-8.

  3. The translation attempts to produce contemporary English while retaining the atmosphere of the Greek. Consequently, "change your attitude" is preferred to the archaic "repent," and "campaign" is preferred to the rarely used "proclaim" or less vivid "preach." The simple present is used to translate Mark's "historic present." (E.g. "He says to them...")

Isaac Newton said, If I have seen further it is only by standing on the shoulders of giants. This sentiment truly applies to the results presented here. Our field owes a great debt to those who have compiled the information, both printed and electronic, upon which the data and distance matrices are based.

Compiling the basic data from which analysis proceeds is an arduous and painstaking task. Richard Mallett deserves special thanks in this respect, having encoded data matrices and transcribed tables of percentage agreement from numerous sources. Mark Spitsbergen helped to encode the UBS4 apparatus data for the first fourteen chapters of Matthew.

Maurice A. Robinson kindly provided tables of percentage agreement for the Gospels and Acts. These are derived from the apparatus of the second edition of the United Bible Societies' Greek New Testament. The exacting task of transforming the data into electronic format was performed by Claire Hilliard and Kay Smith.

A number of the results are produced from comprehensive data generously provided by the Institut für neutestamentliche Textforschung in Münster, Germany. Researchers at the INTF have spent many years on the gargantuan task of compiling this data. Holger Strutwolf, Klaus Wachtel, and Volker Krüger were instrumental in providing access to the data.

The analysis would scarcely have been possible without the marvellous R Language and Environment for Statistical Computing. Finally, thanks go to Gerald Donker for suggesting that the RGL plotting library be used to produce three-dimensional CMDS maps. He also encouraged me to take a less procrustean approach to missing data. As a consequence, the analysis results presented here include many more witnesses than they otherwise would.



[1] Defining the limits of a variation site is a matter of editorial discretion. See Potential Computer Applications for a discussion of some approaches.

[2] Gerd Mink provides a definition of the term initial text in Problems of a Highly Contaminated Tradition, 25-26. Eldon J. Epp finds the term original text problematic, as discussed in his Multivalence of the Term 'Original Text.'

[3] See Analysis of Textual Variation for more details of the encoding conventions employed here.

[4] A distance matrix can be obtained from a table of percentage agreement by dividing each percentage by one hundred then subtracting the result from one. For example, a percentage agreement of 85% corresponds to a distance of 0.15.

[5] A sample size of fifteen corresponds to a confidence interval with a relative width of about half of the entire range of possible distances.

[6] Maechler and others, Cluster Analysis Basics and Extensions; diana method of the cluster package.

[7] Examples of phylogenetic analysis results are presented in Spencer, Wachtel, and Howe, The Greek Vorlage of the Syra Harclensis.

[8] See documentation relating to the pam method of the cluster package by Maechler and others, Cluster Analysis Basics and Extensions.

[9] Roderic Mullen has prepared an augmented version of Hurtado's table for P45; see the Mullen source entry for details.

[10] All witnesses with an equal greatest number of missing entries are dropped in the same step.

[11] These coefficients are discussed in How To Discover Textual Groups.

[12] The R script named MVA-PAM-MSW.r produces a list of numbers of groups corresponding to peaks in the MSW plot with above-average values.

[13] Epp, Significance of the Papyri, 291.

[14] In this article, a trajectory refers to a line joining two endpoints in textual space. Eldon J. Epp introduced the term trajectory to describe a time sequence of witnesses with the same kind of text. See e.g. Epp's Twentieth-Century Interlude, 93.

[15] If a group has only two members then the PAM algorithm chooses one as the medoid. Neither member of a two member group is more central.

[16] Group K of the 13-way partition is comparable to Wisse's Kx described at page 94 of his Profile Method.

[17] See my Groups article for details of this statistical approach.

[18] See Gerd Mink's Problems of a Highly Contaminated Tradition and Introductory Presentation for an explanation of the CBGM. Phylogenetic analysis techniques such as described in Spencer, Wachtel and Howe's Greek Vorlage of the Syra Harclensis can also be used to investigate the priority of texts.

[19] See Silva Lake, Family Π, 15, for a list of manuscripts comprising this family. The purple manuscripts (i.e. 022, 023, 042, 043, and 080), dated to the sixth century, are deluxe copies written in letters of silver or gold on purple-dyed vellum. Wisse did not mention this group, probably because only one of its members (022) was included in his study. Streeter discusses the purple manuscripts group in his Four Gospels, 575-7.

[20] The same partition of the UBS2 data set for Matthew is given above and is repeated here for convenience. The UBS4 data set is not used as it only covers the first half of Matthew at present.

[21] Streeter, Four Gospels (27, 32) thought that the Eastern and Western types each had two sub-varieties. His chart of MSS. and the Local Texts (108) lists supposed representatives of the two Eastern branches (associated with Antioch and Caesarea) and the two Western ones (associated with Italy/Gaul and Carthage).

[22] See e.g. Jacobus Petzer, Latin Version, 121, and Streeter, Four Gospels, 65.

[23] D and Latin h do not qualify as having significant distances even though other witnesses at the same distance (i.e. Theta; Latin aur, l, ff-1) do qualify. This happens because the textual states of D and h are not defined at as many places as the others.

[24] Minuscule 1241 is in the group corresponding to Family Π (i.e. group 2193) in the 24-way partition but is absent from the corresponding group of the 93-way partition. This implies that 1241 is not a core member of Family Π in Matthew.

[25] Brooks, Gregory of Nyssa, 264.

[26] According to Wasserman (Patmos Family, table 1, note 5), the present location of a MS in such a collection is likely to be the place of origin as well. Saying that the original and current locations of such manuscripts are likely to coincide may be an overstatement.

[27] Silva New, Patmos Family of Gospel Manuscripts, 85.

[28] Wasserman's groups are presented in table 5.1 of his Patmos Family article. He arrived at these groups using the quantitative analysis method developed by Colwell and Tune.

[29] Wisse, Profile Method (52), classifies 05 as a member of his group B in all three test passages of Luke. He says (91), This group involving 15 MSS has traditionally been known as the Neutral or Alexandrian text-type; von Soden called it H for Hesychius.

[30] Four Gospels, 108.

[31] See the Profile Classification table in Wisse's Profile Method, 52-90. Wisse counts 017, 02, 034, 041, 1346, and 2411 as members of Family Π; he did not analyse 1421, 1500, 222, 732, or 863. Silva Lake, Family Π and the Codex Alexandrinus, 7-8, includes 1500 and 1780 in her list of Family Π members.

[32] Most of the links were established by reference to Wisse's Profile Method, 52-113. Even though Wisse's work relates to three sample chapters of Luke's Gospel, many of his manuscript groups persist in a wider context. The purple manuscripts group, not mentioned by Wisse, is described by Streeter, Four Gospels, 575-7.

[33] Streeter, Four Gospels, 27. He thought that the type was divided into branches associated with Antioch and Caesarea (108).

[34] Larry Hurtado (Text-Critical Methodology, 88) and Stephen Carlson (The Origin(s) of the 'Caesarean' Text, 20-21) have already noticed the Western leaning of Codex 038.

[35] It is hard to tell from this map whether the latter end point for 038 and 565 is the Vulgate or Old Latin cluster.

[36] Sanders, Freer Collection, 1:63-4. Sanders placed the textual change in the vicinity of 5.30 while Hurtado, Text-Critical Methodology, 19, places it around 5.6.

[37] Hurtado, Text-Critical Methodology and the Pre-Caesarean Text.

[38] Aland and others, Greek New Testament (4th ed.), 2*.

[39] Minuscule 2427, which is now considered to be spurious, has been dropped from the B cluster so that it does not affect decisions on cluster membership.