Home Articles Reader Opinion Editorial Book Reviews Discussion Writers Guide About TCRecord
transparent 13

General and Specific Effects of Training in Reading with Observation on the Experimental Technique

by Arthur I. Gates & Dorothy Van Alstyne - 1924

That "reading" is not a single or unitary power but merely a name for a large number of abilities, more or less specifically acquired, even if positively intercorrelated, is now rather commonly recognized. 1 Reading is reacting. The number of reactions that may be made to the printed word by a well-trained adult is large, with ever remaining possibilities of addition and combination.

That "reading" is not a single or unitary power but merely a name for a large number of abilities, more or less specifically acquired, even if positively intercorrelated, is now rather commonly recognized. 1 Reading is reacting. The number of reactions that may be made to the printed word by a well-trained adult is large, with ever remaining possibilities of addition and combination.

Extreme cases will make the distinction clear. Contrast reading a manuscript for the purpose of getting the outstanding ideas with 'proof-reading.' Either of these types of reading may be further subdivided; each constituent ability must be, perhaps, mainly specially acquired. A wide reader may be a poor proof-reader. In his earliest efforts, he will experience difficulty in detecting misspelled words, errors of punctuation and grammar, the mistakes and defects in type, inconsistencies in mechanical composition and rhetoric. He may find it difficult enough at first to detect misspellings alone but this ability usually develops rapidly with specific practice. To react effectively to several types of errors in a manuscript is at first impossible, but such ability may be acquired like other "hierarchies of habits" familiar in typewriting, telegraphy, and other functions, the learning of which has been subjected to experimental analysis. Thus the skilled proofreader may be simultaneously sensitive to errors and inconsistencies of many types. Conceivably a year of service by the professional proofreader results in little gain in knowledge of, or capacity to comprehend, the type of material handled daily.

Comprehension is likewise little more than a convenient title for a list of mental reactions which may be made to printed material. We may read in innumerable ways to satisfy different purposes. To give a few examples: 1. We may read primarily to grasp the thought—to comprehend the bare facts given—but there are many forms of this type of reading. We may read unselectively, giving all words or ideas equal weight, or attempt to single out several main ideas, to perceive the logical outline, to detect the main idea, the topic sentence, the key phrase or word. In any of these cases, we may read with a passing interest merely, for temporary or more permanent retention. Teachers' requests to "read this so you can tell me all about it"; "read this so you can write an outline afterward"; or "read this to get the main idea," doubtless elicit quite different reading reactions among experienced pupils. Reading the directions for making a cake, or using a vacuum cleaner; reading the legends of pictures, graphs or maps; reading problems in arithmetic or chemistry may demand specific items of technique.

2. Re-reading affords opportunity for many types of skill—speed in skimming, flexibility in changing pace, skipping from place to place. Such abilities are doubtless variously combined when our purpose is to review evenly, now carefully, now sketchily; to pick up the thread of a story, to detect parts earlier overlooked, to verify an outline or a general impression.

3. Reading may be conducted to secure data bearing on some particular issue, to answer some definite question or assuage some doubt. Thus the teacher may ask the pupils to read a passage to answer some specific "Why," "When," "Where," "What" or "How"; or she may ask them to read for any information bearing on the question of immigration.  The adult reads not only to find facts to meet particular problems, but often to find anything that meets any one of many general needs. Thus the advertising writer reads widely, ever alert for "hunches" for a new display or selling talk; the student of education reads a book on ethics or biology, alive to points which may have an educational implication, or he reads a series of technical articles in the hope that a good problem for research may be suggested.

4. We may read mainly to recollect, to supplement the passage, as when we allow our mind to drift, while reading a poem, to scenes, persons, conversations, activities, which are merely suggested. These reactions may be more active, at least more forced, when our problem is to recall historical events or other poems which are relevant or similar and more restricted when we must attempt to "read between the lines," to divine the purpose of a letter, or discover the significance of an allegory or the moral of a fable. These are types of interpretative reading of the factual type that involves mental reactions quite distinct and probably more complex than plain comprehension. Interpretative reading may, and often does, involve anticipation—what is coming next, how will the story end, to what conclusion is the argument leading. All of these are types of reading purposes well worthy of attention in school.

5. One may read with reaction more emotional than intellectual, although both are usually combined. One may read mainly for a thrill of mirth, grief, anger, or excitement, to suppress or intensify melancholy. To ask "What is the emotional tone of this passage?" may give us a mental adjustment which may result in reactions to features unnoticed at other times. Similarly we may read to detect the rhythm, the literary style, and each of these types of reading is a fine art to which a lifetime of cultivation may be devoted.

These are not all of the kinds of reactions which may be evoked during "reading." Doubtless many who read these pages may be able to draw up a longer and better list. If they try they will be more impressed with the complexity and diversity of reading reactions that are, or may be, acquired.

The nature of these activities, especially their interrelations, how they facilitate and antagonise each other in practice, how the development of one influences the development of others, and what are the most economical and interesting teaching devices, are problems, practically important, on which we have but meagre direct evidence. Where guidance is needed, we must fall back upon general principles derived from studies of materials, such as perception of letters or digits, memorizing words or nonsense syllables, Latin, formal grammar or geometry, which are remote.

An experiment was designed, therefore, to ascertain some of the facts concerning the general and specific effects of training certain types of reading reactions. It is essentially an experiment in the transfer of training essaying to answer the question: How much and in what manner does the development of one type of reading ability influence other types? One of our main problems, however, was to evaluate certain forms of experimental technique by means of which further investigations, more extensive in scope, may be most satisfactorily pursued.

The pupils of the third, fourth, fifth, and sixth grades of the Horace Mann School served as subjects. There were three classes in each grade, two approximately equivalent and a third a half-year advanced.

The "control group" technique was adopted. To all of the groups was given a battery of tests, to which transfer was to be measured, at the beginning. One class in each grade was then given specific training in one form of reading; another in a second form of reading, and a third, the "control group," was given no special training during the practice period. At the end of the training period, all of the groups were given the same battery of tests as was used at the beginning. The specific improvement in two types of practice, the transfer of each to other functions could then be ascertained since the control group provides a measure of improvement due to regular school or home work, to general growth, and to improvement in scores on the initial and final tests attributable to those experiences alone.


1. The Paragraph-Question Method. One class in each of the four grades was given daily training in reading paragraphs for the purpose of answering questions concerning the important facts. The material consisted of 853 paragraphs, each containing in the vicinity of 10 lines, mimeographed one on a sheet, 8 ½ x 11 inches. The selections were chosen (most of them rewritten from originals selected from Children's books) by the following criteria: They must be interesting and of suitable difficulty; they should be of factual rather than fictional content-most of them related interesting facts about animals, nature, science and travel; they must possess distinct unity; that is, each paragraph must contain only related and well-organized data and be a complete unit in itself. There were several series of such units falling under a single topic.

Each paragraph was followed by five questions concerning important facts in the passage which could be answered by a few words or by six statements to be answered by underlining the word "true" or "false."

The pages were assembled and bound into covered booklets, one for each child. At the end of each practice period, the booklets were collected, scored, and on the next period returned to the owner. For all grades the paragraphs were assembled in such a way that the easy and difficult passages were scattered at random throughout.

Practice consisted in reading the paragraphs and answering the questions as accurately and rapidly as possible during daily periods of ten minutes. When time was called each day, the pupil marked the place at which he was working with the date, placing in the booklet a marker to facilitate its discovery later. Next day the pupil began where he left off; often it was necessary to re-read the paragraph partly completed the preceding day.

The booklets, collected at the end of each period by the teachers, were scored according to a key booklet which allowed a wide margin for the teacher's judgment. Before each new practice period, the pupils were afforded an opportunity to study their errors and record the quantity of their achievement. They were encouraged to improve. The arrangements permitted each pupil to progress as well and as rapidly as he was able until the book was completed.

This type of training will be designated as the Paragraph-Question, or the P.Q. method. One score was the number of questions correctly answered; it represents the rate and accuracy with which material of uniform difficulty may be comprehended, in the sense of perceiving the important facts presented. A second score was the number of lines read per period; it represents speed without regard to comprehension.

2. Rapid Reading with Oral Recitations. For materials, books of selections appropriate to each grade and similar in content to the mimeographed paragraphs were used. The class in each case was told that they were to have some training in reading silently and in answering questions orally on what they read. They were told to read as fast as they could but only as fast as they could comprehend. The teacher chose a short selection which she thought three or four would finish in three minutes; or, if this were not possible, a selection was chosen which could be divided into approximately two three-minute periods. It was explained to the pupils that at the end of each minute they were to encircle the word that they were then reading, when the teacher said "Mark!" and go right on reading. If they reached the point indicated by the teacher as the end of the selection before she said "Stop!" they should stop reading and close their books. An oral recitation period was then conducted for approximately two minutes on the material just read.

This recitation period was conducted at different times in all of several ways as follows:—(1) Specific questions about the story were asked, modeled after those of the Thorndike-McCall Reading Scale; (2) several children were called on to tell different parts of the story in order; (3) a series of statements were made and the pupils were asked to tell whether they were true or false; and (4) the completion method was used; that is, a statement was begun and the pupils were asked to complete it according to the story. Any of these methods was combined with any other method or only one used for a practice period. Errors in the pupils' statements were, of course, corrected. The latter part of the selection was especially stressed in the recitation period, so that those who had not finished the reading would be able to understand the material when they started reading again. At the end of the two-minute period the class repeated the above procedure, so that the total time for each day's training, 10 minutes, would be divided into approximately six minutes of reading and four minutes of recitation.

The pupils kept records on which were marked the date and the average number of lines read per minute each day. They were told by the teacher that she was keeping a record of their recitation answers. They were encouraged to do better each day both in comprehension and speed.

This type of training will be called the Oral-Recitation, or the O.R. method. The scores were the number of lines read per minute; each day's scores being the average of six minutes' actual reading. No measures of relative degrees of comprehension were secured, but the teachers sought to keep comprehension at least as good as it was at the beginning.

The main differences between the two methods are: (1) the "paragraph-question groups" used shorter selections with more frequent and more objectively measured tests of comprehension; (2) the "paragraph-question groups" wrote their answers and later studied the corrections, whereas the other groups gave only oral responses; and (3) the paragraph groups could re-read the whole passage to find the answer to a question whereas the Oral-Recitation groups did not re-read after the beginning of the question period. The methods of scoring resulted in a measure of speed for both and a measure of comprehension for the P.Q. Groups.

When approximately half of the members of a class had completed their books of paragraphs, the work for all the classes of that grade was stopped and the transfer tests, given at the beginning, were repeated.


To measure the amount of transfer, the following tests were given:—

1. Thorndike-McCall Scale for Ability to Understand Paragraphs, Form 8.  This scale consists of paragraphs followed by questions, and is similar to our "paragraph" material except that the materials range from easy to difficult passages.  It measures the degree of difficulty in material that may be comprehended.  Basal score is the number of questions correctly answered.

2. Monroe's Standardized Silent Reading Test, Revised, consists also of passages in order of difficulty with briefer demonstrations of comprehension. The author gives one method of scoring to yield a measure of comprehension; another, in which errors in comprehension are disregarded, to measure rate. Both were used.

3. Courtis Silent Reading Test, Form 3, Understanding of Paragraphs, is similar in form to our paragraphs and questions, except that the material is very simple. Score is number of "yes—no " questions answered correctly in five minutes.

4. Courtis Silent Reading Test, Rate, consists of a passage of words to be read continuously for three minutes.  This exercise was similar to the work of our "oral-recitation" groups.  Score is number of words read per minute.

5. Word-Perception Test, a test devised two years ago by one of the authors as one of a series to measure speed and accuracy of perception of words.  It consists of 24 blocks of words, of which the easiest and hardest follow:


The task is to select and underline the word in the group which is the same as the word in the margin. This test was constructed to measure the speed and accuracy of the mechanical operations of word perception. Score is number of exercises done correctly minus the number of errors.

6. Cancellation of Unlike Groups of Digits. This test consists of rows of pairs of numbers, mainly 5 digits each. The task is to cancel each number which is not identical with the number with which it was paired. Score is number of groups attempted minus three times the number of errors.

7. Picture-Naming Test—a page containing clear outline drawings of 70 familiar objects, such as hat, watch, kite. The task was to write the names of the objects under the pictures as rapidly as possible. Score is number correct.

Tests 5, 6 and 7 were introduced as experimental safe-guards and utilized in a study of the "control group" technique as will be explained later.


For our main problem—the determination of the relative amounts of improvement in the function practiced and transferred to other functions—it is necessary to secure measures of improvement which are comparable. We cannot compare gains in the gross scores of the several tests themselves since the units may have very different meanings; gains expressed as percentages of the initial score are also seldom comparable, since the gross size of the mean initial score gives an incomplete statement of the facts at that time and greatly influence the size of the percentages. For example, in one test the initial score (hypothetical cases) is 8^ the final score 16, the gain 8; in another the initial score is 50, the final score 66. In gross scores, the gain for the second test is twice that of the first; stating the gain as a percentage of the initial score, the first gives 100, the second 32, or about one-third. A comparison of gains by either method is inadequate.

The most satisfactory measure as yet devised for the purpose of comparing ability or gains in ability is based both upon the averages and some measure of the dispersion of ability in the groups. Of the several measures of dispersion, the standard deviation is most satisfactory and is here used. In detail, the procedure was as follows: (1) computation of the average (mean) performance of each group on each test; (2) computation of the standard deviation (S.D.) of abilities on the initial tests;4 (3) computation of gains in terms of gross scores by subtracting the average scores on the initial tests from the average scores on the final tests; (4) the conversion of these gains into multiples of the standard deviation on the initial test in question by dividing the average gross gains by the S.D. This yields the S.D. gains which are comparable. The main assumption here made is that these pupils differed from one another equally; i.e. showed the same degree of average variability in the several tests, hence a gain representing a change from the mean to + 1 S.D., a score above which are found in the initial test approximately 16.5 per cent of the children, is the same in one test as in another. This is, of course, the assumption upon which all test scores are objectively "scaled" in the construction of educational and other tests.5


The total number of practice periods (10 minutes each) for the P.Q. and the O.R. groups were the same for a given grade, as follows: grade III, 11; grade IV, 17; grade V, 16; and grade VI, 14. Not all of the pupils were present every day, so without regard to date, or intervals between, the practice periods were pushed back to leave no gaps. Under period 3 or 4, then, are included all of the records for the third or fourth practice period, although they were not all done on the same day. The result of this procedure is that all absences appear on the last practice days, some of which are omitted entirely from the computations below.

The specific gains from practice in the P.Q. training are given in Table I, in terms of gross units as well as in multiples of the standard deviations. In all grades the gains are as uniform as those usually found in such experiments and in all classes they are large. Grade 4 shows the largest gross gain, but in proportion to the number of practice periods, grade 3 excels. Grades 4, 5, and 6 improve with approximately equal rapidity. There is no evidence that any of these groups have approximated a limit of improvement. Averaging the results for all grades, we find the mean number of practice periods to be 12.25 and the mean S.D. gains to be 2.45. This figure represents the mean improvement hi rate of absolutely accurate comprehension.

The gains in speed of reading, measured by the number of lines read per period, disregarding errors in comprehension, are given in Table II. The curves of improvement for rate thus measured are similar to those for number of questions correctly answered, but the total ascents are greater. There is an indication, not very pronounced, that the more advanced grades make, in equal time, a more rapid gain in rate. Combining the four classes whose average number of practice periods is 12.25 as before, the average S.D. gain is found to be 3.32, a figure somewhat greater than that for mean gain in number of questions correctly answered.




The fact that speed increased more rapidly than number of questions correctly answered is not in itself evidence that the former improved somewhat at the expense of the latter. The thoroughness of comprehension during the practice periods is indicated by the ratios of questions answered correctly to those attempted. We have not computed these figures for every period, but for the two initial periods, which are combined, and two final periods likewise combined, for each class. The raw scores represent the number of correct answers in twenty-five attempted answers. They are shown in Table III.

Comprehension becomes no more inaccurate; indeed, in all grades except grade 3, where it remains constant, it becomes somewhat more accurate during the practice which produced great advances in the rate.

Viewing the results of the three types of scores together, especially those for grade 4, 5, and 6, it is apparent that slightly differing emphases were given to rate and accuracy in comprehension. Grade 4, which made relatively small progress in speed, made the greatest progress in number of questions correctly answered. In Table III, it will be observed that this grade, first and last, was most nearly perfect in accuracy. Grade 6, relatively inaccurate at the beginning and end, although it improved in this respect, made the most rapid progress in uncorrected speed. Thus improvement depends, in a subtle way, on the direction in which practice is guided. These differences occurred, small though they are, in spite of our efforts to standardize all conditions.



The returns from specific practice by the Oral-Recitation method are given in Table IV. The results for grade 3 were omitted because of certain deficiencies in the records, which do not, however, indicate any deficiencies in the practice itself. The raw scores in this training portray speed. The gains are fairly uniform and large for each class; and more pronouncedly than the gains in speed by the P.Q. groups, the higher the grade the greater the gain. Thus for period 12, the S.D. gains are, for grade 4, 2.0; for grade 5, 2.5; and for grade 6, 7.7. There are reasons for believing that the results for grade 6, which are startling in magnitude, are as reliable as the others. This relation between improvement and grades was also found by O'Brien.6

To make the results of the two types of training as comparable as possible, the average gains are computed from such days, for each grade, as will make the number of practice periods approximately equal for both types. The average number of practice periods for the P.Q. groups was 12.25. Taking the 13th day for grade 4, the 12th for grade 5, and the 12th for grade 6, the average number of practice periods for the O.R. groups becomes 12.33. For the P-Q- groups, the average S.D. gain in number of questions correctly answered was 2.45; for rate of reading, errors in comprehension disregarded, 3.32; for the O.R. group the gain in rate of reading, errors in comprehension unmeasured, is 4.23. If we base the comparisons upon grades 4, 5, and 6 only (since results for grade 3, O.R. are unusable) the results appear in Table V. This is a fairer comparison but the results are relatively about the same. The most likely explanation, we believe, is that speed was given rather more, and accuracy in comprehension rather less, emphasis in the Oral-Recitation groups.



In the trained functions themselves, the improvement, stated in terms of comparable measures, multiples of the S.D.'s from the averages of the initial performances, differs with the methods used to measure raw gains, amount read, number of questions correctly answered or proportion of questions answered of those attempted. Gains in the transfer tests, those given before and after training, will likewise vary and as a consequence, the measurement of amount or per cent of transfer offers difficulties.

For a first presentation, the various transfer tests have been scored in what seems to us, or to the authors of the tests, as the most reasonable manner to secure a measure of general ability. Since all groups took the transfer tests, the S.D.'s are based in each case upon the three classes of a particular grade combined. The mean gains for each class in raw scores are then computed; initial score subtracted from final, and the remainder divided by the S.D. to secure the S.D. Gains.

In Table VI, the mean scores, the difference between them and the S.D. gains are given for all groups arranged under grades. Table VII gives the average S.D. gains for all grades before and after subtracting the gains for the control groups. The gains made by the control groups are due to improvement during the transfer tests themselves, plus that due to growth and the regular school instruction during the intervals between the initial and final tests. By subtracting these amounts, the improvement due to the paragraph-question or oral-recitation training is revealed.

The training on paragraphs gave the greatest transfer to Thorndike-McCall and Monroe comprehension (0.45 S.D. in each case). Both of these tests were similar in form to the materials used in training. The specific gain on the training material for these groups, however, was 2.45 S.D. or 5.45 times the improvement on Thorndike-McCall and Monroe comprehension. These tests consist of passages constantly increasing in difficulty, and purport to measure the degree of depth or difficulty, as distinguished from rate of comprehension. The conclusion suggested is that the training marked by increased speed and efficiency in reading certain kinds of material of a moderate and uniform degree of difficulty increased but slightly the depth or power of comprehension.



S. D. Gains for all grades combined, together with the P. E. of the means.7

Next to these two tests, the greatest transfer from the P. Q. training was made to the Courtis Comprehension test, which is similar in form to the training material and of material uniform in difficulty. The material was, for our pupils, too easy; indeed, so trivial as to invite disinterest. Conceivably, habits of reading to get more important and complex facts from more significant material are quite different from those involved in reading childish material to answer trivial questions. The transfer is positive (0.33 S.D.) but less than one-seventh of the amount of gain in the practice material.

Although the P. Q. group increased markedly in rate of reading for purpose of securing the answer to pertinent questions-the gain was 3.6 S.D. for number of lines read, comprehension disregarded-the transfer to two "rate" tests, the Courtis and Monroe, is small, 0.20 S.D. and 0.15 S.D. respectively. For the former, the transfer is but 5.6 per cent. In our opinion, it is very doubtful whether the Monroe Rate score represents rate or anything else of importance; we, therefore, attach little importance to that result. But in the slender spread of improvement from the training to speed on the Courtis test is an interesting suggestion that speeds in reading are diverse; children learn to read rapidly for one purpose without acquiring even approximately equal facility for other purposes. The constituents of reading are perhaps, more numerous and specific than we had surmised.

From the Oral-Recitation Groups, the transfer is distributed among the tests in strikingly different proportions. Whereas the P. Q. groups gained but slightly (0.20 S.D.) in Courtis rate, the O.R. groups gained markedly, 1.14 S.D. Of all the several amounts of transfer, the latter is the largest. Contrasted with specific gain in training that seemed so similar, the transferred improvement is less imposing. For grades 4, 5, 6, the only ones for which we have usable O.P. records, the specific gains average 4.6 S.D.; the average transfer for the same grades is 1.40 S.D.8 or 30.4 per cent. Between two interpretations, one, that the pupils were, because of the ease of the Courtis material, not stimulated to do their best; the other, that rapid reading in moderately difficult material is quite a different activity from rapid reading of very simple material, we cannot on present evidence decide. Perhaps both explanations are true in some proportion.

The transfer from O.R. to the comprehension tests, Courtis, Thorn-dike and Monroe, is meagre, measuring 0.33, 0.28 and 0.25 respectively. Comparing results for the comparable grades (4, 5 and 6) the average transfer to these three tests is about 6.1 per cent.

The O.R. training increased ability in the Monroe score for rate by only 0.08 S.D.; the lack of significance of the rate score in this test was mentioned above.

The different methods of measuring gain in the P.Q. training, and the omission of the scores for grade 3 in the O.R. training, make general comparisons of the magnitudes of transfer, as shown in Tables V and VI somewhat difficult. Table VIII has been constructed to display the facts more intelligibly.

What now appears to be the answer to the question: To what degree does training one reading reaction increase ability in others? It is necessary, as disclosed in the data above, to give the answer in terms of the particular reading reaction trained and the particular type of untrained reading one has in mind. The degree of transfer varies greatly. To answer the question generally (and admittedly crudely) we may consider the averages at the foot of Table VIII. The P.Q. training increases comprehension in other materials, when the method of scoring is the same in both cases, by 16.7 per cent; it increases rate in other materials, 5.25 per cent. The transfer of O.K. training to rate in the Courtis test is 22 per cent, and to comprehension, 5.2 per cent. In these consolidated results, however, many of the important facts are concealed.


The transfer of ability is small but genuine. It was suggested at the beginning of this article, on the basis of theory, and since verified by the experimental data, that reading is not a single unitary ability or power but a composite of many activities. It would, therefore, be inconsistent and improper to speak of the transfer of reading ability unless it is tacitly understood that we mean thereby not different amounts of some unitary power but various minute abilities in various amounts. If we accept, furthermore, Thorndike's theory of transfer, the theory that the training of one function influences another in so far as the two have identical elements, or involve identical abilities, we are driven, in order to make our findings more intelligible, to a search for the common elements or activities which account for the transfer manifest.

With this purpose in mind, during the construction of plans for the experiment three tests designed to measure abilities possibly constituent in one or both of the forms of training, were included among the transfer tests. They were the word-recognition, the picture-name writing and the digit-cancellation tests. The results of these tests (in Tables VI and VII are illuminating.

The word-perception test was designed by one of the writers to measure, in as pure form as possible, sheer speed and accuracy of the perception or recognition of word forms. That it does measure this ability very well has been demonstrated elsewhere and in other ways.9 Both the paragraph-question and the oral-recitation forms of training result in an increase in this ability over and above that due to ordinary school training (control group); the O. R. more than the P.Q.—0.27 S.D. and 0.18 S.D., respectively. Our interpretation of these facts is that reading in which rate is emphasized and improved results in a sharpening of the skill with which words are mechanically perceived. The oral-recitation method apparently gave more emphasis to, and probably but not certainly resulted in, greater improvement in speed-a result suggested by the measures of transfer in the word-perception test.10

The cancellation-of-unlike-digits test, involves neither the perception nor the writing of words. It was selected as a means of measuring more or less well the general technique of taking tests, which conceivably might be improved by the P.Q. and the O.R. groups. Just what devices constitute this technique, we do not know and can merely surmise,—possibly such things as getting a quicker start, placing material in a better position, losing no time in curious surveys of the whole test, increasing control of attention or of excitement, and the like. Whatever they may be, these items are acquired in about equal, but slight, amounts by both practice groups. Part of the specific and transferred improvement, then, is not reading as such but improved technique in taking pencil-and-paper tests.

The picture-naming test was introduced to measure the improvement in ability to write a word or two as a response to a stimulus, under test conditions. The stimuli were purposely made easy and familiar—they were clear cut drawings of familiar objects—in order to measure the writing responses, rather than the activities of perception emphasized in the preceding test.

Table VII shows that the paragraph-question groups, which answered the questions by writing a word or two or by underlining a word, excelled the control group by 0.12 S.D., whereas the oral-recitation groups, which did no writing, gained in this ability somewhat less; the gain being 0.05 S.D. greater than that of the control group. In the case of the P.Q. group, at least, part of the specific improvement, and part of the transfer to other tests, may be not reading at all but improvement in writing words in response to stimuli under test conditions.

The gains for the O.R. groups were the same in the cancellation and Picture-Name Writing tests; both are due, presumably, to general technique in taking tests. For the P.Q. the gain in the Picture-Name Writing test is greater than in the cancellation test—0.12 S.D. and 0.03 respectively. The former test gives play not only to writing under test conditions, but also to the other features of technique which produce the smaller gain. The difference11 (0.09 S.D.) apparently measures roughly the influence of increased skill in writing word responses during tests. Since the O.R. group had no such practice, it reveals no gain in this test beyond that due to general technique.

The transfer of abilities that are intrinsically reading abilities is, then, somewhat less than indicated by the figures in Table VII or VIII. It may be urged, with respect to general efficiency in life, that it makes no difference whether the improved ability in the work of reading-to-get-the-facts is mainly improvement in some intellectual process or activity, in facility of sheer word-perception, in speed of jotting down notes, or in other matters of adjustment to the task—time saved in one way is as good as time saved in another. For the science and practice of education, however, it makes all the difference in the world to know just what we are teaching.

To ascertain more closely the transfer of abilities intrinsic to reading, we may take the data of Table VII and treat them as follows: From the specific gain in P.Q. training, subtract 0.12 S.D., the approximate amount due to acquisition of general efficiency in taking tests and in writing word responses under test conditions;12 from the specific gains in O. R. subtract 0.05 S.D., the amount due to general test-technique training; from the transfer reading tests taken by the P.Q. groups subtract 0.12 S.D. if answers were made in writing and 0.03 S.D., the transfer of general technique if there were no written responses; from the transfer reading tests for the O.R. group subtract 0.05, the amount of transfer due to factors not intrinsically reading ability. The resulting values are given in Table IX.

The corrected estimates are slightly smaller than those based upon the gross scores. They, or rather the considerations which lead to them, are of value mainly in disclosing the extreme complexity of the problem of appraising the transfer of training. By specialists in this field of inquiry, it will be freely admitted that much of the experimentation in the past has been crude, although not always misleading. The demands of education, at least, are for results that require methods more refined. This study provides a few suggestions of service to that end.

The ordinary control group technique, it is now apparent, is not wholly adequate for the measurement and analysis of the transfer of improvement. Failure to control the transfer of improvement in mere technique in taking tests which has been quite universal in investigations of the past, has given a fictitious appearance of transfer in general ability to add, do geometry, grammar, etc., where it did not exist or where it existed in smaller quantities, since it is probable that improvement in ability to take tests generally results in making the total transfer larger than that due to gain in addition, geometry, etc., per se.

It is suggested, therefore, that in future work the investigator should scrutinize his tests with care, considering them not merely as measures of ability to read, add or whatnot, but as a measure of various abilities, to perceive words or digits, to adjust to a certain kind of task-undertime-pressure, to control attention, to write words or figures in such and such a, way, and so forth. Where the interest is in measuring reading or some other abilities specifically, control for all of the other factors must be provided and these extrinsic abilities allowed for.


The results of the study point unequivocably, we think, to certain practices in the teaching of reading. First, it is apparent that instruction and practice in reading in a general way—mere reading—does not guarantee the development of all the important types of reading ability; indeed, it almost certainly will not do so. One of the writers, during the past four years, has studied the reading abilities of a number of graduate students at Teachers College who complained of slowness in rate. These cases frequently disclosed more than mere slowness; they showed impressive deficiencies in abilities to read in each of several important ways,—to skim, to pattern variously the rate and type of reading reactions to the kind of material or the purpose of the moment. Training in reading must be many-sided; many types of reactions must be acquired.

Second, while transfer from one type of reading to others is genuine and usually positive, it is so small that it cannot be depended upon to develop desired abilities. They must be developed specifically. We may accept with gratitude the increments from transfer; but never be willing to accept them as a substitute for direct training.

Third, it is apparent, in the results of this study, that a most economical and interesting way to develop reading abilities consists in utilizing the principles of 'practice experiment' in forms similar to but superior to those used here. Compared to the regular class-room technique, not only were the direct returns for practice in the two experimental groups relatively very great, but the indirect returns, through transfer, were also relatively great. The "control groups," it will be recalled, were not groups receiving no school instruction in reading but groups receiving regular classroom work.

The "practice experiment" procedure embraced five main features: (1) the selection of materials carefully adjusted in difficulty, utility and interest to the group; (2) provision of devices for testing and stimulating the type of comprehension to be trained; (3) a time control to provide a measure of and incentive to increase in efficiency; (4) the keeping of records of typical errors, defects, deficiencies, and progress by the pupils; and (5) adequate administration of practice and reviews.

Of these requirements, the first and last merit special attention, taking them up in reverse order.

The method of training with practice materials often utilized in schools consists in a series of short, intensive daily exercises, after the fashion of the practice experiment. This is desirable, and especially in remedial work intensity of application is a requirement of utmost importance. Having given one intensive continuous series of practices, the tendency has been to let it go at that. A type of reading, like any other acquired ability, dies out with disuse. Reviews, not always necessarily as long as the original practice, and often most economical when given at constantly increasing intervals, are quite as necessary as they are in the mastery of facts in history, multiplication tables or skill in typewriting.13

In the selection and mechanical arrangement of materials of proper difficulty, interest and utility, much ingenuity has already been displayed but more is needed.14 Many of the exercises are patterned after the forms of educational and psychological tests, which is desirable except that when the difficulty, interest or utility of the material or the type of reading reaction is overlooked. For example, one may question, mainly on grounds of utility, the value of such exercises as the following for purposes of teaching reading.

Rearrange the following disarranged sentences:15

1. The countries circling / "the world" meant / conquered Hannibal / around the Mediterranean Sea / when the Romans / at the time, etc.

Mark out one work out of place in each line:16











But in these and other articles and books are contained excellent types of practice materials designed to cultivate not only direct comprehension of facts but also simultaneously abilities to interpret, evaluate, classify, organize, reflect, imagine, appreciate. These are types of hierarchies of habits for the development of which more materials are needed.

1923. Twentieth Year Book of the National Society Study of Education. Part II, Section 2, 1921. J. A. Wiley. Practical Exercises in Supervised Study and Assimilative Reading, Cedar Falls, Iowa: Iowa State Teachers College, 1922.

1From the Research Department of the Horace Mann School of Teachers College. In this paper are not included certain outcomes of the study which, because of their technical character, will be published elsewhere.

2 With the cooperation of the following teachers in the Horace Mann School: Miss Mary E. Baker, Miss Jean Betzner, Miss Elsa Beust, Miss Ida M. Robbins, Miss Margaret Condry, Miss Honora M. Frawley, Miss Marie Hennes, Miss Mary F. Kirchwey, Miss Mary R. Lewis, Miss Ethel M. Orr, Miss Mary G. Peabody, and Mrs. Siegried Upton.

3Grade 3 used 36 paragraphs, the others being judged too difficult by the teachers.

4For the measures of ability in Paragraph-Questions and Oral-Recitation, the S.D.'s were based on the scores for one class (20-25 pupils), each grade being treated separately. The S.D.'s of ability in the transfer tests were based on the score for three classes, combined, at each grade level, since the Paragraph-Questions, the Oral-Recitation, and the Control Groups took these at the same time.

5Readers who are not familiar with these statistical devices will not, on that account, have difficulty with the following sections. Those who wish a more detailed statement of tile procedure may consult the texts of Kelly, McCall, Rugg or Thorndike.

6J. A. O'Brien, Silent Reading:  New York:  Macmillan 1921, pp. 190 ff.

7The computation of the measures of reliability of the averages and differences between averages occasioned some difficulty. The means in the above table are averages of averages. A P.E. might be obtained by averaging the S.D. of each of the four groups around its own average and divide it by .67 ///INSERT IMAGE 4. This P.E. would be much smaller than the ones shown in the table which were computed as follows: (1) compute average of the four averages for each test given in Table V; (2) get S.D. of these from the average (3), divide by .67 ///INSERT IMAGE 4. Since the several grades practiced for different lengths of time, the variation from the common average is much higher than would be produced by the influence of chance errors alone. All P.E.'s are therefore too large. The effect will be to hold us to a conservative account of differences (—) which is not without merit.

8This figure is not the same as that given in Table V; it is the mean of the S.D.'s on the last practice days, as given in Table IV.

9See Gates, A. I. The Psychology of Reading and Spelling with Special Reference to Disability, New York:  Teachers College, Bureau of Publications, 1922, and in articles to appear later.

10The difference is 0.27- 0.18 = 0.09 ± P.E. .07. Since this P.E. as computed is doubtless too large, it is fairly probably that the difference is significant.

11The P.E. of this difference is less than 0.06.

12Perhaps the amount subtracted should be larger, since “technique” and “ability to write word responses” will not themselves transfer completely to other tests, but we cannot, from our data, tell what the specific gains in these respects were.

13For some of the principles involved, see Gates, A. I. Psychology for Students of Education, New York:  Macmillan, 1923, pp. 257-262, 286-293.

14For example:Jean Betzner. "Materials for Practice in Silent Reading,” Horace Mann Studies in Education, vol. 1, April, 1923. S. S. Brooks, Improving Schools by Standardized Tests. Boston: Houghton Mifflin, 1922, 143-246. Educational Bulletin, Minneapolis Public Schools, No. 2, May, 1923. Mary R. Lewis. "How a Teacher Helped Poor Readers to Improve,” Horace Mann Studies in Education, vol. 1, April, 1923. Twentieth Year Book of the National Society Study of Education. Part II, Section 2, 1921. J. A. Wiley. Practical Exercises in Supervised Study and Assimilative Reading, Cedar Falls, Iowa: Iowa State Teachers College, 1922.

15Recommended by S. S. Brooks, op. cit. pp. 222.

16From the Minneapolis Bulletin, op. cit.

Cite This Article as: Teachers College Record Volume 25 Number 2, 1924, p. 98-123
https://www.tcrecord.org ID Number: 6026, Date Accessed: 1/22/2022 10:35:23 PM

Purchase Reprint Rights for this article or review
Member Center
In Print
This Month's Issue