Chapter II
Review of the Literature
In order to provide a better perspective of the literature relevant
to this study, several areas were addressed. These areas relate directly
to the current trends in second language teaching and testing, that form
the foundation of the present study. First, sociolinguistic research directly
related to this study is presented. Second, the historical background of
various second language testing approaches, in conjunction with teaching
methodologies, will be discussed in order to examine the relationship between
second language teaching and testing. Finally, the current literature,
the functional approach to modern language teaching and testing, the notion
of functional competence, and its relationship to other language competence
is presented.
.
Sociolinguistic Research
Language professionals have paid careful attention to the social aspects of language. The impact of sociolinguistic studies, however, has not yet fully reached the language classrooms.
Language and its context. The notion that contextual factors,
social and otherwise, must be taken into account in determining the acceptability
and interpretation of sentences is relatively new (Lakoff, 1972). It was
only in 1923 that Malinowski pointed out that language is far from being
self-contained; in fact, it is entirely dependent on the society in which
it is used (Ho, 1981). Each language has evolved in response to the demands
of a given society, so its nature and use in that society are entirely
context-bound or context- dependent. Hymes (1972) contended that there
are rules of use without which the rules of grammar would be useless, and
he suggested that the notion of Chomsky's competence be enlarged to include
contextual appropriateness. In order to account for language use, sociolinguists
have come up with a variety of models. In place of Chomsky's dichotomy
of competence and performance, Hymes (1967) offered a four-fold distinction
that he suggested be included in an adequate theory of language use:
1. whether (and to what extent) something is formally possible,
2. whether (and to what extent) something is feasible by virtue of the
means of implementation available,
3. whether (and to what extent) something is appropriate in relation to
a context in which it is used and evaluated, and
4. whether (and to what extent) something is in fact done, actually performed,
and what its doing entails. These four sectors of his communicative competence
model reflect the speaker-hearer's grammatical (formally possible), psycholinguistic
(implementationally feasible), sociocultural (contextually appropriate),
and de facto (actually occurring) knowledge and ability for use (Pride
& Holmes, 1972, p. 281).
Hymes (1967) used the speech event as the smallest unit for analysis
and describes its components as follows:
SETTING or SCENE: time and place; also psychological setting and cultural
definition as a type of scene, PARTICIPANTS or PERSONNEL: addressor-addressee,
audience,
ENDS: ends in view (goals, purposes) and ends as outcomes,
ART CHARACTERISTICS: the form and the content of what is said,
KEY: the tone, manner or spirit in which an act is done,
INSTRUMENTALITIES: channel (the choice of oral, written, telegraphic, or
other medium) and code (Spanish, English, etc.) or subcode (dialect, sociolect),
NORMS OF INTERACTION and of INTERPRETATION: specific behaviors and properties
that may accompany acts of speech, as well as shared rules for understanding
what occurs in speech acts,
GENRES: categories or types of speech acts and speech events, such as conversation,
curse, prayer, lecture, etc.
Another strand of research in language use is what is generally referred to as "functionalism," (i.e., language as actions) of which Austin (1962) is one of the forerunners. According to Austin, there are three types of speech acts: (a) locutionary acts, (b) illocutionary acts, and (c) perlocutionary acts. A locutionary act is an utterance with a certain sense and reference, that is, the utterance is meaningful. All meaningful utterances are locutionary acts. But a speech act may also be an illocutionary act, because it may do one of the following: announce, state, assert, describe, admit, warn, command, congratulate, comment, request, apologize, criticize, approve, thank, promise, regret, and so on. Or it may be a perlocutionary act, one that brings about or achieves some other condition or effect by its utterance, for example, an act that convinces, deceives, encourages, bores, inspires, irritates, persuades, deters, surprises, or misleads someone. Perlocutionary acts pertain to effects produced on the addressee. Perlocutions may be intended or unintended, whereas illocutions are always intended (J. Jain, personal communication, December 2, 1986).
All of the functionalists agree that most utterances are multifunctional, which means that what is grammatically the same sentence may be a statement, a command, or a request; what are grammatically two different sentences may, as acts, both be requests (Hymes, 1972). Scholars and researchers in the field of sociolinguistics have pursued yet other paths of investigation. There are studies on classroom language behaviors (Black & Butzkamn, 1978), politeness formulas (Brown & Levinson, 1978; Goody, 1978; Lakoff, 1973), interaction between topic, listener, and language (Ervin-Tripp, 1964), relation between grammatical structure and persuasiveness (Cantor, 1979), different forms of directives and their usage (Ervin-Tripp, 1976), adults' understanding of direct and indirect speech acts (Hosman, 1978), relation between language varieties and social situations (Gregory & Carroll, 1978), and others.
In the article on the logic of politeness, Lakoff (1975, p. 299) suggested two rules of pragmatic competence: Rule 1--be clear, and Rule 2--be polite. Later she further divided Rule 2 into three rules of politeness: (a) use passive and impersonal expressions, (b) use expressions such as sort of, I guess, or euphemisms and so on, and (c) use expressions such as like, y'know, I mean, and so on. The speaker can give options to the addressee or soften the effect of a statement in concession to a possibly different opinion of the addressee (Shimazu, 1978, p.33). In request situations, her rules of politeness can be combined into one; use request questions, so that the addressee does not feel pressured and has the option to say "no." The rule of politeness could also be changed to say: use of ways of addressing and greeting to achieve contact and to show deference.
Brown and Levinson (1978) intuitively examined language usage obtained by recording informants and came up with a detailed description of politeness strategies. These strategies pertaining to request situations are the use of tactful indirection, hedging, minimizing the imposition, showing deference, exaggeration, reducing ego, and apologizing. It seems probable that, when making requests, a person usually poses himself or herself lower in position, using high pitch of voice as a sign of entreaty (Ho, 1981).
Both Ervin-Tripp (1964) and Goody (1978) have investigated status and
rank in the communicative event. They have found that status and rank play
an important role in language variation. In a recent discourse-analysis
study of directives, including those that serve the speakers and those
that regulate the addressees, Ervin-Tripp (1976) categorized directives
into the following five categories:
1. Imperatives like "Bring me a sweater."
2. Embedded imperatives like "Could you bring me a sweater?"
3. Question directives like "Have you got a sweater here?"
4. Statements of need like "I'm cold."
5. Hints like "It's a cold night."
She found that the addressee's status or rank relative to, and familiarity
with, the speaker are salient features that influence the use of different
forms of directives. Different forms of directives tend to be used in different
situations as a function of the degree of familiarity between the interlocutors
and the size of the status discrepancy. For example, imperatives are typically
used when the addressee is unfamiliar with the speaker or of a higher rank
than the speaker. And hints are used when the addressee is a familiar person
or someone of a higher rank than the speaker (Ervin-Tripp, 1976).
Ervin-Tripp (1976) claimed that directives are especially rich in alternations, possibly because the speaker is asking some action of the listener that involves varying degrees of effort. In general, the higher the cost of goods or service, the greater the option offered to the addressee. Typically, as cost goes up, or the task difficulty increases, one moves from imperative to request question and then to statement. In terms of elaboration, the form of address and style used while talking to a person of higher rank are more elaborate. To put it plainly, there is more elaboration when speaking to someone of higher rank or someone less familiar. She found that the devices used to signal social distance or unfamiliarity tend to be those used to indicate higher rank. When "familiarity" and "rank" are weighed together, familiarity is a stronger element in language choices; familiarity neutralizes rank (Ervin-Tripp, 1976).
Hosman (1978) also found that familiarity often overrides rank differences in making language choices. In other words, when a person is very familiar with the addressee, rank differences become insignificant (Ho, 1981, p.45). Similarly, the results of Farhady's (1980, p.101) study, "how university students interact with professors in academic situations," showed that the status of the interlocutors made no difference in the response patterns. Familiarity and social relation are expressed or implied in the context, if not stated; they are implied such that it can easily be recognized by native speakers of English. Because the rank, status, age, and gender-related restrictions are considered to be minor determinants (Ho, 1981) in the response patterns in English, even in a case of unfamiliar interlocutor's interaction.
Another area of interest in the current study involved laboratory experiments that tested the persuasive implications of specific grammatical variations of 52 male and female undergraduate students at the University of Pennsylvania (Zillmann & Cantor, 1974). The subjects were recruited by means of an announcement posted on the university campus. Rhetorical agreement questions (i.e., "Isn't that ridiculous?") were shown to be more persuasive than the same utterances in statement form (i.e., "This is ridiculous"). Rhetorical concession questions (i.e., "What could be better?") were found to enhance persuasiveness, compared to the statement form (i.e., "Nothing could be better"), when the hearer was favorably predisposed toward the position advocated, but to reduce persuasiveness when the hearer was negatively predisposed. One intervening variable that has been thought to mediate the effects of these and other grammatical variables is the degree to which the forms appear to put the listener under pressure to agree or to comply.
These findings match what Ervin-Tripp and Goody claimed about the effectiveness or "in terms of politeness" of embedded imperatives over that of direct imperatives.
Cantor (1979) set up a study to test the effects of grammatical form variations in door-to-door solicitations for funds. Four forms of request were elicited by 56 solicitors (38 male and 18 female university students) in a field experiment that was run as part of the American Cancer Society's annual fund drive in Madison, Wisconsin. Persuasion was studied using an obviously valid behavioral measure of compliance (i.e., the money donated). The results showed that, in the context of conventional, polite solicitation approaches to Madison residents, the more pressure associated with the grammatical form, the greater success it would have. It was found that the polite imperative (i.e., "Please contribute to our fund") was the most successful, followed by the agreement question (i.e., "Won't you contribute to our fund?"), the information question (i.e., "Would you like to contribute to our fund?"), and the statement (i.e., "We are asking you to contribute to our fund"). The degree of pressure decreases in the same order.
The results reported in Cheek's dissertation (1974) revealed the discrepancy between the adult second language German learner's responses and those of the native German gymnasium speakers under certain given situations. It was shown that different situations demanded different styles or registers and that most native speakers could come up with the appropriate responses whereas language learners could not find the appropriate responses. A follow-up study (Cheek et al., 1975) examined adult students of various language backgrounds and found that situations or social settings determine the choice of appropriate language. Thus, various social situations should be incorporated into ESL teaching and testing. These situations include expressing irritation, offering assistance, apologizing for forgetfulness, making polite refusals, expressing suspicion, and so on.
The competence and performance distinction. In the middle of the 1960s, Chomsky (1965) introduced the terms "competence" and "performance" to language professionals and practitioners. Since then these two terms have been frequently discussed in the linguistics literature. Chomsky (1965) stated, "We thus make a fundamental distinction between competence (the speaker-hearer's knowledge of his [or her] language) and performance (the actual use of language in concrete situations)" (p. 4). Competence would probably be equated with a native speaker's tacit or implicit grammar that is concerned with the linguistic components of language generating only grammatical sentences, whereas performance will focus on how language is actually produced and used (Jakobovits, 1970).
To look at these two terms from the testing perspective will be useful. Although scholars in linguistics distinguish these two, the distinction may not be directly relevant in language testing. The question is whether one is capable of measuring the examinee's language competence.
The purpose of measurement is to determine how people perform a task. No matter what their competence in that particular task may be, evaluating their competence will only be possible by observing their performance. Theoretically one could be quite competent in one task but unable to use that competence in order to perform the task appropriately and accurately. One example may be a foreign student who knows many grammatical rules and exceptions to the rules but has difficulty in producing a coherent and appropriate utterance using these rules. Under such circumstances, existing evaluation techniques will fail to assess the examinee's competence accurately because only his or her performance can be observed and measured. Therefore, testing is concerned with performance and not with competence. Such a limitation, however, does not rule out the distinction between the two terms. Language test developers should avoid theoretical arguments regarding the differences between these two concepts. One could claim that competence is not testable because it is not a directly observable behavior. The existence of competence in a task is a necessary but not a sufficient requirement for performing the task. Whenever the term "testing" appears with the term "competence," testing of the manifestation of that competence (i.e., performance) is implied rather than the competence itself. Some sociolinguists believe that the theory of linguistic competence ignores the appropriateness or sociocultural significance of an utterance (Halliday, 1978; Hymes, 1967; Munby, 1978). Hymes (1972) went one step further and stated that "there are rules of use without which the rules of grammar would be useless" (p. 278). The primary function of language is to enable human beings to communicate appropriately and adequately rather than to be an isolated object of inquiry. Applied linguists have coined the term "communicative competence," which is somewhat different from making a distinction between competence and performance.
Communicative competence is primarily concerned with the knowledge or capability of a person to appropriately coordinate the rules of language structure and the rules of language use. Munby (1978) assumed that linguistic competence is an essential part of communicative competence. Most linguists and language specialists (Jakobovits, 1970; Oller, 1973; Widdowson, 1978) agreed that communicative competence includes linguistic competence and some two or three other competencies relevant to language use. Recent studies in discourse analysis and second language acquisition (Canale & Swain, 1980; Chafe, 1980; Gumperz, 1982; Hatch, 1978; Tarone, 1979) have revealed important information with respect to how people carry out a meaningful communicative act. One major area is the communicative strategies used by the speakers to handle various features of communication. Canale and Swain stressed the significance and the relevance of strategic competence to communicative competence. They stated that
No communicative competence-theorists have devoted any detailed attention to communication strategies that speakers employ to handle breakdowns in communication; for example, how to deal with false starts, hesitations, and other performance factors, how to avoid grammatical forms that have not been mastered fully, how to address strategies when unsure of their social status. In short, how to cope in an authentic communicative situation and how to keep the communication channel open. (p. 50)
Canale and Swain (1980) introduced an additional component "strategic competence (or communication strategies)" to be a component of communicative competence.
A three-component model of language competencies or communicative competence--(a)
linguistic, (b) sociolinguistic, and (c) strategic competencies--has been
established. Thus the task of applied linguists has become more complex.
Although most linguists and language experts agree that communicative competence
includes other competencies--linguistic, sociolinguistic, and strategic--there
is no empirical evidence to support such a hypothesis (Farhady, 1980, p.
26). Farhady (1980) used the term "functional competence." By
this, he meant limiting the domain of communicative competence and proposed
a more simplified and clearly specified form of language competence that
could account for all selective language competencies.
.
Overview of Language Testing and Teaching
Testing procedures and teaching materials have been influenced by teaching methodologies developed from different theoretical doctrines throughout the history of language education. Because each teaching method has given certain priorities to the relative importance of each language component, a clear-cut distinction between teaching methods and testing methods has not existed. Because there have been long periods of overlap and competition among different methods at different times, a chronological ordering of methodologies for teaching and testing does not seem to be applicable. The history of language education indicates that there have been various trends in language testing. There was not a well-established theory for language testing regarding the distinction between different types of competence and performance before the recent theoretical developments in linguistics and psycholinguistics. The tests developed during those periods included more subjective measures such as translation and essay-type questions. With advances in applied linguistics, the trend shifted toward the development of psychometrically sound tests. In recent years, the requirements of a good language test have been theoretically expanded to include psychometric and communicative factors (Farhady, 1980, p. 28). Spolsky (1978) stated:
It is useful, through an over-generalization, to divide language testing into three major trends, which I will call the pre-scientific, the psychometric-structuralist and the integrative-sociolinguistic. The trends follow in order but overlap in time and approach. The third picks up many elements of the first, and the second and the third coexist and compete. (p. v)
Spolsky's classification is illustrated in Table 1.
The prescientific period. The prescientific period refers to
the period prior to the application of principles of educational psychology
to language testing (Farhady, 1980, p. 29). Instruments developed in this
period could be characterized as lacking such properties as reliability
and objectivity (Spolsky, 1978). These tests derived from the
.
Table 1
Classification of Teaching and Testing Approaches
_____________________________________________
Teaching . . . . . . . Testing . . . . . . . . . Types of the Test
_____________________________________________
Grammar- . . . . . . . Pre- . . . . . . . . . . . . Translation
Translation . . . . . scientific . . . . . . . . . .Essay exams
Audio- . . . . . . . . . Psychometric- . . . . . . Discrete-
Lingual . . . . . . . . . Structuralist . . . . . . . . Point
Cognitive . . . . . . . Integrative- . . . . . . . . Integrative
. . . . . . . . . . . . . . . Sociolinguistic
Notional- . . . . . . . Functional- . . . . . . . . Functional-
Functional . . . . . Communicative . . . . Communicative
____________________________________________
Note. From Justification, development, and validation
of functional language testing (p. 29) by H. Farhady, 1980, Ann Arbor,
MI: University Microfilms International. Copyright 1980 by H. Farhady.
Reprinted by permission.old grammar-translation method of teaching foreign
languages (Farhady, 1980).
The major goal of the grammar-translation method was to teach the grammar of the language. Such a teaching method, which ignored fundamental language skills such as speaking and listening, resulted in the development of tests that examined what was taught. Neither the method nor the test dealt with language as communication. The tests developed and used at this time were "composition" and "dictation" in the target language. Because the dictation was administered through a word-by-word reading, it was a spelling test rather than a dictation. The tests at this time also included some grammar translation tasks from literary passages, which examinees were required to translate into or from the target language (Briere, 1972). The lack of objectivity and consistency in the scoring methods were the most serious deficiencies of the tests. Many irrelevant factors such as stylistic preference in composition, accuracy of spelling in dictation, and the purpose of the task in translation intruded into accurate measurement of student language proficiency (Briere 1972). The lack of objectivity in scoring methods and unsystematic testing techniques made it difficult to empirically determine the statistical characteristics of these tests. Test developers were not concerned with the "scientific" properties that a reasonable test should possess (Farhady, 1980, p. 30). Gradual changes took place in the philosophy of teaching foreign languages after World War II. Farhady (1980) stated:
Methodologists started to question the merits of the grammar-translation method with respect to teaching and learning modern languages. Teachers realized that psychological principles of language behavior should be taken into account. They were also convinced that linguistic theories could be very insightful in designing and developing instructional materials. Therefore, both psychology and linguistics began to influence language teaching methods. (p. 31)
A change took place in the objectives, methods, and purposes of language teaching. Equating language with literature was no longer the basis for curriculum design in language courses. Language teaching entered a new era called the "structuralist era."
The psychometric-structuralist period. During this period (1930-1940)
the structural linguistic theory and behavioristic psychology theory merged
and influenced language teaching, resulting in the creation of a teaching
method called the "Audio-Lingual" method. The principles of the
Audio-Lingual method are
1. Speech is primary.
2. Each language must be viewed within its own context as a unique system.
3. The speaker may know nothing "about the language" although
he or she is perfectly capable of using it.
4. Learning a new language should be viewed as a sequence of activities
leading to a "habit formation."
Behavioristic psychology developed a mechanistic approach to learning. This mechanistic approach led to viewing learning as a series of "stimuli and responses," the connection between which was created by the reinforcement of correct responses. Audio-lingualism had its ideological roots in behaviorist psychology and descriptive linguists such as Bloomfield and Fries. On language testing, there was a strong impact of this approach (a set of assumptions that any teaching method is based on) as shown in Lado's (1961) statements. "The theory of language testing assumes that language is a system of habits of communication. These habits permit the communicant to give his [or her] conscious attention to the overall meaning he [or she] is conveying or perceiving" (p. 22). Due to the cooperation between psychologists and linguists, the contribution of psychology to the theory of language testing was initiated by introducing the principles of educational measurement to language testing. Psychometric techniques were used by language testmakers. Therefore, statistics and statistical analyses received serious attention in the development and administration of tests, as well as in the interpretation of test scores. Concepts such as reliability, validity, and desirable item characteristics became fundamental requirements for a good test. Thus, classifying this period as psychometric-structuralist may be well be justified if one considers the two influences on language testing in this period (Farhady, 1980, p. 33).
The discrete-point approach period. Structural linguists, reinforced by behavioral psychologists, influenced and developed language teaching and testing methods during this period. Discrete-point tests, which are based on the discrete-point approach, "a set of theoretical assumptions on which any teaching method is based" (Anthony & Norris, 1972), usually in the form of multiple-choice items, swept the field of language testing. Discrete-point tests are still one of the most popular tests in the field of applied linguistics. According to the discrete-point testing approach or theory, one assumes that by assessing the language student's knowledge of isolated segments of language (phonemes, morphemes, words, etc.) the test could accurately evaluate the learner's ability in a given language. Considering that linguistic competence is only one of the components of the language ability, the discrete-point testing method ignores other components of the learner's total language competence. Many scholars have pointed out the weaknesses and limitations of discrete-point tests (Briere, 1972; Farhady, 1980; Jakobovits, 1970; Oller, 1976, 1978). Because the discrete-point test developers concentrated on linguistic structures of language, they ignored various extralinguistic factors involved in the use of language (Jakobovits, 1970). If one critically examines the discrete-point testing theory, it becomes obvious that the theory ignores the most important purpose of language, communication. The ultimate goal of learning a language is to function in a given social setting in that language. Farhady (1980) stated:
A DP (discrete-point) approach to teaching virtually ignores the communicative aspects of language, and DP testing overlooks assessing the language learners' ability to use language for communicative purposes. The exclusion of communicative competence has raised numerous objections and scholars have seriously questioned the validity of DP teaching and testing methods. (p. 35)
The validity of the two most fundamental theories in the discrete-point testing was questioned: (a) the principles of the Skinnerian habit formation theory, the psychological theory behind the discrete-point testing, were questioned by cognitive psychologists and (b) the structuralist theory, the linguistic theory behind the discrete-point testing was challenged by the Chomskian transformational linguistic theory. Carroll's (1961, cited in Farhady, 1980, p. 36) criticism appeared, demanding a reform:
The work of Lado and other language specialists has correctly pointed to the desirability of testing for very specific items of language knowledge and skills judiciously sampled from the usually enormous pool of possible items. This makes for highly reliable and valid testing. It is the type of approach which is needed and recommended where knowledge of structure and lexicon, auditory discrimination and oral production of sounds, and reading and writing of individual symbols and words are to be tested. I do not think, however, that language testing (or the specification of language proficiency) is completed without the use of ... an approach requiring an integrated, facile performance on the part of the examinee. It is conceivable that knowledge could exist without facility. If we limit ourselves to testing only one point at a time, more time is ordinarily allowed for reflection that would occur in normal communication situation, no matter how rapidly the discrete items are presented. For this reason, I recommend tests in which there is less attention paid to specific structure points or lexicon than the total communicative effect of an utterance. (p. 318)
After this criticism, test developers began to search for tests that measure communicative abilities more realistically. Farhady (1980, p. 76) stated that all human communication, no matter in what area, directly and indirectly involves language interaction in unknown ways. There is a constant and inevitable interaction among the linguistic components of language. Therefore, testing each component, independently of one another, is not desirable. An adequate test will test all components of language. Ideally all components should be tested simultaneously. This is too complex and not feasible.
The integrative-test period. There was a change from discrete-point testing to integrative testing. One strongly recommended testing method had been added to the old one. Some language professionals say the new method replaced the old one by denouncing the validity of the former's testing method. The result was a tendency on the part of teachers and administrators to swing from one extreme to another in their testing strategies. For examples, Prator (1981) referred to this tendency in teaching as the "pendulum syndrome."
Since the advent (or inception) of the integrative test, there have been ongoing debates among scholars advocating either the discrete-point or the integrative test. The majority of language testing experts agree on the point that discrete-point tests require the examinee to manipulate highly artificial tasks that have little or no relevance to the actual use of those tasks in real-life situations (Briere & Hinofotis, 1979; Clark, 1978; Jakobovits, 1970; Oller, 1972, 1973, 1978, 1979; Spolsky & Jones, 1975; Spolsky, 1978; Upshur and Fata, 1968). Oller (1973) and other language specialists pointed out that the difference between discrete-point and integrative tests is not of type but of degree. Because the discrete-point test aims at testing one point of language at a time, there is, generally, no need for an examinee to understand a context longer than a sentence to answer a discrete-point test item (Oller, 1973). Spolsky (1978) believed that all of the linguistic components of language generate an integrated whole; therefore, the contribution of each discrete-point test item to the total knowledge of language could not be identifiable and is insignificant. Oller (1979, p. 172) strongly opposed the use of the discrete-point test, because he believed that breaking language into its linguistic components and into language skills and modes creates an enormously large number of items.
These inadequacies shifted the attention of language developers from discrete-point tests to integrative tests. Integrative tests have been considered more defensible measures of language proficiency than discrete-point tests. This shift has been the starting point for a new era in language testing referred to as the integrative-sociolinguistic period (Farhady, 1980, p. 38).
Massive development of standardized language tests. Although the weaknesses of the discrete-point test were pointed out by some scholars (Spolsky, 1978), massive development and production of discrete-point tests like the Test of English as a Foreign Language (TOEFL) had already taken place. The tests that were developed during this period, however, should be evidence of cooperation between language test makers and psychologists using scientific techniques in language testing (Spolsky, 1978).
The integrative-sociolinguistic period. The criticisms of discrete-point testing evolved from its inadequacies in dealing with language behavior as integrative, meaningful, and communicative. Discrete-point tests tested linguistic components only; therefore, they ignored testing the most important "communicative" aspects of language behavior.
The advocates (Oller and others) of integrative tests believed that
integrative tests measure the actual aspects of language activities that
one must normally perform in using language. They maintained that performance
on integrative tests depends on how an examinee understands, processes,
and produces normal language in real-life situations. Spolsky and Jones
(1975) believed that the integrative theory of testing could handle the
full complexity of language by using socio-linguistic rules involved in
actual communication. Cooper (1968) and Jakobovits (1970) emphasized the
necessity of incorporating sociolinguistic and sociocultural rules in the
tests. Cooper (1968), Jakovobits (1970), and Oller (1973) wanted to demonstrate
that integrative tests could tap the learners' communicative competence.
Farhady (1980) pointed out the inadequacies of integrative tests. He stated,
"It seems that IN [integrative] tests have their own inadequacies
and most of them do not assess the communicative ability of the language
learner" (p.40).
.
Attempts to Develop ESL Oral Tests
In recent years, attempts have been made to develop oral tests that are reliable and valid. The most commonly used oral test is the Foreign Service Institute (FSI) oral interview test. Research conducted by Hinofotis (1977) using an FSI-type procedure provided promising results. There seems to be a great need for research in the development and validation of oral interview tests because they have a highly desirable advantage over other types of tests. That is, the oral interview is one of the very few techniques that allows for the assessment of language proficiency in a direct, face-to-face situation. Almost all paper-and-pencil tests, including the functional [pragmatic competence] test developed in this study are indirect measures of language proficiency.
Problems of oral tests. Oral tests, however, have inherent problems.
In spite of the desirability of the oral test, the time involved in administering
and scoring the test is far beyond what practical considerations permit.
.
Problems of Other Integrative Language
Tests Dictation and cloze tests are considered to be the other two most important integrative language tests, in addition to composition and oral interview tests. There are three important properties of a test, (a) reliability, (b) validity, and (c) objectivity. Because the dictation and cloze tests violate the assumption of local independence of items, reliabilities of these tests were called into question (Farhady, 1980, p. 49). If items are independent, performance on one item should not influence performance on the other items (Lord & Novick, 1968). Because the items of dictation and cloze tests are contextually dependent on one another, if an examinee misses an item with key importance in the context, he or she may miss other items dependent upon those items.
Farhady (1980) stated that dictation and cloze tests do not assess communicative ability:
Dictation and cloze tests do not effectively tap the sociocultural, sociolinguistic or communicative competence of the learner. These tests are as inadequate as DP [discrete point] tests in dealing with communication between two or more interlocutors. .... However, due to the diversity of language skills to be tested and the controversies on the nature of language tests, it is commonly accepted that a good, balanced language test should include both DP and IN [integrative] parts. (p. 50)
This is essentially correct. During test construction, the present study
has attempted to avoid both the limitations of discrete-point and integrative
tests.
.
Pragmatic Language Proficiency Teaching and Testing
Pragmatic language testing, free from the problems of dictation and cloze tests, is a new direction in testing the examinee's language ability. Farhady (1980) realized that teaching and testing linguistic forms of language without paying attention to how these forms are actually used was not sufficient. Social appropriateness of an utterance, who is talking to whom, when, and under what circumstances, is just as important as linguistic accuracy. In preparing second language learners with necessary functional and pragmatic language skills, the inadequacies of structuralist and existing cognitive methodologies in dealing with language activities have led scholars to seek alternative methods for teaching and testing second languages. The movement toward development of such a theory of language teaching started in Europe and has received increasing attention from methodologists in the United States (Campbell, 1978; Rivers, 1973). Language instruction shifted its focus from teaching linguistic forms to teaching categories of communicative functions, which were intended to teach the appropriate use of language (Wilkins, 1976). Because none of the existing tests was developed on the basis of the notional-functional approach, the need for a new testing approach was sought. Although the necessity of functional proficiency or pragmatic competence tests has been realized and mentioned by various scholars such as Wilkins (1976), Morrow (1977), van Ek (1976a), Canale and Swain (1980), and Carroll (1980), there have been only a few attempts to construct such tests. The Farhady study (1980) was probably the first attempt to develop a functional or pragmatic language test. Because I have used his framework extensively, it is appropriate to review his study here.
Farhady's Study. The objective of Farhady's study was to develop communicative tests that follow the principles of the notional-functional approach. Discrete point and integrative tests had dominated ESL testing for many years prior to Farhady's study. At the time of his study (1980), the notional-functional approach to language teaching assumed that a limited number of functions were used in communication and that those functions could be identified, classified, taught, and tested. Two language functions and 4 subfunctions from van Ek's typology of language functions (1976a) were chosen, and the context of the test was limited to academic settings. Each test item incorporated 2 social variables: (a) social relations between interlocutors and (b) social status of interlocutors.
The Functional Test was developed in 3 phrases. In phase 1, open-ended test items were administered to 200 native speakers of English to elicit socially appropriate and linguistically accurate (S+ L+) options. From those options, the most frequent response for each item was selected as the keyed response. Phase 2 involved the same procedures with 150 nonnative speakers. Their responses were compared to those of native speakers to identify deviant responses. For each item, both linguistically inaccurate and socially inappropriate (L- S-), linguistically inaccurate but socially acceptable (L- S+), linguistically accurate but socially inappropriate (L+ S-), and both linguistically accurate and socially acceptable (L+ S+) options were developed. In the final phase, 56 multiple-choice items were pretested with 30 native and nonnative speakers to insure appropriateness of the options. Later the 56 items were divided into two 28-item forms and administered as part of the UCLA ESL placement exam (ESLPE). The ESLPE was used as a criterion measure to validate the Functional Test. The subjects used for the Functional Test validation were 826 incoming international students who took the Fall 1979 version of the ESLPE in 5 different sessions from September 14 to September 24, 1979. The subjects fell into 2 groups: (a) the first group consisted of students admitted to academic programs at UCLA, who took regular university courses and ESL courses concurrently and (b) the second group consisted of nonacademic students studying English for a variety of professional, occupational, and personal reasons. With regard to the first research question (the statistical characteristics of the Functional Test), reliability coefficients of .78 (for FORM-A) and .77 (for FORM-B) suggested that the tests were as valid and reliable as any other subtests of the ESLPE. Intercorrelations among the Functional Test and the other sections of the ESLPE of from .50 to .80 were interpreted as indications of concurrent validity. Results suggested that shorter composites could be created (in response to the second research question) to decrease the number of items in the Functional Tests and the ESLPE without losing significant information about the examinees' English language proficiency. In response to the third research question, learner background variables (gender, university status, major field of study, nationality, and native language) were indicated that explained significant differences in performance on the subjects of ESLPE.
The ultimate goal of a pragmatic competence test. The ultimate goal of the pragmatic competence test is to determine whether or not the test assesses linguistic, sociolinguistic, strategic, or communicative skill of an examinee's total language competence. Pragmatic competence tests are superior to existing discrete-point and integrative tests. Farhady (1980) stated:
They [pragmatic competence or functional tests] are almost automatically valid because we would know exactly what we wanted to test before constructing a test. Functional proficiency [or pragmatic competence] could also be decomposed into linguistic, sociocultural, and minimum communicative proficiencies. This versatile property of functional tests would enable language testers to identify learner problems and/or the degree of contribution of each component of language competency to the totality of communicative competence. (p.73)
This study intended to produce a valid pragmatic competence test that would indicate a language learner's progress toward acquiring native-speaker competence in American English.
The developmental stage of pragmatic competence tests. Because the notional-functional theory of teaching, on which pragmatic competence testing is based, is relatively new, few practical advancements to pragmatic competence testing have been established. Although the necessity for communicative competence tests has been suggested by many scholars such as Morrow (1977) and Carroll (1980), any feasible procedures of such test construction that reflect the principles of communicative competence have not been identified yet. Farhady (1980) suggested some of the major principles of functional or communicative testing:
It would seem reasonable, and may be necessary, for a testing procedure to follow the principles of a teaching theory. Unfortunately, inadequate theories behind DP [discrete-point] and IN [integrative] tests resulted in the development of inadequate tests. Of course, the shortcomings of both DP and IN tests could be partially attributed to the intricate nature of human language. In other words, where the accuracy and/or appropriateness of language to be used is not of primary interest to the tester, DP or IN tests may be useful and efficient. However, as far as language teaching, learning, and testing are concerned, there is almost no doubt that none of these theories have been adequate enough to handle the most important purpose of language, communication. Therefore, to avoid some of the shortcomings of existing testing procedures, functional [pragmatic competence] testing should follow the principles of functional teaching, which seems to be more adequate than other instructional approaches in dealing with the learner's communicative ability. (p. 75)
The tenor of this author's statement is correct. The test developed for this study attempted to measure a student's proficiency in acquiring communicative competence skills either from his or her learning in a functional teaching context or learning the skills outside the classroom. Some teachers follow, and some do not, the principles of functional teaching in the ESL programs where the students are studying. The students, however, achieve communicative competence either from the teaching in classroom or from learning situations outside the classroom. The students will prove that competence by their results on the Pragmatic Competence of American English (PCAE).
Parameters or criteria for developing pragmatic competence tests.
By administering pragmatic competence tests to educated native speakers
of English, reasonable criteria for native speaker norms could be established.
Farhady (1980) stated, "The ultimate goal of functional testing is
to compare the performance of nonnative speakers to that of native speakers.
.... It should not be assumed that all native speakers will perform invariably
under a given condition" (pp. 77-78). Of course, almost all native
speakers are communicatively competent, but in a certain social setting,
a native speaker would have more difficulty in carrying out a certain task
verbally than in another setting. Using educated native speakers' performance
as criteria is probably the best and so far the only available choice when
constructing a pragmatic competence test. Wilkins (1976) stated that it
is not known yet how to decompose the totality of a communicative act into
its discrete categories. Spolsky and Jones (1975) pointed out that it is
neither possible nor valuable to break the performance of a unit of communication
into the components of which it may consist. Functional or pragmatic competence
tests should be concerned with the degrees of linguistic, sociocultural,
and communicative ability [strategic competence] of the examinees in order
to diagnose the learners' difficulties in particular language areas (Farhady,
1980, p. 82). The next chapter will explain the procedures of developing
a pragmatic competence test dealing with the above mentioned principles.
.
Current Approaches
In spite of the existence of different views on the definition of language proficiency, a general issue on which many scholars in applied linguistics seem to agree today is that the focus of proficiency tests is no longer on classroom achievement but on the students' ability to use language (Farhady, 1980, p. 17). The so called "pragmatic tests" (cloze and dictation), for example, are claimed to be suitable for assessing overall language proficiency as well as for diagnosing examinees' specific language problems (Oller & Perkins, 1980). It has been demonstrated (Farhady, 1980) that functional proficiency tests are superior to existing discrete-point and integrative tests. Farhady's study of 826 university-bound ESL students at UCLA showed that functional tests are almost automatically valid because the test-maker would know exactly what he or she wants to test before constructing a test. The Farhady study proposed a new direction for the development and use of ESL tests that is called "the functional approach." The long history of educational measurement has witnessed different perspectives on the role, purposes, and techniques of testing, which have probably evolved from the various psychological theories of educational processes (Ebel, 1972). When the principles of educational measurement are applied to language testing, the differences in methods and techniques of evaluation become more conspicuous. The complexities of educational measurement, combined with the intricacies of language behavior have created a more perplexing and unique situation in language testing. Following the principles of educational measurement, many different purposes of language tests have been suggested. For example, achievement, aptitude, diagnostic, and proficiency tests are among the familiar categories of language testing. Clark (1972) distinguished two major categories of language tests: prognostic and evaluation of attainment. Each category, as illustrated in Figure 1, includes various subcategories.
.
. . . . . . . . . . . . . . . . . . . . . . . Language Tests
.
.
. . . . Prognostic Tests . . . . . . . . . . . . . . . . . . . . . . Evaluation
of Attainment
.
Selection . . . . . . Aptitude . . . . . . . . . . . . . . Measurement
. . . . . . . . . Measurement
. . Tests . . . . . . . . Tests . . . . . . . . . . . . . . . of Achievement
. . . . . . . of Proficiency
.
. . . . . . Placement . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . Measurement
. . . . . . . . Tests . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . of Knowledge
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . General . . . . . . . . . . . .Diagnostic
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . Achievement
.
Note. From Justification, development, and validation
of functional language testing (p.16) by H. Farhady, 1980, Ann Arbor,
MI: University Microfilms International. Copyright 1980 by H. Farhady.
Reprinted by permission.
Figure 1. Classification of Language Tests
.
The first major category (i.e., prognostic tests) involves making decisions about the most appropriate channel for the students to pursue in their language learning process. If the purpose of the test is to decide on the acceptance or nonacceptance of students into a certain program, it is referred to as "selection test." When the decision relates to choosing one program from among several possible alternatives, it is called a "placement test." If the decision is to determine the future degree of the student's success in language learning, it is called an "aptitude test" (Farhady, 1980, p. 15).
The second major category (i.e., evaluation of attainment) has more diverse functions or purposes than the first category. One of the goals of these tests is to obtain information about "the students' attainment of language skills taught in the course" (Clark, 1972, p. 3). If this information involves achievement in a broad skills area, the test may be called "general achievement." When the information sought is about highly detailed language structures, the test may be called "diagnostic achievement." The primary requirement of a diagnostic test is that "it indicates unambiguously the students' `mastery' or `nonmastery' of each of the language skills tested" (Clark, 1972, p. 3). Another category of tests under evaluation of attainment is called "proficiency tests." Because proficiency tests have direct relevance to the present study, this concept will be discussed in detail.
A general issue on which many scholars today seem to agree is that the focus of proficiency tests is no longer on classroom achievement but on the students' ability to use language. Proficiency tests are supposed to be independent of the way in which language is acquired. Briere (1972) defined the term "proficiency" as "the degree of competence or the capability in a given language demonstrated by an individual at a given point in time independent of specific textbook, chapter in the book, or pedagogical method" (p. 332).
Clark (1972) defined language proficiency as the language learner's ability "to use language for real-life purposes without regarding the manner in which that competence was acquired. Thus, in proficiency testing, the frame of reference .... shifts from the classroom to the actual situation in which the language is used" (p. 5). In this definition, another parameter is added to the function of language proficiency tests, that is, the use of language in real-life situations. It is this construct among others that the current study addresses.
Thus, the preceding review traces the historical development of various theoretical approaches to second language testing as they relate to various second language teaching methodologies. Current thinking stresses the importance of integrative tests as a way of evaluating the ability to function in a second language. Many of the current integrative tests are, however, inadequate and time consuming. Therefore, a pragmatic competence or functional test in a format that is relatively easy for the examiner to administer and score is needed.