A critical review of age-related research on L2 ultimate attainment

This article addresses age-related attainment effects in second language acquisition, posing the question of whether such effects are to be explained in terms of a Critical Period with a predictable and abrupt offset point or in terms of the impact of a wider range of factors. It attempts to explore this question by focusing on four discussion points in the current debate: (i) the wide use of native-speaker behaviour as the key L2 attainment yardstick; (ii) the degree of compatibility of prevailing views regarding the notion of a critical period for L2 acquisition; (iii) the relative narrowness of much research in this area, where age of L2 onset is often regarded as the crucial if not the only critical variable; and (iv) insights relative to maturational constraints on language acquisition offered by recent brain research. The article concludes that a loosening of the association between ultimate L2 attainment research and Critical Period Hypothesis (CPH) issues would shed more light on L2 attainment in terms both of the comprehensiveness and of the acuity of the insights which would result.


Preamble
It is undeniable that there exists a relationship between age and success in additional language (L2) learning. However, the precise nature of that relationship has long been a matter of controversy. The main area of contention can be summed up in the following questions: • is there a language-related maturationally constrained window of opportunity (or CRITICAL PERIOD) -ending at some point during or at the end of childhood -at the offset of which language acquisition becomes qualitatively different, more arduous and/or less successful in its outcome? or • do age-related changes in language-acquiring capacity and/or outcome result from the impact of factors other than a specifically language-focused critical period?
In what follows we attempt a critical review of research addressing the age factor in L2 attainment. Obviously, however, an article-length treatment cannot hope to provide anything resembling an exhaustive survey. Instead, we focus on a relatively restricted number of issues that have figured prominently in the recent literature. These are: (i) the extent to which it is appropriate to regard native-like performance as the critical yardstick in assessing L2 attainment; (ii) the extent to which we can speak of a genuinely unitary Critical Period Hypothesis; (iii) the relatively narrow scope of much research in this area, with age of L2 onset typically regarded as the crucial variable and other linguistic and contextual variables often insufficiently taken into account; (iv) the suggestion, which has been the subject of recent research, that late language acquisition may make use of different areas of the brain as compared with early acquisition.
In our discussion we attempt to give due attention to the evidence and arguments deployed in favour of the maturational constraints view, accepting that this view informs many other approaches to this topic. It goes without saying that we fully respect colleagues who espouse such other approaches. Our own review leads us, however, to suggest that the maturational constraints perspective has relied too much on native speaker behaviour as a basis for comparison, ignoring a range of issues calling the reliance on such a comparison into question. Our interpretation of available research is also that divergences among researchers concerning the precise nature of the purported maturational constraints concerned have been in large measure glossed over, and that insufficient attention has been paid to a range of other potentially important factors, such as amount and quality of input, learners' orientations and attitudes, and the specific conditions under which the L2 is encountered. Finally, we conclude that the neurobiological research to which the maturational constraints hypothesis has referred is very far from unequivocal, especially in terms of its recent indications.

The native speaker as a point of comparison
The criterion for success in L2 has widely been set as the level of proficiency associated with native speakers, in the sense of persons who have been exposed to the language in question since infancy. In the words of Cenoz & Genesee (1998: 18), 'bilinguals, in and outside the school, are usually evaluated against "monolingual" competence in their nonnative languages'. As pointed out by Andreou & Galantyomos (2009), the notion of native speaker goes back to the widespread belief that one's particular language, one's lingua nativa, was biologically inherited (cf. Christophersen 1988). In this section we look more closely at the research and issues relating to use of the native-speaker yardstick in language attainment studies.
It is understandable that native speaker proficiency should have been widely used as a point of comparison. The popular view of success in L2 acquisition is closeness to native-speaker performance, and in relation to the age question the common observation is that, at least in non-instructional settings, early acquirers tend to end up indistinguishable from native speakers whereas later acquirers do not. The question, though, is whether reliance on this particular comparison is really the best way of exploring age effects and maturational issues. The discussion that follows suggests that there are better ways of proceeding -that, while the value of research using this comparison certainly needs to be acknowledged, the question mark attaching to it needs to be borne in mind.
It is interesting to note that in L1 acquisition studies, too, the reference-point for evaluating the attainment of late acquirers has been the language of individuals in contact with the relevant L1 since early infancy. The two kinds of late L1 acquirers that figure in the research literature are, first, children who, until their rescue, were deprived of normal human society and linguistic interaction, and, second, profoundly deaf children who are not given access to language input they can process -i.e. signed input -until a relatively late stage.
The most famous modern case of a language-deprived child is that of Genie, who was rescued from the isolation imposed on her by her parents in twentieth-century California (see, for example, Jones 1995). After being brought into care Genie made rapid and substantial progress in her language development, but because her English diverged -in particular in its syntactic anomalies -from that of users of the language who had had normal exposure to English since infancy, some researchers interpreted her case as evidencing 'constraints and limitations . . . outside of . . . the critical maturational period' (Curtiss 1977: 234). This reading has been seen by others as rather over-specific in the light of the recent finding that general neglect during the first five years of life leads to a permanently smaller head circumference, smaller brain size, and impaired ability to learn language and to develop normal social behaviour (e.g. Chugani et al. 2001;see Uylings 2006).
With regard to profoundly deaf subjects who are deprived of processable language input in their early years and then exposed to sign language at a later stage, when their language is compared to that of early signers, some 'deficits' are observable which differentiate them from those in contact with sign language from infancy (see, for example, Mayberry 1993; Morford & Mayberry 2000;Emmorey 2002;Singleton & Newport 2004). This finding has long been used as an argument in favour of the existence of maturational constraints in language acquisition (e.g. Long 1990), and at first sight it seems persuasive. On the other hand, late signers do not fail to develop a first (signed) language, but, on the contrary, typically become entirely functional in sign language. Following Chugani et al.'s above-cited (2001) finding on the effects of neglect, it can be speculated that deprivation of language input during the phase in a child's life when cognitive development is at its most intense may have serious general cognitive effects, and that it is these general effects that are reflected in later language development. Some research (e.g. Peterson & Siegal 1995;Schick & Gale 1997;Woolfe, Want & Siegal 2002) does indicate that deaf children whose access to sign language is delayed have problems in the area of THEORY OF MIND, that is, in the understanding that individuals other than themselves have mental states (beliefs, intentions, etc.). Lundy (1999) has argued that the impact of such problems on language development could be far-reaching.
To return to the L2 domain, the criterion of native-speaker proficiency, widely deployed though it may be, is not unquestioned. Hill (1970: 243f.) suggested that the notion of native speaker is culture-bound, and that, for example, in an area like South India, where many adults speak several languages exhibiting similar phonetic systems, it would not necessarily be easy for native speakers of a given local language to recognize L2 users of their own language who are speakers of other local languages. More recently, Cook (e.g. 1999Cook (e.g. , 2002, Davies (2003Davies ( , 2004 and Piller (2002) -among others -have problematized the whole notion of the native speaker model both in L2 research and in the L2 teaching context. Cook consistently argues that the focus should be on L2 users in their own right rather than in comparison with native speakers. He remarks that, while 'ultimate attainment is a monolingual standard rather than an L2 standard ' (2002: 6), there is no intrinsic reason why the L2 user's attainment should be the same as that of a monolingual native speaker. Davies discusses the difficulty of defining what a native speaker actually is. He expresses the view that 'the distinction native speaker-non-native speaker . . . is at bottom one of confidence and identity ' (2003: 213). Piller (2002), for her part, reports her subjects' perceptions that 'passing [for a native speaker] is a temporary performance', 'typical of first encounters', and 'designed for a particular audience ' (p. 200).
In specific regard to age-related studies of L2 acquisition, one of the suggestions made by those taking a critical period perspective is that after a certain maturational point the L2 learner is no longer capable of attaining native-like levels of proficiency. For example, Scovel (1988) makes this claim with respect to phonetics/phonology (cf. Scovel 2000Scovel , 2006, and Long (1990Long ( , 2007 makes it more generally. Such claims have been challenged by research focused on later beginners attaining native-like levels of L2 proficiency: see, for example, Birdsong (1992); Ioup et al. (1994); Bongaerts, Planken & Schils (1995); Ioup (1995); Bongaerts et al. (1997);Palmen, Bongaerts & Schils (1997); Bongaerts (1999); Moyer (1999); Bongaerts, Mennen & Van der Slik (2000); Bongaerts (2003); Muñoz & Singleton (2007); Kinsella (2009) (cf. Singleton & Leśniewska 2009). See Long (2005), however, for a critical analysis of the methodological and other problems that some of this research may present. Hyltenstam & Abrahamsson (2000: 155) assert that no post-pubertal L2 beginner has yet been shown to reach a level of proficiency that in every linguistic detail is identical to that of a native speaker. They also recognize, let it be said, that the more closely we study very early L2 beginners the more we realize that, at the level of subtle detail, they also tend to differ from monoglot native speakers. On this latter point research has shown (see, for example, Hyltenstam & Abrahamsson 2000: 161;Hyltenstam & Abrahamsson 2003a, b) that even very young L2 beginners may diverge from native speakers at the level of lexico-grammatical detail. (cf. also , which goes beyond the lexico-grammatical dimension). The same is true at the phonological level: Flege (e.g. 1999Flege (e.g. , 2002 cites a number of studies which show that individuals exposed to an L2 in an L2 environment as young children are nevertheless quite likely to end up speaking the L2 with a non-native accent (e.g. Flege, Frieda & Nozawa 1997;Guion, Flege & Loftin 2000;Piske, Mackay & Flege 2001) and to be less good than monoglot native speakers at vowel and consonant perception in their target language (Flege, MacKay & Meador 1999;MacKay, Meador & Flege 2001).
It could well be, therefore, that the maturational issue is less important in this connection than the very fact of the possession of knowledge of another language (cf., for example, Grosjean 1982;Cook 1995). It seems also to be the case that the degree of distance between the L1 and the L2 plays a role (cf. Kellerman 1995). McDonald (2000) certainly found, for example, that learners of English from a Spanish-speaking background who had begun to be exposed to the language before the age of five were able to perform to native levels on an English grammaticality judgement test, whereas Vietnamese speakers with pre-age-five experience of English were not. In any case, what the foregoing would seem to imply from our perspective is that a more appropriate comparison than that between later L2 beginners and native users of the language in question might be one between later L2 beginners and those who begin to acquire an L2 in early childhood (cf. Singleton & Ryan 2004:109;Ortega 2010).
Recent work by Abrahamsson & Hyltenstam (2008), however, again uses indistinguishability from native speakers as the criterion for success in late L2 acquisition. They found that high levels of language aptitude were possessed by late L2 learners of Swedish who were judged to BE native speakers BY native speakers. Their argument is that a high degree of language aptitude is the sine qua non of nativelike attainment in late second language acquisition, and they suggest that the possession of such aptitude by a few individuals 'does not justify a rejection of the CPH' (p. 503). They propose (ibid.) a research agenda which would investigate 'the way in which such learners attain their nativelikeness -for example, through the use of unique psychological processes and an unusual sensitivity to language structure or even through continued access to the innate, implicit language acquisition mechanism that, for some reason, has remained unaffected by maturation'. Finally, they make the prediction that 'no adult learners should be found who are entirely nativelike in the L2 without having a high level of language aptitude and -we may add -without having worked professionally and successfully with the target language for a significant period of their lives ' (pp. 503f.).
Regarding the detail of the above claim, we should first comment on the idea of the innateness of language. This is an underlying axiom for one aspect of Abrahamsson & Hyltenstam's claim, but in fact, in the sense in which they deploy it is by no means universally accepted (cf. Sampson 2005). On the other hand, the notion that highly successful late L2 learners (indeed all highly successful L2 learners) require considerable amounts of input and experience in the target language is not at all controversial, although it would clearly be unwarranted to suggest that only cases of nativelike late learners WITHOUT such abundant input and experience could be considered candidates for undermining the Critical Period Hypothesis. In relation to the question of language aptitude, there is certainly evidence that this may play a role in successful L2 acquisition. Bylund (2009;Bylund, Abrahamsson & Hyltenstam 2010) have also proposed that high levels of language aptitude may act as a prophylactic against language attrition. Clearly, one may be sympathetic to this proposition without necessarily accepting the authors' claim that there is a critical period, ending at puberty, for avoiding attrition. In order to test detailed predictions in these matters with appropriate rigour, however, one would ideally need a more satisfactory definition of the construct of language aptitude than the fact of doing well on a specific language aptitude test.
The widely shared intuition of language professionals is that language aptitude is likely to contribute to successful L2 attainment at any age. It is noteworthy that Abrahamsson & Hyltemstam themselves find 'small yet significant aptitude effects in child SLA ' (2008: 481). In the present context, more to the point is the fact that the exclusion as counter-examples to the CPH of cases of high-attaining adult beginners who happen to be possessed of a good measure of language aptitude radically changes the whole critical period concept. In the biological sciences, a critical period is conceived of as species-wide, as TRANSCENDING INDIVIDUAL ATTRIBUTES. In any case, to return to the main theme of the present section, and to emphasize further a line of argument put earlier, given all that we now know about the interaction between languages in the mind (see, for example, Cook 1992Cook , 2003Cook , 2006, using native speaker behaviour as the yardstick of L2 attainment for L2 acquirers beginning at ANY age may require some re-evaluation, especially in relation to adult beginners, whose L1 is long established and deeply entrenched.
Let us return to the above-mentioned issue of the innateness of language. The Chomskyan version of the innateness hypothesis throws an interesting sidelight on the use of a native speaker yardstick in the discussion of age effects in L2. This relates to the question of whether the criterion to be adopted in assessing L2 attainment of late acquirers should be the degree of native-like performance or behaviour in the L2 or whether we should simply be looking at the extent to which late L2 acquirers, in common with native speakers AND early acquirers, are able to derive particular kinds of mental representations from purportedly bio-endowed mechanisms.
The claim made by some Chomskyans is that there is a fundamental qualitative difference between child L2 acquirers and post-pubertal L2 acquirers in terms of access to the innate bio-endowment -'Universal Grammar' (UG) -which, according to Chomsky, informs and guides language acquisition (see, for example, Cook & Newson, 2007: 237f.). Researchers such as Bley-Vroman (1989) have suggested that the child L2 acquirer is on a par with the child L1 acquirer in benefiting from access to innate mechanisms, but that the post-pubertal L2 learner has no access to UG. The empirical basis for this perspective was never solid (cf. Martohardjono & Flynn 1995;Hawkins 2001: 353-359), which is doubtless why other views on this issue abound within the Chomskyan paradigm -the notion that full access to UG remains throughout life, as well as various claims regarding the continuation of partial access beyond childhood (see, for example, Mitchell & Myles 2004: 78f.).
Some UG researchers, adopting a partial access standpoint, have attempted to isolate precisely which linguistic modules, submodules, features, or interface areas are deleteriously affected by maturation. One oft-cited example of such an approach is Hawkins & Chan's (1997) Failed Formal Features Hypothesis (FFFH), which posits a critical period for the selection of parameterized formal features, while suggesting that the principles of UG remain available beyond this critical period. The hypothesis suggests that formal features not selected during the course of L1 acquisition become inaccessible in respect of L2 acquisition in adulthood. Hawkins & Chan claim to find evidence within their own study that speakers of Chinese (a language without wh-operator movement in overt syntax) learning L2 English (a language with wh-operator movement in overt syntax) beyond the childhood years establish L2 mental representations which involve pronominal binding rather than operator movement. Montrul & Slabakova (2003), on the other hand, focusing on the post-pubertal acquisition of the Spanish aspectual system by English speakers, found that purportedly universal features of functional categories in this domain that were not selected in the L1 in early childhood remained accessible beyond childhood.
As is evident from the foregoing, the UG perspective on the critical period is less concerned with performance -the overt use of the language -than with underlying competence, in the Chomskyan sense, and the associated mental representations. As Rothman puts it: Any version of the CPH that claims that adults lose the ability to acquire syntax, for example, in the way children do, is forced to take the position that apparent L2 knowledge of target syntactic properties absent from or different in the L1 were learned explicitly and are therefore represented differently in the brain. (Rothman 2008(Rothman : 1065 The nature of such mental representations is the nub of the question for UG-informed research -not the capacity or otherwise of L2 acquirers to PERFORM like native speakers. In his own work, Rothman reports that the late L2 acquirers he investigates, while certainly NOT native-like in terms of the detail of their behaviour in the L2, remain capable of overcoming poverty-of-the-stimulus conditions and of arriving at particular UG-derived mental representations at the syntax-semantics interface. Given the different concerns of UG-based research into maturational constraints, the issue of whether or not to use native speaker performance as a yardstick arises to a much lesser degree than in more 'applied' areas of the CPH debate. The mental representations under scrutiny are, of course, identified on the basis of conclusions reached concerning the nature of native-speaker competence, but the question here is not about late L2 acquirers' capacity to pass for native speakers, but rather about whether or not these underlying mental representations can continue to be induced from L2 data with advancing maturation. A final dimension to the question of the use of the native speaker as a reference point for L2 acquirers is given by cases where in functional terms the L2 BECOMES the L1 (cf. Bialystok 1997). Where, for example, a child migrating to a particular community has an L1 which is different from that of the majority of members of the host community, where his/her home language receives little or no support from the community in question and where his/her parents make little or no effort to support it either, the language in question may significantly attrite. One such case, reported by Kouritzin (1999: 75-96), is that of Lara, who had migrated with her family from Finland to Canada at age two, and subsequently lived for four years in a small town within a tight-knit Finnish community, continuing to develop in Finnish. At age six, however, her parents decided that the time had come to integrate with English-speaking Canada, and the entire family moved to a large city. According to Lara's account, her development in Finnish came to a halt at this point and English progressively took over. She reported that the last time she tried (and failed) to converse in Finnish was when she was eighteen years old. Her perception was that she had lost her original L1 and that her language was now English.
Researchers have been aware of this kind of phenomenon for a long time. Lambert, writing about the negative image of bilingualism in the first half or so of the twentieth century pointed out in 1975 that relevant negative research findings emanating from early studies mostly focused on sequential bilinguality in immigrant or minority children whose L1 was effectively being replaced by the language of the majority population. This point remains valid for more recent studies showing negative effects of bilinguality (Hamers & Blanc 2000: 93). It has been observed (notably by Jia & Aaronson 1999) that whereas older arrivals in an L2 environment often make choices which bring them into frequent contact with fellow native speakers of their L1, which accordingly restrict their contact with the L2, fewer choices are available to younger arrivals, because of compulsory schooling. In any case, their linguistico-cultural identity is not as fully formed as that of their elders, so their motivation to maintain it is likely to be weaker. According to Jia & Aaronson the consequence of these contrasting circumstances is that, whereas immigrants arriving after age ten on the whole maintain their L1, immigrants arriving before age ten tend to switch their dominant language from the home language to the language of the host country.
Likewise, Flege writes of a frequent trade-off in migrant children between L2 and L1 proficiency specifically in the phonetic-phonological domain, and a switch of dominance from L1 to the L2, his line being that in bilingual children 'the phonic elements of the L1 subsystem necessarily influence phonic elements in the L2 system, and vice versa ' (1999: 106). Thus, depending on their experience and input, young children may acquire a good L2 accent at the expense of their L1 accent. Alternatively, they may develop an authentic accent in their L1 at the cost of a non-native accent in their L2. Flege further contends that in migrants arriving as adults in the L2 environment, L1 phonology has continued to be refined, untrammeled by other language influences, and that its influence on L2 phonology acquisition is accordingly much stronger than in the case of child migrants.
If Jia & Aaronson and Flege are right, the favourable comparison made between the L2 proficiency of early L2 acquirers and that of native speakers of the language in question is explicable in many cases by the fact that these acquirers' erstwhile L2 has in fact switched its status to that of L1 -understood not in the classic sense of the language of infancy, obviously, but in terms of dominance, functionality and users' perceptions. There is a sense, therefore, in which making a comparison between the proficiency of such language users in their dominant language and that of native speakers is not a matter of comparing L2 attainment with an L1 baseline, because for such language users their dominant language for all practical purposes is ITSELF their L1. As for late acquirers, the continuing dominance (and influence) of their language of infancy, if viewed in the terms proposed by Jia & Aaronson and/or by Flege, emerges in most cases as an age-related phenomenon, but not strictly a maturationally determined phenomenon.

Variability in proposed offset points for the critical period
So far we have been talking about EARLY and LATE, YOUNGER and OLDER beginners, as if the underlying basis for such polarities were clear and agreed upon. This is far from the case, however. From the outset of the discussion of age-related factors in language acquisition there have been disagreements about when the optimum period for acquisition ends. Thus, the pre-cursor of critical period thinking, Penfield, suggested that 'for the purposes of learning languages, the human brain becomes progressively stiff and rigid after the age of nine' (Penfield & Roberts 1959: 236) and that 'when languages are taken up for the first time in the second decade of life, it is difficult . . . to achieve a good result' (Penfield & Roberts 1959: 255). Lenneberg (1967), on the other hand, proposed puberty as the offset point for the critical period. With respect to L2 acquisition, he asserted (p. 176) that after puberty 'the incidence of "language-learning-blocks" rapidly increases', 'foreign languages have to be . . . learned through a conscious and labored effort', and '[f]oreign accents cannot be overcome easily'.
Other researchers have suggested that different areas of language (phonology, syntax, lexicon, etc.) have different critical ages associated with them (for discussion see, for example, Singleton & Ryan 2004: 84-94;Singleton 2005). Such multiplicity implies a multiplicity of mechanisms, which is entirely plausible, but which clearly runs counter to the notion of a unitary critical period. A further group of L2 researchers assert that the critical period ends progressively over a number of years, a process that begins around the age of six or seven (e.g. Johnson & Newport 1989;Long 1990). More radically, Hyltenstam & Abrahamsson (2003a: 543-544) cite, in this case in respect of L1 evidence, Ruben's (1997) review of studies of children who had experienced temporary hearing impairment in their first year of life and who subsequently showed deficits in verbal memory and phonetic perception in L1. Ruben concludes that the critical period for phonetics/phonology ends around the twelfth month of infancy. He further interprets the relevant research as indicating that the critical period for syntax ends in the fourth year of life, and for semantics in the fifteenth or sixteenth year. To return to L2 evidence, Hyltenstam & Abrahamsson's own review of the evidence leads them to have doubt about the critical period itself, and to state in one publication that it may be une chimère (2003b: 122). Elsewhere they speculate that the language learning mechanism may be 'designed in such a way that it . . . inevitably and quickly deteriorates from birth' (2003a: 575) and, for this reason, 'nativelike proficiency in a second language is unattainable' (p. 578) (cf. also ).
Referring to L1 evidence, Aram et al. note (1997: 85) that 'the end of the critical period for language in humans has proven . . . difficult to find, with estimates ranging from one year of age to adolescence'. The impact of such uncertainty regarding the critical age at which, purportedly, the language acquisition capacity abruptly deteriorates is twofold. First, it is taken by some to undermine the plausibility of the entire notion of a critical period for language acquisition, whether L1 or L2. Second, it deprives the concepts of 'early' and 'late' L2 learning of any kind of stable reference point and, therefore, of any real meaning.
With regard to the former observation, if there were clear evidence for the closing of a window of opportunity for language aquisition, surely it ought to be possible for researchers to agree where it is situated. The fact that there is so much DISagreement about this matter casts severe doubt on the whole idea of a critical period for language. After all, zoologists seem to have no hesitation in determining the critical period for the development of binocularity in various species -including humans (Almli & Finger 1987: 126). In fact, scepticism regarding critical periods affecting higher-order functioning in humans is not confined to the language domain. Thompson (2001: 87), for example, talking about child psychology in a general way, comes to the conclusion that 'the complexity of the behavioral systems to which these concepts are applied in young children makes it difficult, if not impossible, to identify the parameters of sensitive periods with appropriate specificity'. Bailey (2002) argues similarly that proof of the existence of developmental critical periods in human beings has been elusive, and that their invocation by child educators may in fact be a distraction from the real issues.
Concerning the point made in the previous paragraph about the stability of reference points, if twelve years is taken to be the critical age, L2 learning at age four should presumably be regarded as 'early' learning, but if twelve MONTHS is taken to be the critical age, L2 learning at age four would already be 'late' learning. At the very least, this implies that any comparison of EARLY and LATE L2 beginners must be discussed relative to a particular timing of the critical age, and that any such comparison risks falling foul of other views on the offset of the critical period.
Two further points need to be made in the context of the above discussion. The first is that variability in the views on the critical period concern not only the timing of its offset but also the scope of its effects. Whereas, for example, Lenneberg (1967) saw such constraints as affecting language in general, for Scovel (1988Scovel ( , 2000Scovel ( , 2006, as we have seen, they are relevant only to the phonetic/phonological sphere. Other researchers limit the purported domain of the age factor quite differently. Cummins, for example, explains certain findings from critical periodoriented research in terms of age-related effects affecting basic interpersonal communicative skills (BICS) rather than cognitive academic language proficiency (CALP) (Cummins 1979: 199ff.). Other researchers have drawn a distinction between implicit and explicit learning. DeKeyser (2000, 2003a, 2003b, 2006) suggests that maturational constraints apply only to implicit language learning mechanisms. Clearly, the divergences evident in the approaches outlined above further fragment the maturational constraints hypothesis.
The second point has to do with the nature of the offset. In order to qualify as a critical period as this is understood in the biological sciences, the purported optimum period for language acquisition would have to end in a relatively abrupt manner. That is to say, the observed decline in capacity would have to be qualitatively different from the overall, gradual, linear decline in physical and cognitive capacities that is associated with the ageing of any organism. So we must ask: is this the case? Classic critical period-friendly studies such as that of Johnson & Newport (1989) have claimed to find a sharp discontinuity between the performance of L2 acquirers who began to be exposed to the L2 before or after the supposed offset of a critical period. The recent investigation by DeKeyser, Alfi-Shabtay & Ravid (2010), focussing on morphosyntax, also suggests a steep decline until the age of 18 followed by a levelling-off. Other researchers, however, have placed a question-mark over the notion that any age-related decline in language-acquiring capacity has an 'elbow-shaped' slope of the kind that one would expect if it were determined by a critical period (Bialystok & Hakuta 1999;Flege 1999;Birdsong 2004Birdsong , 2006. Bialystok & Hakuta's re-analysis of Johnson & Newport's data (Bialystok & Hakuta 1994;Bialystok 1997) suggests 'that the tendency for proficiency to decline with age projects well into adulthood and does not mark some defined change in learning potential at around puberty' (Bialystok 1997: 122). Bialystok and her colleagues also analysed census data on reported age of arrival in an L2 English-speaking environment and reported proficiency in English; they detected a steady linear decline of reported proficiency as age of arrival increases, but no indication of a dramatically sharper rate of decline at any given point (Bialystok & Hakuta 1999;Hakuta, Bialystok & Wiley 2003;Wiley, Bialystok & Hakuta 2005). We should note that Bialystok et al.'s interpretation of such census-derived data has been heavily criticized on methodological grounds (DeKeyser & Larson-Hall 2005), but, on the other hand, data on the relationship between L2 accent and age of arrival show a similarly continuous decline (cf. Flege 1999). Birdsong (2006: 12) comments that 'a recurrent finding is that a linear function captures the relationship between AoA [age of acquisition] and outcome over the span of AoA'. Reichle's publications (2010aReichle's publications ( , 2010b provide further evidence of the continuity of a linear decline in L2 learners' capacities well into adulthood. This research focuses on information structure (IS) -the ways in which a speaker deploys syntactic cues to signal the important elements in a sentence, with particular reference to the French clefting construction involving c'est -for example (2010a: 24), C'est un marteau qui se trouve sur la table: 'A HAMMER is on the table'. Reichle administered an acceptability judgment interpretation of the c'est construction to L2 users of French all of whom were at least 18 years old, and who had resided in France for periods ranging from four to 32 years. In his results he found evidence for a decline in the accuracy of acceptability judgments as age of arrival increased, but he also found that '[t]he relationship between judgment task proficiency and age of arrival continues to decline past the years of physical and cognitive maturation and well into adulthood' (Reichle 2010a: 28). He interprets these findings as running counter to a strong version of the Critical Period Hypothesis. In a second study, Reichle again looked at acceptability judgments relating to the use of c'est. His experimental groups on this occasion consisted of low-proficiency and high-proficiency L2 users of French, all of whom were at least 18 years of age and whose average age of exposure was the mid-teens. The results of this experiment were more difficult to interpret, suggesting that under controlled conditions truth value takes precedence over focus structure in acceptability judgments. Reichle concedes that in the light of such differences, further studies are needed, involving greater numbers of participants. Nevertheless, he affirms (2010b: 82) that the high level of native-like behaviour among late acquirers in both studies 'is contrary to one of the criteria for the critical period'.
A final comment on the shape of the decline of age effects: whereas an elbow shape would certainly strongly argue for the end of a privileged period for L2 acquisition, the smooth linear decline that is observed in a number of studies suggests an age-related decline more in keeping with what we know about the contour of general cognitive deterioration (see Birdsong 2006). Moreover, given the high degree of interrelation of the factors that influence L2 acquisition, it seems reasonable to hypothesize that the effects observed may result from the interplay of a range of variables (including age of acquisition) rather than the decline of any specific faculty. Of course, in order to verify such a hypothesis a great deal more research is required.

Contextual and individual factors
Comparisons of the effects of age of onset or age of acquisition (AoA) and of length of residence (LoR) in the target language community have frequently shown that the correlation between age of initial learning and proficiency is higher than that between length of residence and proficiency. This has been used in support of the proposition that age effects are the most important, or even the only, determinant factor of variable proficiency in SLA. Recently, however, a number of researchers have highlighted the role of contextual and individual factors that may mediate or interrelate with age effects in SLA. In this section we begin by reviewing recent research that has focused on the impact of L2 input and contextual factors. We then review studies that have looked at socio-psychological and cognitive factors in relation to age.

Input and contextual factors
Research that has looked at the relationship between input and the age issue is diverse in focus and findings. First, a number of studies have proposed measures of input that are more revealing than the simple measurement of LoR in the target language country. Second, the work of researchers in the area of speech acquisition, and in particular the work of Flege and colleagues, has highlighted the importance of quality of L2 input. A third focus has been on a critical analysis of the temporal variables in age-related studies, mainly age of onset, length of residence or length of exposure, and biological age or age at testing. Fourth, in the area of instructed foreign language learning, there is incipient research on the amount of input needed for earlier starters to surpass later starters in the LONG TERM following expectations derived from findings in naturalistic learning settings.
LoR is easy to calculate in research with large numbers of participants, such as the typical cross-sectional studies that compare groups of learners with different AoA. In these studies, the significance of LoR appears small in comparison with the much greater effects of AoA (e.g. DeKeyser & Larson-Hall 2005). However, recent research has shown that the choice of LoR as an indicator of L2 input received and of L2 learning opportunities may be too crude to be genuinely revealing. This is most clearly seen in contrast to the very fine-grained type of research that gathers information about actual contact with the L2. A good illustration of such research is Jia & Aaronson's (2003) study mentioned above, from which it emerges that the same LoR may be associated with widely varying amounts and intensity of L2 exposure and use. In their longitudinal study of ten Chinese learners of English aged between five and sixteen, Jia & Aaronson were able to document how the five-year-olds benefited from more contexts of L2 interaction than the adolescents. For example, the former had more L2speaking friends, while the latter chose more L1-speaking peers as their friends. In the same vein, Jia, Aaronson & Wu (2002) highlight the association of long-term L2 attainment with multiple variables, including long-term L1 attainment, in turn related to learners' language environment in the immigration setting. The participants in this study were 112 adults from different countries. They were administered an oral and written grammaticality judgment test (GJT) in English and in their L1, and a questionnaire that elicited biographical information including the kind of contact they had had with English. The authors found that an earlier AoA led to a better performance in both L2 GJTs, but that L1 and L2 performance were negatively correlated. Furthermore, participants with a greater mastery of L2 reported having used L2 at home more often than the other participants. From results such as this it is clear that the measurement of language contact (i.e. use of L2 relative to use of L1, in addition to LoR) is more valid and appropriate than the mere counting of years after arrival in the L2 environment. The latter may sometimes be meaningless because of the important differences observed in the amount, intensity and diversity of L2 contact that learners, even those with the same LoR, may experience.
In the case of instructed language learning, in addition to factors such as length of instruction (in hours, semesters or years), measures that account for frequency of use of L2 outside the classroom may be even more revealing (Muñoz, in press). When foreign language learners experience stays abroad, length of immersion as well as a measure of use of L2 relative to use of L1 is also relevant (e.g. Freed, Dewey & Segalowitz 2004). In this line of work, further evidence of age-related differences in L2 contact is provided by investigation of the contexts of L2 interaction experienced by a group of 39 children (ages 10-11) and 46 young adults during a 2-3 month stay abroad. In this study, Llanes (2010) administered a questionnaire regarding the participants' experience while abroad, which revealed that the children spent more hours in interaction with native speakers of the L2 than the adults (on average 29.2 hours per week vs. 7 hours per week, respectively). The children also enjoyed a wider social network -all living with host families and with many classmates -than the adults, who tended to socialize most frequently with other non-native speakers and to live in apartments or halls of residence. Moyer (e.g. 2005) argues that over and above quantifiable measures of cumulative experience, it is essential to examine the quality of the target language experience, namely the range of contexts of target language use outside the classroom or during an immersion stay, as well as the relative interactivity required by those contexts (e.g. TV vs. face-to-face interaction). Indeed, a number of studies have shown that the domains in which the language is used may be particularly significant. For example, in a study by Moyer (2005), formal learners of German as a foreign language were tested on their spoken and written performance on certain selected syntactic constructions (their frequency and accuracy). The analyses showed that experience of interactive realms of contact was more significant for spoken performance than instruction. Engagement in informal personal domains has also been observed to be a useful predictor of native-like pronunciation (Flege, Munro & MacKay 1995;Moyer 2004).
In the case of extended immersion experiences, living with native speakers is a strong predictor of long-term attainment over and above age of arrival. For instance, Marinova-Todd (2003) studied 30 post-pubertal learners of English from 25 countries who spoke 18 languages between them, and found that the six most proficient participants lived with native L2 speakers. Three attained native levels across all domains, including pronunciation, vocabulary size, grammatical knowledge, narrative skills, semantic comprehension and pragmatic skills. Kinsella (2009) studied 20 late-beginning near-native users of French who had English as their L1. Information about contexts of language use and interaction with native speakers was collected via a semi-structured interview. Those who scored within native speaker ranges on all tasks were married to native speakers of French, and were immersed both at home and at work in the target language. Similarly, Muñoz & Singleton (2007) found that the most successful late L2 learners, in a group of 11 Spanish-L1 near-native learners of English they studied, were living with native speakers of the target language.
To conclude this part of the argument, three methodological comments are in order. First, longitudinal studies are better at capturing crucial information about the quantity and quality of input to which the participants are exposed in age-related studies than traditional cross-sectional studies with large number of UNKNOWN subjects (this has been acknowledged recently by researchers in the latter tradition; see DeKeyser et al. 2010). Second, the studies reviewed above go beyond the scope of traditional enquiries into the age factor by using a mixed research methodology. This is a characteristic of a number of recent studies that use interview data and self-assessment in addition to proficiency measures (e.g. Bongaerts, Planken & Schils 1995;Bongaerts et al. 1997;Bongaerts 1999;Nikolov 2000;Marinova-Todd 2003;Moyer 2004;Urponen 2004), which allows for the triangulation of data. Third, it has been noted that a methodological improvement on the simple measure of LoR has been the employment of measures of learners' L1/L2 use. Typically, measures of L1/L2 use have been derived from background questionnaires (e.g. Flege et al. 1995;Piske et al. 2001;Flege, MacKay & Piske 2002;Flege & MacKay 2004) and, to a smaller extent, from semistructured interviews with the participants (e.g. Marinova-Todd 2003; Muñoz & Singleton 2007;Kinsella 2009), or both (e.g. Moyer 2004; Muñoz in press). It has been suggested, however, that participants' self-estimates of language use may not be sufficiently accurate. In this connection, Flege (2009: 188f.) argues that more accurate estimates of L2 input may be obtained by means of techniques, such as the Experience Sampling Method (ESM), which record participants' current activity several times a day over a period of time.
Another approach to the re-examination of the role of input, from the area of speech acquisition research, highlights the role of high-quality input (equated traditionally to nativespeaker input). In a longitudinal study, Winitz, Gillespie & Starcev (1995) followed a Polish boy over a seven-year period after his arrival in the US at the age of seven. The boy settled with his non-English speaking family in a small rural town, attending a school that had very few non-English speaking children and did not offer ESL classes. As a consequence, the boy received much more native-speaker input than most other children who immigrate in apparently similar circumstances, but are enrolled in ESL classes in schools located in large cities and have much more contact with other immigrants. The boy's oral productions were rated by native English-speaking listeners together with productions of other non-native boys of similar age and productions of native-speaker boys. It was found that the Polish boy's ratings increased rapidly over his first year in the US, becoming indistinguishable from ratings of the native speakers' productions. In contrast, the productions of the other non-native boys received lower ratings, indicating a foreign accent. The fact that these latter boys were exposed to abundant non-native input in school and outside school seems to reinforce the idea that the type of input received in the two cases had consequences for their speech development. In addition, it may be noted that the group of non-native boys were constructing an identity as part of a group of ESL children, which may also have had consequences on their targeted accent.
Flege (2009) contrasts these results with those from a study which also compared immigrant children in the US with English native-speaker children. In that study (Flege et al. 2006), two groups of Korean children who had arrived in the US at an average age of nine were tested on two occasions separated by 1.2 years. The two groups differed in LoR, one having been in the US for three years and the other group for five. Sentences produced by the Korean children and by age-matched native English children were recorded and later rated for degree of foreign accent by native English-speaking listeners. The sentences produced by the Korean children always received lower ratings, indicating a foreign accent. Differences in LoR between the two groups of Korean children did not produce significantly different rates, nor did differences between the two testing times. Flege (2009) suggests that in contrast to the Polish boy in the study by Winitz et al., the Korean children in this study did not receive a substantial amount of native-speaker input. This inference is reinforced by findings from other studies with adult learners. Flege & Liu (2001) tested Chinese learners who had arrived in the US at an average age of 27 years. Half of the participants had a short LoR (mean 2.7 years) and half had a longer LoR (mean 6.6 years). In addition, participants were separated in terms of occupational status, i.e. according to whether or not they were students. When their performance on three tests was compared it was found that LoR did not significantly affect their results, whereas the interaction LoR x occupational status did. That is to say, the long-LoR students obtained higher scores in the three tests than the short-LoR students, but LoR differences were not associated with significant differences in results in the case of non-students. Flege's (2009) interpretation is that only the students received a substantial amount of native-speaker input, because of their frequent interaction with their professors and fellow students. The non-students, for their part, had jobs that required little use of English and did not provide the same amount of high-quality input as the students received.
With regard to temporal variables in age factor studies, criticisms have been voiced about the operationalization and conceptualization of the most important variables in this type of research: mainly AoA and LoR; the re-examination of LoR has also considered participants' biological age at testing. To begin with, it has been argued that L2 acquisition may not really begin with the first encounter with the L2 but rather with the first SIGNIFICANT exposure (full immersion in the L2 and interaction with native speakers) (White & Genesee 1996: 242f;Long 2005: 296;Birdsong 2006:11;Muñoz 2008a: 584ff.). Consider, for example, the case of immigrants who reside in very tight-knit and relatively unintegrated communities and/or have a job that does not require use of the L2 (Flege & Liu 2001;Stevens 2006: 681). In such circumstances, immigrants are likely to hear different dialects of the L2, compatriots from the home country who speak the L2 with a similar foreign accent, and individuals from other L2 backgrounds with different L1-inspired foreign accents (Flege 2009). In addition, in cases in which the L2 community is strong, the target may be the accented speech that characterizes the community and becomes an identity marker (which suggests the inappropriacy of the native-speaker norm as the benchmark here). Such instances indicate clearly that AoA should not automatically be equated with age of arrival, which casts some doubt on the results obtained in studies that have been based on precisely this equation, since the attribution of late or early learning to participants may not have been accurate. It may seem paradoxical that certain studies that have not deemed it necessary to take into account individual factors in context do, nevertheless, disregard learners' school instructional experience prior to arrival in the country for similar reasons to those outlined above (e.g. Johnson & Newport 1989;Birdsong 1992).
The study by Hellman (2008) of 33 Hungarian-L1 adult-onset learners who had had more than 20 years of significant exposure to English is a good illustration of the new approach. In this study, AoA was defined as the time when significant daily interaction with native speakers of English began for each participant. This did not always coincide with age at arrival: it sometimes took participants many months to begin communicating regularly with native speakers of English, as in the case of a stay-at-home mother who did not begin to learn English until her children had started school. Instruction in the home country is another case in point: while the beginning of instruction in English in a classroom setting was not considered to be the beginning of significant exposure, the beginning of an English languagemedium MBA programme taught mainly by native teachers was. The study looked at the size and depth of the learners' lexicon and the results showed that 76% of the participants were nativelike on all L2 vocabulary measures, and that the accomplishments of five of them on those measures were above the comparably educated native speaker mean. This large percentage of successful learners may be partly due to the fact that the study looked at the lexicon, a language dimension that may be less favoured by early exposure (but see Mayberry & Eichen 1991;Hyltenstam 1992;Long 2007: 50ff.). In addition, the fact that this study found a higher number of successful late-starters than did other ultimate attainment studies may be partly due to the operationalization of AoA as age of first significant exposure.
The identification of the age of first significant exposure warrants particular focus in studies with heritage speakers. In the study by Au et al. (2002), early passive exposure to a language spoken in the home allowed the participants who (re)learned the language in adulthood to perform well on phonological measures of the target language, but not on syntactic measures. However, recent research comparing heritage users and L2 learners of Spanish has shown the former also have an advantage over the latter in certain grammatical constructions (Montrul 2010). In this respect, Montrul's (2008) work provides evidence and argumentation for the significance of the input received in the home by heritage speakers. Crucially, simultaneous bilinguals are more prone to show incomplete acquisition in the minority language than sequential bilinguals, owing to reduced amounts of input in the family in early childhood, while their language was not yet fully developed. In contrast, sequential bilinguals are only exposed to the majority language in middle or late childhood, once the L1 is in a more advanced stage. Montrul's (2008) main claim is that L1 loss is related to age effects within the critical period -extending to age 8-10, which is when L1 is established and supported by school literacy -and to the availability of input. She supports this claim with evidence for the increasing degree of L1 loss in international adoptees, simultaneous bilinguals and sequential bilinguals, respectively. While Montrul argues in favour of the existence of a critical period, she nevertheless sees age of acquisition as a macrovariable that subsumes other interrelated factors, such as 'maturational state, biological age, cognitive development, degree of first and second language proficiency, amount of first and second language use, among others' (2008: 1).
In contrast, other authors have suggested that the significance of initial age of learning may be undermined precisely by the fact that it cannot be disentangled from other variables. Adopting this approach, Jia & Aaronson (2003) argue that arrival age is a confounded indicator of neurobiological maturation because it co-varies with environmental factors. For Flege (2009) arrival age is a proxy for several variables, including state of neurological development, state of cognitive development, state of L1 phonetic category development, levels of L1 proficiency, language dominance, frequency of L2/L1 use and kind of L2 input (native speaker versus foreign-accented). If age of arrival is seen as a macrovariable, its effects on ultimate attainment cannot, says Flege, be compared to those of one simple variable such as percentage of L1 use, or input. Alternatively, the effects of a single variable such as inputgenerally around 10% of the variance observed -may be relatively stronger than so far perceived.
In a similar vein, and on the basis of an examination of 20 late learners of German, Moyer (2004: 140) argues that while age may have some direct impact on ultimate attainment, socio-cultural context inherently brings to bear multiple influences on the learning process that coincide with age. Specifically, early exposure predisposes the learner to a greater variety of contact sources (formal and informal, personal and professional domains) as well as being associated with greater consistency and frequency of personal contact. All of this results in more opportunities to use the L2 and greater confidence and sense of self in the language, which ultimately lead to more practice opportunities and increased fluency.
LoR (length of residence) has typically served as the operationalization of naturalistic learners' amount of exposure between the initial point of contact with the L2 and testing time, as seen above. The effects of AoA and LoR appear paradoxical: children learn L2s more slowly than adults; yet the earlier one starts to learn an L2, the better one will typically speak it in the long run. That is to say, older learners have a rate advantage, whereas younger learners have an ultimate attainment advantage. Researchers have speculated about the length of time of residence or immersion in the L2-speaking environment that is required for a learner to have reached ultimate attainment in naturalistic learning settings. Given the difficulty of determining this, attention has been focused instead on determining the period after which no LoR effects can be found. This is important because it has been argued that the effects of the length of residence may be limited to an initial period (Long 2007), which could explain the fact that LoR effects have not been visible in a number of studies (e.g. Johnson & Newport 1989;DeKeyser & Larson-Hall 2005).
Different lengths of time have been suggested for this initial period: from five to ten years (Krashen, Long & Scarcella 1979): a minimum of ten years (DeKeyser 2000: 503); or an even longer period (Stevens 2006). The initial period has been used as a proxy for ultimate attainment, given the difficulty of determining when the end state has been reached (Birdsong 2004). In the case of instructed foreign language learning, it is even more difficult to speculate about the length of instruction needed to reach ultimate attainment. Estimating amounts of input equivalent to ten years of full immersion would yield unrealistic periods of time, suggesting that in such a learning situation the amount of exposure or input never ceases to be determinant, in contrast to the expectations derived from studies of naturalistic learners.
From a different angle, the end point of LoR has also warranted some critical attention. For research purposes the end point of the length of residence period is the time at which learners' testing takes place. It has been argued that age at testing may impact on ultimate attainment by becoming confounded with cognitive factors, education and other background variables (Bialystok & Hakuta 1999;Hakuta, Bialystok & Wiley 2003;Stevens 2006). With regard to cognitive factors, Birdsong (2006) has highlighted the difference between the effects of cognitive AGEING on L2 learning or age-related cognitive decline (i.e. effects in the temporalassociative areas of the brain, particularly) and the effects of MATURATION on L2 learning, that is, the probability of native-like ultimate attainment if learning begins before the end of the critical period. Stevens (2006) has focused attention on the linear dependence between the variables age at immigration, length of residence and age at testing. Studies that have used correlational analysis (e.g. Johnson & Newport 1989) have shown a high correlation between AoA and proficiency, but this entails a negative relationship between chronological age and L2 proficiency as well. This casts some doubt on the conclusion often reached that age at onset is the only predictive factor. Stevens argues, accordingly, that the problem cannot be unambiguously addressed by correlational analyses and hence requires a careful consideration of the concepts and conceptual processes that these three variables index. Stevens (2006: 684) also suggests that age at testing or chronological age is not just an indicator of biological processes associated with ageing, but also an excellent indicator of lifecycle stage, strongly associated with motivations and opportunities to speak and to maintain or improve proficiency in an L2.
In instructed foreign language learning, age at testing may also be seen as confounding with age of onset (Muñoz 2008a). In this case, older learners' greater cognitive maturity at testing may in part explain the positive relationship between L2 achievement and older age at learning in comparisons with younger learners, after the same amount of instruction. Specifically, age at testing may have an impact on test-taking skills, since older learners' superior cognitive development 'helps them achieve a better understanding of the task in comparison with younger learners' (Muñoz 2008a: 588). The work by Tragant & Victori (2006) (see also Victori & Tragant 2003) shows how older pupils make more frequent use than younger pupils of strategies that are adequate for accomplishing learning and testing tasks. In the same line, Larson-Hall (2008: 38) argues that 'if the effect for an earlier start is small, it may be hidden by the cognitive advantages that the older starters hold on the tests'.
In the case of instructed foreign language learning, where input is mostly limited to the classroom, earlier work showed late-starting pupils catching up on early-starting pupils (e.g. Oller & Nagato 1974;Burstall 1975). A criticism made of this research argued that both early and late starters were at some point in the same classes, which had a levelling-down effect on the early starters. Over the past decade, a large number of studies have been conducted that are free from this methodological flaw because the advancement of foreign language teaching in many European schools has allowed the comparison of cohorts of pupils with different starting ages who have not at any point been put together in the same classes (e.g. Cenoz 2002Cenoz , 2003García Lecumberri & Gallardo 2003;García Mayo 2003;Lasagabaster & Doiz 2003;Muñoz 2003;Navés, Torras & Celaya 2003;Perales et al. 2004;Álvarez 2006;Miralpeix 2006;Mora 2006;Muñoz 2006b;Torras et al. 2006;Kalberer 2007). These recent studies have yielded consistent results showing a rate advantage for the late starters over the early starters.
The rate advantage of older learners is one of two recurrent findings in ultimate attainment studies in naturalistic settings, the other being the long-term advantage of younger starters (Krashen, Long & Scarcella 1979). Consequently, the expectation that dominated the field until a few years ago was that early-starting formal learners would also surpass late-starting formal learners in the long term. However, these studies have not found the expected longterm advantage of younger children or early starters over older children or late starters, when comparisons are made after the same number of hours of instruction (see Muñoz 2008b). Because these studies were carried out within a typical school context, learners had received limited amounts of input (rarely exceeding 700 hours). In fact, it has been suggested that given the limited input in a school setting, young learners would need a much longer period of time to outperform older learners (Ellis 1994;Singleton 1995aSingleton , 1995b. Two recent studies have aimed at capturing long-term effects of instructed language learning (Larson-Hall 2008;Muñoz in press).
In order to bring into the comparative frame foreign language learners with longer periods of instruction, Larson-Hall (2008) tested 200 Japanese college students after an average amount of input of 1,923 hours (early starters) and 1,764 hours (late starters). Larson-Hall claims that though the participants may not have reached their point of ultimate attainment in English, 'this time period should be long enough for advantages, if there are any, to emerge' (p. 37). The participants were examined on a phonemic discrimination task and a grammaticality judgment task. The study found some modest but inconsistent advantages for an early starting age only when learners' input ranged between 1,600 and 2,000 hours (the approximate range being 500-4,000). Importantly, age was not entirely separate from amount of input but, in fact, interacted with it.
The study by Muñoz (in press) offers another comparative study in respect of long-term achievement. This study tested 159 Spanish-Catalan college students with more than ten years of instruction in English (an average of 2,206 hours). The participants were examined on a general proficiency test, a lexical test and a phonetic perception test. While age of onset did not correlate significantly with performance on any of the three tests, length of instruction showed significant correlations with the results of the general proficiency test and the lexical test; in addition, a measure of language contact showed correlations with the phonetic perception test results. These findings confirm a previous prediction (see Muñoz 2006b) that in a typical limited-input foreign language setting, age does not yield the same type of long-term advantage as it does in a naturalistic language learning setting.

Socio-affective and cognitive factors
A number of researchers have recently investigated age effects in tandem with other variables such as social and psychological factors. Importantly, this multi-factor focus introduces qualitative instruments and analysis in an area that has traditionally used quantitative comparative analyses across large groups of learners (cf. the discussion in the previous section). The studies by Moyer (2004) and Kinsella (2009), which integrate quantitative and qualitative methods, are good illustrations of this new trend towards gathering L2 performance data alongside ethnographic data by means of individual interviews with the participants, as seen above. Moyer (2004) studied the L2 proficiency of 25 successful late learners in Germany. Three sets of instruments were used for the collection of data: a questionnaire gathering biologicalexperiential, social-psychological, instructional-cognitive and experiential-social factors; a series of controlled and semi-controlled production tasks; and a semi-structured interview. Psychological factors such as satisfaction with own phonological attainment and motivation accounted for 74% of the total variance in outcome, without either LoR or AoA. In their predictive power the two variables were as strong as (or stronger than) AoA and LoR combined (56%) (Moyer 2004: 81). Further, the qualitative analysis of the interview data underscored a set of psychological and social influences in the learner's individual learning experience: opportunities for contact; attitudes toward the target language culture; sense of self in L2 (motivation, behaviour and language function); perceptions of foreignness and belonging; and intention to stay. Moyer (2004) argues that the clusters of factors identified in the study are universally significant but the actual level of significance is a question of individual orientation in each case. In a later discussion, Moyer (2009: 159) describes the learner's orientation to the target language as 'the main force behind how s/he utilizes L2 input'. In that respect, Moyer notes that cumulative research points to the need for a greater focus on sources or domains of input (see above) as well as on the learner's intention towards the L2, as illustrated by the messages to be drawn from cases where there is a shift toward L2 as the primary language.
Learners' intention towards the L2 is manifested in their willingness to use it. In their earlier-noted longitudinal study of children and adolescents, Jia & Aaronson (2003) show that language preference may be shaped by socio-psychological factors and that it may in turn powerfully influence language use (and language-dominance shift, as seen above) and proficiency. Jia & Aaronson argue that the greater readiness to interact in the L2 and the resultant more frequent use of the L2 on the part of younger learners may explain the switch observed in the literature from the short-term advantage that tends to characterize older learners to the long-term advantage that tends to characterize younger learners. In other words, it appears that in the medium-to long-term, environmental differences affecting the L1 and the L2 may accumulate and result in language proficiency differences.
In a similar fashion, affiliation to the L2 has been considered especially relevant for late learners of an L2. It has been suggested that linguistic and cultural affiliation to the L2, plus issues of identity, are closely connected with the measure, in which there is active engagement in seeking opportunities for interaction with native speakers. The role played by such socio-psychological factors has recently been investigated in L2-focused age-related studies, principally with respect to phonology (Major 1993;Piller 2002;Moyer 2004). Moyer (2004Moyer ( , 2009 observes in this connection that once the shift in language affiliation occurs, the relationship between accent outcomes and LoR becomes predictable. A clear manifestation of affiliation to the L2 is the desire and motivation of L2 users to pass for native speakers (see Marx 2002, for a first-person account of identity phenomena in the appropriation of L2 accent; see also Kaplan 1994). Several recent studies have shown learners' motivation to sound nativelike to be a characteristic of successful L2 learners. Well-known studies are those conducted by Bongaerts and his associates in the area of L2 pronunciation. By way of example, Bongaerts (1999) reports that the participants in his study with outstanding pronunciation in English were highly motivated for professional reasons to sound like native speakers (see also Moyer 1999).
Likewise, a number of studies have reported on learners who did not want to pass for L2 native speakers (Nikolov 2000;Moyer 2004). Piller (2002) suggests that L2 users may assess, consciously or not, the advantages and disadvantages of passing for a native speaker, and such evaluations have consequences for L2 speech, resulting in variations in terms of perceived nativelikeness. In the same vein, Wray (2008: 265) notes that an L2 user may wish to avoid the possibility that the native-speaker interlocutor might take too much for granted regarding shared linguistic and cultural knowledge. Pavlenko & Lantolf (2000) offer the comment that, more than anything else, late bilingualism is founded on agency and intentionality, suggesting that people typically decide to develop their L2 only 'to a certain extent', which allows them to be proficient, even fluent, but also to avoid the consequences of losing their old, familiar ways of being in the world and having to adopt new ways (2002: 162). In this connection, Kinsella (2009) reports that some of the L1-English speakers in her study abandoned the desire to become nativelike in French after a period of residence in France and learned to appreciate the benefits of their foreignness.

Neurolinguistic dimensions
Recently, the use of sophisticated neuroimaging techniques in age-related SLA research has raised high expectations regarding their potential as a source of evidence relative to the CPH. However, the interpretation of neuroimaging-based research evidence is difficult at this stage, because we still know very little about, for example, the exact functions of brain areas involved in language production and perception, or about such issues as the relationship between the localization of neural substrates and language learning outcomes (cf. Fabbro 1999Fabbro , 2002De Bot 2000. In what follows, a brief account of neurolinguistic work is presented, identifying the foci of concern for researchers working in this area of inquiry.
A critical or a sensitive period for language acquisition accords well with the idea that a late-acquired L2 is represented in the brain differently from the L1. Specifically, under the strong version of the hypothesis, the classical language areas are not available for the learning of an L2 after the critical period. In fact, since the early days of the anatomoclinical approach the claim of differential localization for L2 has dominated neurolinguistics, following the discoveries of Broca (1861). For example, as noted by Abutalebi (2008), Scoresby-Jackson (1867) based his account of the selective loss of an L2 in an aphasic patient after brain damage on the idea that Broca's area is the language organ only for native languages, whereas the remaining part of the left inferior frontal gyrus might be responsible for L2s. In more recent times, a clear distinction has been made between the cognitive system mediating the meaning of words on the one hand, and the grammar of an L2 on the other. While the former tends to be seen as common across languages (Kroll & Stewart 1994;Francis 1999), the representation of L2 syntax has become a controversial issue. The line that argues for the existence of a critical period and a consequent different representation (i.e. at a cognitive level) of a language acquired after puberty is well represented by Paradis (1994Paradis ( , 2004. According to Paradis, there is a fundamental distinction between L1 and L2 learning in that the L1 grammar is acquired implicitly whereas an L2 grammar, if learned after the end of the critical period, is acquired explicitly. This implies that individuals may have declarative knowledge of the L2 grammar, even if the processes involved in production and comprehension may remain inaccessible to conscious awareness. A rationale for this position is provided by Ullman's Declarative/Procedural model (2001). This model proposes that, in monolinguals, words are represented in a declarative memory system, whereas grammatical rules are represented in a cognitive system that mediates the use of procedures. Following Ullman (2001Ullman ( , 2004, procedural and declarative knowledge are mediated by distinct neural systems involving a fronto-striatal network (i.e. Broca's area and the basal ganglia) for the first type and left temporal areas for the second. An L1 is acquired implicitly and mediated by an innate language learning mechanism only triggered during a critical period, whereas an L2 is generally acquired explicitly via formal instruction and represented declaratively in a left temporal area along with L1 and L2 vocabulary. In consequence, grammatical knowledge for an L2 learned after the critical period may not be processed through the neural structures related to implicit processing such as Broca's area and the basal ganglia, as is the case for L1 grammar, but, rather, be subserved by the temporal memory system. (Paradis 1994(Paradis , 2004Ullman 2001Ullman , 2004. This is the line taken by Clahsen and his colleagues (cf. Clahsen & Felser 2006;Felser & Clahsen 2009) in their investigation of grammatical processing by adult language learners. In their studies, adult L2 learners are observed to under-use syntactic information during the processing of morphologically complex words and sentences and to rely more on lexical-semantic cues to interpretation. This is taken as support for 'the shallow structure hypothesis for L2 processing according to which mature language learners interpret structurally complex sentences by assigning . . . representations to the input that lack grammatical detail . . .'. As underlying reasons for this, these researchers suggest 'maturational changes in the brain during childhood and adolescence that make it harder for late language learners to acquire procedural processing routines, as suggested by Ullman (2005)' (Felser & Clashen 2009: 315).
An alternative account that challenges the claim of differential localization for the L2 is Green's convergence hypothesis (2003). According to this hypothesis, 'the acquisition of the L2 arises in the context of an already specified, or partially specified, language system and L2 will receive convergent neural representation within the representations of the language learned as the L1' (Abutalebi 2008: 468). Following Abutalebi & Green (2007: 247) 'the neural representation of L2 (i.e. the brain structures implicated in its representation) converges with the neural representation of that language in native speakers -in other words, there is common neural network subserving L1 and L2'. Further, according to their dynamic view of bilingual speech production, the single network mediating the representation of a person's L1 and L2 is modulated by a control structure (integrated by separable neural systems including the anterior cingulated cortex, the basal ganglia, the inferior parietal lobule and, most prominently, the prefrontal cortex), and the manner in which this network operates depends on a person's proficiency in L2. More specifically, an increase in proficiency is accompanied by a shift from controlled to automatic processing and this will be accompanied by a reduction in prefrontal activity. In the end, neural differences between native and L2 speakers may disappear as proficiency increases. The convergence hypothesis has succeeded in bringing L2 learners' level of proficiency to the foreground so that both the current major strands of neurolinguistic investigation (those concerned with the WHERE question and those concerned with the WHEN question) have tended to focus on both AoA and level of proficiency as the potential predictors of degree of L1-like processing of the L2 (see Birdson & Paik 2008: 427).
Research concerned with the WHERE question using functional neuroimaging techniques has found ample support for the notion that the same structures underlie the acquisition of L1 and L2. In his overview of neuroimaging studies focusing on grammatical and lexicosemantic processing in bilinguals, Abutalebi (2008) notes three types of studies: those that use an artificial grammar, those concerned with L1 processing (in particular with syntactic encoding for the characterization of the neural structures mediating grammar), and those focusing on grammatical processing in L1 and L2 in bilinguals. As an example, the activation exerted by different natural languages in bilinguals has been investigated by a large number of functional neuroimaging studies, and Abutalebi (2008: 469) lists 12 functional neuroimaging studies that seem to have contradicted the predictions of Ullman's Declarative/Procedural model, showing that overall both low and high proficiency bilinguals engage for L2 the same neural structures responsible for grammatical processing in L1.
However, when the L2 system is weak, neural differences between L1 and L2 may exist, it seems, for both grammatical processing and lexico-semantic processing. The lexico-semantic domain has been well studied by means of neuroimaging techniques, and the detailed findings can be found in comprehensive recent overviews (Indefrey 2006;Abutalebi & Green 2007). Abutalebi (2008) summarizes the main findings. In contrast to the results from studies that failed to take account of L2 proficiency level (e.g. Kim et al. 1997), it transpires that when L2 proficiency is comparable to L1, neuroimaging studies have reported common activations in similar left frontal and temporo-parietal brain areas which are engaged when monolinguals perform the same tasks, both in single word production tasks (e.g. Hernández et al. 2001;Ding et al. 2003) and in retrieval tasks (Chee, Tan & Thiel 1999). On the other hand, bilinguals with low proficiency in L2 engaged additional brain activity in the latter type of tasks, mostly in prefrontal areas (e.g. Briellmann et al. 2004), as well as in lexical decision tasks (Illes et al. 1999;Pillai et al. 2003) and semantic judgment tasks (Chee et al. 2001;Wartenburger et al. 2003;Rueschemeyer, Zysset & Friederici 2006).
Exposure to the L2 may also have a strong influence on language inter-dependency in the bilingual lexico-semantic system. Perani et al. (2003) investigated the effect of 'differential exposure' to L2 in a functional Magnetic Resonance Imaging (fMRI) study of two groups of early highly proficient bilinguals (Spanish-born or Catalan-born). Speakers in the group that had Spanish as L1 and lived immersed in Catalan, as assessed by an extensive questionnaire, activated less of the left prefrontal cortex for word generation in L2 than speakers in the group that had Catalan as L1 and were less exposed to Spanish (their L2). Following Abutalebi (2008), these exposure-related differences, observed in the left dorsolateral frontal cortex, are in line with evidence from previous studies in monolinguals, reporting that experience and practice on language task performance might result in decreased neural activity within the left prefrontal cortex (Thompson-Schill, D'Esposito & Kan 1999). More evidence for the role of exposure is provided by the study by Pallier et al. (2003), which indicates that when L1 input is no longer available, the L2 seems to be able to take over the role of the L1. In this study Pallier et al. tested eight Korean participants who were adopted by Frenchspeaking families between the ages of three and eight. Behavioural and fMRI data revealed that the Korean participants did not differ significantly from the French controls. Pallier et al. (2003: 158) interpret their results as evidence 'in favour of the reversibility of plastic changes associated with language acquisition in the first few years of life'. However, in work also comparing Korean adoptees with highly proficient Swedish learners of Korean, Hyltenstam et al. (2009) challenge Pallier's results on methodological grounds. Their principal argument is that the attrited language can be reactivated through intense relearning, which explains the advantage shown by some of the adoptees in their own work. In contrast, they say, the lack of a relearning programme in Pallier el al.'s study had not allowed L1 remnants to re-emerge.
It is suggested that L2 proficiency and exposure may be the main determinants for lexicosemantic processing, whereas the age of L2 acquisition seems to have no major role in this domain (Indefrey 2006; see also Perani & Abutalebi 2005). In contrast, in the grammatical domain, the neural substrate is argued to be more dependent on age of acquisition effects than proficiency effects. Following Abutalebi (2008), the available evidence points to the role of age of acquisition, since late L2 learners recruit more neural resources around the areas mediating L1 syntax (e.g. Wartenburger et al. 2003;Jeong et al. 2007). However, Abutalebi also notes that many of the studies showing that late learners engage the prefrontal cortex more extensively were carried out with relatively low L2 proficiency subjects. The more extensive activity along the left prefrontal cortex could then be due to the degree of L2 proficiency rather than to age of acquisition. An exception is the study by Wartenburger et al. (2003) that investigates late bilinguals with a nativelike proficiency. The study reports that these highly proficient bilinguals activate the prefrontal cortex more extensively, suggesting that grammatical processing may be neurologically wired in. Abutalebi (2008) concludes that more studies are needed that investigate very high proficiency bilinguals, in order to see if the neural basis for grammatical processing depends on age of acquisition. Another problem with studies such as that of Wartenburger et al. (2003) is that the analysis does not allow us to see if the age effect has an elbow shape -showing abrupt discontinuity after a certain ageor if, on the contrary, there is a continuous decline with age. In this study, late learners have an AoA greater than six and a mean AoA of 19 (with a standard deviation of 6.6); that is to say, there is a wide variation in age. In fact, the authors themselves conclude that the exact time-frame of any critical period remains unspecified.
In this context, Snow (2002) argues that the real question about age differences in brain localization is whether it implies ANYTHING about behaviour or about critical periods. Besides, even if differences clearer and greater than those shown in research so far were found, they could still be irrelevant to the issue of the critical period for language acquisition. As Marinova-Todd, Marshall & Snow (2000: 17) put it, 'it is entirely possible that adult and child learners localize their learning differently without showing different levels of learning or alternatively show similar localization but different learning outcomes'. One may also note that neuroimaging studies concerned with location have been criticized from the point of view of theory construction. Beretta (2009: 70) suggests that findings generated by PET (Positron Emission Tomography) or fMRI 'may be merely epiphenomenal, just as linearity is epiphenomenal in linguistic theory -every sentence has to be heard or produced in a linear string, but that observation has no theoretical status'.
Fewer studies using neuroimaging methods have been carried out for the WHEN question. Existing findings point out that, in general, the same neural resources are used in processing L2 and L1, even when the L2 is learned relatively late. However, Stowe & Sabourin (2005) conclude their review of studies that have examined ERP (event-related potential) responses pointing out that L2 processing may show more subtle differences, particularly in the extent to which the ERP waveform is modulated by the amount of early input or input over the lifetime, by proficiency, and by differences in the linguistic systems of the L1 and the L2. It is also claimed that for less proficient learners, there appears to be evidence of an overuse of some parts of the system used by the native speakers (Sabourin 2003).
Not many studies have both recorded ERPs and contrasted different stages of L2 exposure or proficiency. One was the study by Weber-Fox & Neville (1996) in which ERP responses to linguistic anomalies among Chinese-English bilinguals were examined. The participants had been exposed to English at various ages, ranging from one to sixteen years, and differences in terms of delays of response, amplitude and scalp distribution were reported. It was not clear, however, that the groups had been matched in proficiency, and the younger group showed a more nativelike profile but also seemed to be the most proficient group. Uncontrolled subject variability has been recognized as a serious problem with cross-sectional research in this paradigm. The more recent work by Osterhout and colleagues (2006) attempts to overcome this problem by using a longitudinal design. Specifically, these researchers used ERPs to assess changes with increasing L2 exposure or proficiency in online comprehension of an L2 in novice student learners. By means of longitudinal studies, the researchers found that remarkably little L2 instruction was needed before learners incorporated L2 knowledge into their online comprehension processes. The apparent contradiction between this fast rate and the conventional belief, based on production data, that adult L2 syntactic learning is slow is resolved by separating comprehension from production: as in L1 learning, a learner's ability to understand the language develops in advance of his or her ability to produce it. Importantly, the researchers argue in favour of their choice of longitudinal studies because it minimizes between-subject variability, suggesting that the lack of consistency across experiments reflects, in part, uncontrolled subject variability. For example, if the variability across subjects is large enough, the researchers argue, the grand average might reflect only the accidental overlap of effects that were present in many learners in the sample. In this case, comparisons among 'L2 learners who were exposed to the L2 at different ages might produce reliable differences among the groups but might not represent true effects of age on acquisition' (Osterhout et al. 2006: 207).
In sum, it may be concluded that findings from a number of neurolinguistic studies conducted in the 1990s (e.g. Weber-Fox & Neville 1996;Kim et al. 1997) cannot provide decisive evidence concerning the existence of a critical period, because they fail to relate differences in brain activation patterns to differences in target language proficiency. More to the point are the findings from some recent studies reviewed above that, in considering apparent age of onset effects, take due account of target language proficiency and use. However, studies which meet this criterion are very few in number, and the problem remains of how to interpret the findings and draw relevant inferences.

Concluding summary
In this review we have critically explored the widely-held view that ultimate attainment in an additional language is predictable overwhelmingly or even solely on the basis of the chronological age at which exposure (whether significant or other) commences. We have subjected to close scrutiny the notion that the maturational constraints of a critical period for language acquisition are the crucial operative factor.
We began by raising questions about how age-related L2 research has deployed the criterion of the proficiency exhibited by monolingual native speakers as THE yardstick for ultimate L2 attainment. We have gone on to detail the vast amount of variability associated with critical period advocacy. We have then subjected to close scrutiny the notion of age of onset, and have highlighted the widely neglected importance of quality and amount of input and learners' attitudes and orientations. Finally, we have addressed the claim that maturational constraints manifest themselves in terms of changes in the organization of neural substrates for additional languages, countering this claim with evidence that many of these differences seem to be WASHED OUT with increasing L2 proficiency, although more research is needed to understand differences in grammatical processing in particular.
We have taken the line throughout that the evidence to be found in a number of recent L2 attainment studies of factors other than maturational should be brought more to the fore and treated more seriously. This is not to deny the reality of age-related effects in L2 acquisition, but rather to suggest that some of those effects relate to variables other than maturation as such, and that to focus exclusively on maturation impoverishes and distorts discussion of the relevant issues.
We have referred in our treatment of the topics specified above to the lack of confirmation in the literature of the existence of an abrupt maturational cut-off point in L2 learning capacity of the kind that would normally be associated with the ending of a critical period as usually understood. This question will no doubt continue to be debated, but our view is that until and unless an 'elbow' CAN be seen as clearly associated with the purported offset of any postulated maturational window of opportunity for language acquisition, age-related factors in L2 acquisition will need to be interpreted in the same light as age-related factors in every other domain of learning.
Overall, our suggestion is that a loosening of the association between ultimate L2 attainment research and Critical Period Hypothesis issues would open the way to a richer perspective on L2 attainment and a fuller harvest of empirical findings and theoretical insights. This would -in our judgment -remain the case even if at some future date some version of the Critical Period Hypothesis were to be put in place with a less variable definitional profile and less questionable empirical foundations than the version(s) we currently have before us.

Further research
It is customary in proposing future lines of research in the L2 acquisition domain to call for more longitudinal research. We shall not diverge from this pattern, since such research is needed, especially in the probing of the role of age in respect of ultimate attainment. If discussion in this area is ever to get beyond the stage of disputes about the interpretation of statistics, we require research which examines in real time and in a thorough and detailed manner the relationships among (i) onset variables, (ii) the full array of L2 learners' experience, contexts, attitudes and orientations and (iii) L2 development across the life-span.
It would also be valuable if further studies were to replicate the classic critical-period research design, but comparing the attainment of later L2 beginners against that of early L2 beginners rather than native speakers of the language in question. It is clear from our earlier treatment of the matter, and for the reasons given, that we are highly doubtful about the value of using monolingual native-speaker proficiency as the yardstick for evaluating the attainment of early and late L2 beginners. A more useful comparison would be simply between the ultimate L2 attainment of high-proficiency late and early beginners, although any such comparison would have to ensure that the context and conditions of exposure, as well as attitudes and orientations, were similar in all cases. Another possibility would be to compare very high-achieving late beginners with high-proficiency, simultaneous, balanced bilinguals with competence in both the L1 and the L2 of the late beginners. Presumably, such bilinguals' language processing would be subject to similar kinds of interaction between the two languages in question as that of the late beginners.
With respect to input-limited instructed language acquisition, it would be interesting for future research comparing older and younger L2 starters to replicate the typical comparative design but with the important modification of treating the teacher's input (rather than some general conception of native speaker norms) as the model against which to measure learners' achievement. The area of pronunciation would be especially interesting in age-related studies, since it would allow us to elucidate in a precise manner whether children's purported advantage in the phonetic/phonological domain is confirmed, by means of a better imitation of the accentual particularities of the specific model presented by the teacher's speech.
A research area in need of further elucidation is whether a high level of language learning aptitude is a prerequisite for high levels of proficiency in late learners only. Such elucidation will require serious theoretical work on the construct of language aptitude and its components, as well as the development of a broader range of valid and reliable testing instruments. Work examining the potential predictive role of psycholinguistic factors such as working memory could also make a valuable contribution in this domain. Furthermore, a longitudinal design would be of particular interest here for tracing the predictive power of individual aptitude factors.
We have repeatedly emphasized the need in ultimate attainment comparisons to attend very closely to the amount of input received by the L2 learners under scrutiny and to the manner in which it has been delivered, as well as to the quality of the target language experience. Studies which have delivered on this are relatively thin on the ground. We strongly suggest that such attention should become the norm in future research in this area, and in any replications of existing studies. Although very close scrutiny to input would certainly necessitate keeping numbers of participants low, this would be compensated for by a clearer understanding of the factors that impinge on individuals' language learning processes and outcomes.
Finally, with respect to instructed learners, it seems particularly timely at the moment to examine the age-related benefits that a stay abroad may bring them. Whereas studies in this area to date have usually examined a number of factors, such as learners' orientations or learners' proficiency level, and have typically focused on young adults, more and more adolescent and also child school learners are spending some time in the target language country. There is a need to investigate the role played by age in the outcomes of stay-abroad experiences, especially on the types and amount of L2 contact that learners at different ages have while in the new setting. In addition to stays abroad, formal learners now have sources of unlimited target language input at their disposal from media and internet use. The degree and manner in which this L2 exposure influences learners' proficiency over time may also give new insights into the mediating role of input and learning context in the effects of age on SLA.