Stefanie Wulff
Home
Contact information
Disclaimer



About the UF Corpus Linguistics Lab


Steffi Wulff
Lab Director
webpage

The UF Corpus Linguistics Lab is located in the basement of Turlington Hall at the University of Florida and is one of several labs of the Linguistics Department. In the Corpus Linguistics Lab, we investigate language data using corpora. Corpora are large-scale digital collections of language. The lab offers access to various corpora of English, German, Spanish, and other languages; corpora of written and transcribed spoken language; and specialized corpora such as corpora of academic speech and writing, learners of English as a second language, and the like. Access to these corpora is provided using various software tools such as AntConc, MonoConcPro, WordSmith Tools, and R. The lab also provides access to Eprime for experiments.

In the UF Corpus Linguistics Lab, we see corpus linguistics as a method, not a theory. All faculty and students affiliated with the Corpus Linguistics Lab are united by their commitment to rigorous, empirical analyses of language data. Correspondingly, the researchers affiliated with our lab conduct research in various theoretical frameworks and on a wide range of topics, including language processing, second language acquisition, and the synchronic and diachronic description of languages such Dutch, English, Spanish, and many others. For a list of currently ongoing research projects, check out a list of some of our projects below.

If you are a student interested in studying with us, we want to speak with you. Please contact the lab director, Stefanie Wulff (swulff@ufl.edu).

Lab Members


Anna Bjorklund
B.A. Student/Lab Volunteer


Chad Hammock
Research Assistant


Edith Kaan
Faculty Member
webpage


Ethan D. Kutlu
Ph.D. Student
webpage


Alexandra Lavrentovich
Ph.D. Student


Marc Matthews
Ph.D. Student

Affiliate Members


Laurence Anthony
Waseda University
webpage


Ryan Boettger
University of North Texas
webpage


Jorge González Alonso
University of the Basque Country
webpage


Stefan Gries
UC Santa Barbara
webpage


Nick Lester
UC Santa Barbara
webpage


Magali Paquot
Université catholique de Louvain
webpage


Mike Putnam
Penn State University
webpage


Ute Römer
Georgia State University
webpage


Jason Rothman
University of Reading
webpage


Debra Titone
McGill University
webpage

Former Lab Members


Steven Critelli
B.A. Student/Lab Volunteer


Erica Drayer
B.A. Student/Lab Volunteer


Dylan Attal
B.A. Student


Corinne Futch
B.A. Student/Research Assistant


Isa Hendrikx
Visiting Scholar


Martha Hinrichs
B.A. Student


Hali Lindsay
B.A. Student/Lab Volunteer


José Molina
B.A. Student


Rebecca Morris
B.A. Student


Noah Rucker
B.A. Student/Lab Volunteer


Chen Si
Ph.D. Student/Lab Assistant


Alexander Webber
M.A. Student/Lab Volunteer

Current Projects

Stefanie Wulff and Ryan K. Boettger (with the help of research assistant Chad Hammock): Collaborative Research: Evaluating a data-driven approach to teaching technical writing to STEM majors
(research project; funded by NSF #1708360/#1708362)
Overview. This research project seeks to improve the quality of writing instruction for undergraduates majoring in science, technology, engineering, and mathematics (STEM). Understanding writing disciplinary differences has become increasingly relevant as instruction moves from literature-based composition courses in English departments to include technical writing and content-based courses taught by scholars in different disciplines. One effect of these changes is that students need to write in a way that conforms to the practices of a discipline they may not (yet) be familiar with. However, STEM undergraduates have little access to customized, discipline-specific writing instruction. A solution to this problem is engaging students in a form of data-driven learning (or DDL) that teaches them how to write in their discipline rather than apply generalist writing principles that contradict how professionals actually communicate. An interdisciplinary team of researchers will develop a series of DDL instructional units for STEM undergraduates in both multi-major writing-intensive courses as well as STEM-focused content courses in physiology and ecology. Unit content and students' application of the instruction will be validated through peer review and revised via a control-group quasi-experimental design. Results and instructional materials will be disseminated through publications, workshops, and publicly available web tutorials.
Intellectual Merit. Introductory technical writing courses provide a great service to STEM departments, but it is not uncommon for instructors to have 20 different majors represented in their classroom. This project includes an innovative combination of characteristics designed to help writing and discipline-specific instructors customize their curriculum to meet the needs of all students: (i) It introduces modern corpus-linguistic methods that make large-scale studies possible, covering more text types and more language features, rather than case studies of a small number of individuals, classes, or texts. (ii) The DDL environment provides STEM students an accessible forum for applying these techniques and learning to overcome writing deficiencies that are prevalent in their disciplines. (iii) The project's personnel encompass content, language, and methodological expertise and represent three disciplines: biology, linguistics, and technical communication. (iv) The effects of DDL will be assessed in four diverse populations at a major public institution that reflects the global demographics and instructional challenges for teaching technical writing. The inclusion of multiple instructional settings will address how DDL transfers to diverse STEM settings and influences how students learn technical writing.
Broader Impacts. The proposed project advances discovery and understanding of how STEM students learn to write in their disciplines. Additionally, the project fosters new interdisciplinary collaborations focused on a fundamental component of STEM education—technical writing. STEM undergraduates need customized writing instruction and enhanced communication skills to prepare for the workforce. To help these students and their instructors, the team will disseminate the following for public use: (i) the Technical Writing Project (TWP), an online corpus of student technical writing previously compiled by the lead researchers; (ii) materials for the instructional units; and (iii) a series of web tutorials for audiences engaged in STEM writing practices on how to use the TWP and the instructional materials for individual and classroom learning purposes. The team will also disseminate the research findings through conference presentations, workshops, and peer-reviewed research within linguistics, technical communication, and STEM education. These venues attract academics and practitioners as well as national and international audiences.

Ethan Kutlu: Factors impacting native speakers’ FAS judgments
(Ph.D. dissertation project)
In this dissertation, we are aiming to understand and identify factors that can affect a rater's judgments while hearing foreign accented speech. Many L2 learners face daily discrimination as their speech may be accented and thus considered incomprehensible. In comparison to a regional accent, which is generally found to be more acceptable, foreign accented speech (FAS) is often regarded as problematic (Gluszek & Dovidio, 2010). Since the early 1970s, FAS has been examined in the (related) fields of linguistics, second language acquisition, and more recently, social psychology (Munro & Derwig, 1995; Ferguson et al. 2010; Van Engen & Peelle, 2014). Meanwhile, linguistic studies in accentedness and speech perception agree that speech perception is variable, and that humans can identify sounds even with minimal acoustic cues (Hillenbrand, Clark & Baer, 2011). This raises two questions: What makes FAS different from other kinds of speech variation? Why is FAS judged so negatively by so many native speakers?

Alexandra Lavrentovich: Using​ ​classifier​ ​features​ ​to​ ​determine​ ​cross-linguistic​ ​influence​ ​on​ ​the​ ​developmental​ ​trajectory​ ​of English​ ​morphemes
(Ph.D. dissertation project)
One​ ​prevailing​ ​position​ ​in​ ​second​ ​language​ ​acquisition​ ​(SLA)​ ​research​ ​is​ ​that​ ​learners​ ​of​ ​another​ ​language (L2)​ ​follow​ ​a​ ​predictable,​ ​fixed​ ​path​ ​in​ ​the​ ​acquisition​ ​of​ ​morphosyntactic​ ​structures​ ​(Goldschneider​ ​& DeKeyser,​ ​2001;​ ​VanPatten​ ​&​ ​Williams,​ ​2007),​ ​regardless​ ​of​ ​their​ ​dominant​ ​language​ ​(L1)​ ​background (Ellis,​ ​1994;​ ​Ortega,​ ​2009).​ ​For​ ​example,​ ​grammatical​ ​morpheme​ ​studies​ ​propose​ ​the​ ​following​ ​so-called natural​ ​order​ ​for​ ​English​ ​learners​ ​(Krashen,​ ​1987). However,​ ​recent​ ​literature​ ​reviews,​ ​experimental​ ​studies,​ ​and​ ​corpus​ ​approaches​ ​have​ ​cast​ ​doubt​ ​on​ ​the fixed​ ​nature​ ​of​ ​developmental​ ​sequences​ ​(Hulstijn​ ​et​ ​al.,​ ​2015;​ ​Weitze​ ​et​ ​al.,​ ​2011;​ ​Murakami​ ​& Alexopoulou,​ ​2016).​ ​For​ ​example,​ ​Luk​and​ ​Shirai​ ​(2009)​ ​find​ ​Japanese​ ​and​ ​Spanish​ ​learners​ ​of​ ​English show​ ​different​ ​hierarchies​ ​of​ ​accurate​ ​use​ ​of​ ​three​ ​morphemes,​ ​which​ ​may​ ​be​ ​explained​ ​by​ ​the​ ​presence​ ​or absence​ ​of​ ​the​ ​equivalent​ ​morpheme​ ​in​ ​the​ ​L1.​ ​In​ ​a​ ​longitudinal​ ​corpus​ ​study,​ ​Murakami​ ​(2016)​ ​shows individual​ ​variation​ ​and​ ​non-linearity​ ​in​ ​trends​ ​for​ ​accurate​ ​use​ ​across​ ​proficiency​ ​levels.​ ​Hence,​ ​L1 background​ ​and​ ​proficiency​ ​can​ ​reorganize​ ​the​ ​predicted​ ​order​ ​of​ ​morpheme​ ​acquisition.​ ​Aligning​ ​with​ ​the current​ ​research,​ ​this​ ​dissertation​ ​investigates​ ​cross-linguistic​ ​influence​ ​in​ ​the​ ​developmental​ ​trajectory​ ​of English​ ​grammatical​ ​morphemes.​ ​The​ ​research​ ​aims​ ​to​ ​model​ ​the​ ​dynamic​ ​and​ ​emergent​ ​nature​ ​of morpheme​ ​production​ ​by​ ​using​ ​a​ ​longitudinal​ ​learner​ ​corpus​ ​and​ ​computational​ ​methodology.​ ​The research​ ​has​ ​the​ ​following​ ​goals: 1. Quantitatively​ ​detect​ ​the​ ​under/overuse​ ​of​ ​grammatical​ ​morphemes​ ​between​ ​learners​ ​with​ ​different L1​ ​backgrounds​ ​and​ ​qualitatively​ ​examine​ ​what​ ​underlies​ ​these​ ​patterns​ ​to​ ​determine cross-linguistic​ ​influence. 2. Model​ ​the​ ​absence​ ​and​ ​presence​ ​of​ ​grammatical​ ​morphemes​ ​for​ ​individual​ ​learners​ ​across different​ ​proficiency​ ​levels​ ​to​ ​determine​ ​the​ ​extent​ ​of​ ​individual​ ​variation​ ​in​ ​morpheme​ ​accuracy development. The​ ​data​ ​will​ ​come​ ​from​ ​the​ ​EF-Cambridge​ ​Open​ ​Language​ ​Database​ (EFCamDat),​ ​a​ ​33-million-word longitudinal​ ​corpus​ ​of​ ​English​ ​learner​ ​scripts​ ​written​ ​by​ ​students​ ​enrolled​ ​in​ ​a​ ​virtual​ ​learning​ ​environment (Geertzen​ ​et​ ​al.,​ ​2014).​ From​ ​the​ ​data,​ ​I​ ​include​ ​Chinese,​ ​Spanish,​ ​Portuguese,​ ​Arabic,​ ​Russian,​ ​and German​ ​learner​ ​groups​ ​as​ ​they​ ​are​ ​the​ ​most​ ​represented​ ​in​ ​the​ ​corpus​ ​accounting​ ​for​ ​over​ ​70%​ ​of​ ​the​ ​data (Alexopoulou​ ​et​ ​al.,​ ​2015;​ ​Nisioi,​ ​2015).​ ​The​ ​learners​ ​pass​ ​through​ ​16​ ​proficiency​ ​levels​ ​in​ ​the​ ​online curriculum​ ​that​ ​correspond​ ​to​ ​the​ ​language​ ​proficiency​ ​guidelines​ ​A1​ ​through​ ​C2​ ​set​ ​forth​ ​by​ ​the​ ​Common European​ ​Framework​ ​of​ ​Reference. The​ ​main​ ​goal​ ​is​ ​to​ ​determine​ ​how​ ​cross-linguistic​ ​influence​ ​(CLI)​ ​might​ ​reorganize​ ​the​ ​predicted morpheme​ ​order​ ​at​ ​different​ ​proficiency​ ​levels​ ​of​ ​a​ ​learner’s​ ​developmental​ ​trajectory.​ ​To​ ​demonstrate​ ​L1 influence,​ ​I​ ​will​ ​follow​ ​criteria​ ​from​ ​a​ ​detection-based​ ​approach​ ​(Jarvis​ ​&​ ​Crossley,​ ​2012)​ ​which​ ​uses frequency​ ​differences​ ​between​ ​English​ ​writing​ ​patterns​ ​and​ ​the​ ​selected​ ​L1​ ​backgrounds.​ ​The​ ​criteria​ ​for determining​ ​CLI​ ​are​ ​as​ ​follows:​ ​(1)​ intragroup homogeneity: where​ ​learners​ ​with​ ​the​ ​same​ ​L1​ ​show​ ​similar morpheme​ ​developmental​ ​trajectories;​ ​(2)​ intergroup hetereogeneity: where​ ​learners​ ​with​ ​different​ ​L1 backgrounds​ ​show​ ​different​ ​trajectories;​ ​(3)​ cross language congruity: where​ ​learners​ ​use​ ​an​ ​English pattern​ ​that​ ​is​ ​similar​ ​to​ ​one​ ​they​ ​have​ ​in​ ​their​ ​L1;​ ​and​ ​(4)​ intralingual contrasts: where​ ​learners differentially​ ​use​ ​an​ ​English​ ​feature​ ​depending​ ​on​ ​how​ ​congruent​ ​that​ ​feature​ ​is​ ​in​ ​their​ ​L1. One​ ​way​ ​to​ ​meet​ ​the​ ​criteria​ ​is​ ​to​ ​carry​ ​out​ ​a​ ​Native​ ​Language​ ​Identification​ ​(NLI)​ ​task​ ​where​ ​a​ ​machine classifier​ ​identifies​ ​a​ ​learner’s​ ​L1​ ​based​ ​solely​ ​on​ ​the​ ​learner’s​ ​English​writing.​ ​An​ ​NLI​ ​analysis​ ​identifies the​ ​specific​ ​English​ ​features​ ​most​ ​likely​ ​to​ ​be​ ​affected​ ​by​ ​the​ ​L1​ ​which​ ​we​ ​may​ ​not​ ​detect​ ​from​ ​more subjective,​ ​manual,​ ​surface-level​ ​analyses​ ​(Crossley,​ ​2012).​ ​A​computational​ ​approach​ ​to​ ​CLI​ ​has​ ​the advantage​ ​of​ ​being​ ​able​ ​to​ ​deal​ ​with​ ​a​ ​large​ ​quantity​ ​of​ ​very​ ​similar​ ​data​ ​points​ ​(e.g.,​ ​the​ ​distribution​ ​of function​ ​words​ ​across​ ​all​ ​learner​ ​essays)​ ​and​ ​estimating​ ​the​ ​probability​ ​of​ ​a​ ​learner's​ ​L1​ ​given​ ​subtle patterns​ ​in​ ​the​ ​data​ ​(e.g.,​ ​the​ ​overuse​ ​or​ ​underuse​ ​of​ ​function​ ​words).​ ​I​ ​will​ ​use​ ​a​ ​support​ ​vector​ ​machine classifier​ ​with​ ​features​ ​such​ ​as​ ​part-of-speech​ n-grams​ ​and​ ​function​ ​words.​ ​The​ ​findings​ ​from​ ​the classification​ ​task​ ​will​ ​be​ ​used​ ​to​ ​determine​ ​patterns​ ​of​ ​the​ ​presence​ ​or​ ​absence​ ​of​ ​specific​ ​linguistic features​ ​between​ ​L1​ ​groups​ ​and​ ​how​ ​these​ ​patterns​ ​may​ ​change​ ​across​ ​proficiency​ ​levels.​ ​To​ ​further explore​ ​longitudinal​ ​factors​ ​and​ ​individual​ ​variation,​ ​I​ ​will​ ​use​ ​generalized​ ​additive​ ​mixed​ ​models​ ​on individuals​ ​in​ ​the​ ​corpus. The​ ​intellectual​ ​merit​ ​of​ ​this​ ​research​ ​will​ ​be​ ​in​ ​its​ ​triangulation​ ​of​ ​learner​ ​corpora,​ ​computational methods,​ ​and​ ​qualitative​ ​analysis​ ​to​ ​show​ ​how​ ​differences​ ​between​ ​learners​ ​can​ ​be​ ​approached​ ​in​ ​a data-driven​ ​way.​ ​The​ ​study​ ​looks​ ​at​ ​the​ ​emergence​ ​and​ ​distributional​ ​frequencies​ ​of​ ​grammatical morphemes​ ​for​ ​English​ ​learners​ ​with​ ​different​ ​L1​ ​backgrounds​ ​across​ ​increasing​ ​proficiency​ ​levels.​ ​The NLI​ ​approach​ ​improves​ ​on​ ​manual​ ​comparisons​ ​or​ ​learner​ ​case-studies​ ​because​ ​we​ ​can​ ​use​ ​larger​ ​data sets,​ ​make​ ​more​ ​objective​ ​decisions​ ​for​ ​where​ ​L1-specific​ ​language​ ​transfer​ ​effects​ ​may​ ​occur,​ ​and perform​ ​more​ ​semi-automatic​ ​analyses​ ​on​ ​other​ ​available​ ​corpora.​ ​There’s​ ​also​ ​evidence​ ​that​ ​classifiers outperform​ ​human​ ​experts​ ​in​ ​detecting​ ​L1​ ​background​ ​(Malmasi​ ​et​ ​al.,​ ​2015). The​ ​broader​ ​impact​ ​of​ ​this​ ​research​ ​is​ ​to​ ​exploit​ ​the​ ​findings​ ​on​ ​cross-linguistic​ ​transfer​ ​and individual​ ​variation​ ​in​ ​hypothesis-making​ ​in​ ​SLA​ ​and​ ​pedagogy.​ ​For​ ​example,​ ​the​ ​NLI​ ​task​ ​contributes​ ​to SLA​ ​research​ ​by​ ​adding​ ​quantitative​ ​data​ ​to​ ​known​ ​transfer​ ​effects​ ​that​ ​an​ ​otherwise​ ​manual​ ​inspection could​ ​miss​ ​and​ ​may​ ​help​ ​with​ ​hypothesis-making​ ​as​ ​to​ ​why​ ​these​ ​transfer​ ​effects​ ​exist.​ ​For​ ​direct applications​ ​in​ ​language​ ​teaching​ ​and​ ​learning,​ ​L1-specific​ ​transfer​ ​effects​ ​can​ ​be​ ​used​ ​informatively​ ​to tailor​ ​instruction,​ ​feedback,​ ​and​ ​methods​ ​in​ ​the​ ​classroom​ ​and​ ​curriculum​ ​as​ ​well​ ​as​ ​be​ ​applied​ ​to​ ​language teaching​ ​technology.

Stefanie Wulff and Stefan Th. Gries (with the help of lab volunteers Anna Bjorklund, Steven Critelli, Erica Drayer, Corinne Futch, Hali Lindsay, and Noah Rucker): Cognitive determinants of oral and written blend formation
(research project)
In this research project, we aim to take a first step towards addressing this gap by conducting an experimental study in which native speakers of English are asked to blend source words together. The source word stimuli will be systematically controlled for the different cognitive determinants mentioned above. In a crucial extension of our previous research with Dylan Attal (see below), we will elicit blends both orally and in writing from our participants. The results will be statistically evaluated both monofactorially (means, interquartile ranges, and exact tests) and multifactorially by means of a linear model that identifies which factors contribute to an increasing distance of the chosen cut-off points to the ideal ones as determined by the predictors (Gries 2006). The findings of this study stand to make a valuable contribution to our understanding of subtractive word formation processes by providing us with first clues regarding an online production and comprehension model of blending and by informing our understanding of the differences between creative and conscious word formation processes such as intentional blending compared to involuntary and unconscious word formation processes such as speech errors.

Stefanie Wulff and Stefan Th. Gries (with the help of research assistant Corinne Futch): Particle placement in L2 learner language
(research project; funded by a Language Learning Small Research Grant)
In this nproject, we are carrying out the first large-scale- corpus-based analysis of particle placement in learner language. Particle placement is a word-order alternation that involves the variable position of the particle in English transitive phrasal verbs (The squirrel picked up the nut vs. The squirrel picked the nut up). While researched intensively in native language, the present study presents the first large-scale, corpus-based account of particle placement in learner language, including data from three L1 backgrounds (Chinese, German, and Spanish) as well as native English speakers; data from the spoken and written modes; and a statistical model integrating a large number of variation parameters known to influence alternations in general, especially under-researched phonological constraints.

Past Projects

Marc Matthews: Need to, have to, and must: a collostructional analysis
(2015/2016 graduate advanced study project)
Modal verbs are a challenge even for intermediate-advanced learners of English. In this study, Marc examines three near-synonymous modals verbs in English, have to, need to, and must, in order to identify semantic nuances that distinguish these three verbs in authentic language use. The ultimate goal of the study is to present a number of teaching suggestions to improve learners’ understanding of how to use these modal verbs. To this end, Marc retrieved >5,000 tokens of the three modal verbs from the 2012 spoken sub-section of the Corpus of Contemporary American English. He is now in the process of annotating that data for several variables that we believe to impact native speakers’ choice of modal, including the subject (pronominal vs. lexical nouns), the animacy of the subject, the degree of association between the modal and the matrix clause verb, and the absence or presence of negation. We will subject the data to a series of collostructional analyses (Gries & Stefanowitsch 2004), and, ultimately, at multinomial regression analysis, to determine which factors play a role in the choice of modal, if and how these factors interact, and how important they are relative to each other.

Martha Hinrichs: The role of surprisal in L2 syntactic priming
(2014/2015 University Scholars Program Fellowship undergraduate study project)
In this project, Martha investigates the double object alternation in L1 Korean L2 English written production data. In Korean, all ditransitive verbs can be used in the prepositional object construction, while only some verbs such as cwu- (give) can also be used in the Korean double object construction (Jung & Miyagawa 2004). This raises the question whether advanced Korean learners of English would exhibit the same kind of constructional priming effects observed for other L2 English learners at advanced levels of proficiency, and if so, to what extent these prioming effects are modulated by verb-specific knowledge that is aligned with the constructional verb preferences of native speakers (Gries & Wulff 2005, 2009). This study uses the Young English Learners Corpus (YELC), a compilation of essays written by students in South Korea. Distinctive Collexeme Analysis (DCA; Gries & Stefanowitsch 2004) will be employed to to measure each verb’s associative bias towards either construction in the learner data. These statistics will then be compared to native English speakers’ preferences documented in previous research.

Rebecca Morris: L1 vs. L2 idiom processing: investigating the role of morpho-syntactic variation
(2014 undergraduate individual study project; completed)
In this project, Rebecca will test native and non-native speakers’ sensitivity to different variants of V NP idioms. Adopting a usage-based perspective, the hypothesis under investigation is that native speakers should be faster and more accurate in determining whether a given phrase constitutes an idiomatic or literal meaning depending on the surface form that the phrase is presented in. More specifically, native speakers are expected to identify idiomatic and literal senses faster when the phrase is presented in its most typical, i.e. frequent, surface form. A second hypothesis to be tested is that non-native speakers should exhibit the same qualitative behavior, yet less pronounced than native speakers. The underlying assumption is that since non-native speakers have had less input, and therefor weaker mental representations of what constitutes the most typical variant forms, they will be less able to rely on their knowledge of these more or less fixed assemblies as they make judgments and or give reaction times. In order to test these two hypotheses, Rebecca will first of all identify the most frequent as well as less frequent variant forms of a set of 60 V NP idioms (which are available as a data sample for previous and ongoing research of Dr. Wulff) and use these as stimuli in a combined judgment and RT task.

Rebecca graduated from the Univesity of Florida in 2015 and is currently a Ph.D. student at Indiana University.

Dylan Attal: Cognitive determinants of blend formation: an experimental approach
(2013/2014 University Scholars Program Fellowship and honors thesis project; completed)
In a television commercial broadcast at the 2013 Superbowl, the sandwich company Subway let its customers know that throughout the month of February, any sandwich would cost only 5 dollars. In order to make this promotion more memorable, they referred to it as Februany, a blend of February and any [sandwich]. Blending is an extremely popular word creation process, especially for advertisement campaigns and newspaper headlines – both genres in which space is limited and publishers compete for consumers’ attention. Blends fit the bill because they are compressed language, and they are catchy.
From a cognitive-linguistic research perspective, blends raise one major question: what factors impact the way in which a speaker blends two words together? For example, what makes brunch, a blend of breakfast and lunch, a better blend than breakfunch? Previous research on the basis of large collections of blends suggests that speakers take a variety of cognitive determinants into consideration in order to achieve the ideal balance between economy (the bigger the overlap of words, the better) and recognizability of the source words (the more material of both source words remains intact, the better). These cognitive determinants include various characteristics of the source words, such as their phonetic, phonemic, graphemic, segmental, and semantic similarity as well as their frequency in language. How exactly these characteristics interact in the online production and comprehension of blends, however, remains largely unclear to date. Gries (2012: 166) correspondingly points towards the dire need to leave behind purely descriptive linguistic accounts and turn to psycholinguistic concepts, notions and methods instead... With regard to experimental approaches, it would be interesting to have speakers coin blends of source words while controlling for many of the factors known to influence blending.
In this research project, we aim to take a first step towards addressing this gap by conducting an experimental study in which native speakers of English are asked to blend to source words together. The source word stimuli will be systematically controlled for the different cognitive determinants mentioned above. The results will be statistically evaluated both monofactorially (means, interquartile ranges, and exact tests) and multifactorially by means of a linear model that identifies which factors contribute to an increasing distance of the chosen cut-off points to the ideal ones as determined by the predictors (Gries 2006). The findings of this study stand to make a valuable contribution to our understanding of subtractive word formation processes by providing us with first clues regarding an online production and comprehension model of blending and by informing our understanding of the differences between creative and conscious word formation processes such as intentional blending compared to involuntary and unconscious word formation processes such as speech errors.

Dylan completed his USP Fellowship project and his honors thesis in April 2014. His honors thesis earned highest honors.

José Molina: Constructional priming as a function of L2 proficiency and L1 background
(2013 Honors thesis project; completed)
In this project, José elaborated on a previous study by Gries & Wulff (2005) that tested advanced German L2 English learners’ knowledge of verb argument structure constructions such as the ditransitive construction (José gave Steffi the paper) and the prepositional dative construction (José gave the paper to Steffi). José replicated two experiments, a syntactic priming experiment and a semantic sorting experiment. Rather than investigating only advanced learners of English from one L1 background only as in Gries & Wulff (2005), José elicited data from L2 learners at low-intermediate levels of proficiency, and from different L1 backgrounds. He found that the main controbutor to priming was the verb provided in the sentence fragment to be completed (as opposed to the verb presented in the prime sentence or the construction provided in the prime). Furthermore, he observed pronounced verb-specific effects such that certain verbs primed either construction significantly more often than others.

José completed his honors thesis project in December 2013 and was awarded highest honors. He graduated from the University of Florida in 2014 and then earned a M.A. degree in computational linguistics at Brandeis University.


Under construction...