Research

Computational work:

Natural language processing for cognitive models of learning and memory
A substantial body of literature exists to explain episodic memory and semantic memory for words. More recently, I have been integrating machine learning models of word and phrase meaning to explain memory for words and phrases, implementing the verbal model in my dissertation. These models specifically draw on context-based accounts of linguistic representations and of memory, in which context evolves over time and is used to anchor meaning.

Limitations of context-sensitive models of word meaning
Some neural network models (e.g. ELMo and BERT) produce context-dependent representations of individual word meanings, so a word like “up” can have a more literal meaning in “hang up” but a less literal one in “give up.” I have found that context-sensitive word representations from these models are unable to account for human judgments of phrase similarity (e.g. government leader and party official are highly similar). Specifically, these models need a much broader context to make these judgments correctly; unlike people, they cannot rely on non-immediate memory or prior experience.

Learning phrase compositionality
Most computational models of word meaning produce vectors associated with individual words. The result is a “geometric” interpretation of meaning, in which compositionality can be thought of as axes — vectors representing similar types of relationships (e.g. run-runs and draw-draws). Compositionality can be thought of in a similar way. That is, vector representations of combinations that are not literal should deviate from their word representations. I have been exploring methods that test whether this can explain human compositionality judgments.

Behavioral work:

Language production of sequences
Relevant publications: [1] [2]
Compound words and multiword expressions are complicated — we have to identify, remember, and then produce them fluently. When speakers are producing compounds, are they planned and produced like monomorphemic words? And when we retrieve phrases for production, is the process similar, or more discrete? My work has shown that compounds are produced similarly to monomorphemic words (Jacobs & Dell, 2014), but that phrases are decomposable, and retrieving one word from a phrase triggers the retrieval of the other to the extent that they co-occur (Jacobs, Dell, & Bannard, 2017).

Episodic memory for multiword sequences
Relevant publications: [1] [2]
A number of theories have proposed that multiword sequences are stored as unanalyzed wholes in long-term memory. As part of my dissertation work, I used novel applications of recognition memory and free recall experiments to understand phrase processing (i.e. Jacobs, Dell, Benjamin, & Bannard, 2016; Jacobs, Dell, & Bannard, 2017). I have found that phrase frequency effects can still arise in the presence of incremental production, without needing to posit phrase representations per se. More recently, I have been investigating whether phrase frequency effects arise spontaneously during the recall of visual information (Jacobs & Ferreira, in prep).

The role of auditory memories in speakers’ prosodic decisions
Relevant publications: [1]
Research on spoken word production tends to focus on the mechanics and/or goals that affect the phonetic forms of speakers’ utterances. Context and experience, however, influence speakers’ decisions. But what experience matters? Can simply saying a word in your head change your fluency? In a series of several projects, I have found that hearing a word is critical for production decisions (Jacobs et al., 2015; Buxó-Lugo, Jacobs, & Watson, under revision; Tippenhauer, Jacobs, & Watson, in revision), though it is not the only factor, as speakers reduce even when auditory input is degraded (Jacobs, Loucks, Watson, & Dell, in prep).

Tools:

Building distributional semantic models of words, phrases, and entities
Building word, phrase, and entity embeddings can be difficult and require significant hacking of existing packages. When working with traditional matrix factorization methods, the input space can be transformed to your liking. The nontology package makes that easy.

Easier forced alignment and annotation in Praat [Coming soon!]
Running the p2fa forced aligner lost me a lot of time. I’ve since discovered gentle, which I combine with praatio to make forced aligning easy in Praat. No more stress over a forced aligner pipeline that breaks! Keep your eyes peeled for updates to gently!