Works in progress

Outside of my main field of research (memory for multiword expressions), I am working on several issues related to how statistical regularities and language processing. Here are some that I’m currently investigating:

Comparing different computational measures of semantic similarity for cognitive and psycholinguistic tasks
[with Katrin Erk, Linguistics, University of Texas]
A number of different measures of semantic similarity exist for explaining human behavior, such as the number of overlapping features, distance in a formal ontology like WordNet, or distributional similarity as measured by LSA, word2vec, GloVe, or other measures. The suitability of any one measure depends on the application, such as predicting naming latencies in the picture word interference paradigm, the order that speakers produce a word in in a list, or even the selection of stimuli for different experiments. I am documenting how different applications should take into account the way the different models are trained, the advantages and limitations of each, and demonstrate that for some tasks, different measures may tell very different stories about the semantic relatedness between two words.

Audience design, uncertainty, and emoji usage
Different platforms have well-documented differences in emoji appearance. Some platforms are notoriously incongruent with the emoji on other platforms. In this work, I am analyzing the linguistic contextual properties of a subset of emoji that have been judged by humans to not map onto the same meaning. I am especially interested in using linguistic markers to predict whether speakers actively avoid using emojis that differ in their interpretations across platforms.

Linguistic compositionality and hapax legomena in Instagram hashtags
Language is compositional in the sense that all language is composed. Instagram hashtags are an especially rich source for identifying novel language. In this study, I am looking at the linguistic properties of the component words used in hashtags to predict whether a hashtag will only be used once.

Spontaneous speech prosody provides cues to conventionalized referent names
[with Margaret Fleck, Computer Science at UIUC]
Speakers’ prosody, especially word and segment durations, predicts whether a phrase is “a thing” or not, either in the discourse, in the language community, or as a formally accepted concept. This work looks at the different prosodic cues in spontaneous speech that listeners might take into account when determining whether a collocation that they have never heard before (e.g. alcoholic beverages or heart attack), which could help listeners learn new phrases composed of familiar words.