This paper proposes that unsupervised phonetic and phonological learning of acoustic speech data can be modeled with Generative Adversarial Networks. Generative Adversarial Networks are uniquely appropriate for modeling phonetic and phonological learning because the network is trained on unannotated raw acoustic data, learning is unsupervised without any language-specific inputs, and the result is a network that learns to generate acoustic speech signal from random input variables. A GAN model for acoustic data proposed by Donahue et al. (2019) was trained on an allophonic alternation in English, where voiceless stops surface as aspirated word-initially before stressed vowels except if followed by a sibilant [s]. The corresponding sequences of word-initial voiceless stops with and without the preceding [s] from the TIMIT database were used in training. Measurements of VOT of stops produced by the Generator network was used as a test of learning. The model successfully learned the allophonic alternation without any language-specific input: the generated speech signal contains the conditional distribution of VOT duration. The results demonstrate that Generative Adversarial Networks bear potential for modeling phonetic and phonological learning as they can successfully learn to generate allophonic distribution from only acoustic inputs without any language-specific features in the model. The paper also discusses how the model's architecture can resemble linguistic behavior in language acquisition.
This paper presents applications of a technique for estimating influences of the Channel Bias on phonological typology called Bootstrapping Sound Changes (BSC). Following Author (2017), we argue that for any synchronic alternation, the BSC technique enables estimation of the probability that the alternation arises based on the number of sound changes it requires and their respective probabilities. This paper develops and illustrates further applications of the proposed model. With the Bootstrapping Sound Changes technique, we can compare Historical Probabilities of attested and unattested alternations and perform inferential statistics on the comparison, predict (un)attestedness in a given sample for any alternation, and derive quantitative outputs for a typological framework that models both Channel Bias and Analytical Bias influences together. The BSC technique also identifies several mismatches in typological predictions of the Analytic and Channel Bias approaches. By comparing these mismatches with the observed typology, the paper attempts to disambiguate Analytic Bias and Channel Bias influences on typology.
This paper presents two cases of lexically gradient phonotactic restrictions that operate against what would be phonetically natural: Tarma Quechua and Berawan. The paper shows that the unnatural trends in the lexicon are statistically significant, phonetically real, and show clear signs of productivity, with evidence from loanword phonology and from morphophonological alternations. Based on the two cases presented, we argue that gradient phonotactics can operate in the unnatural direction: in a context where one value of a feature (in our case, [±voice]) is phonetically dispreferred across languages, that marked feature value may actually be preferred by phonotactics (e.g., voiceless rather than voiced stops after a nasal or intervocalically). To our knowledge, this is the first report of a (truly) unnatural gradient phonotactic restriction on segmental structure. The unnatural gradient phonotactics in Tarma Quechua and Berawan bear theoretical implications: we demonstrate that Harmonic Grammar with Con restricted to natural constraints disallows phonotactic restrictions that favor the unnatural feature value in a given environment, contrary to what is attested in our data. Based on Author 1, the paper also proposes a new historical explanation for the development of these unnatural patterns. We argue that each of them is the result of a special sequence of three phonetically natural sound changes — a so-called “blurring chain”. This hypothesis explains several peculiar aspects of the data, derives the typology of natural/unnatural processes, and provides a new historical explanation for unnatural patterns in general.
The Philippine-style Austronesian voice system (AVS), which serves to identify a single privileged argument leading to the Subject-Only Restriction, is well-known for its highly articulated nature. While the synchronic status of the AVS has been explored extensively, its diachronic development is less clear. This paper fills the gap in our understanding of the development of the AVS while simultaneously exploring the effectiveness of internal reconstruction as a tool of historical syntax. I argue that both voice marking and the nominalizing function of the AVS affixes were already present at the Proto-Austronesian stage. The analysis presented here capitalizes on simple and independently motivated syntactic phenomena: case marking, the shift from prepositions to preverbs, and reanalysis. Based on these features, I show that the AVS developed out of the reanalysis of reflexive markers into markers of intransitivity and out of prepositions incorporated into the verb complex; these two different sources of voice marking explain why the morphological exponents of different voices are differently positioned in the verb form. The proposed reconstruction straightforwardly accounts for a number of AVS properties, including the prominence of arguments promoted to subject position, the Subject-Only Restriction, the existence of various peripheral functions of the voice affixes, the placement of the affixes, asymmetries in their functions, and tendencies in the later development. The historical analysis also has implications beyond Austronesian, in allowing us to to explain the cross-linguistic distribution of adpositions and preverbs and to captures the descriptive facts of a similar morphosyntactic system outside Austronesian: the voice system in Dinka.
This paper addresses one of the most contested issues in phonology: the derivation of phonological typology. I present a new model for deriving phonological typology within the Channel Bias approach. First, a new subdivision of natural processes is proposed: non-natural processes are divided into unmotivated and unnatural. The central topic of the paper is an unnatural alternation: post-nasal devoicing. I argue that in all reported cases, post-nasal devoicing does not derive from a single unnatural sound change (as claimed in some individual accounts of the data), but rather from a combination of three sound changes, each of which is natural and motivated. By showing that one of the rare cases of unnatural sound change reported actually arises through a combination of natural sound changes, we can maintain the long-held position that any single instance of sound change has to be natural. Based on several discussed cases, I propose a new diachronic model for explaining unnatural phenomena: the Blurring Process. Additionally, I provide a proof establishing the minimal sound changes required for an unmotivated/unnatural process to arise. The Blurring Process and Minimal Sound Change Requirement result in a model that probabilistically predicts typology within the Channel Bias approach. This paper also introduces the concept of Historical Probabilities of Alternations (Pχ) and presents a groundwork for their estimation called Bootstrapping Sound Changes. The ultimate goal of the new model is to quantify the influences of Channel Bias on phonological typology.