28 Apr 2012

On the computational nature of syntax

I found an amazing article called On the nature of syntax (2008) by Alona Soschen who, in a nutshell, uses language as a means to examine possible underlying features common to other adaptive systems. Strangely enough, this intrigues me as a programmer too. To quote the abstract:
"There is a tendency in science to proceed from descriptive methods towards an adequate explanatory theory and then move beyond its conclusions. Our purpose is to discover the concepts of computational efficiency in natural language that exclude redundancy, and to investigate how these relate to more general principles. By developing the idea that linguistic structures possess the features of other biological systems this article focuses on the third factor that enters into the growth of language in the individual. It is suggested that the core principles of grammar can be observed in nature itself."
While this is a powerful subject in itself, there are also some interesting facts mentioned within about extreme language structures. It's stated that nouns technically rank higher than verbs universally speaking and this helps explain why the Australian language of Jingulu quite astonishingly has only three true verbs in its vocabulary: "do", "go" and "come". More extreme yet, the Nigerian language Igbo (aka Ibo) has, in place of verbs proper, inherent complement verbs which are made up of -gbá plus a noun (eg. –gbá egwú "to dance", literally "do dance"; –gbá igwè "ride a bicycle", literally "do bicycle"). This confirms my prior impression through my experience with computer programming that verbs are equivalent to "computer functions" that operate on input data (ie. nouns). Nouns then indeed are most primal since one must have data first before any function can operate on it. Nonetheless verbs too are a close second in importance since not much could be expressed without them and likewise not much could be programmed without functions. The interrelationships between language, logic and the qualities that create an adaptive system keeps me busy for hours.


  1. Redundancy is not necessarily a bad thing. I have a colleague from Serbia that complains that English has entirely too many nouns. "Stove, range, oven, grill, in Serbia we have one word," he explains, "it is like that with everything here." This kind of redundancy allows us to be extremely specific. Other times, redundancy provides a form of error checking. If something exhibits redundancy, it is easier to find outliers or errors and correct them using context clues. When there is less redundancy, this is more difficult. At any rate, I really enjoy your blog.

  2. I agree with you on that, Connor. Redundancy is like "breathing room" and if a system is too tightly designed, it just doesn't have the room to adapt well if at all. However we know that language adapts and adapts quite well. I like to think of language as hovering between order and chaos, not too hot and not too cold, always rebalancing itself after every tiny change to its system.

  3. This is quite interesting and really flies in the face of 50 years of Chomskyian "verbocentrism" in generative Syntax.

  4. Does it necessarily though, Ezr? All I know is that if we lay structural linguistics beside category theory, let's say, we see "verbs" being relabeled as "morphisms" and "nouns" as "objects". A morphism is also equivalent to a "function" in a computer program with input and output data as its "objects".

    Now let's consider a "language" completely void of verbs. It's tantamount to an assortment of objects without relations, input and output without linking morphisms. It's therefore a disconnected web (or "non-web", if you will) that can't possibly do its job as a language to convey anything. So it's not a language any more than it would be a valid computer program. A computer program with only functions without input or output is simply useless; so too is a computer program composed of only data without functions working upon it.

    So back to the issue of verbocentrism, my analogies inform me that BOTH nouns (= "objects" = "input/output data") and verbs (= "morphisms" = "functions") are required TOGETHER to form a valid productive system like language, whether a computational language or human one. Mathematics can be said, in a sense, to be another kind of "language", as can quantum physics or the economy because all these things use morphisms on input data to produce derivative data just as language derives new meaning out of prior concepts.

    (I also think that these comparisons show the challenges inherent in sciences that employ overspecialized terminology that manages to obfuscate similar concepts and relationships discovered in other seemingly very different fields. As we see, computation, linguistics and set theory are all one and the same. We need more cross-disciplinary communication and put the uni- back in the term university. But I digress...)

  5. I agree with much of what you say, but the issue here is different, and in fact simpler, I think.
    Many if not most traditional syntactic models, including Chomskyian generativism, are intensely "verb-centric", by which I mean that verbs are considered paramount in some specific way.
    For example, in the Minimalist program, if you "eat" and "food" as "eat food", you get a VP where the verb will always be assumed to be the main element - on the basis of distributional info but not the "semantic naturalness" that you seem to suggest.
    I'm curious how an orthodox Chomskyian would deal with a language such as Jingulu; I'm guessing they would try to analyze, say, "do-food" as a full-fledged verb on some abstract level, with the "food" element subordinate to it, a la Hale & Keyser.

  6. "I agree with much of what you say, but the issue here is different, and in fact simpler, I think."

    While you oppose a verb-centric position, what is this "simpler" position of your own exactly? I suspect you may only believe my view is complicated due to a lack of familiarity with information theory and its relationship to structural linguistics.

    Reiterating my position, category theory suggests that language is a non-centralized web, not a Chomskyian hierarchy, where BOTH noun-like entities (= objects) AND verb-like entities (= morphisms) must equally hold a prominent place in this delicately balanced system. How simple do you want it?

    With nouns and verbs being mutually symbiotic features, we rid ourselves of the extra assumptions of an either-or or one-more-than-the-other kind of relationship.

    For that matter, I believe we can go further and recognize that each object (= noun) can be used as a morphism (= verb) and vice versa. Thus the noun-verb duality is an illusion as well, similar in a loose way to particle-wave duality in physics.

    We are then left with generalized "particles" or "atoms" with two possible roles: stative (as an object or noun-like morpheme) or active (as a morphism or verb-like morpheme).

    So in Jingulu, -gba simply fulfills the function of a morphism that, as suggested by my explanation, is necessary to all languages, as are the objects on which these morphisms operate. A verb-centric view is only seeing half of the picture.

  7. In contrast to languages like Jingulu, some of the highly polysythetic languages of the Americas (such as Algonquian) are very "verb-y". In the Algonquian languages' case I have heard that most of the noun lexemes those languages (such as Ojibwe and Cree) are derived from verbal roots. Note that I do not know if that is actually true or not, It's something I read in several posts on a message board by a linguist of Native American background who studied the Algonquian languages.

    It makes sense that even in a verb-y language one can still contast "object" lexemes and "morphism" lexemes, something that linguist I mentioned above rejects, to my annoyance.

  8. "It makes sense that even in a verb-y language one can still contast 'object' lexemes and 'morphism' lexemes"

    What point are you responding to and to whom?

  9. We should remember the fact that Mandarin zài can be interpreted both as a verb (eg. wo zai "I'm here") and a preposition (ie. zai tushuguan "at the library"). Dào is simultaneously a noun meaning "path", a verb meaning "to go towards", and a preposition meaning "towards". The distinction we make between these possible interpretations based on strict word categories imposed upon the same word is often dependent on the context of the message and the language we're translating into. There's nothing stopping us from using word classes from other languages like English to interpret Mandarin and other languages, but things start to unravel if we expect too much from that.

    Interestingly enough, it looks like others are having trouble with the universality of the noun/verb distinction such as Baker who in On Gerunds and the Theory of Categories openly questions "Can a category be both nominal and verbal?" And the answer to that would be "YES!!" Why? It can only be because the noun/verb dichotomy is false and not truly fundamental to the computational structure of language.

  10. "What point are you responding to and to whom?"

    I was criticizing the linguist I mentioned in my post. He, a poster at the ZompistBB site at the time, claimed in a thread about polysythetic languages that Proto-Algonquian did not even have "noun-like things" as a concept and expressed everything as a kind of process (a "verb-like thing"). I was skeptical of his claim, but I didn't know how to refute it.

  11. Perhaps he's thinking of "nouny" verbs in Cree, as in noocih-acaskw-ii-w 'He hunts muskrats'. In Word formation in Plains Cree: Root incorporation and noun incorporation (2011), Johnson sketches out a grammatical structure of this formation into nicely quantized entities in an ordered tree. Further "movement" is then required according to the rules of the author to get the head over to the beginning of the construction.