About Abhijit Mahabal
Abhijit Mahabal has studied Mathematics, Computer Science, and Cognitive Science and did his Ph.D. in the last two of these. Abhijit spent a decade at Google as a researcher, working on different types of algorithms related to natural language. Natural language is anything like English or Swedish – what people use to communicate with each other, as contrasted with Programming Languages. He recently started a new position at Pinterest, looking at identifying linguistic structures to make it easier for users to find what they are looking for.
Can you explain a bit further what you actually did at Google?.
– I started on Google search and worked on algorithms for identifying contextual synonyms, which means identifying when some word, such as “gm”, is similar in meaning to some other word or phrase. When we are talking about chess, gm means Grand Master; but when we’re talking about crops, “gm” is more likely to mean “genetically modified”. Telling the two apart is important to find the best pages that a user is looking for.
– For the last five years, I worked on automatic discovery of concepts by looking at regularities in the use of language. These so called “distributional” approaches can identify concepts based on similarity in how they are used. For instance, the twelve star signs, with Aries, Taurus, and so forth, behave similarly in how people use them in sentences: people talk of planet Venus entering Aries, but also about Venus entering Taurus; of someone being born under Libra, but also of someone being born under Gemini. These regularities allow the identification of the star signs as a category, although naming this cluster automatically is much harder.
– These techniques can also be used to discover higher level structures. For instance, if you look at the nouns present in the sentence where the verb is “to arrest”, several of the categories from above show up: a category of law enforcers (police, FBI, sherriff, etc), laws (Terrorism Prevention Act, etc), crime (burglary, arson, theft, and so forth). These associated categories are called slots, and together form the structure associated with the notion of arresting. By looking at where these categories are present relative to the verb, we see that the subject, who is doing the arresting, is a law enforcement agency and that the crime is often attached to the verb via the preposition “for” – that is, we see “arrest for burglary”. All this can be discovered by statistical analysis of regularities in language use. Knowing such higher level structures can set up expectations that help us understand. In the sentence “The fuzz arrested Mac”, even if we don’t know what “fuzz” means, we might guess that it refers to law enforcement; in this case, that would be a correct guess since “fuzz” is a slang term for the police.
Why is it important for AI to learn natural language? What is your vision with the work you do?
– Perhaps I should answer a broader question: why is natural language needed for intelligence, whether machine or human. The philosopher Andy Clark explains the ideas well in his paper “Magic Words”. He begins by saying that of course words are not magic, but neither are sextants, compasses or microscopes which nonetheless enable us to do things we could not otherwise. Words and grammar enable us to create complex thoughts, build on prior thoughts, and store thoughts externally as notes or as entire books that survive centuries. Indeed, there is a lot of evidence of how language plays a role in human intelligence and in controlling and directing our attention and behavior. if you watch a kid learning something new by trying things, you notice that they talk to themselves about what they are doing; that seems to help with gaining mastery. New domains of knowledge invariably bring new specialized vocabulary.
– Finally, most psychological therapies in the twentieth century – from Freud on to Cognitive Behavioral Therapy, have been “talk therapies”, using conversation to change thinking and behavior. In CBT, this involves also changing how you describe a situation to yourself. Human intelligence, at least, is deeply wedded to language. There is every reason, I think, to believe that language will play a similar role in machine intelligence. Additionally, in the case of “importance of language to AI”, if we are to communicate with such an intelligence, language is the most economical means of doing so; such communication, I suspect, is a far more simple problem than the linguistic machinery needed to have thoughts.
– My goal, toward which I hope to make some tiny progress in the next five decades, optimistically speaking, is a deeper understanding of how language can create meaning and giving computers some ability to do this. This must include understanding not just the language called English but understanding how “temporary words” and “linguistic microcosms” form. When a group of people meet regularly for a long time, the language they use among themselves changes a bit; insider jokes, slang, and other small changes can develop. When I was an undergrad at the Indian Institute of Technology, there were five campuses across the country, and I had friends in each. Each had developed a specialized vocabulary partly influenced by the city it was located in, but also quite different from its surrounding, and likely unintelligible to outsiders. There are currently no good ways to study such variations computationally. Most algorithms today are very data hungry, and data from microcosms is very limited, and there are methodological challenges. I don’t think anyone disputes the existence of such small linguistic communities, but the subjectivity involved in studying these and anything that makes use of introspections makes studying these a hard sell.
What are the problems you face with teaching AI natural languages?
– Human language is complex. What does this mean: “I am not myself today”? Or the description of the wild west as “out where the men are men”. We use words flexibly, and a single word may have many overlapping aspects. Men refers to the gender, but can also refer to particular kinds of masculinity, and the two men in that sentence pick out different aspects. The sentence does not tell us which is which, it is up to the listener to figure that out, based on what makes sense.
– The everyday word “mother” is a canonical example used to describe gradation of senses. Mothers play many roles in our lives: giving us half of their genes, giving us their mitochondrial genes, carrying us for nine months, breast feeding, nurturing, putting band-aids on skinned knees, and so forth. The same person typically carries out these roles, and this person is the mother. But for various reasons, these roles may fall on different individuals. A subset of these roles are fulfilled by surrogate mothers, wet nurses, biological mother, foster mother, and so forth. That list was enlarged in this century because now the “regular” genes and mitochondrial DNA can come from separate mothers. Furthermore, even den mothers in Girl Scouts and the Mother Superior play some of these roles. All these senses are overlapping and typically appear as one unified concept.
– Words you hear don’t come with the information of which aspect is being talked about, and the process of identifying or even inventing the meaning can be seen clearly in action when a noun is used as a verb. Consider the noun skin. Now consider the verb to skin. Perhaps it just
– Relatedly, many of our metaphors make use of the bodies we reside in. Feeling hot under the collar (angry) or getting cold feet (fearfulness) arise from the physical sensations that sometimes accompany. Getting computers to guess novel uses along these lines is going to be very hard, and may need embodied systems.
How is language taught to AI?
– Computers are not yet capable of understanding language in its richness, but even understanding little bits of it has practical value. Search engines are a great example where a little bit of language understanding helps a lot. Given a search query, it is not enough for Google to find web pages containing exactly those terms. Consider the three queries [Coca Cola gm], [chess gm], and [gm crops]. The best pages for these queries may not use the term gm; maybe they use “General manager”, “grand master”, or “genetically modified”, and understanding the word enough to look for the right thing in the right context is a big win. Note that Google does not really understand what genetically modified means. It just knows enough about what other words it shows up with where, and statistical analysis help make the right connection.
– Another use for a simpler subtask is sentiment analysis. It is much easier to classify a review as positive or negative than it is to understand the nuanced picture it paints. But this simpler task has practical utility. These subproblems are now tackled by Deep Learning methods. If shown many examples annotated as positive or negative reviews, in addition to a large corpus of unannotated documents, modern systems can do a fair job on guessing the overall sentiment of a new review. It can still get tripped up by irony, exaggeration, fake reviews, and such, and it is very difficult to extract finer grained information about individual aspects, but a somewhat useful solution can be put together relatively easily.
– With more computing power, larger and larger models of statistical regularities are being created and made available by Google and other big companies, and so it is easier than ever before to experiment with these techniques. For more complex tasks such as the limited “conversation” that Siri and Google Home can engage in, under the hood one finds a large number of rules carefully curated by linguists and helped along by statistical methods such as those mentioned above.
How many languages do you speak, and has it helped you in your work to speak several languages?
– I speak Marathi, Hindi, and English, and studied Sanskrit. Marathi is my mother tongue, and I learned the others later in life. This has been immensely helpful since I made a ton of errors in speaking and understanding. When I was in kindergarten, I did not really speak Hindi but was of course exposed to Bollywood songs. I have since lived in Hindi-speaking areas and my wife is a Hindi speaker. It frequently happens to me that I am re-listening to a Hindi song after many years and I now suddenly understand what it means and how I had completely misunderstood it.
– In my experience, when you misunderstand a word, you almost certainly misunderstood some surrounding word, otherwise the meaning would have been too jarring. There is an aptly titled paper called “To Err is Human, to study error making is Cognitive Science”. Speaking multiple languages helped me err a lot.