09 Jul 2017
What is: natural language processing (NLP)?
Will computers be able to really understand humans in the future?
Imagine a world in which machines are capable of translating text into another language without losing any of that text’s original meaning. A world where there is a way to take a seemingly endless research paper and have a machine summarise all of its key points without actually having to read it in full.
Spoiler alert: those are just a few examples of the many ways that natural language processing (or NLP) has already improved our lives.
What is: natural language processing?
NLP combines the power of computer science, artificial intelligence (AI) and computational linguistics in a way that allows computers to understand natural human language and, in some cases, even replicate it.
Computers communicate in different languages than humans do.
Just think of all of the programming languages out there: FORTRAN, HTML, Java, C++… the list goes on. These languages may look like an incomprehensible series of symbols, letters and numbers to the untrained eye, but when combined together into the appropriate lines of code (sometimes hundreds of thousands of lines at that!), a computer is able to read the code, process it and complete the designated function or operation.
NLP is a little different because the computer isn’t processing these traditional programming languages. Instead, the computer is designed to comprehend written or spoken human language such as English, Russian, Japanese or Italian.
The ultimate goal of NLP is to analyse human language, ‘understand’ it and then derive a coherent meaning in a way that is beneficial for the user. In ideal scenarios, NLP completes a task faster and far more efficiently and effectively than any human can.
But getting there has been a challenge – until very recently.
NLP is all around us
NLP applications are nearly everywhere these days, even if most of us aren’t aware of them yet. Some of the most common tasks include translations, summarisations, sentiment extractions and topic segmentation, among others.
Let us explain. Here are some prominent examples that you may not have known were NLP:
One of the most obvious uses of NLP is in language translations. Have you ever used the ‘See Translation’ link on Facebook or the globe icon in Twitter? Have you ever popped a word or phrase into Google Translate? Years ago you may have received laughable translations at best (okay, the Facebook ones are still pretty bad), but today translations have improved thanks to advances in NLP.
Processing programmes such as Microsoft Word or apps like Grammarly can correct sentence structure, punctuation and wonky grammar. They also flag spelling mistakes. This is especially useful with homonyms, which are two words that sound the same but are spelt differently, like ‘there’, ‘their’ and ‘they’re’. These grammar tools have helped us to produce flawless written work thanks to their ability to bring errors or discrepancies to our attention for further human correction.
Handwriting and voice recognition
You’ve probably seen handwriting recognition in handheld devices, such as smartphones and tablets, and their ability to convert handwritten notes into text. The function is popular with college students, journalists and business executives, and even has the ability to transform your scribbles into standardised letters and numbers.
Voice recognition allows users to perform commands on an electronic device without having to physically push buttons or pull switches. This can include simple tasks, such as turning on and off your kitchen lights or music player, or more complex requests, like when speaking to Apple’s Siri, Amazon’s Alexa or Microsoft’s Cortana.
Conversational Agent (CA)
Also known as chatbots, some CAs have pre-programmed responses while others are designed to learn over time as customer interaction increases. The user’s questions are analysed for keywords, which are then compared to the keywords in question/answer series in the existing database. The chatbot’s response is then provided based on a pattern-matching score algorithm that retrieves responses with the most relevant keyword.
By using keyword searching and text classification, an email account can be set up to separate the unwanted spam from email items that are actually important. Certain phrases or keywords are automatically flagged, prompting an immediate route to the spam or trash folder.
This function takes lengthy articles and creates a reduced version that contains only the most important information. The catch is that the new version is written in the computer’s own words through natural language generation, not simply copying and pasting text.
Brands very often use this to gauge customer reaction on social media, platforms such as Facebook and Twitter. This technique is used to tag a post or tweet as positive, neutral or negative in order to render a customer’s opinion, feeling or belief, as well as to track trending topics and hashtags. Sentiment analysis is also at the core of Phrasee Pheelings™, the world’s first deep sentiment analysis technology for optimising for email marketing language. (Try it for free here!)
How does NLP work?
As mentioned before, typical computer processing operations read a string of text and then come to a literal and logical conclusion. This rational component is present in natural human language.
However, in order for a machine to understand our language – which is something that up until recently has been an entirely human capability – it must be able to ‘think’ beyond the literal meanings of the symbols and ‘understand’ the information within the correct context and tone.
That’s because while human language is logical, it is also emotional and ambiguous. The latter two can really throw a wrench into the computer’s processing capabilities, so to speak.
In order to tackle context, feelings and semantics, a computer must understand words beyond their meanings, parts of speech, and relationship or function to one another in a sentence.
A computer must then go further and progress to comprehending phrases and longer sentences, eventually understanding the ideas or messages within an entire text or article.
For example, although there are certain words that can be labelled by a computer as emotional ‘feeling’ words, sarcasm inherently falls on deaf (computer) ears.
‘After the bride found out her flower girl felt feverish and the centrepieces had not yet arrived at the wedding venue, she was positively thrilled to learn that a thunderstorm was coming her way.’
In normal circumstances, ‘thrilled’ would be tagged as an extremely positive word, prompting an automatic computer response of, ‘That’s wonderful to hear!’ or ‘Thank goodness!’, but a human would know better. Even without subtle clues like tone of voice or body language, a human would understand that the bride was anything but thrilled.
The ambiguity of language and irregular syntax or semantics is yet another component that also makes natural language processing a hard task for computers. Note the following:
‘The police shot the rioters with guns.’
Did the police use their guns to shoot the rioter? Or were the rioters with guns (as opposed to the rioters without guns) shot by the police? And shot with what specifically – gigantic water hoses?
‘Man helps snake bite victim.’
As for this second example, did the man help the snake sink its teeth into victims? Or did the man help the people who were bitten by the snake? It is (hopefully) the latter, and only a human would come to that conclusion as quickly.
Although we are still far off from flawless interaction between humans and computers, we are well on our way to achieving that goal with NLP.