In the discussions around Artificial Intelligence (AI), machine learning, and big data, you might have also heard the term Natural Language Processing (NLP) brought up a few times. Like other aspects of data science, NLP has been around for many years but has become increasingly more important for the everyday consumer.
One of the most famous examples of AI, is Siri which actually leverages NLP. In fact, NLP can be seen in action with Siri’s amazing ability to seemingly have a “personality” and to understand voice commands.
While Scraawl does not have a friendly AI voice, the data analytics platform does rely on NLP frameworks for many of our analytics. But first….
Natural Language Processing defined
NLP is a computer science discipline that focuses on the analysis and understanding of human language. It is a part of AI, which overlaps with machine learning. If you think about regular conversations between people, you may think of all the non-verbal cues that go into interpreting another. It could be the slight tilt of an eyebrow to convey doubt or the lower tone and slowed pace used to illustrate sarcasm. Now imagine taking all these cues away and to still understand what the other person truly meant with their statement. It’s hard. And that is (one of the things) what NLP seeks to do.
Here are a couple good resources for digital marketers:
- Natural Language Processing 101
- What Marketers Must Know about Natural Language Processing
- How can NLP Technology be Used for Marketing
And here is a concise technical overview:
— Scraawl (@Scraawl) January 23, 2018
Got it? Okay, onward!
Natural Language Processing applied
The most immediate example of NLP lives right in your email inbox. In order to identify spam or ham, email platforms will utilize text classification to characterize incoming emails.
You may also be familiar with parts of speech tagging, another process made scalable for big data through natural language processing. With parts of speech tagging you can begin to build lexicon libraries that can you can match new data against to, for example, predict the demographic of a user online based on their Google searches.
NLP can also be found in most sentiment analysis tools. While a spam or not-spam system is an example of binary classification, it is possible to have multi-class problem sets such as having multiple categories for sentiment, i.e. positive, negative, or neither/neutral sentiment.
Here is a great video on how AI and NLP can impact the future of content marketing:
Natural Language Processing in Scraawl
Social media posts are comprised of many different elements, images and videos, but the body of text on the posts are often where we look to first. In order to truly analyze social media conversations for over hundreds of thousands of posts across different platforms like Twitter, Facebook, and Instagram it was important to leverage the power of natural language processing.
Sentiment analysis, topic modeling, named entities, and text summarization (in news feed searches) are all examples of NLP within Scraawl.
One of our most popular analytics is Sentiment Analysis for Twitter searches. Elaborating on the earlier example of multi-class systems, you can see how NLP-based Sentiment Analysis is applied to social media data through’s Scraawl’s analytic:
Scraawl Sentiment Analysis uses a machine learning-based approach that scores each record based on textual and structural features and then classifies them into positive, negative and neutral tweets. Classification accuracy is in the 75-80% range on average and may vary on the context of the conversations. Misclassifications may occur. Top tweets representative of positive and negative sentiments are shown.
We originally searched for (cat AND has:symbols). We were hoping for cat emojis, but instead got a lot of financial chatter about $CAT (Caterpillar, Inc.) on Twitter. If there were most tweets about the cute animals known as cats, we speculate that the sentiment would have been way more positive and maybe less neutral.
Here are a few other analytics offered by Scraawl:
Entity extraction is
Scraawl Named Entities uses a lexicon of established entities to identify people, organizations, and locations mentioned within bodies of text.
In the example above, within the hundreds of posts I collected, Scraawl was able to neatly identify and pull out proper noun groups like Caterpillar, Inc and Apple Inc. This is a great way to give context to a collection what exactly a body of text is referring to, which in our case turned out not to be about fuzzy cats but the stock option $CAT.
Topic modeling leverages machine learning to visualize and cluster words together based on patterns of use. In the example below, we take a look at how it handled the search results.
Scraawl Topic Modeling uses a probabilistic model to discover patterns of word use within and across tweets. The model uses this information to identify up to 8 abstract “topics” that occur in the collection, the probabilistic scores that represent the likelihood of these topics, and up to 15 top words within that topic.
While this analytic does not differentiate between what is a proper noun (e.g. Caterpillar, Inc versus a caterpillar) like Named Entities, it is useful to see how words were commonly grouped together. If you’re interested in seeing what’s trending within a social discussion around a subject, like a new tech product, Topic Modeling is a great way to see that information discretely organized.
Why Should You Care?
You should care about Natural Language Processing because it impacts so much of your everyday life.
Personalized recommendations when you search things online, your phone’s voice command, automatic translation, the sheer joy of not having to look at spam mail anymore– they’re all connected by AI and natural language processing.
Undoubtedly the technology will continue to get better and better as automatic machine learning (the machine learning of machine learning) matures.
And for digital marketers, NLP it’s the wave of the future that’s here now.
Subscribe to our email Newsletter
Sign up for Scraawl news, tips & tricks, and latest blogs. Don’t worry, we never ever spam our listserv and we only email once a month.