Scraawl Introduces Advanced Search and Text Analytics on News Feeds

Scraawl Introduces Advanced Search and Text Analytics on News Feeds

Finding news is easy enough these days. Simply Google an event or topic and you’ll see hundreds of millions of results within milliseconds. But do you really need millions and millions of results? Instead, Scraawl’s improved news feed aggregation service offers the ability to search and analyze a curated list of verified news outlets and blogs (with new sources added daily). Like with other data sources in Scraawl, users can analyze news content  using a wide range of analytics related to top sources, words, regions,  geo-referenced locations, categories, languages and advanced analytics  related to named entities, media galleries, topic modeling, and geo-spatial analytics.

Let’s get started.

News Aggregation

Thanks to Scraawl’s machine learning-based natural language processing pipeline, Scraawl’s news aggregator service, processes and indexes news feeds from feeds of thousands of large national news outlets like BBC and CNN, smaller sites like TechCrunch and The Verge, and a wide range of international sources. We aggregate and process news in over 30 languages. These processed and indexed articles are then searchable using either Premium or  Premium Advanced Search, which means you can now apply complex search queries that leverage a combination of search operators and search rules.  Advanced Search allows for keywords, news source, regions, location and languages-based searches, and the use of Boolean logic (AND, OR and NOT operations) to create targeted searches. In fact, you’ll find news search operators in Scraawl to be very similar in syntax to Scraawl’s  Twitter’s  advanced search operators.

Search and Text Analytics

Here is one example of using the News Aggregator service to monitor news relevant to social media marketing:

  1. Running an Advanced Premium search, I entered the keywords (“Social media marketing” OR “Digital Marketing”) (lang:en) to catch up on the latest news in digital marketing that is in English, and to also get an overview of the industry.
  2. Notice, quotes were used for an exact string match for results, and search rules related to region and language were used to articles from North America and in English.
  3. Once the results have been collected the Basic Statistics dashboard and the Advanced Analytics tab will populate.  
  4. In Advanced Analytics tab you’ll find similar analytics to the other data sources.
  5. In this example, we decided to take a closer look at Topic Modeling which visualizes word patterns.

Within the generated analytic, we see five main topic groups. The first and largest bubble shows the terms “media,” “people,” and “social” as some of the biggest concepts currently in digital marketing. These groupings demonstrate that at the heart of digital marketing is the desire to connect to real people.

Interestingly, when you drill down into the Topic 1 by clicking on the bubble, you will see the relative influence score of each term. Facebook actually manages to be ranked higher than Twitter reflecting a slightly higher relative importance score.

Facebook still ranked higher despite Twitter showing up as a named entity more times with a 77 entity county versus 64. This image below is from the Named Entity Recognition analytic:

  1. With News Feeds, as with other data sources, there is also access to Raw Data. This tab is actually a very helpful news aggregator that makes it easy to peruse headlines and lead paragraphs, or do a further search on these articles to prune this dataset.

For example, I was curious to see how many of these articles also mentioned the term “entrepreneur” as digital marketing often falls into the realm of entrepreneurs. There were 19 articles that contained the phrase or were from the news source Entrepreneur. It even brought up one article that mused on the true nature of the term “entrepreneur” which I found very helpful in my research.

Try it Out for Yourself

Scraawls new aggregator makes it easy to do big data analysis on bodies of text you would normally have to read and synthesize on your own. Again, if you’re interested in experimenting more with news feeds as a data source check out the documentation on operators, RSS Advanced Search Rules or contact us for a demo, or a free trial

Until next time, Scraawl team.


Subscribe to our email Newsletter

Sign up  for Scraawl news, tips & tricks, and latest blogs. Don’t worry, we never ever spam our listserv and we only email once a month.

Related Posts---


  1. Avatar
    many thanks

Leave A Reply---

Back to top