Project highlights
- Analyze semantic of natural language
- Get relevant search results
- Classify information efficiently
- Industry:
- Social Media, eLearning
- Market:
- Austria
- Cooperation:
- 2 years
An innovator in natural language processing and text mining solutions, our client develops semantic fingerprinting technology as the foundation for NLP text mining and artificial intelligence software. Our client’s company, based in Vienna and San Francisco, addresses the challenges of filtering large amounts of unstructured text data, detecting topics in real-time on social media, searching in multiple languages across millions of documents, natural language processing, and text mining. Our client was named a 2016 IDC Innovator in the machine learning-based text analytics market as well as one of the 100 startups using Artificial Intelligence to transform industries by CB Insights.
Business challenge
Inspired by the latest findings on how the human brain processes language, this Austria-based startup worked out a fundamentally new approach to mining large volumes of texts to create the first language-agnostic semantic engine. Fueled with hierarchical temporal memory (HTM) algorithms, this text mining software generates semantic fingerprints from any unstructured textual information, promising virtually unlimited text mining use cases and a massive market opportunity.
Our client partnered with us to scale up their development team and bring to life their innovative semantic engine for text mining. Our expertise in REST, Spring, and Java was vital, as our client needed to develop a prototype that was capable of running complex meaning-based filtering, topic detection, and semantic search over huge volumes of unstructured text in real time.
Technology solution
Intellias team of NLP experts had previous experience working in the eLearning industry for the development of a text analytics platform. Our language-agnostic services for text analytics NLP development let the client process terabytes of text data by encoding the semantics of natural language elements into semantically grounded binary code. The resulting code is then further compared and analyzed with standardized metrics, offering great opportunities for NLP text analytics, text filtering, classification, labeling, and search. This text analytics and NLP services are advantageous for businesses that need to search text repositories in various languages, monitor incoming emails, large databases, or detect trendy topics on social media. In such a way, the client’s text analytics platform allows recognizing hidden patterns of users’ behavior that could be used in wide spheres of online education, entertainment, eCommerce, etc.
We also presented a prototype of text analytics NLP algorithms integrated into KNIME workflows using Java snippet nodes. This is a configurable pipeline that takes unstructured scientific, academic, and educational texts as inputs and returns structured data as the output. Users can specify preprocessing settings and analyses to be run on an arbitrary number of topics. The output of NLP text analytics can then be visualized graphically on the resulting similarity index.
What did our client get?
- A text mining NLP engine that operates on the level of semantics rather than keywords for filtering, classifying, clustering, and searching text
- A horizontally scalable text analytics platform that’s ready to support exponential growth of the user base
- A semantic fingerprinting technology to offer text analytics and NLP software as a service on the global market
Business outcomes
We helped the client:
- Acquire new investors and raise $2 million in capital for their text mining NLP platform
- Implement RESTful APIs available on the Amazon Web Services marketplace to scale their text analytics NLP solution with customized databases and applications
- Monitor customers’ reactions in real time through an intelligent filter that converts the Twitter firehose into a stream of semantic fingerprints
- Develop text mining NLP algorithms in KNIME nodes to power a prototype of their text analytics platform