Automated analysis of product features in online reviews – Part 1

Large providers such as Amazon and Google often use online reviews in text form. In these, users can express their opinion or experience of a product or service. There are now many such reviews and the structure of a review is entirely up to the user. In my bachelor thesis, I took a closer look at a subarea of this research, the Aspect Based Sentiment Analysis. Read more about this in my blog…

As an example, I carried out an automated analysis on a large scale. Aspect Based Sentiment Analysis is about automatically extracting product features (Aspects) mentioned in online reviews and the sentiment towards them.

Training of various models

I trained various models (artificial neural networks) for this task using machine learning. A particular focus here was on so-called Transformer models. These are models with a certain type of deep learning model architecture. This was first introduced in 2017 by Google’s deep learning division Google Brain and has become increasingly relevant in the field of natural language processing (NLP) in recent years. In many areas, these models set the state of the art, meaning they are among the most powerful models. In particular, there are various pre-trained models, i.e. models that have been pre-trained on a large amount of data and one or more NLP tasks. For the subsequent learning of a new task, only a fine-tuning is then necessary (“transfer learning”). This allows the already learned abstract representations of natural language to be used to learn the new task more effectively, instead of having to train a new neural network from scratch.

Overview of the procedure:

For my work, I used online reviews from the online retailer amazon.com. First, five product categories were selected and several hundred reviews from each of these were manually annotated according to the described Aspect-Based-Sentiment-Analysis scheme. Subsequently, various models were trained and evaluated on this dataset. The best-performing models were then used to automatically label aspects and sentiments on a large corpus (about 400,000 reviews). I then clustered (grouped) the Aspects predicted by the models using different approaches and then analyzed them.

And which results and especially which conclusion I could draw from it, will be published in two weeks. So stay tuned!

Image by Towfiqu barbhuiya on Unsplash

The Power of Regular Expressions for Clever Sloths

First of all, regular what? Regular expressions are based on a formal language that is used to specify patterns for matching strings or sub-strings in text. In order to find simple information in a text file, many people still use search functions available in any up-to-date editor. But what if your search scenario is not so simple? What if you are looking not for one specific digit, for example, but for all digits in your text file? Or what if you want to find and remove all tags from your document? Good news! You not have to be a programmer to do all that stuff. You simply need a wee bit of knowledge about regular expressions and a clear picture of what you are looking for in your text file. Today’s blog will give you an idea about regular expressions and after reading it you will be ready to write your first regular expression, promise!

The perfect translation process for our website

Connected, automated and easy to use - that was our dream, which is now a reality. In our blog today we share our experience with you about how we translated our website and what we learned from this project.