The Predictive Analytics World in Berlin is a small but brilliant event. It is probably the best meeting of the Predictive Analytics Community in Germany, highly focused on sophisticated presentations and a specialized audience. Thus, a fitting occasion for the first entry in our Predictive Analytics Blog.
I firstly have to say that the Predictive Analytics World exhausts me every time. In a positive way. At this event, so many inspirations and ideas flurry through the rooms that I need to ask for your indulgence – this entry is written under the influence of powerful endorphins.
„From Smart Phones to Smart Places to Smart Profiles“
The highlight of the morning was the presentation by Hendrik Wagenseil and Nina Weigel of GfK on the topic "From Smart Phones to Smart Places to Smart Profiles". They reported on the development of a future product of GfK which is intended to provide an insight into the geographic movements of target groups which are interesting for marketing. So to speak a luxury version of passer-by frequency surveys. The basic idea is the consolidation of geo-positioning data from cell phone network operators on the one hand with marketing profiles on the other hand. The fact that this naturally needs to be done in compliance with data protection laws is only one of the exciting issues which were addressed. This was probably simplified by the fact that the presentation dealt primarily with US data; but of course, comparable products would also be interesting in Germany.
The methodical ideas on fusion and quality assurance of the two highly different data sources were at least as rewarding. There are multiple potential sources of errors, e.g. sampling errors because not everyone carries a cell phone with them, limited precision of the localization based solely on cell phone towers etc. The strategies to minimize these errors were not ready-made but carefully tailored to the particularities of the data situation. Sampling errors, for example, were minimized by a skillful recourse to the demography of the cell phone users´ town of residence (which is known from census data).
Time and again, this also highlighted the sensitivity of the used data almost in passing, e.g. when it was pointed out that localized entry events with time stamps allow for a high likelihood of correctly concluding the place of residence. The geographic place with frequent nightly entry events is obviously the residence.
Professor van der Aalst of TU Eindhoven had a very different emphasis in his presentation on process mining. In our consulting projects, we have always attached importance to understanding the processes which generate the data we work with. Despite this, the maturity level of the methods and software tools introduced by Professor van der Aalst has surprised and inspired me. The direct consequence for me was that these tools are included into our data science toolkit. I am surely going to report on the initial experiences in this blog soon. My conclusion from this presentation was that there is a specialized community there which does great work but which is largely unknown within the data science and also within the business intelligence environments.
The methods developed by this community allow for the automated reconstruction of the underlying processes from log data; the level of detail can be selected in absolute discretion. These factual processes – including all unofficial abbreviations, errors and particularities – can then be used as basis in order to either compare them to the “official” processes or to develop such processes. They are also a solid basis for prognoses – a valuable addition to predictive analytics which cannot be replaced by any of the standard statistics or machine learning methods, but which can be combined with the latter in a marvelous way. The field of application of these methods is huge and does not only include the industrial processes which may come to mind first. In particular and especially customer service processes are a rewarding field of application; however, these methods have also been proven e.g. in case of processes in hospitals.
In addition to these highlights, there were, of course, other exciting presentations and intensive discussions during the breaks. It continues tomorrow with the second part of the conference and of this entry…