MarkLogic User Group in Berlin
How to skip ETL processes and rapidly implement smart data projects with MarkLogic

Based on the example of VLB-TIX, NEWBOOKS Solutions demonstrate how to utilize very large quantities of disparate data at the MarkLogic User Group in Berlin.

Many companies in the publishing and media industry are faced with difficulties when it comes to using their entire data pool since that data is often dispersed within separate business units and data silos. In order to gain access to their information companies have to invest an enormous amount of time and money in ETL (extracting, transforming and loading) tools and must force their data into relational databases. The problem with this approach is that customers, business partners and employees expect accurate and up-to-date information in real-time and cannot wait months or years for laborious integration processes to be completed. According to Moritz Hodde, CEO of NEWBOOKS Solutions, "Publishers need to be able to use both their own and external data quickly, efficiently and flexibly, because results are required now and in real-time, while the challenges posed tomorrow may be very different".

It was against this backdrop that Stefan Schwedt, CIO of NEWBOOKS Solutions and CEO of technology partner iucon, presented the VLB-TIX smart data project at the MarkLogic users meeting on October 13, 2016 in Berlin and mapped out the best path for companies in the media and publishing sector to follow, when it comes to integrating and efficiently using large and disparate quantities of data. In doing so he drew structural comparisons between traditional relational ETL-based databases and MarkLogic's innovative, non-relational approach, which was an essential component in the successful implementation and adoption of VLB-TIX as the system of choice for bookstores in the German-speaking world.

Schwedt explained how, with the support of MarkLogic, the development team has been able to achieve and permanently maintain the immense level of processing power required to run VLB-TIX. According to Schwedt, the combination of approximately 2 million merged records from the VLB and the equally large NEWBOOKS database, plus around 200,000 incremental updates per day and 5,000 users, each of whom on average enters 5 different search profiles based on an average of 13 different filter types, amounts to 65,000,000,000, yes that’s 65 billion comparisons, on the database!

Software architecture of VLB-TIX: comparison between SQL and NoSQL-Architecture

Software architecture of VLB-TIX: Comparison between SQL and the actual NoSQL-Architecture

Given today’s technology and taking into account developments already in the pipeline such a volume of data can only be mapped using a NoSQL database. Furthermore, the diverse and often last-minute demands submitted by industry players could only be implemented, because in MarkLogic, who are the only company to have an operational and transactional Enterprise NoSQL database, NEWBOOKS found a reliable and optimal alternative to decades-old relational database systems. MarkLogic offered the developers the necessary flexibility and scalability and, above all, what can genuinely be described as “agile application development" in terms of concepts and requirements. Supported by tried and tested features such as top level security certification and high availability, MarkLogic is the state-of-the-art database system for smart data projects.