Knowledge sharing
is key.

Subscribe to receive weekly interesting data, AI & tech articles ❤️

Get in touch

A gentle introduction to blockchain

2023-03-19 | Paolo Léonard

5 minutes read

A gentle introduction to blockchain focuses on the proof-of-work algorithm.

A gentle introduction to blockchain

Quantitatively measuring speech quality and training a text-to-speech model for Flemish Dutch

2023-03-05 | Silke Plessers

14 minutes read

Recently, Microsoft released VALL-E, a revolutionary new language model for text-to-speech (TTS) designed to significantly outperform other state-of-the-art zero-shot TTS models in terms of both speech naturalness and speaker similarity. VALL-E requires nothing more than a 3-second speech recording from a previously unseen speaker to synthesize high-quality speech. However, unfortunately, VALL-E is not yet available to the public and since the model is trained solely on English data, it will no

Quantitatively measuring speech quality and training a text-to-speech model for Flemish Dutch

Is edge computing just a buzzword?

2023-02-26 | Stijn Dolphen

6 minutes read

Edge computing is one of the recent buzzwords in Artificial Intelligence and - according to Gartner - it even has the potential to reach mainstream adoption in two to five years, with transformational business benefits as a result. How are these so-called marginal calculations creating additional value at the edges of a network instead of the centralized server location - or even an infinite pool of cloud resources? Let’s find out. Setting the scene. The integration of cutting-edge technologie

Is edge computing just a buzzword?

Song of the Machines (4): Digital Music Production

2023-02-05 | Arthur Chionh

5 minutes read

4 Dataroots colleagues, no professional music production experience, a heap of Artificial Intelligence (AI)-generated samples of music and lyrics. How did all these end up in the Song of The Machines? In this final instalment of our blogpost series on Beatroots and the 2022 AI Song Contest, we dive into the world of digital music production with AI. Digital Audio Workstations (DAW) Digital Audio Workstations (DAW) are software used for music production. Maybe you’ve heard of DAWs like ‘A

Song of the Machines (4): Digital Music Production

Anomaly detection in images using PatchCore

2023-01-22 | Toon Van Craenendonck

8 minutes read

Anomaly detection typically refers to the task of finding unusual or rare items that deviate significantly from what is considered to be the "normal" majority. In this blogpost, we look at image anomalies using PatchCore . Next to indicating which images are anomalous, PatchCore also identifies the most anomalous pixel regions within each image. One big advantage of PatchCore is that it only requires normal images for training, making it attractive for many use

Anomaly detection in images using PatchCore

Rootsacademy project: Fixing a slow AWS Lambda function

2023-01-15 | Nicolas Jankelevitch

7 minutes read

The rootsacademy, some believe it's as unrealistic and imaginary as Hogwarts, but nothing is further from the truth. This mythical academy really exists and turns wild partying scholars into professional consultants that are experts in the magical world of data. After the academy, most employees start working for their first client. For those who haven't found a match with a client yet (or for those whose project will only start in a couple weeks time, like myself) there is the rootsacademy pro

Rootsacademy project: Fixing a slow AWS Lambda function

Running Power

2023-01-10 | Thibauld Braet

9 minutes read

Bij dataroots lopen er heel wat sportievelingen rond en zijn we niet vies van af en toe een impulsieve uitdaging. Eigenschappen die we absoluut gemeenschappelijk hebben met Bobby en Seppe, de hosts van de Jogclub podcast. Impulsiviteit heeft zeker en vast zijn charme, een stijgend aantal wearables heeft sport echter alsmaar meer datagedreven gemaakt. Het leek ons dus leuk om samen met de mannen van de Jogclub ons gevoel af te toetsen tegenover deze sensors in aanloop naar de Jogclub Ultratrail.

Running Power

Create your own Christmas miracle with AI generated art

2022-12-22 | Sophie De Coppel

10 minutes read

Christmas is around the corner and you are still missing some cool Christmas cards? Well I got the thing for you! Don’t let artist block control you and start creating with the help of AI. This last year has been mind blowing with the rise of recent AI art generators like DALL-E , Midjourney and their open-source nephew, Stable Diffusion

Create your own Christmas miracle with AI generated art

Setting up AWS Infrastructure Using Terraform for Beginners

2022-12-18 | Jinfu Chen, Baudouin Martelée

5 minutes read

After one month of training at dataroots, some starters work on the internal project. The project of the Rootsacademy 2022 Q3 class consists of making an end-to-end solution for inferring information from traffic images. It goes without saying that this end-to-end solution requires infrastructure. In this post, we go through the infrastructure along with some tips and tricks to deploy AWS infrastructure using Terraform. Our aim is to explain things at a high level, such that you, the reader, can

Setting up AWS Infrastructure Using Terraform for Beginners

Song of the machines (3) : Generating lyrics with musical context.

2022-12-11 | Sander Van Grunderbeeck

5 minutes read

In this episode of “Can 4 Dataroots colleagues without music production experience write hit songs with AI?” you discover how the Beatroots team finetuned a transformer model to generate musical lyrics for their hit song. -------------------------------------------------------------------------------- Lyric generation powered by Beatroots AI No hit song without meaningful lyrics to go with it, right? This statement motivated the Beatroots team to explore the use of AI for lyric generation.  

Song of the machines (3) : Generating lyrics with musical context.

Tokyo Drift : detecting drift in images with NannyML and Whylogs

2022-12-04 | Warre Dreesen, Martial Van den Broeck

9 minutes read

Detecting drift in your data is very important when deploying models in production. It ensures that the performance of your model does not decrease due to the nature of the input data changing. There are a lot of tools out there for monitoring your data and detecting drift such as Great expectations, NannyML,... . However most of these are made for tabular data. In this blogpost we will discuss different approaches for detecting drift in images using popular tools. Creating drift using data au

Tokyo Drift : detecting drift in images with NannyML and Whylogs

Is the role of Chief Data Officer still hot or not?

2022-11-23 | KEVIN MISSOORTEN

8 minutes read

The role of Chief Data Officer evolved a lot since being first introduced twenty years ago. Given the paradigm shift to the Data Mesh, some experts argue that the CDO’s responsibilities might once again drastically change. Leading to the discussion whether the role is still necessary or not. In 2002, Catherine Doss stepped up as Chief Data Officer of Capital One. Back then, the position of a CDO and the challenges it came with were very different from how it is viewed today. Since I have been a

Is the role of Chief Data Officer still hot or not?

Real-Time Voice Cloning - tutorial

2022-11-10 | Virginie Marelli

2 minutes read

This tutorial demonstrates how a simple voice transfer app can be created using Streamlit . The code for this demo is based on the repository for Real-Time-Voice-Cloning . This app allows you to: * Record your voice * Visualize the embedding of the speaker * Synthesize speech based on the recorded voice Setup 1. Install Requirements Python 3.6 or 3.7 is needed * Create your virtual environment (e.g. pipenv

Real-Time Voice Cloning - tutorial

Hyper parameter tuning with Optuna - tutorial

2022-10-27 | Hans Tierens

2 minutes read

You all know that datarootsians are excellent data athletes. Olympics athletes train with weights, we evolved past that mere display of physical strength and started training the weights. In this way, Machine Learning Engineers train our models to achieve optimal performance on any task given to us. That’s how we shine! In this tutorial, we explain how state of the art hyper parameters techniques work and when to apply them, using Optuna library. The code GitHub - datarootsio/tutorial-hyperpa

Hyper parameter tuning with Optuna - tutorial

MLOps - tutorial

2022-10-13 | Vitale Sparacello, Murilo Cunha, Bram Vandendriessche

2 minutes read

"MLOps is a set of practices that aims to deploy and maintain machine learning models in production reliably and efficiently." - wiki At datartoots we've been pioneers of the MLOps methodology since the very beginning. For us MLOps means being able to identify all the business challenges, deliver the best solution quickly and efficiently, and monitor the project's evolution over time. To promote MLOps best practices we have run a workshop to KU Leuven uni

MLOps - tutorial

Songs of the machines (2) - Harmonisation

2022-10-10 | Zoë Van Noppen

4 minutes read

This blog post is part of a series of content in which we uncover how we wrote the song “Song of the Machines” with which we participated in the AI Song Contest 2022. AI songcontest As shared in the previous blog post, our vision for this year's song contest was to use AI tools as “creative partners in crime, music-making”. Rather than letting one model create the entire song, we would keep a human in the loop (for a limited part) of the songwriting model. However, in the spirit of the “AI Son

Songs of the machines (2) - Harmonisation

Snowflake + Snowpark Python = machine learning?

2022-10-03 | Murilo Cunha

9 minutes read

Snowflake announced on June 2022 that they are offering Python support with Snowpark! 🎉 What does that mean, you ask? Well, that means that now we can do all sorts of things with Python on Snowflake ecosystem, even some machine learning 🦾. "How?!?!", you ask? Short answer is: UDFs and stored procedures. Long answer is, as you could've guessed, a bit longer. What is Snowflake? If you're new to Snowf

Snowflake + Snowpark Python = machine learning?

Face Mask Detection - tutorial

2022-09-29 | Toon Van Craenendonck

7 minutes read

Face masks are crucial in minimizing the propagation of Covid-19, and are highly recommended or even obligatory in many situations. In this project, we develop a pipeline to detect unmasked faces in images. This can, for example, be used to alert people that do not wear a mask when entering a building. We recorded a YouTube video to explain the general pipeline of this project. Our pipeline consists of three steps: 1. We detect all human faces in an image 2. W

Face Mask Detection - tutorial

AI a catalyst for innovation

2022-09-26 | Virginie Marelli

4 minutes read

With the Energy crisis, it is even more blatant that we need to work on reducing our carbon footprint and better use the planet's resources. Earlier this year, I wrote an article about the impact that AI has on energy and how much smart implementation of algorithms we need. I discussed the race of big models and the negative impact they can have on energy consumption and thus carbon emissions. I also listed a couple of solutions from the software and hardware perspective that are active area o

AI a catalyst for innovation

The imagery revolution and how to create a logo with DALL-E

2022-09-18 | Virginie Marelli

11 minutes read

Joining the bandwagon Recently, text-to-image models have been sprouting all around the internet. It started with the release of DALL-E 2, a model created by OpenAI. > DALL·E 2 is a new AI system that can create realistic images and art from a description in natural language Many other similar models were released quickly after DALL-E. It went so fast that I'm not even sure which models were released when. The most famous ones are probably DALL-E, obviously, Midjourney, a model developed by Mid

The imagery revolution and how to create a logo with DALL-E

Great Expectations - tutorial

2022-09-15 | Paolo Léonard

3 minutes read

A brief tutorial for using Great Expectations , a python tool providing batteries-included data validation. It includes tooling for testing, profiling and documenting your data and integrates with many backends such as pandas dataframes, Apache Spark, SQL databases, data warehousing solutions such as Snowflake, and cloud storage offerings (S3, Azure Blob Storage, GCS). This tutorial covers the main concepts you'll need to know to use Great Expectations, gently walk

Great Expectations - tutorial

The price of healthy eating

2022-09-12 | Thibauld Braet

5 minutes read

It’s March 2020 and suddenly we’re all forced to spend time at home and because restaurants and bars are closed, all together we rediscover our kitchen and our passion for banana bread. We’re forced to learn how to cook and are eager to make healthy lifestyle changes by cooking healthy and exercising more. Fast forward to February 2022 and our passion for banana bread is being threatened by a war causing a sudden increase in the price of flour. What do we have to do now? Are there other alterna

The price of healthy eating

Tremplin IA by Digital Wallonia

2022-09-08 | Virginie Marelli

2 minutes read

Is AI for you? You don't really know how to start?  then Tremplin IA is spot on You probably have heard already a lot about AI and how it gives a competitive advantage to many companies. Yet, you don't really know where to start and how to test whether AI can bring value to your company. Tremplin IA is a program held by Digital Wallonia to help companies start a Proof of Concept (POC) in AI. Dataroots as a member of the AI experts of

Tremplin IA by Digital Wallonia

Our most popular posts - looking back

2022-09-05 | Virginie Marelli

2 minutes read

Last September, we started to regularly write blog posts. Everyone within Dataroots has been participating to this endeavour. People write about their job, about a conference, a new technology or simply they write about their passion. Thus since September, we have been publishing one post a week, without failing, sometimes even 2 posts per week! Creating the content is probably the hardest, you need to find an interesting topic, have a creative angle, a pinch of opinion and mash this into a st

Our most popular posts - looking back

Statistics Saga 2: Dimensionality Reduction

2022-08-29 | Chiel Mues

10 minutes read

Welcome back! If you remember from last time I told you we would continue with matrix factorisations, more specifically dive into some dimensionality reduction techniques! Hope you're ready! This blogpost will give you a comparison of two specific factorisation techniques that are foundational to the idea of dimensionality reduction: principal component analysis and factor analysis. Some of the techniques a

Statistics Saga 2: Dimensionality Reduction

Network analysis and community detection using Gephi

2022-08-22 | Silke Plessers

10 minutes read

Networks are everywhere around us. Just by reading this blog post, you traveled across the internet following links to land on different web pages. Your social life is defined by relationships with other people that consecutively are connected to some other people. Nowadays, these friendships are also defined online by social media platforms such as Facebook, Twitter, Linkedin etc. Protein interactions, gene interactions, supply chain optimisation, payments and transactions, even the spread of a

Network analysis and community detection using Gephi

To be sentient or not to be sentient?

2022-08-08 | Romain Compagnie

7 minutes read

AI, sentience and Google LaMDA If you've followed the tech-related news recently, you probably heard about Google's latest conversational bot, called LaMDA (Language Model for Dialogue Applications). After a lengthy conversation with the bot, a Google engineer became convinced that the bot they designed was sentient, or conscious in more common language. His claim and the public release of the conversation sparked heated debate among AI experts and enthusiasts. This is the perfect opportunity

To be sentient or not to be sentient?

Keep posted on our events!

2022-08-08 | Bart Smeets

0 minutes read

Just a short FYI :) As of now we will be listing all our upcoming events over at dataroots.io/events . We already have a calendar leading up to end of next year and we will be publishing these events soon at our events page. Keep posted 🙌

Keep posted on our events!

Federated Learning - a tour of the problem, challenges and opportunities

2022-08-01 | Raul Jimenez Maldonado, Omar Safwat

11 minutes read

The majority of machine learning algorithms are data hungry, the more the data we feed our models, the better they learn about the world’s dynamics. Luckily for us, data is everywhere in today’s world, dispersed over the different locations where they were collected. Examples of this is the user data that is collected on a daily basis by our cell phones, medical equipment and practitioners in medical facilities, etc. Conventionally, if we wanted to train a learning model, we would collect the d

Federated Learning - a tour of the problem, challenges and opportunities

Next best action recommendation - part 3: recommending actions using reinforcement learning

2022-07-25 | Silke Plessers, Sandy Moens, Virginie Marelli

9 minutes read

You have heard about reinforcement learning for next best action optimization but don't really know why you would use it over other techniques or how to use it best? In this article, we try to demystify and explain how we used offline reinforcement learning to have a good baseline model for optimizing marketing campaigns. A quick recap As mentioned in the initial post of this series

Next best action recommendation - part 3: recommending actions using reinforcement learning

Leaning in for HuggingFace Spaces

2022-07-25 | Sophie De Coppel, Hans Tierens

6 minutes read

Deploying your Machine Learning model is often the cherry on the cake for a Machine Learning Engineer. After putting a lot of effort in building your model, it is immensely satisfying to be able to send it off on an adventure of its own, hoping it conquers the world. However, the size of the world is vast, so we better equip our model with a quick and able ride. The question of whether to build our own horse in the cloud or borrow someone else’s horse for the trip? I recently faced this q

Leaning in for HuggingFace Spaces

Terraforming Snowflake ❄️

2022-07-18 | Lidia-Ana-Maria Baciu

9 minutes read

It should go without saying that data is a critical asset for any organization. As a result, it is important that the platform handling all this data is able to do so with scalability and speed in mind. Enter... 🥁🥁🥁 Snowflake! Snowflake is a cloud platform for data x, where x = . So everything that is data-related basically. Terraform , on the other hand, is an infrastru

Terraforming Snowflake ❄️

Terraform with Azure became even more awesome: filling the gaps in your code with the azapi provider

2022-07-04 | Sam Debruyn

9 minutes read

The cloud is just someone else's computer and to manage that we prefer to use Infrastructure as Code (IaC). dataroots believes that IaC can benefit any team working with cloud resources and most often Terraform is our tool of choice there. As a data & cloud engineer focusing on Microsoft Azure, that is true for me as well. However, there have been a couple of hick-ups along the road. We have to talk about the pro

Terraform with Azure became even more awesome: filling the gaps in your code with the azapi provider

Song of the Machines (1): Sampling musical sections

2022-06-30 | Dorian Van den Heede

6 minutes read

Can 4 Dataroots colleagues without professional music production experience write hit songs with AI? In this blogpost series the Beatroots team members uncover how they wrote their latest song, Song of the Machines, which they submitted for the 2022 AI Song Contest. AI Song Contest The AI Song Contest is an international music competition exploring the use of AI in the songwriting process. We have participated with Beatroots since its inception in 2020. The pre

Song of the Machines (1): Sampling musical sections

Weather Nowcasting - Model compression

2022-06-27 | Margaux Gérard, Omar Safwat

5 minutes read

-------------------------------------------------------------------------------- In our previous post, we explained our project of weather nowcasting in a general way. Now, we will deep dive into one of the most important steps in machine learning, which is model optimization. The need for optimizing model size and speed arises whenever the prediction model is required to run on an edge device, namely, smartphones, surveillance cameras, robots, etc.  Therefore, the challenge is maintaining a s

Weather Nowcasting - Model compression

🍔 Burgers & Drinks - Get to know Dataroots!

2022-06-21 | Silke Gerets

1 minutes read

What: Get to know our dataroots team! Where: Tiensevest 132, Leuven Who: Everyone with a passion for data & AI About this event 🤖You have a passion for data and AI? You also like a good burger as a study break or after finishing your exams? 🎯 We have the perfect event for you! 🍔 On Tuesday June 28th we’ll be organising a walking dinner at dataroots! 👍 The aim of this dinner is to give you an idea of what we do at dataroots and to meet some of our team members. ❗️Spots are limited, so

🍔 Burgers & Drinks - Get to know Dataroots!

The Great Industry - Heurisko - take aways

2022-06-20 | Richard Cosemans

4 minutes read

Industry 4.0 marks the fourth industrial revolution. What does that mean for us? Are we part of it, i.e. is the industry ready for big data and A.I.? Time to find out! Dataroots was invited to Heurisko 2022, the annual seminar hosted by Flanders Make, where the most innovative and industry-ready research results and applications are presented. Industry Zero to Industry Hero There have been multiple industrial revolutions in the last 200 years. The first industrial revolution happened around the

The Great Industry - Heurisko - take aways

Home design: how AI helps you customize your furniture

2022-06-13 | Sophie De Coppel, Hans Tierens

12 minutes read

Do you want to restyle or redesign your interior, but don't want to leave the comfort of your own home? Don't fancy reading through hundreds of interior design albums or going to your local furniture store to try imagining those couches in your own living room? Have you always dreamed of a couch with an extravagant tiger print, but you don't know if it will fit your interior? Well I got a thing for you! During my internship at dataroots, I have built an AI-driven application, SofaStyler, which

Home design: how AI helps you customize your furniture

Some interesting takeaways from this year's Techorama

2022-06-09 | Sam Debruyn

5 minutes read

Last week was a busy week for fans of the Microsoft technology stack like myself. Microsoft hosted its yearly developer conference, Microsoft Build, announcing lots of exciting updates to new and existing Azure services. In the meantime, the Belgian community of Microsoft technology users gathered in Kinepolis Antwerp for this year's edition of Techorama. First off, let me start by thanking the incredible crew and partners

Some interesting takeaways from this year's Techorama

Weather Nowcasting - deploying a model on edge

2022-06-06 | Margaux Gérard, Lidia-Ana-Maria Baciu, Adrian Gonzalez Carpintero, Omar Safwat

13 minutes read

The research department at Dataroots hosts twice a year its RootsAcademy, an initiation program that prepares its students for their career as consultants in data and AI. After the academy, the consultants take on their first internal project at Dataroots, with the aim of putting the concepts learned through the academy into practice. This March, we have been doing a proof of concept to automate the deployment of a weather nowcasting model on a Nvidia Jetson nano. Weather nowcasting is all abou

Weather Nowcasting - deploying a model on edge

Arty Farty - AI Song Contest 2021

2022-06-02 | Virginie Marelli

1 minutes read

Since 2020, dataroots participate in the AI Song Contest. Since the contest is around the corner and the team is working hard, we thought to tease with our previous participations! For the 2021 submission, Beatroots upped their game and studied the Jukebox model released by OpenAI. This model opened many opportunities to sample musical audio waves and complete musical ideas in the style of many genres and artists. Beatroots fine-tun

Arty Farty - AI Song Contest 2021

How to develop a business-driven data strategy

2022-05-29 | Ben Mellaerts

8 minutes read

for companies with different operating models If you prefer the video version (with slides); it is available here . Organizations have a business strategy in place to define how they can achieve and maintain a sustainable competitive advantage. However, most organizations don’t yet have a strategy in place on how to extract the right value from data. According to a survey

How to develop a business-driven data strategy

A gentle introduction to Geometric Deep Learning

2022-05-23 | Vitale Sparacello

9 minutes read

Intro AI has changed our world, intelligent systems are part of our everyday life, and they are disrupting industries in all sectors. Among all the AI disciplines, Deep Learning is the hottest right now. Machine Learning practitioners successfully implemented Deep Neural Networks (DNNs) to solve challenging problems in many scientific fields. Nowadays, cars can see how busy a crossroad is, it’s possible to have pleasant conversations with imaginary

A gentle introduction to Geometric Deep Learning

Recipe for a Data Burger

2022-05-16 | KEVIN MISSOORTEN

5 minutes read

At dataroots, we like to present our service portfolio by means of a burger. Like a burger, the ‘pièce de résistance ‘ is the Artificial Intelligence value-chain, with data pipelines transporting & providing quality data from source to model, simple or complex models mashing the data into insights and finally integration of those insights into the day to day business processes to put these hard earned insights to work. Also like a burger, the way to facilitate the efficient consumption of the

Recipe for a Data Burger

Arty Farty - AI Song Contest 2020

2022-05-12 | Virginie Marelli

1 minutes read

Since 2020, dataroots participate in the AI Song Contest. Since the contest is coming soon and the team is working hard, we thought to tease with our previous participations! 6 dataroots colleagues group together with only one mission: generating fully automated songs by clicking a button! The final algorithm generates songs by traversing the shortest distance in MIDI harmonies sampled by Variational Autoencod

Arty Farty - AI Song Contest 2020

Next best action recommendation - part 2: causal inference techniques

2022-05-09 | Silke Plessers, Sandy Moens, Virginie Marelli

11 minutes read

Causal inference is used to determine whether an action on a selected population is efficient and by how much. It is extremely useful to evaluate the average treatment effect of a campaign. For this, you need to compare the outcome difference on a treatment and on a control group. In this post, we explain techniques that can be used to evaluate an action even when a proper control group does not exist. We will explain how causality can still be inferred and tested and how much we can deduct fr

Next best action recommendation - part 2: causal inference techniques

Statistics Saga 1: Matrix Factorization

2022-05-02 | Chiel Mues

5 minutes read

This blogpost will give you a gentle (re)introduction to the idea of matrix factorization, an enormously useful technique in statistics and machine learning. Matrix Factorization Matrix factorization is a technique to decompose or factorize a matrix into a product of more fundamental matrices. If that sounds a bit confusing, it's analogous to factorizing a number: 48=4×12 or 48=6×8. Of course, a matrix is more complex than a number, so many kinds of factorization are possible. Perhaps the easi

Statistics Saga 1: Matrix Factorization

Trends in statistical visualisation

2022-04-25 | Lode Nachtergaele

4 minutes read

Machine learning engineers are at the intersection of programming (computer science), math/statistics/machine learning and domain knowledge/communication. Although a lot of progress has been made in the first two, their advances are constrained by the ability to convey their results to the business owners of a problem. Graphical representation can be of enormous help to bring over complex results. In this blogpost, we discuss latest trend in visualisation of statistical results. State-of-the -ar

Trends in statistical visualisation

Next best action recommendation - part 1: measuring the effect of a campaign

2022-04-11 | Silke Plessers, Virginie Marelli, Sandy Moens

10 minutes read

Campaigns you said? Great but which one? Multiple ways exist to nudge customers: for instance calling, sending out emails, offering discounts, etc. The channels are various and the content of the marketing messages are even more diverse. In this article we explain how to optimize a marketing campaign and what to do when you did not implement the ideal strategy but have data that can help you derive important insights. From churn prediction to business value Not so long ago, in a previous post

Next best action recommendation - part 1: measuring the effect of a campaign

Non-existent quotes by GPT-3

2022-04-11 | Bart Smeets

3 minutes read

Over the weekend I had the pleasure of talking to Gertrude Poirot Torricelli III, long for GPT-3. Her insightful advise and hopeful views on the future of society and the world at large inspired me to share her musings with the rest of you. She was very open to this idea and I will be sharing her advise in quote form daily during the week of April 11th, 2022. All quotes will be collected down here. 👇 Monday Tuesday Wednesday Thursday Friday That wraps up this non-existent quotes series! ✅ �

Non-existent quotes by GPT-3

What is architecture?

2022-04-04 | Wim Van Leuven

5 minutes read

As a growing data consultancy boutique, we get more and more questions to review and architect data platforms. While growing, we are also maturing the architecture practice at Dataroots. What is Architecture? We can obviously not discuss architecture without some reflection on the term itself in the context of ICT solutions in general, and data platforms specifically. A topic which immediately proves to be not that easy to grasp. When brainstorming the subject, we easily talked about the respon

What is architecture?

Is AI an eco disaster?

2022-03-28 | Virginie Marelli

6 minutes read

You hear more and more that technology in general is not so eco friendly. What about AI? Is it also not so eco-friendly? What is the impact of developing AI models and how good is AI for the planet? With this article, we try to demystify and understand the impact of AI on the planet and how this could be reduced. What are the resources needed to build an AI? To build AI models, requires a lot of resources, especially if you are building models like Bert, GPT, or in general, deep neural net

Is AI an eco disaster?

Open source alert: Rootsstyle

2022-03-21 | Virginie Marelli

1 minutes read

You love Matplotlib cause it's easy to use and you can generate plots quickly? It's now possible to do these Matplotlib exact same plots with dataroots theme with Rootsstyle ! Rootsstyle works with any visualization tools that builds upon Matplotlib (seaborn, pandas). Check it out !

Open source alert: Rootsstyle

A light introduction to transformers for NLP

2022-03-21 | Murilo Cunha

5 minutes read

If you ever took a look into Natural Language Processing (NLP) for the past years, you probably heard of transformers. But what are these things? How did they come to be? Why is it so good? How to use them? A good place to start answering these questions is to look back at what was there before transformers, when we started using neural networks for NLP tasks. Early days One of the first uses of neural networks for NLP came with Recurrent Neural Networks (RNNs). The idea there is to mimic huma

A light introduction to transformers for NLP

Marketing strategy - How to go beyond propensity models

2022-03-16 | Virginie Marelli

6 minutes read

When you start integrating data into your marketing strategy, the first questions that needs to be answered are often: who’s going to churn in the next couple months? To whom should we best sell what product? Does that person need this product? To answer these types of questions one can build a model based on historical data. We look for customers that demonstrated the desired behavior in the past (churn, buying a product, etc) and how they looked like (characteristics and behavior). The assump

Marketing strategy - How to go beyond propensity models

Internships

2022-03-09 | Virginie Marelli

2 minutes read

Want to discover if a career in AI is something for you? Apply for one of our cool internships or propose your own! We are already planning the internships of next year, here is a sneak peak into what it entails 🤖 Looking for an internship? Internships are the perfect way for you to see if you would like to pursue a career in AI and for us to see if there’s a match for a long-term collaboration! There is not enough time in a human life to develop all the cool ideas that we have in mind so it

Internships

How to make AI fair and influence data science projects.

2022-03-08 | Tim Leers

6 minutes read

The problem. Artificial intelligence (AI) is driving the rapid transformation of industries. However, the exponential rate of that transformation is difficult to manage for legislators. Moreover, there is no industry standard to ensure AI is safe and beneficial. New applications are introduced at breakneck speed, oftentimes without sufficient consideration of their potential societal impact. AI promises to enable the scaleable automation of almost any decision-making system. In doing so, we amp

How to make AI fair and influence data science projects.

Deep learning model compression

2022-02-28 | Toon Van Craenendonck

4 minutes read

Deep neural networks offer unparalleled performance for many applications, but running inference can be resource-intensive. Model optimization comes in to help here, reducing disk storage, memory usage or compute requirements. This can be useful for deployment on the edge (to run models where it otherwise would not be possible), as well as for the cloud and on-premise (to run models faster, or allow more models te be stored in-memory simultaneously). Moreover, reduced energy requirements of opti

Deep learning model compression

Gender Equality at the Olympics

2022-02-25 | Thibauld Braet

6 minutes read

Last week, the winter Olympics in Beijing came to an end. For Belgium, this meant a successful edition with one female (Hanne Desmet) and one male (Bart Swings) medal. Belgian medals at the Winter Olympics are pretty rare anyway but the medal of Hanne Desmet was the first Belgian female one since the games of 1948 in Sankt Moritz! At dataroots, we highly value diversity, putting the topic regularly on the agenda to see if everybody thinks we’re on the right track. The past decades, the topic ha

Gender Equality at the Olympics

What the Duck?!

2022-02-23 | Bruno Quinart

4 minutes read

Unboxing an embeddable analytical database. DuckDB is a recent addition in the analytical database world. And it takes an interesting approach: it wants to be the SQLite for analytics. DuckDB was developed by Mark Raasveldt and Hannes Mühleisen, two database researchers at the Centrum Wiskunde & Informatica (CWI) in Amsterdam, the Dutch National Research Institute for Mathematics and Computer Science. CWI is not just any research institute. For a few decades now, the team has been pushing the

What the Duck?!

The explainable AI boom: Why is XAI important? And why now?

2022-02-19 | Tim Leers

4 minutes read

As we alluded to in our trends post , the number of researchers, developers and companies that focus on eXplainable AI (XAI) is growing faster each year. 💡XAI is an umbrella term for methods, algorithms and tools that increase insight into the inner workings of AI. This is in contrast wit

The explainable AI boom: Why is XAI important? And why now?

Marriage problem - a matching theory story

2022-02-14 | Virginie Marelli

4 minutes read

Matching theory (a branch of game theory) is a mathematical framework attempting to describe the formation of mutually beneficial relationships over time. What other topic could we possibly have chosen for Valentine's day? Actually, this is a very serious and important field of research in economics. And, in 2012, Alvin Roth and Lloyd Shapley got awarded a Nobel prize

Marriage problem - a matching theory story

Data Quality for Notion Databases 🚀

2022-02-06 | Ricardo Elizondo

5 minutes read

> Notion ➕ Great Expectations = 🚀 If you've ever heard of or used Notion (specially their databases) and Great Expectations, you can already imagine what this is about 😉. If not, find a quick ELI5 below: See our Github for more technical details and detailed instructions. 👶 ELI5: Great Expectations > "Great Expectations is a shared, open standard for data quality. It helps data teams eliminate pipeline debt, through data testing, docu

Data Quality for Notion Databases 🚀

Trends in XAI tools & research at NeurIPS 2021

2022-02-04 | Tim Leers

10 minutes read

eXplainable AI or XAI is crucial to ensure stakeholder and public trust, as well as reliability, particularly in high-stake contexts where AI decisions can impact lives. Open-source contributors, researchers & companies are stepping up their game by providing ever-more ambitious and inventive methods to ensure transparent, interpretable and ultimately, explainable AI. As a consequence, XAI methods are sprouting up like mushrooms, meaning that the decision on which method to use is becoming inc

Trends in XAI tools & research at NeurIPS 2021

Data science and notebooks = databooks: a love story

2022-02-02 | Murilo Cunha

4 minutes read

If you're not new to Python and data science, you probably heard of Jupyter notebooks . But if you haven't, here's the gist: it's an interactive environment, meaning you can run little bits of code and see the output, store variables in memory, etc. That makes notebooks a good tool for experimentation, reporting and visualizations. And because of that, it's a popular tool of choice for data science in general. And this is why you see a lot of notebooks in places like Kagg

Data science and notebooks = databooks: a love story

What we are excited about for 2022!

2022-01-30 | Virginie Marelli

11 minutes read

Foreword In this post, we have gathered our experts’ views on new developments in AI. However, AI is a broad field and we do not pretend to have a complete understanding of the whole landscape. Our view is necessarily biased by our activities as an AI service provider and our Belgian market presence. Bearing this in mind, we examine different trends that we have spotted in AI across industries, research, tooling and much more. The goal of this article is to get an overview of the landscape and

What we are excited about for 2022!

Publication Alert: Tim Leers

2022-01-28 | Bart Smeets

1 minutes read

🙌 An article that our very own Tim Leers co-authored just got released. A snippet of the summary: > Engagement and training of community health workers (CHWs) have demonstrated their value in different conditions. Despite repeat epilepsy trainings of CHWs in Northern Rwanda, the treatment gap remained high. We hypothesized that effectiveness of CHWs on mobilization of patients living with epilepsy (PwE) could be improved using a va

Publication Alert: Tim Leers

DataTrends 2022

2022-01-27 | Virginie Marelli

0 minutes read

Watch our experts share our views on : * What type of data do we currently work with? * How are the AI use cases evolving? * How much time does it take to leverage value from AI/data? * What has been the biggest evolution in infrastructure to support the AI cases * Where is the market in terms of AI adoption and maturity? * What is the role of the EU citizens, how are they included in AI projects/development?

DataTrends 2022

Open Source is at the heart of the way we work

2022-01-24 | Sam Debruyn

2 minutes read

> Why would the chef give away the recipes for the dishes he is famous for? What does the engineer achieve from sharing his schematics for that new technological marvel? Working open source is like sharing your secrets. These contemplations are often pronounced by people outside or not familiar with software development. Let’s have a look at a couple of examples of why and where Open Source proves its value in our day-to-day business. Essential ingredient Free Open Source Software (FOSS) is an

Open Source is at the heart of the way we work

Qs?

We're here for you 🙌

Chat
Contact form
Email

sign up to our weekly AI & data digest ❤️