By Sam Debruyn
Dataroots is very active in the data community thanks to its strong culture of internal knowledge sharing which we also like to share with the outside world. We do this by organizing meetups, giving talks, writing blog posts, and more. In this short post, we list 6 times 6 ways you can engage with the community of people working with data.
Meetups
We noticed that a lot of meetups started to appear after the pandemic. People apparently like to go out and share knowledge. We are very happy to see this and we try to support this by giving talks here and there. Here are 6 Belgian meetup groups which we think you should follow*:
Data Science Leuven: an overview of data meetups in Belgium without the mother of all Belgian data meetups would be incomplete. By far the biggest community of data (science) enthusiasts in Belgium.
Python User Group Belgium: this group has recently been (re-)launched and is growing fast. It's a great place to learn about Python and its ecosystem. 🐍 Definitely stay for the drinks and discussions afterward.
Data Mineurs: a French-speaking group of data enthusiasts organizing meetups in Charleroi. The talks are very accessible, regardless of your level of expertise. Previous meetups had speakers from various sectors with lots of interesting use cases.
DataBeers Brussels: lightning data talks and beers 🍻, is there a better combo? The (free!) tickets for these meetups sell out like pancakes, 🥞 so be quick!
Belgium dbt Meetup: a meetup group focussed around one the most popular open source tools for data transformation today. The group is quite new, but the first few meetups drew a lot of attention.
Belgium NLP Meetup: Yves organizes a meetup every few months to discuss the developments in the field of NLP. Most meetups gain a lot of traction and we had the honor of hosting the last meetup at our offices!
*: we won't hide the fact that some of our colleagues are actively involved in these groups as organizers which we think is very cool ;)
Conferences
Sometimes you're hungry for more after a meetup. What better than spending a full day (or more) with like-minded people? Here are 5 conferences we think you should attend*:
DataMinds Connect: looking for a purely data-focused conference in Belgium? Look no further. This conference is organized by the DataMinds community and each year has a great line-up of speakers.
Big Data London: this is a free conference in London 🇬🇧 with a great line-up of speakers. Did we mention it's completely free?
Data & AI Summit: every year Databricks organizes this 4-day conference where you can hear the latest from every corner of the data world 🌎.
Data Platform Next Step: as someone who recently gave a talk at this great conference, I can only say you missed out on interesting insights into the Microsoft data stack. Join us next year?
Pyjama Conf: Looking for a fully remote experience? Pyjama Conf is a 24-hour, Python-focused conference that is once touted as the “laziest conference EVER” for encouraging participants to attend in their pyjamas while enjoying talks from around the globe.
RootsConf: sorry, this one is only for our colleagues reading this blog. One of the perks of working at dataroots is that we organize our own conferences with lots of interesting talks and workshops about data and AI. There is only one way to attend.
Slack
What if you want to ask some questions or discuss data-related topics from the comfort of your desk (or couch? 💆)? These are 5 Slack communities centered around popular data tools where you'll find like-minded people to discuss with:
Apache Airflow Slack: the Slack community of Apache Airflow, probably the most popular data pipeline orchestration tool. More than 30k members from all over the world. Want to talk DAGs? What are you waiting for?
Prefect Slack: the Slack community of Prefect, the new kid on the block in the data pipeline orchestration space. As we think this is one of your best options for data orchestration right now, we recently partnered with Prefect. Will we see you in #introductions?
dbt Slack: the Slack community of dbt, the open source data transformation tool. More than 50k members and over 200 channels about lots of data topics.
Soda Slack: our Belgian friends over at Soda have a Slack community where you can grill test your data. Do you finally want to get rid of those Feb 30th dates in your data? This is the place to be.
Raito Slack: we had data transformations, data pipeline orchestration², and data quality monitoring. Let's add a great tool for data access management to the mix. Another Belgian company as well. 🙌
MotherDuck Slack: our partner, MotherDuck, recently launched their product in beta. MotherDuck brings DuckDB to the cloud and we wrote a blog post about it. Are you intrigued? You can join their Slack channel to discuss with other duckers ;)
Blogs, podcasts, and YouTube
If you want to stay on top of all things data while commuting, you could read blogs, listen to podcasts, or watch YouTube videos. We've got you covered with 5 channels we can recommend:
this one: we cover our own experiences in data & AI and have been doing that for a few years now. Subscribe to our newsletter to stay up to date with our latest posts.
DataTopics: a monthly podcast hosted by our own Kevin Missoorten with interviews with industry leaders in our local data space.
Modern Data Stack: a blog about the modern data stack. What else did you expect? 😄 The blog covers all aspects of a data stack with stories from unknown start-ups to big tech like Spotify.
mehdio: this Belgian (now based in Berlin) YouTuber 📹 covers data engineering topics like Spark, dbt, DuckDB, and more.
DataCamp: the leaders in online data training (with just a 10min walk between our HQ and theirs) have a great blog with lots of interesting posts about data science, data engineering, and more.
Towards Data Science: this blog is probably on top of mind for every data scientist. This Medium publication has been around for a while and consistently publishes quality content about everything related to data science.
Social Media
Social media platforms are powerhouses when it comes to disseminating information, learning, and networking. They offer immediate access to a global community of data professionals and enthusiasts. Here are 5 social media platforms where the data community is thriving:
Reddit: There are numerous subreddits dedicated to machine learning, data engineering, and related fields. Some popular ones are r/datascience, r/dataengineering, and r/machinelearning. Reddit is known for its topical posts where users can discuss various topics, points of views and share resources.
Twitter: Twitter is (was?) an excellent platform to follow data experts, organizations, and enthusiasts. Many professionals share their knowledge, articles, and insights in the data domain. Engage with the community by following hashtags like #DataEngineering, #AI, or #MachineLearning or #OpenSource. Also check out Twitter Chats or Spaces.
Mastodon: Mastodon is similar to Twitter but is open-source and decentralized. It’s becoming popular among tech enthusiasts for its ad-free and non-algorithmic timeline. You can follow and engage with professionals in the data field, and you might find some unique perspectives here. The Fosstodon server is definitely worth checking out.
Lobste.rs: Lobste.rs is a technology-focused link aggregation website. It's somewhat similar to Reddit but tends to have a more concentrated audience of developers and tech enthusiasts. You can find articles, discussions, and insights on data-related topics here. It follows an interesting and opinionated approach to moderation and user invites.
Hacker News: Hacker News (news.ycombinator.com) is a popular website for sharing and discussing computer science and entrepreneurship articles. There is a strong community of data professionals and it's a great place to find discussions and news about data science, machine learning, and big data.
KBin: Recently, the data community has been abuzz with discussions surrounding Reddit's API pricing changes. Amidst this turmoil, KBin has emerged as a notable alternative. KBin is an open-source, fediverse-based platform that resembles Reddit in link-aggregation and discussion functionality. Curious about what KBin has to offer? It's worth checking out!
Each of these platforms boast their own niche microcommunities that can cater to your specific interests within the data field. These microcommunities offer a more tailored and intimate environment for sharing insights, asking questions, and networking with like-minded individuals who share your passion. Additionally, keeping an eye on emerging platforms like KBin and BlueSky is worthwhile. Or go a bit more retro and subscribe to relevant mailing lists (check out the Python and Apache mailing lists) or explore interesting authors on Substack.
Contributing to open source
Learn by doing. By contributing to open source, you can not only give back to the community in a very tangible way, but you can also learn a lot. By diving into the code of your favorite tools and frameworks, you'll learn everything about the ins & outs. Bonus: maintainers will review your PRs and give you feedback. Here are 5 open source projects which welcome community contributors by making it very easy to make your first PR:
ruff: open source projects can be an excellent way to learn a new programming language. Are you new to Rust? This Python linter is taking the world by storm and is a great way to get started with Rust.
dbt-core: dbt and its ecosystem of adapters and packages are fully open-source. Most of these repositories have some issues labeled good first issue and well-documented contribution guidelines.
Apache Airflow: one of the biggest Python projects on GitHub. Airflow comes with a ton of integrations and creating your own providers is a great way to contribute. Having tens of existing providers to look at will help you get started.
Airbyte: Airbyte is an upcoming EL tool that is fully open-source. The project needs help with building connectors to data sources and destinations. At the moment they are even offering rewards 💰 for contributions.
pandas: the most used data library in Python also has several good first issues to get you started. This ranges from bug fixes and feature implementations to writing documentation.
tf-profile: dataroots publishes a lot of open-source content as well and this is our most recent tool. You can use tf-profile to get a better grasp of your Terraform projects. Feel free to take a look, raise any issues or create pull requests to extend it!
Conclusion
Of course, there are more ways, but this should already get you started. Follow us on our social channels to stay up to date with our activities and events. This is also the place where you can shout at us because we didn't include your favorite meetup, conference, Slack community, or blog in our list. 😅 We're always happy to learn about new initiatives!