Review of DataEngConf (NYC, 2016)

Put together by Hakka Labs, whose mission it is to organize, foster and educate data communities. Right now their conferences have been put on in San Fransisco and New York in 2016, with those cities repeated in 2017 and an expansion to Europe in 2017 as well. According to Pete Soderling, Founder of Hakka Labs, the leading cities for the European expansion in Summer of 2017 are London or Berlin.

The conference was a two day affair, with dual tracks (Data Engineering and Data Science). It was located at the New Labs complex, in the Brooklyn Navy Yard. I focused exclusively on Data Engineering. The conference was billed as a technically-focused conp between data engineering and data science. It lived up to it's billing as the majority of the talks had significant technical details. Both days features opening and closing keynotes for the entire conference.

The highlight of the conference was the presentation by Hilary Mason, titled Data Science: Past, Present and Future, Ms. Mason is a nationally known figure in the data science community. Her message of how far we've come in the fields of data engineering and data science, and where we are headed was fantastic. She is also a leading proponent of building up and connecting the data scene in NYC. At most conferences the opportunity to interact with presenters is very limited but DataEngConf did something excellent to remedy this problem. which is that each speaker had "office hours" immediately after their presentation. While this does present time conflicts, it also for deeper questioning and dialogue with the presenters, which was fantastic.

From Ms. Mason I received the following advice on fostering the data professionals community in my city of Madison, Wisconsin, where I have run the BigDataMadison meetup for the past few years. She stressed the need to be inclusive to different types of individuals, and cater to multiple needs. She also shared the idea to schedule "data drinks", which are invite-only, unstructured meetings held in a social atmosphere to promote sharing and fun around data.

Within the data engineering track there were three types of talks:
1. Technology specific talks: Parquet/Arrow, Kudu, Kafka Streams
2. Use case/implementation talks: Spotify, Buzzfeed, Basho
3. Advice/prognostication: future of python in data wrangling, career panel, computational social science, using data science for social good

My three top talks were the presentation:
Data Science: Past, Present and Future Hilary Mason
The Evolution of Data Processing At Spotify by Erin Miller
The Future of Column-Oriented Data Processing with Arrow and Parquet by Julien Le Dem.

All in all, a very enjoyable experience. Thanks to Hakka Labs for organizing the conference and my employer, Earthling Interactive, for subsidizing the costs of travel and attendance.

Image copyright Designboom

Pitt Fagan

Greetings! I'm passionate about data; specifically the big data and data science ecosystems! It's such an exciting time to be working in these spaces. I run the BigDataMadison meetup where I live.