Publicado em June 18, 2021

Data Engineering for All

How shortage in one its most important roles might be affecting the AI Revolution.


·      AI needs to shift to a more data-centric model as a good number of AI projects still have a tough time yielding steady results.

·      It is necessary to build the foundation for AI to succeed at scale in a consistent basis.

·      There is a shortage of Data Engineers in the market.

·      Escola Livre de IA, with I2AI’s support, is launching a new initiative to elucidate the basic concepts of Data Engineering.

On March – influential professor, researcher, and entrepreneur – Andrew Ng urged the Data community to shift from a Model-centric to a Data-centric approach to improve AI results. His proposition is that this would be a path towards being more systematic and efficient on delivering real results and actual value for companies investing in Artificial Intelligence. On a personal note, Andrew’s ideas will always have my attention, since it was way back with his Stanford courses in 2016 that I really got in touch with the concepts of Machine Learning - and all the math and statistics that run under the hood in AI models - ultimately becoming a Data Scientist!

Andrew Ng on MLOps: From Model-centric to Data-centric AI

But back to his chat: the main takeaway from his talk was:

·      Ever since the “boom” of AI in recent years, the community’s emphasis – and corporate investment – has been focused on teaching, developing, and improving more sophisticated and complex AI Algorithms – with an expanding zoo of different neural network architectures, among other models, being developed.

·      While this approach has yielded incredibly significant results, initial evidence has begun to show that improving the quality of the data – having not only Big Data but striving for Quality Data may be the next step that will truly unlock AI.

Interestingly, we can also observe, according to the recent “PWC – AI Predictions for 2021” published this year – that there still is quite a long way to go for companies to fully reap the benefits and results of investing in AI: reaching scale at production on a consistent basis.


PWC – AI Predictions 2021


So, could it be - that we are still lacking the foundation for the real success of AI Projects? Just look at the picture below – Data Scientists (and I include myself on this group) usually only work on the higher layers of the pyramid. Should this really be our focus? Not surprisingly, looking around on AI and Data Science sites, there is also a growing number on recent articles about this subject:


The Data Science Hierarchy of Needs – Monica Rogati

We Don't Need Data Scientists, We Need Data Engineers

Why You Should Consider Being a Data Engineer Instead of a Data Scientist - KD Nuggets


Should You Become a Data Engineer in 2021? - Towards Data Science