Deep Learning Frameworks Using Spark on YARN

Hadoop, Spark & Kafka

Traditional machine learning and feature engineering algorithms are not efficient enough to extract complex and nonlinear patterns hallmarks of big data. Deep learning, on the other hand, helps translate the scale and complexity of the data into solutions like molecular interaction in drug design, the search for subatomic particles and automatic parsing of microscopic images. Co-locating a data processing pipeline with a deep learning framework makes data exploration/algorithm and model evolution much simpler, while streamlining data governance and lineage tracking into a more focused effort. In this talk, we will discuss and compare the different deep learning frameworks on Spark in a distributed mode, ease of integration with the Hadoop ecosystem, and relative comparisons in terms of feature parity.