Bigdata Applications with Microservices & Airflow
KAFKA cluster start: create some topic in kafka which can be used to pass json message. SpringBoot Kafka Microservice : Spark Jobs as Service : Job reads the kafka topic and prints the message also writes it to parameter file in Airflow execution directory. Airflow setup and Dag creation through parameter file :
0 Comments
As of today, both Machine Learning & Predictive Analytics are imbibed in majority of business operations and have proved to be quite integral. However, it is Artificial Intelligence with the right deep learning framework which amplifies the overall scale of what can be further achieved and obtained within those domains. The machine learning paradigm is continuously evolving. The key is to shift towards developing machine learning models that run on mobile so as to make applications smarter and far more intelligent. Delving even further, deep learning is what makes solving complex problems possible. Given that deep learning is the key to executing tasks of higher level of sophistication, building and deploying them successfully proves to be quite the herculean challenge for data scientists and data engineers across the globe. Today, we have a myriad of frameworks at our disposal that allows us to develop tools that can offer a better level of abstraction along with simplification of difficult programming challenges. Each framework is built in a different manner for different purposes. Here, we look at some of the top 8 deep learning frameworks in order for you to get a better idea on which framework will be the perfect fit or come handy in solving your business challenges. Arguably one of the best deep learning frameworks that now been adopted by several giants at scale such as Airbus, Twitter, IBM and others mainly due to its highly flexible system architecture. The most well known use case of TensorFlow has got to be Google Translate coupled with capabilities such as natural language processing, text classification/summarization, speech/image/handwriting recognition, forecasting and tagging. TensorFlow is available on both desktop and mobile and also supports languages such as Python, C++ and R to create deep learning models along with wrapper libraries. TensorFlow comes with 2 tools which are widely used –
Caffe is a deep learning framework that is supported with interfaces like C, C++, Python, MATLAB as well as the Command Line Interface. It is well known for its speed and transposability and its applicability in modelling Convolution Neural Networks (CNN). The biggest benefit of using Caffe’s C++ library (comes with a Python interface) is accessing available networks from the deep net repository ‘Caffe Model Zoo’ which are pre-trained and can be used immediately. Whether it is modelling CNNs or solving image processing issues, this has got to be the go-to library. Caffe’s biggest USP is speed. It can process over sixty million images on a daily basis with a single Nvidia K40 GPU. That’s 1 ms/image for inference and 4 ms/image for learning and more recent library versions are faster still. Caffe is a popular deep learning network for vision recognition. However, Caffe does not support fine granularity network layers like those found in TensorFlow or CNTK. Given the architecture, the overall support for recurrent networks and language modeling is quite poor and establishing complex layer types has to be done in low-level language. Popularly known for easy training and combination of popular model types across servers, the Microsoft Cognitive ToolKit (earlier known as CNTK) is an open source deep learning framework to train deep learning models. It performs efficient Convolution Neural Networks and training for image, speech and text based data. Similar to Caffe, it is supported by interfaces such as Python, C++ and the Command Line Interface. Given its coherent use of resources, the implementation of Reinforcement Learning models or Generative Adversarial Networks (GANs) can be done easily using the toolkit. It is known to provide higher performance and scalability as compared to toolkits like Theano or TensorFlow while operating on multiple machines. Compared to Caffe, when it comes to inventing new complex layer types, the users don’t need to implement them in a low-level language due to the fine granularity of the building blocks. The Microsoft Cognitive Toolkit supports both RNN and CNN type of neural models and thus capable of handling image, handwriting and speech recognition problems. Currently, due to the lack of support on ARM architecture, the capability on mobile is fairly limited. Pytorch is a python version of Torch framework which was released by Facebook in early 2017. It uses dynamic computational graphs which contributes significantly analyzing unstructured data. Pytorch has customised GPU allocator that makes DL models more memory efficient. Pytorch is a simple framework that offers high speed and flexibility.
Pytorch is used by many tech companies such as Twitter, Facebook and Nvidia to train DL models. It is developer-friendly and very efficient. Some of the main disadvantages are that it is still comparably new beta version and there is not enough community support. However, it has the potential to challenge TensorFlow with its growing momentum in the upcoming years. |
|