Thomas Henson

  • Data Engineering Courses
    • Installing and Configuring Splunk
    • Implementing Neural Networks with TFLearn
    • Hortonworks Getting Started
    • Analyzing Machine Data with Splunk
    • Pig Latin Getting Started Course
    • HDFS Getting Started Course
    • Enterprise Skills in Hortonworks Data Platform
  • Pig Eval Series
  • About
  • Big Data Big Questions

Ultimate List of Tensorflow Resources for Machine Learning Engineers

January 14, 2021 by Thomas Henson Leave a Comment

Post first appeared on the Big Data Beard as Ultimate Lost of Tensorflow Resources for Machine Learning Engineers
Tensorflow is the most popular deep learning/machine learning framework right now. One of the biggest reasons for the popularity of Tensorflow (and my personal favorite) is the portability. A Machine Learning Engineer can create models using Tensorflow on their local machine then deploy those same models to 100s or 1000s of machines. Another reason for the popularity is because the Tensorflow is primarily used with Python. Developers both old and new having been shifting to Python for the last 10 years, which means there is a huge talent pool out there ready to develop in Tensorflow.
The Google Brain team is primarily responsible for releasing the first iterations of Tensorflow (DistBelief prior to release). In 2015 Google released Tensorflow to the open source community and the development has only continued at scale. Considering the importance and popularity of Tensorflow I thought it was a good idea to create a resource list for Tensorflow learning/training/research.

Tensorflow Resources

Course on Tensorflow

Run Tensorflow in 10 Minutes with TFLearn – TFLearn offers machine learning engineers the ability to build Tensorflow neural networks with minimal use of coding. In this course, Implementing Multi-layer Neural Networks with TFLearn, you’ll learn foundational knowledge and gain the ability to build Tensorflow neural networks. First, you’ll explore how deep learning is used to accelerate artificial intelligence. Next, you’ll discover how to build convolutional neural networks. Finally, you’ll learn how to deploy both deep and generative neural networks. When you’re finished with this course, you’ll have the skills and knowledge of deep learning needed to build the next generation of artificial intelligence.

Research Topics on Tensorflow

Tensorflow – Official site for all things Tensorflow including downloading and installing. Read through the documentation and getting started guide. For a 15 hour deep dive into Tensorflow go through the Machine Learning Crash Course. 15 hours sounds like a lot but break it up into 30 minutes a day for 30 days. After 30 days you’ll have more of an understanding of ML/DL with Tensorflow than most of the competition.
Tensorflow Source Code – At some point in your Tensorflow journey you may want to jump directly into the source code. Tensorflow is an open source project and like most popular open source projects it’s on GitHub.
Tensorflow Resources

Hands On Tensorflow Resources

Tensorflow Playground – Interactive Neural Network inside the browser. It allows you to train data from 4 different data sets. You can control features, neurons, learning rate, activation, regularization, etc. One of the easiest things to try is running the same data type through the different activations to see which is faster.
JavaScript Tensorflow? – At first glance I didn’t realize the potential of having a JavaScript Library for Tensorflow. What benefit would come from training models in the browsers? After playing around with some of the demos (Pac-Man) on Tensorflow.js I started to understand how this can open doors to better game develop, human-computer interaction, and more.
Hands-On Machine Learning with Scikit-Learn & Tensorflow – Shamelessly stole this recommendation from a colleague. Should this be on the list for the Big Data Beard Book Club? I think so!
Docker Tensorflow – Super simple way to get started using Tensorflow. Data Engineers can pull Docker tensorflow/tensorflow  then pick CPU or GPU to get started developing with Tensorflow. I’ll say it again….a super simple way to get up and coding with Tensorflow. Go download it right now!!
Tensorflow Resources

Tensorflow Resources Video

Why Tensorflow is Awesome for Machine Learning – Since I created this list I’m definitely going to put my video at the top of the Tensorflow video. In this video I breakdown Tensorflow was a monumental tool for Deep Learning and Machine Learning.
Siraj Raval YouTube – Siraj Raval has a huge following on his YouTube Channel which is all about Machine Learning, Artificial Intelligence, and Deep Learning concepts. Checkout his first video on Tensorflow in 5 minutes for a quick high level overview of Tensorflow. Then watch my favorite Tensorflow video of creating an image classifier for training a model to detect is this picture of Darth Vader or not.
What is missing? Do you have a suggestion for a resource that should be added? Make sure to put those suggestions for Tensorflow resources in the comment section below.

Want More Data Engineering Tips?

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

Filed Under: Tensorflow Tagged With: Machine Learning, Machine Learning Engineer, Tensorflow

Ultimate Battle Tensorflow vs. Hadoop

October 4, 2019 by Thomas Henson 1 Comment

Tensorflow vs. Hadoop

The Battle for #BigData 

This post has been a long time coming!

Today I talk about the difference between Tensorflow and Hadoop. While Hadoop was built for processing data in a distrubuted fashion their are some comparison with Tensorflow. One of which is both originated out of the Google development stack. Another one is that both were created to bring insight to data although they both have different approaches to that mission.

Who now is the king of #Bigdata? To be fair the comparison is not like for like but most of the time are bound together as it has to be one or the other. Find my thoughts on Tensorflow vs. Hadoop in the latest episode of Big Data Big Questions.

Transcript – Ultimate Battle Tensorflow vs. Hadoop

Hi folks! Thomas Henson here with thomashenson.com. Today is another episode of Big Data Big Questions. Today’s question is really a conversation that I heard from, actually, my little brother when he was talking about something that he heard at a conference. He brought it to my attention. “Hey, Thomas, you’re involved in big data. I was talking to some folks at a GIS conference around Hadoop and TensorFlow.” He’s like, “One person came up to me and said, ‘Ah! Hadoop’s dead. It’s all TensorFlow now.” I really wanted to take today to really talk about the differences between Hadoop and TensorFlow, and just do a level set for all data engineers out there, all big data developers, or people that are just interested in finding out. “Okay, what’s happening in the marketplace?” Today’s question is going to come in around TensorFlow versus Hadoop and find out all the things that we need to know from a data engineering perspective. Even in the end, we’ll talk about which one’s going to be around in five years. Find out more right after this.

Welcome back. Today, as promised, what we’re going to do is, we’re going to tackle the question around which is better, what’s the differences of TensorFlow versus Hadoop, where does it fit in data analytics, the marketplace, and solving the world’s problems? If you’re watching this channel, and you’re invested in the data analytics community, you know how we feel about it, and we’re passionate about, we’re being able to solve problems using data. First thing we’re going to do is break them down, and then at the end, we’re going to talk about some of the differences, where we see the market going, and which one is going to make it in five years. Or, will both? Who knows. First, what is TensorFlow. We’ve talked about it a good bit on this channel, but TensorFlow is a framework to do deep learning. Deep learning gives you the ability to subset, and a branch of machine learning, but it’s just about processing data. The really cool thing about TensorFlow, and the reason TensorFlow and frameworks similar to TensorFlow in the deep learning realm are so awesome is because it gives you the portability to run and analyze your data on your local machine or even spread it out in a distributed environment. It comes with a lot of different algorithms and neural networks that you can use and incorporate into solving problems. One of the cool things about deep learning is just the ability to actually look and analyze more video data or voice recognition, right? Or, if you’re going on Instagram or you’re going on YouTube, and you’re looking for examples on deep learning, chances are somebody’s going to build some kind of video or some kind of photo identification that will help you identify a cat. That’s the classic example that you’ll see, is, “Hey, can we detect a cat by feeding in data, and looking, and analyzing this?” Tensorflow doesn’t use Hadoop, but TensorFlow uses big data. You use these large data sets to train your models that can be used on edge devices. If you’re even used a drone, or if you’ve ever used a remote control to use natural language processing to change the channel, then you’ve used some portion of deep learning or natural language processing. Not saying it’s TensorFlow, but that’s what TensorFlow, it really does. It’s very popular, developed by Google, open sourced, and housed by Google. A lot of free resources out there, and for data scientists and machine learning engineers, it’s a very, very exciting product to be able to build out and be able to start analyzing your data quicker and in a very popular fashion. Couple together the excitement for deep learning, couple together the ease of use of TensorFlow, and that’s why the market has just been hot for TensorFlow and those other frameworks.

What is Hadoop? Hadoop, it’s all about elephants, right? Hadoop has really been around since, I don’t know, we’re probably in 12 to 13 years of it being open source, but if we think back to what we did from analyzing data that was coming in from the web, think about being able to index the entire web, it’s kind of what Google helped develop that, and Yahoo, and a lot of the other teams from Cloudera and HortonWorks, really helped to push Hadoop into the open source arena. Hadoop is synonymous with saying big data. You can’t say big data without thinking about Hadoop. Hadoop’s been around for a long time. There’s a lot of different components to Hadoop, and even on this channel, whenever we talk about Hadoop, we’re specifically really talking about the ecosystem. The ability to process data, but the ability to also store large amounts of data with HDFS, so the Hadoop distributed file system, there’s a lot of components in there. There are APIs, and there are other tools that help for you to do it, but one of the things that I really like to think about when we talk about Hadoop and why it was so record-breaking, and just really open the market for big data was just the ability to set up distributed systems and be able to analyze large amounts of data. These large amounts of data would be more in the unstructured data, so think of it not being in a database, but a lot of it would still be in text-based. You could go out there, very popular example is going out here, setting up an API to pull in Twitter data, and be able to do cinnamon analysis [Phonetic 00:05:13] over that. Not so much the deep learning. They’re trying to get into the deep learning area right now, but more of machine learning, using algorithms like singular value decomposition or [Inaudible 00:05:25] neighbor, but being able to do that over large sets of data. Large sets of data with multiple machines. Hadoop, been around for a while, more seen as replacing the enterprise data warehouse. With TensorFlow now on the scene, where does Hadoop fit in, and what’s going on, and what are some of the differences?

Hadoop was written in Java. TensorFlow was written in C++. Both of them have APIs. They give you the ability to, whenever we’re talking about the processing of data, you can do it in Java, you can do it in Python, you can do it in Scala. There’s a lot of different options there from a Hadoop perspective. TensorFlow, too. You can see C++. You can also see it in Python. Python’s one of the more popular ones, actually did a course using TF Learn and TensorFlow to show that. When we think about the tools, it’s a little bit different. When we think about Hadoop, we’re actually building out a distributed system. Then, we’re using things like maybe Spark. Think of using Spark to be able to analyze that data. We’re going to pull insight from that data back to our cinnamon analysis that’s going to say, “Hey, these specific words in here, when we see them, this tweet is unhappy,” or, “This tweet is happy.” Versus TensorFlow, same thing. More of a processing engine, like framework to be able to pull in, analyze the data, and give you insights on whether that image contained a cat or not a cat. You’re starting to see some of the differences. We talked about Python versus Java. Both of them, there’s different APIs that you can start to use those. I’m probably talking right now about saying that I haven’t seen a lot of Java and TensorFlow, but I’m sure somebody has an API or some kind of framework out there that works on it. Another big difference, too, is the way that the processing is done. The Hadoop ecosystem’s really trying to get into it right now, but from a TensorFlow perspective, we’re really seeing it on GPUs, right? Think of being able to use GPUs to process data, 10-20, a lot faster than what we see on a CPU. Where Hadoop is more CPU-based, the way that we’re solving problems with Hadoop is we’re throwing a lot of CPUs in a distributed model to process the data and then pull it back in. TensorFlow, same thing, distributed networks. As you start to scale out your data, you really need to distribute those systems, but we’re doing it with GPUs. That’s speeding up the process. Little bit of a difference there, just in the approach, but that’s one of the big key differences. If we’re a data engineer, and we’re evaluating these, where do they come in? Ease of use, Hadoop, you’re building out your distributed system. Really Java-based, so if you have a Java background, it’s really good, but you can get by without it in some areas. It’s really not so much of a comparison with ease of use, but if we’re talking about just being able to stand something up and start messing around with it, it’s going to be a little bit more complicated and harder to do it from a Hadoop perspective with TensorFlow. You can actually look at an NFS file system. You can feed in data from different file systems, where with Hadoop, you’re building that system out, and also building out a file system. You’re building out distributed systems, and you’re building out disaster recovery and some of the other components. It’s harder to do from a Hadoop perspective, but there’s more expertise in it, because you’re actually building out a whole solution set, versus TensorFlow is the processing system that you’re using. The comparison on that perspective is somebody tries to talk to you about that, kind of explain that it’s, these are two different systems, right? When we’re talking about which are we using, that really comes down to it. If you’re looking for a project, and somebody says, “Hey! Should we use TensorFlow here, or Hadoop?” It’s going to be pretty easy to spot those, I think, because when you’re starting to look at them, if you think of Hadoop, think of something that’s replacing or falling in line to the enterprise data warehouse. What are we doing? Do we have massive amounts of data. It could be structured, semi-structured, but you’re wanting to offload, and you’re wanting to run huge analytics over that processing. Then, that’s probably going to be a Hadoop perspective. We’re probably building out that system when we think of the traditional enterprise data warehouse. That’s the bucket that we’re going to fall in. If we’re talking about doing some sort of artificial intelligence or doing some things with deep learning, maybe not so much in the machine learning era, you’re going to want to look at TensorFlow. Especially, listen for keywords like, hey, what are we doing from the perspective of images, or video, or voice? Any of those media-rich types of data, then you’re probably going to use TensorFlow, too. If you have machine learning engineers, a data scientist, and you’re trying to do rich media, TensorFlow’s going to be your really popular one. If you have more data analysts, and even your data scientist, but from the perspective of, we’re looking at large amounts of data and wanting to marry it, but we have it in some kind of structure and some kind of standardized system, then Hadoop may be your bucket.

Which one of these is going to be around in five years? I think they’ll both be around, but I will say that the popularity for Hadoop will continue in some degrees, but it’s more continuing to replace that enterprise data warehouse. Think of what you do from a traditional perspective in holding all your company’s information, from that perspective, where we’re seeing more product development, more media-rich things that are being done from an artificial intelligence. We’ll see more TensorFlow there. Will TensorFlow still be the number one deep learning framework in five years? Will deep learning, I can’t answer that here. Would I learn it if I were just starting out as a data engineer? Yeah, definitely. Definitely from the perspective of, I want to learn how to implement it and how to use it. You don’t have to become an expert. We’re not trying to become a data scientist from that perspective, but start looking at some of the frameworks, and building out, going through some of the simple examples that they have, and then heavy use on docker, container, and that whole world of being able to build those out. That’ll help you if you’re really trying to look into, hey, what could be next for data engineers? Or, what’s going on now? What’s cutting edge from that perspective? I hope you enjoyed this video, please, if you have any comments on it, if I missed something, put it in the comments section here below. I’m always happy to carry on the discussion. Until next time, see you again on Big Data Big Questions.

Want More Data Engineering Tips?

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

Filed Under: Tensorflow Tagged With: Data Engineering, Hadoop, Tensorflow

What Is A Generative Adversarial Network?

July 18, 2019 by Thomas Henson Leave a Comment

 

 

What Is A Generative Adversarial Network

Generative Adversarial Networks

What are deep fakes? How are they generated? On today’s episode of Big Data Big Questions we tackle how Generative Adversarial Networks work. Generative Adversarial Networks or GANs work with 2 neural networks one a generator and another a discriminator. Learn about my experience with GANs and how you can build one as well.

Transcript What Is A Generative Adversarial Network?

This is going to be a cool episode, Mr. Editor. We’re going to talk about a painting that was built by AI or designed by AI that went for over $400,000. Crazy.

Hi folks! Thomas Henson here with thomashenson.com. Today is another episode of Big Data Big Questions. Today, we’re going to talk about Generative Adversarial Neural Networks. We’re going to talk about a painting, so you’ve all probably heard about a painting that was sold for, like, $400,000. It was built, actually, but a Generative Adversarial Network. We’re going to talk about that, explain what that is, and maybe even look at a little bit of code, and tell you how you can learn more about it.

Before we jump in, I definitely want to say, if you have any questions about data engineering, data science, IT, anything, put them in the comment section here below. Reach out to me at thomashenson.com/big-questions. I’ll try my best to answer them and help you out. It’s all about the community and all about having fun. Today, we’re going to have a lot of fun. I’m excited. This is something that I’ve been researching and looking into since, maybe, at least since the first part of 2019, but for sure it’s been a theme for me for a while.

I want to talk about Generative Adversarial Network, what that is. We think about that from a deep learning perspective. We’ve done some videos. We talk about deep learning, but this is a specific kind, so kind of like [Inaudible 00:01:33] neural networks, this is a little bit different. It still uses the premise of, you have your input layer, you have your hidden layers, and you have your output layer, but it’s a little more complexity to it. It’s been around since 2014. Ian Goodfellow is branded as the creator to that. If you follow Andrew Neen [Phonetic 00:01:52] on Twitter, I just saw where he took a role at Facebook. I think it was a competitive thing, and I think Andrew was saying, “Hey, great pickup for Facebook for picking him up,” but you might want to fact check that.

Like I said, that was breaking news here. Generative Adversarial Network. The way that I like to think about that and describe that is, think of it as having two different neural networks that are working. You have your discriminator and you have your generator. What’s going on is your generator is taking data. Think of, we’ve got, let’s say, a whole bunch of images of people. What’s going on is, our generator is going to take that data set and look at it, and it’s going to try to create fake data that looks like real data. Your discriminator is the one that’s sitting there saying, “Hey, wait a minute. That’s real data. This is fake data.” This is real data, that’s fake data. Just continuing on. You keep going through that iteration, until the generator gets so good, he’s able to pass fake data onto the discriminator. For our example, we’re looking at images of people. What you’re trying to do is, you’re trying to generate data of fake people and pass it through as real people. You’re probably like, “Man. How really good is that?”

Check out this website here. These are fake people. These are not real people. These are really good images, and a little bit creepy. I found this, actually, in the last week, and kind of looked at it. Been sharing it internally with some friends and some colleagues, but man. It’s really interesting when you think about it. These people do not exist. There’s no, these people don’t exist on the planet. These were all built by AI or deep learning. It’s pretty cool. Pretty creepy, too.

You’re probably wondering, “That’s pretty cool.” Been around since 2014. I’m researching it. Should I be researching it? I definitely think it’s something that’s going to be out there. There’s a lot of information around it, and a lot of use cases, kind of don’t know where it’s going to go. I can think of it being used for game development. Being able to create worlds. For somebody that’s creating a game that’s going to have multiple, multiple different levels, or even if GIS, you have to create all these landscapes and everything like that. If you can build AI to automate that, if you use a deep learning algorithm that’s going to automate, and build out those worlds, and make them lifelike, how much busy work is that going to save you? Same thing with GIS and in architecture, but also go back to the website we were just looking at, with the fake people. Oh, my gosh! You can use that in media and entertainment. Think about movies. Maybe we don’t even need actors anymore. That’s a little bit scary. For the actors, I don’t know. You still need Thomas Henson and thomashenson.com on YouTube, right?

Really cool. Something I just wanted to share with everybody, and back to what we were talking about in the first part of the show. The first art that was really sold for big ticket item around AI, over $400,000, and it was a generated image, too. I talk a little bit about it in my implementing TF Learn course, but here’s a code sample, really just showing what’s going on. If you’re looking at it, and all this is done in TensorFlow, here, using the extraction layer of TF Learn. Look here, how we’re creating that generator, and how you’re creating a discriminator. It’s a good bit of code here, but really, this is an example from TF Learn examples, where you’re actually starting to general data in here. It’s pretty cool. Pretty awesome to be able to play with if you have Tensorflow installed in your environment. You can actually do an import TF learn and start running this code from the examples here, and start tweaking with it. Really cool.

I you want to learn more, definitely love for you to check out and tell me all about. Go through my TF Learn course. Tell me all about it if you like it. You don’t have to, but I just thought sharing Generative Adversarial Networks, I thought that was pretty cool. I think it’s something that everybody should learn. At least know a little bit about it. Now, you know. Hey, important thing. I’ve got my generator. I’ve got my discriminator. My generator is making the data that’s trying to pass this real data to my discriminator.

Boom! You understand a lot. Thanks for tuning in. If you have any questions, put them in the comment section here below, and make sure you subscribe just so you never miss an episode, and get some great education around Big Data Big Questions.

Nobody can! Nobody can generate a fake image of me!

Challenge accepted?

Filed Under: Tensorflow Tagged With: Deep Learning, Neural Networks, Tensorflow

Learning Tensorflow with TFLearn

February 11, 2019 by Thomas Henson Leave a Comment

Recently we have been talking a lot about Deep Learning and Tensorflow. In the last post I walked through how to build neural networks with Tensorflow . Now I want to shift gears to talk about my newest venture into Tensorflow with TFLearn. The lines between deep learning and Hadoop are blurring and data engineers need to understand the basics of deep learning. TFLearn offers an easy way to learn Tensorflow.

What is TFLearn?

TFLearn is an abstraction framework for Tensorflow. An abstraction framework is basically a higher level language for implementing lower level programming. A simple way to think of abstraction layers is it reduces code complexity. In the past we used Pig Latin to abstract away Java code for Tensorflow we will use TFLearn.

TFLearn offers a quick way for Data Engineers or Data Scientist to start building Tensorflow neural networks without having to go deep into Tensorflow. Neural Networks with TFLearn are still written in Python, but the code is drastically reduced from Python Tensorflow. Using TFLearn provides Data Engineers new to Tensorflow an easy way start learning and building their Deep Neural Networks (DNN).

Pluralsight Author

Since 2015 I’ve been creating Data Engineering courses through Pluralsight. My latest course on TFLearn titled Implementing Multi-layer Neural Networks with TFLearn is my sixth course on Pluralsight. While I’ve developed courses in the past this course was in two major areas: Implementing Multi-layer Neural Networks is my first course in the deep learning area. Second this course is solely based on coding in Python. Until now I had never done a coding course per say.

Implementing Multi-layer Neural Networks with TFLearn

Implementing Multi-layer Neural Networks with TFLearn is broken into 7 modules. I wanted to follow closely with the TLearn documentation for how the functions and layers are broken down. Here are the 7 modules I cover in Implementing Multi-layer Neural Networks with TFLearn:

  1. TFLearn Course Overview – Breakdown of what is covered in this course around deep learning, Tensorflow, and TFLearn.
  2. Why Deep Learning – Why do Data Engineers need to learn about deep learning? Deep dive into the basic terminology in deep learning and comparison of machine learning and deep learning.
  3. What is TFLearn? – First start off by defining TFLearn and abstraction layers in deep learning. Second we breakdown the differences between Tensorflow and TFLearn. Next we run through both the TFLearn and Tensorflow documentation. Finally we close out the module by building your TFlearn development environment on you machine or in the cloud.
  4. Implementing Layers in TFLearn – In deep learning layers are where the magic happens so this where we begin our Python TFLearn coding. In the first example we build out neural networks using the TFLearn core layers. Our second neural network we build will be a Covolutional Neural Network (CNN) with out MNIST data source. After running our CNN it’s time to build our 3 neural network with a Recurrent Neural Network (RNN). Finally we close out the module by looking at the Estimators layers in TFLearn.
  5. Building Activations in TFLearn  – The activations module give us time to examine what mathematical functions are being implemented at each layer. During this module we explore the different activiations available in Tensorflow and TFLearn.
  6. Managing Data with TFLearn – Deep learning is all about data sets and how we train our neural networks with those data sets. The Managing Data with TFLearn module is all about the tools available to handle our data sets. In the last topic area of the data module we cover the implications and tools for real-time processing with Tensorflow’s TFLearn.
  7. Running Models with TFLearn – The last module in the Implementing Multi-layer Neural Networks with TFLearn Pluralsight course in all about how to run models. During the course we have focused mainly on how to implement Deep Neural Networks (DNN) but in this module we introduce Generative Neural Networks (GNN). Finally after comparing DNNs and GNNs we look to the future of deep learning.

Honest Feedback Time

I would love some honest feedback on this course:

  • How did you like?
  • Would you like to see more deep learning courses?
  • What could be better?

Feel free to put these answers in the comment section below or send me an email.

Filed Under: Tensorflow Tagged With: Deep Learning, Pluralsight, Python, Tensorflow, TFlearn

Hello World Tensorflow – How This Data Engineer Got Started with Tensorflow

January 28, 2019 by Thomas Henson 2 Comments

My Tensorflow Journey

It all started last year when I accepted the challenge to take Andrew Ng’s Coursera Machine Learning Course with the Big Data Beard Team. Now here I am a year later with a new Pluralsight course diving into Tensorflow (Implementing Neural Networks with TFLearn) and writing a blog post about how to get started with Tensorflow. For years I have been involved on the Data Engineering side of Big Data Projects, but I thought it was time to take a journey to see what happens on the Data Science side of these projects. However, I will admit I didn’t start my Tensorflow journey just for the education, but I see an opportunity for those in the Hadoop ecosystem to start using the Deep Learning frameworks like Tensorflow in the near future. With all that being sad let’s jump in and learn how to get started with Tensorflow using Python!

What is Tensorflow

Tensorflow is a Deep Learning framework and the most popular one at this moment. Right now there are about 1432 contributors to Tensorflow compared to 653 Keras (which offers abstraction layer for Tensorflow) from it’s closet competitor. Deep learning is related to machine learning, but uses neural networks to analyze data. Mostly used for analyzing unstructured data like audio, video, or images. My favorite example is trying to identify cats vs. dogs in a photo. The machine learning approach would be to identify the different features like ears, fur, color, nose width, and etc. then write the model to analyze all the features. While this works it puts a lot of pressure on the developer to identify the correct features. Is the nose width really a good indicator for cats? The deep learning approach is to take the images (in this example labeled images) and allow the neural network to decide which features are important through simple trial and error. No guess work for the developer and the neural network decides which features are the most important.

Default
1
 

Source – KDNuggets Top 16 DL Frameworks
Tensorflow is open source now, but has it’s root from Google. The Google brain team actually developed Tensorflow for it’s use of deep learning with neural networks. After releasing a paper on disbelief (Tensorflow) Google released Tensorflow as open source in 2017. Seems eerily familiar to Hadoop except Tensorflow is written in C++ not Java but for our purposes it’s all Python. Enough background on Tensorflow let’s start writing a Tensorflow Hello World model.

 

 

How To Get Started with Tensorflow

Now that we understand about deep learning and Tensorflow we need to get the Tensorflow framework installed. In production environments GPUs are perferred but CPUs will work for our lab. There are a couple of different options for getting Tensorflow installed my biggest suggestion for Window user is use a Docker Image or an AWS deep learning AMI . However, if you are a Linux or Mac user it’s much easier to run a pip install. Below are the commands I used to install and run Tensorflow in my Mac.
$ bash commands for install tensorflow
using env

Always checkout the official documentation at Tensorflow.

Tensorflow Hello World MNIST

from __future__ import print_function
import tensorflow as tf

a = tf.constant(‘Hello Big Data Big Questions!’)

#always have to run session to initialize variables trust me 🙂
sess = tf.Session()

#print results
print(sess.run(a))

Beyond Tensorflow Hello World with MNIST

After building out a Tensorflow Hello World let’s build a model. Our Tensorflow journey will begin by using a neural network to recognize hand written digits. In the deep learning and machine learning world the famous Hello World is to use the MNIST data set to test out training models to identify hand written digits from 0 – 9.  There are thousands of examples on Github, text books, and on the official Tensorflow documentation. Let’s grab one of my favorite Github repo for Tensorflow by Americdamien.

Now as Data Engineers we need to focus on being able to run and execute this Hello World MNIST code. In a later post we can cover behind the code. Also I’ll show you how to use a Tensorflow Abstraction layer to reduce complexity.

First let’s save this code as mnist-example.py

“”” Neural Network.
A 2-Hidden Layers Fully Connected Neural Network (a.k.a Multilayer Perceptron)
implementation with TensorFlow. This example is using the MNIST database
of handwritten digits (http://yann.lecun.com/exdb/mnist/).
Links:
[MNIST Dataset](http://yann.lecun.com/exdb/mnist/).
Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
“””

from __future__ import print_function

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(“/tmp/data/”, one_hot=True)

import tensorflow as tf

# Parameters
learning_rate = 0.1
num_steps = 500
batch_size = 128
display_step = 100

# Network Parameters
n_hidden_1 = 256 # 1st layer number of neurons
n_hidden_2 = 256 # 2nd layer number of neurons
num_input = 784 # MNIST data input (img shape: 28*28)
num_classes = 10 # MNIST total classes (0-9 digits)

# tf Graph input
X = tf.placeholder(“float”, [None, num_input])
Y = tf.placeholder(“float”, [None, num_classes])

# Store layers weight & bias
weights = {
‘h1’: tf.Variable(tf.random_normal([num_input, n_hidden_1])),
‘h2’: tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
‘out’: tf.Variable(tf.random_normal([n_hidden_2, num_classes]))
}
biases = {
‘b1’: tf.Variable(tf.random_normal([n_hidden_1])),
‘b2’: tf.Variable(tf.random_normal([n_hidden_2])),
‘out’: tf.Variable(tf.random_normal([num_classes]))
}

# Create model
def neural_net(x):
# Hidden fully connected layer with 256 neurons
layer_1 = tf.add(tf.matmul(x, weights[‘h1’]), biases[‘b1’])
# Hidden fully connected layer with 256 neurons
layer_2 = tf.add(tf.matmul(layer_1, weights[‘h2’]), biases[‘b2’])
# Output fully connected layer with a neuron for each class
out_layer = tf.matmul(layer_2, weights[‘out’]) + biases[‘out’]
return out_layer

# Construct model
logits = neural_net(X)
prediction = tf.nn.softmax(logits)

# Define loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)

# Evaluate model
correct_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()

# Start training
with tf.Session() as sess:

# Run the initializer
sess.run(init)

for step in range(1, num_steps+1):
batch_x, batch_y = mnist.train.next_batch(batch_size)
# Run optimization op (backprop)
sess.run(train_op, feed_dict={X: batch_x, Y: batch_y})
if step % display_step == 0 or step == 1:
# Calculate batch loss and accuracy
loss, acc = sess.run([loss_op, accuracy], feed_dict={X: batch_x,
Y: batch_y})
print(“Step ” + str(step) + “, Minibatch Loss= ” + \
“{:.4f}”.format(loss) + “, Training Accuracy= ” + \
“{:.3f}”.format(acc))

print(“Optimization Finished!”)

# Calculate accuracy for MNIST test images
print(“Testing Accuracy:”, \
sess.run(accuracy, feed_dict={X: mnist.test.images,
Y: mnist.test.labels}))

Next let’s run our MNIST example

$ python mnist-example.py

…results will begin to appear here…

Finally we have our results. We get a 81% accuracy using the sample MNIST code. Now we could better and get closer to 99%  with some tuning or adding different layers but for our first data model in Tensorflow this is great. In fact in my Implementing Neural Networks with TFLearn course we walk through how to use less lines of code and get better accuracy.

tensorflow hello world mnist

Learn More Data Engineering Tips

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

Filed Under: Tensorflow Tagged With: Deep Learning, Machine Learning, Python, Tensorflow

Subscribe to Newsletter

Archives

  • February 2021 (2)
  • January 2021 (5)
  • May 2020 (1)
  • January 2020 (1)
  • November 2019 (1)
  • October 2019 (9)
  • July 2019 (7)
  • June 2019 (8)
  • May 2019 (4)
  • April 2019 (1)
  • February 2019 (1)
  • January 2019 (2)
  • September 2018 (1)
  • August 2018 (1)
  • July 2018 (3)
  • June 2018 (6)
  • May 2018 (5)
  • April 2018 (2)
  • March 2018 (1)
  • February 2018 (4)
  • January 2018 (6)
  • December 2017 (5)
  • November 2017 (5)
  • October 2017 (3)
  • September 2017 (6)
  • August 2017 (2)
  • July 2017 (6)
  • June 2017 (5)
  • May 2017 (6)
  • April 2017 (1)
  • March 2017 (2)
  • February 2017 (1)
  • January 2017 (1)
  • December 2016 (6)
  • November 2016 (6)
  • October 2016 (1)
  • September 2016 (1)
  • August 2016 (1)
  • July 2016 (1)
  • June 2016 (2)
  • March 2016 (1)
  • February 2016 (1)
  • January 2016 (1)
  • December 2015 (1)
  • November 2015 (1)
  • September 2015 (1)
  • August 2015 (1)
  • July 2015 (2)
  • June 2015 (1)
  • May 2015 (4)
  • April 2015 (2)
  • March 2015 (1)
  • February 2015 (5)
  • January 2015 (7)
  • December 2014 (3)
  • November 2014 (4)
  • October 2014 (1)
  • May 2014 (1)
  • March 2014 (3)
  • February 2014 (3)
  • January 2014 (1)
  • September 2013 (3)
  • October 2012 (1)
  • August 2012 (2)
  • May 2012 (1)
  • April 2012 (1)
  • February 2012 (2)
  • December 2011 (1)
  • September 2011 (2)

Tags

Agile AI Apache Pig Apache Pig Latin Apache Pig Tutorial ASP.NET AWS Big Data Big Data Big Questions Book Review Books Data Analytics Data Engineer Data Engineers Data Science Deep Learning DynamoDB Hadoop Hadoop Distributed File System Hadoop Pig HBase HDFS IoT Isilon Isilon Quick Tips Learn Hadoop Machine Learning Machine Learning Engineer Management Motivation MVC NoSQL OneFS Pig Latin Pluralsight Project Management Python Quick Tip quick tips Scrum Splunk Streaming Analytics Tensorflow Tutorial Unstructured Data

Follow me on Twitter

My Tweets

Recent Posts

  • Tips & Tricks for Studying Machine Learning Projects
  • Getting Started as Big Data Product Marketing Manager
  • What is a Chief Data Officer?
  • What is an Industrial IoT Engineer with Derek Morgan
  • Ultimate List of Tensorflow Resources for Machine Learning Engineers

Copyright © 2023 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in

 

Loading Comments...