Thomas Henson

  • Data Engineering Courses
    • Installing and Configuring Splunk
    • Implementing Neural Networks with TFLearn
    • Hortonworks Getting Started
    • Analyzing Machine Data with Splunk
    • Pig Latin Getting Started Course
    • HDFS Getting Started Course
    • Enterprise Skills in Hortonworks Data Platform
  • Pig Eval Series
  • About
  • Big Data Big Questions

What is an Industrial IoT Engineer with Derek Morgan

January 22, 2021 by Thomas Henson 4 Comments

Industrial IoT Engineer with Derek Morgan

Explore Career Path’s as Industrial IoT Engineer

IoT investments are projected to grow by 13.6% through 2022. There is a huge opportunity for developers to jump into a career in IoT and specifically as Industrial IoT Engineers. Data is at the forefront of skills needed for IoT workflows. In this interview I sat down with Derek Morgan to discuss the role of the IoT Engineer.

Derek has quite a bit of experience in IoT and has been focusing in Manufacturing space of IoT. During this episode of Big Data Big Questions we break down the skills needed to enter the IoT Engineering space and what certification matter.  Here are a few of the items we cover:

    • Tech Stack Rechner, Postgres, Python and Terraform
    • How C ++ doesn’t apply here
    • Cloud vs. Private Cloud for IoT
    • Security Challenges with IoT
    • Opportunities for IoT Engineers in this space

Make sure to checkout the full video below to understand the role of the Industrial IoT Engineer.

Industrial IoT Engineer Interview

IoT Engineer Show Notes

Derek Morgan LinkedIN
Terraform
More than Certified in Terraform Course

Filed Under: Career Tagged With: IoT, IoT Engineer, Python

Deep Learning Python vs. Java

October 8, 2019 by Thomas Henson Leave a Comment

What About Java in Deep Learning?

Years ago when I left Java in the rear view of my career, I never imagined someone would ask me if they could use Java over Python. Just kidding Java you know it’s only a joke and you will always have a special place in my heart. A place in my heart that probably won’t run because I have the wrong version of the JDK installed. 

Python is king of the Machine Learning (ML) and Deep Learning (DL) workflow. Most of the popular ML libraries are in Python but are there Java offerings? How about in Deep Learning can you use Java? The answer is yes you can! Find the differences between Machine Learning and Deep Learning libraries in Java and Python in the video.

Transcript

Hi, folks. Thomas Henson here, with thomashenson.com, and today is another episode of Big Data Big Questions. Today’s question comes in around deep learning frameworks in Java, not Python. So, find out about how you can use Java instead of Python for deep learning frameworks. We’ve talked about it here on this channel, around using neural networks and being able to train models, but let’s find out what we can do with Java in deep learning.

Today’s episode comes in and we’re talking about deep learning frameworks that use Java, not Python. So, today the question is, “Are there specific deep learning frameworks that use Java, not Python?” First off, let’s talk a little bit about deep learning, do a recap. Deep learning, if you remember, is the use of neural networks whenever we’re trying to solve a problem. We see it a lot in multimedia, right, like, we see image detection. Does this image contain a cat or not contain a cat?

The deep learning approach is to take those images [Inaudible 00:01:10] you know, if we’re talking about supervised, so take those labeled images, so of a cat, not of a cat, feed those into your neural network, and let it decide what those features are. At the end you get a model that’s going to tell you, is this a cat or is this not a cat? Within some confidence. Hopefully not 50%, maybe closer to 99 or 97. But, that’s the deep learning approach versus the machine learning approach that we’ve seen a good bit.

We talk about Hadoop and traditional analytics from that perspective is in machine learning we’re probably going to use some kind of algorithm like singular value decomposition, or PCI, and we’re going to take these images and we’re going to look at each one and we’re going to define each feature, from the cat’s ears to the cat’s nose, and we’re going to feed that through the model and it’s going to give us some kind of confidence. While the deep learning approach we get to use a neural network, it defines some of those features, helps us out a lot. It’s not magic, but it is a little bit, so really, really innovative approach.

So, the popular languages, and what we’ve talked most about on this channel and probably other channels and most of the examples you’ve seen are all around Python, right? I did do a video before where I was wrong on C++. There was more C++ in deep learning than I really originally thought. You can check that video out, where we kind of go through and talk about that and I come in and say, “Hey, sorry. I missed the boat on that one.” But, the most popular language, one… I mean, I did a Pluralsight video on it, Take CTRL of Your Career, around TensorFlow and using TFLearn. TensorFlow is probably far and away the most popular one. You’ve seen it with stats that are out there. Also PyTorch, Caffe2, MXNet, and then some other, higher-level languages where Keras is able to use some of TensorFlow and be a higher-level abstraction, but most of those are going to use Python and then some of them have C++. Most examples that you’re going to see out there, just from my experience and just working in the community, is Python. Most people are looking for those Python examples.

But, on this channel, we’ve talked a lot about options and Hadoop for non-Java developers, but this is an opportunity where all you Java developers out there, you’re looking for, “Hey, we want to get into the deep learning framework. We don’t want to have to code everything ourselves. Are there some things that we can attach onto?” And the answer is yes, there are. It’s not as popular as Python right now, or R and C++ in the deep learning frameworks, but there is a framework called Deeplearning4j that is a Java-based framework. The Java-based framework is going to allow for you to use Java. You could still use Python, though. Even with the framework, you can abstract away and do Python, but if you’re specifically a Java developer and looking to… I mean, maybe you want to get in and contribute to the Deeplearning4j community and be able to take it from that perspective, or you’re just wanting to be able to implement it in some projects. Maybe you’re like, “Hey, you know what? I’m a Java developer. I want to continue doing Java.” Java’s been around since ’95, right? So, you want to jump into that? Then Deeplearning4j is the one for you.

So, really, maybe think about why would you want to use a Java-based deep learning framework, for people that maybe aren’t familiar with Java or don’t have it. One of the things is it claims to be a little bit more efficient, so it’s going to be more efficient than using an abstraction layer from that perspective in Python. But also, there’s a ton of Java developers out there, you know, there’s a community. Talked about how it’s been around since ’95, so there’s an opportunity out there to tap into a lot of developers that have the skills to be able to use it and so, there’s a growing need, right? There’s communities all around the globe and different little subsets and little subareas. Java’s one of those.

I mean, if you look at what we did from a Hadoop perspective, so many people that were Java developers moved to that community, also a lot of people that didn’t really do Java. It’s a lot like, like I said, at the point I was at in my career, I was more of a .NET C# developer. Fast forward to getting into the Hadoop community, went back to my roots as a Java, so I’d done some Java in the past, and went through that phase. And so, for somebody like me, maybe I would want to go back out. I don’t know. I’ve kind of gone through more Python, but a lot of different options out there. Just being able to give Java developers a platform to be able to get involved in deep learning, like, deep learning is very popular.

So, those are some of the reasons that you might want to go, but the question is, when you think about it, so if I’m not a Java developer, or what would you recommend? Would you recommend maybe not learn TensorFlow and go into Deeplearning4j? You know, I think that one’s going to depend… I mean, we say it a lot in here. It’s going to depend on what you’re using in your organization and what your skill set is. If you’re mostly a Python person, my recommendation would be continue on or jump into the TensorFlow area. But if you’re working on a project that is using Deeplearning4j then by all means go down that path and learn more about it. If you’re a Java developer and you want to get into it, you don’t want to transition skills or you’re just looking to be able to test something out and play with it, and you don’t want to have to write it in Python, you want to be able to do it in Java, yeah, use that.

These are all just tools. We’re not going to get transfixed on any tool. We’re not going to go all in and say, “You know what? I’m only going to be a Java developer,” or, “I’m only going to be this.” We’re going to be able to transition our skills and there’s always going to be options out there to do it. And in these frameworks too, right? Deeplearning4j is awesome, but maybe there’s another one that’s coming up that people would want to jump into, so like I said, don’t get so transfixed with certain frameworks. Like, Hadoop was awesome. We broke it apart. A lot of people navigated to Spark and still use HDFS as a base. There’s always kind of skills that you can go to, but if you go in and say, “Hey, I’m only going to ever do MapReduce and it’s always going to be in Java,” then you’re going to have some challenges throughout your career. That’s not just in data engineering, that’s throughout all IT. Heck, probably throughout all careers. Just be able to be flexible for it.

So, if you’re a Java developer, if you’re looking to test some things out, definitely jump into it. If you don’t have any Java skills and it’s not something that you’re particularly wanting to do, then I don’t recommend you running in and trying to learn Java just for this. If you’re doing Python, steady on with TensorFlow, or PyTorch, or Caffe, whatever you’re using.

So, until next time. See you again on Big Data Big Questions. Make sure you subscribe and ring that bell so you never miss an episode. If you have any questions, put them in the comment section here below. Thanks again.

Want More Data Engineering Tips?

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

Filed Under: Deep Learning Tagged With: Deep Learning, Java, Python, Tensorflow

Data Engineers: Python VS. C#

June 18, 2019 by Thomas Henson 17 Comments

Python VS. C#

Which Is Better Python Or C#?

Getting into wars over different programming languages is a no no in the world of programing. However, I recently had a question on Big Data Big Questions about which is better for Data Engineers Python or C#. So in the spirit of examining the difference through the lens of Data Engineering I decided to weigh in.

Python has long been used in Data Analytics for building Machine Learning models. C# is an object oriented programing language developed by Microsoft and used widely in all ranges of applications. Both have a ton of community support and a large user base but, which one is better? In this episode Big Data Big Questions I breakdown both Python and C# for Data Engineers. Make sure to watch the video to find out my thoughts on which is better in Data Engineering.

Transcripts – Data Engineer Python VS. C#

Hi folks! Thomas Henson here with thomashenson.com, and today is another episode of Big Data Big Questions. Today’s question, we’re going to do a comparison between Python and C#. It’s a question that I’ve had coming in, and it’s also something that’s a passion of mine, because I used to be a C# developer back in the day. Then, I’ve currently, I guess the last four or five, maybe six years, I’ve learned Python. I thought it would be good to go through some of that, especially if you’re just starting out. Maybe you’re in high school, or maybe you’re in college, or maybe you’re even looking to make a jump into data engineering or machine learning engineering, and you’re like, “Hey, man, there’s C# out here. There’s Python.” What are some of the differences? What should I learn? Find out right after this.

Today’s episode of Big Data Big Questions, I wanted to do some of the differences between Python and C#. First thing, we’ll start off with C#. C#, heavily developed by Microsoft. I think it was released in 2000. It’s an object-oriented programming language. You see it a lot. I used it, for instance, back when I was doing asp.net. There’s a lot of things that you can do, use it for. It relies on the .NET framework. You have to have the .NET framework to be able to go. They are in version 7.0. Primarily used, I used it a lot for web application development, but you can do a lot of different things with it, build out really complex and awesome applications, whether it be a desktop application, whether it be web, mobile, they’ve just got so much of a community that, there’s a lot of different things that you can do with it. Another thing, too, one of the comparisons to it is, it looks just like Java. Another reason I rotated to it, because one of my first languages I learned, I think I learned VB first, but I did a lot of stuff around Java, and actually when I graduated out of college, I thought I was going to be a Java developer for a long time. Really got engrained in that community there. Fast forward to being a web developer, and transitioning to C#, it was a really natural process for me. Like I said, heavy community, heavy packages, and frameworks, and things to be able to use. See it a lot with Microsoft. If you’re doing C#, you’re probably used VisualStudio or I think it’s VS Code. They’ve got a couple different IDEs for development and everything like that. See it a lot there.

Python. If you’ve been following this channel, you’ve probably seen a ton of videos that we’ve done around Python. Python was developed in 1991. It’s in version three. We talked about C# being in version 7. Python’s in version 3. I wouldn’t put a lot into that, because we talked about C# being in 2000 and Python’s been around since 1991. Heavily involved, both of them. It’s object-oriented just like that, just like C#. Also, you see it a lot used in, for sure, data analytics, but there’s a lot of different other frameworks that you can use to do web development. Pretty much, you can do anything you want with Python. You do have to install Python and have that running in your version. Sometimes that can be a little bit clunky, especially maybe in a Windows environment, but it’s something that you can download and start playing with, and have going on your machine. Man, probably in less than five minutes. Maybe I should do a video on that, but you can go ahead, download that, and be up and running, and start running your own code. Huge community support. There’s a ton of things out there for it. Like I said, talked about, I think even in our book review, we talked about some of the books for data engineers. I think there were two to three Python books that I had showed there, too. Heavy use there. Like I said, a lot of involvement from data analytics, whether it be data scientists or machine learning engineers, and just like with Tensorflow, or PyTorch, a lot of the deep learning frameworks that we’ve talked about on this channel have Python APIs.

The question is, you’re a data engineer just starting out, which one should you learn? I’m going to go through three different questions, where we’re going to talk about what you should learn, and which one is better? I hate doing which one is better, because each one is a different tool, and some tools are better at other things, or have more functionality to do certain tasks. Let’s jump right into it.

Which one is easier to learn? Err! I’m having to put myself in there, because I’m biased as far as C# and just having been a part of that community. Like I said, my first language being in VB, which was similar, and then a ton of work in Java. C# on the premise looked a little easier, but the way I’m going to do this criteria is which one do I think is easier to get up and get started from a data engineering perspective or data analytics perspective. I’m going to have to give it to Python. Like I said, can be a little clunky when you’re first installing it, but if you were just able to open up a Linux, build out a Linux machine, you can do, especially if you’re in the red hot, and you can do Yum install Python, and then you can start scripting away on some of the code there. Then, also, too, I’ll give it Python just from the perspective of a lot of things from a data analytics perspective. Number two, I’m a data engineer. I’m a machine learning engineer. Which one should I learn today? Which one would I start off with if I had to choose, only choose between Python and C#? I would probably go with Python, right? Go ahead and learn Python. I would encourage anybody watching this channel, jump into that community. There’s a ton of books out there. We’ve talked about on this channel where you can go, and learn how to do data analytics from that perspective. Python’s going to get the win there. Which one do I enjoy coding in more? Personal preference, man, I think C# will always have that win for me. Like I said, this is a data engineering channel, but like I said. I started off as a web developer. I really like VisualStudio, and I know there’s some plugins you can do with VS Code. You can use that as your IDE for Python and everything like that. There’s something about C# and that language that I really found comfortable and probably will always have a special place in my heart. Like I said, just coming from a Java perspective and everything like that, I’ll give that the win. The overall win, the overall win between the three categories, if you’re a data engineer, a machine learning engineer, you have to start somewhere, I’d say start with Python. Go through some of the tutorials. Got some on this channel. I’ve got some on my blog, but get started there. I hope you enjoyed this. Tell me what you think. Did I miss something on the differences? Would you have chosen C# as something to start off with? Do you like Python better than C#, versus like I said, C# has a special place in my heart, let me know in the comments section below, or if you have any questions, Do you want me to answer on the show? Put it in here, and then make sure you subscribe and ring that bell, so you never miss an episode of Big Data Big Questions.

Filed Under: Data Engineers Tagged With: Big Data Big Questions, Data Engineers, Python

Learning Tensorflow with TFLearn

February 11, 2019 by Thomas Henson Leave a Comment

Recently we have been talking a lot about Deep Learning and Tensorflow. In the last post I walked through how to build neural networks with Tensorflow . Now I want to shift gears to talk about my newest venture into Tensorflow with TFLearn. The lines between deep learning and Hadoop are blurring and data engineers need to understand the basics of deep learning. TFLearn offers an easy way to learn Tensorflow.

What is TFLearn?

TFLearn is an abstraction framework for Tensorflow. An abstraction framework is basically a higher level language for implementing lower level programming. A simple way to think of abstraction layers is it reduces code complexity. In the past we used Pig Latin to abstract away Java code for Tensorflow we will use TFLearn.

TFLearn offers a quick way for Data Engineers or Data Scientist to start building Tensorflow neural networks without having to go deep into Tensorflow. Neural Networks with TFLearn are still written in Python, but the code is drastically reduced from Python Tensorflow. Using TFLearn provides Data Engineers new to Tensorflow an easy way start learning and building their Deep Neural Networks (DNN).

Pluralsight Author

Since 2015 I’ve been creating Data Engineering courses through Pluralsight. My latest course on TFLearn titled Implementing Multi-layer Neural Networks with TFLearn is my sixth course on Pluralsight. While I’ve developed courses in the past this course was in two major areas: Implementing Multi-layer Neural Networks is my first course in the deep learning area. Second this course is solely based on coding in Python. Until now I had never done a coding course per say.

Implementing Multi-layer Neural Networks with TFLearn

Implementing Multi-layer Neural Networks with TFLearn is broken into 7 modules. I wanted to follow closely with the TLearn documentation for how the functions and layers are broken down. Here are the 7 modules I cover in Implementing Multi-layer Neural Networks with TFLearn:

  1. TFLearn Course Overview – Breakdown of what is covered in this course around deep learning, Tensorflow, and TFLearn.
  2. Why Deep Learning – Why do Data Engineers need to learn about deep learning? Deep dive into the basic terminology in deep learning and comparison of machine learning and deep learning.
  3. What is TFLearn? – First start off by defining TFLearn and abstraction layers in deep learning. Second we breakdown the differences between Tensorflow and TFLearn. Next we run through both the TFLearn and Tensorflow documentation. Finally we close out the module by building your TFlearn development environment on you machine or in the cloud.
  4. Implementing Layers in TFLearn – In deep learning layers are where the magic happens so this where we begin our Python TFLearn coding. In the first example we build out neural networks using the TFLearn core layers. Our second neural network we build will be a Covolutional Neural Network (CNN) with out MNIST data source. After running our CNN it’s time to build our 3 neural network with a Recurrent Neural Network (RNN). Finally we close out the module by looking at the Estimators layers in TFLearn.
  5. Building Activations in TFLearn  – The activations module give us time to examine what mathematical functions are being implemented at each layer. During this module we explore the different activiations available in Tensorflow and TFLearn.
  6. Managing Data with TFLearn – Deep learning is all about data sets and how we train our neural networks with those data sets. The Managing Data with TFLearn module is all about the tools available to handle our data sets. In the last topic area of the data module we cover the implications and tools for real-time processing with Tensorflow’s TFLearn.
  7. Running Models with TFLearn – The last module in the Implementing Multi-layer Neural Networks with TFLearn Pluralsight course in all about how to run models. During the course we have focused mainly on how to implement Deep Neural Networks (DNN) but in this module we introduce Generative Neural Networks (GNN). Finally after comparing DNNs and GNNs we look to the future of deep learning.

Honest Feedback Time

I would love some honest feedback on this course:

  • How did you like?
  • Would you like to see more deep learning courses?
  • What could be better?

Feel free to put these answers in the comment section below or send me an email.

Filed Under: Tensorflow Tagged With: Deep Learning, Pluralsight, Python, Tensorflow, TFlearn

Hello World Tensorflow – How This Data Engineer Got Started with Tensorflow

January 28, 2019 by Thomas Henson 2 Comments

My Tensorflow Journey

It all started last year when I accepted the challenge to take Andrew Ng’s Coursera Machine Learning Course with the Big Data Beard Team. Now here I am a year later with a new Pluralsight course diving into Tensorflow (Implementing Neural Networks with TFLearn) and writing a blog post about how to get started with Tensorflow. For years I have been involved on the Data Engineering side of Big Data Projects, but I thought it was time to take a journey to see what happens on the Data Science side of these projects. However, I will admit I didn’t start my Tensorflow journey just for the education, but I see an opportunity for those in the Hadoop ecosystem to start using the Deep Learning frameworks like Tensorflow in the near future. With all that being sad let’s jump in and learn how to get started with Tensorflow using Python!

What is Tensorflow

Tensorflow is a Deep Learning framework and the most popular one at this moment. Right now there are about 1432 contributors to Tensorflow compared to 653 Keras (which offers abstraction layer for Tensorflow) from it’s closet competitor. Deep learning is related to machine learning, but uses neural networks to analyze data. Mostly used for analyzing unstructured data like audio, video, or images. My favorite example is trying to identify cats vs. dogs in a photo. The machine learning approach would be to identify the different features like ears, fur, color, nose width, and etc. then write the model to analyze all the features. While this works it puts a lot of pressure on the developer to identify the correct features. Is the nose width really a good indicator for cats? The deep learning approach is to take the images (in this example labeled images) and allow the neural network to decide which features are important through simple trial and error. No guess work for the developer and the neural network decides which features are the most important.

Default
1
 

Source – KDNuggets Top 16 DL Frameworks
Tensorflow is open source now, but has it’s root from Google. The Google brain team actually developed Tensorflow for it’s use of deep learning with neural networks. After releasing a paper on disbelief (Tensorflow) Google released Tensorflow as open source in 2017. Seems eerily familiar to Hadoop except Tensorflow is written in C++ not Java but for our purposes it’s all Python. Enough background on Tensorflow let’s start writing a Tensorflow Hello World model.

 

 

How To Get Started with Tensorflow

Now that we understand about deep learning and Tensorflow we need to get the Tensorflow framework installed. In production environments GPUs are perferred but CPUs will work for our lab. There are a couple of different options for getting Tensorflow installed my biggest suggestion for Window user is use a Docker Image or an AWS deep learning AMI . However, if you are a Linux or Mac user it’s much easier to run a pip install. Below are the commands I used to install and run Tensorflow in my Mac.
$ bash commands for install tensorflow
using env

Always checkout the official documentation at Tensorflow.

Tensorflow Hello World MNIST

from __future__ import print_function
import tensorflow as tf

a = tf.constant(‘Hello Big Data Big Questions!’)

#always have to run session to initialize variables trust me 🙂
sess = tf.Session()

#print results
print(sess.run(a))

Beyond Tensorflow Hello World with MNIST

After building out a Tensorflow Hello World let’s build a model. Our Tensorflow journey will begin by using a neural network to recognize hand written digits. In the deep learning and machine learning world the famous Hello World is to use the MNIST data set to test out training models to identify hand written digits from 0 – 9.  There are thousands of examples on Github, text books, and on the official Tensorflow documentation. Let’s grab one of my favorite Github repo for Tensorflow by Americdamien.

Now as Data Engineers we need to focus on being able to run and execute this Hello World MNIST code. In a later post we can cover behind the code. Also I’ll show you how to use a Tensorflow Abstraction layer to reduce complexity.

First let’s save this code as mnist-example.py

“”” Neural Network.
A 2-Hidden Layers Fully Connected Neural Network (a.k.a Multilayer Perceptron)
implementation with TensorFlow. This example is using the MNIST database
of handwritten digits (http://yann.lecun.com/exdb/mnist/).
Links:
[MNIST Dataset](http://yann.lecun.com/exdb/mnist/).
Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
“””

from __future__ import print_function

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(“/tmp/data/”, one_hot=True)

import tensorflow as tf

# Parameters
learning_rate = 0.1
num_steps = 500
batch_size = 128
display_step = 100

# Network Parameters
n_hidden_1 = 256 # 1st layer number of neurons
n_hidden_2 = 256 # 2nd layer number of neurons
num_input = 784 # MNIST data input (img shape: 28*28)
num_classes = 10 # MNIST total classes (0-9 digits)

# tf Graph input
X = tf.placeholder(“float”, [None, num_input])
Y = tf.placeholder(“float”, [None, num_classes])

# Store layers weight & bias
weights = {
‘h1’: tf.Variable(tf.random_normal([num_input, n_hidden_1])),
‘h2’: tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
‘out’: tf.Variable(tf.random_normal([n_hidden_2, num_classes]))
}
biases = {
‘b1’: tf.Variable(tf.random_normal([n_hidden_1])),
‘b2’: tf.Variable(tf.random_normal([n_hidden_2])),
‘out’: tf.Variable(tf.random_normal([num_classes]))
}

# Create model
def neural_net(x):
# Hidden fully connected layer with 256 neurons
layer_1 = tf.add(tf.matmul(x, weights[‘h1’]), biases[‘b1’])
# Hidden fully connected layer with 256 neurons
layer_2 = tf.add(tf.matmul(layer_1, weights[‘h2’]), biases[‘b2’])
# Output fully connected layer with a neuron for each class
out_layer = tf.matmul(layer_2, weights[‘out’]) + biases[‘out’]
return out_layer

# Construct model
logits = neural_net(X)
prediction = tf.nn.softmax(logits)

# Define loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)

# Evaluate model
correct_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()

# Start training
with tf.Session() as sess:

# Run the initializer
sess.run(init)

for step in range(1, num_steps+1):
batch_x, batch_y = mnist.train.next_batch(batch_size)
# Run optimization op (backprop)
sess.run(train_op, feed_dict={X: batch_x, Y: batch_y})
if step % display_step == 0 or step == 1:
# Calculate batch loss and accuracy
loss, acc = sess.run([loss_op, accuracy], feed_dict={X: batch_x,
Y: batch_y})
print(“Step ” + str(step) + “, Minibatch Loss= ” + \
“{:.4f}”.format(loss) + “, Training Accuracy= ” + \
“{:.3f}”.format(acc))

print(“Optimization Finished!”)

# Calculate accuracy for MNIST test images
print(“Testing Accuracy:”, \
sess.run(accuracy, feed_dict={X: mnist.test.images,
Y: mnist.test.labels}))

Next let’s run our MNIST example

$ python mnist-example.py

…results will begin to appear here…

Finally we have our results. We get a 81% accuracy using the sample MNIST code. Now we could better and get closer to 99%  with some tuning or adding different layers but for our first data model in Tensorflow this is great. In fact in my Implementing Neural Networks with TFLearn course we walk through how to use less lines of code and get better accuracy.

tensorflow hello world mnist

Learn More Data Engineering Tips

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

Filed Under: Tensorflow Tagged With: Deep Learning, Machine Learning, Python, Tensorflow

Ultimate Hadoop Python Example

December 7, 2017 by Thomas Henson Leave a Comment