Thomas Henson

  • Data Engineering Courses
    • Installing and Configuring Splunk
    • Implementing Neural Networks with TFLearn
    • Hortonworks Getting Started
    • Analyzing Machine Data with Splunk
    • Pig Latin Getting Started Course
    • HDFS Getting Started Course
    • Enterprise Skills in Hortonworks Data Platform
  • Pig Eval Series
  • About
  • Big Data Big Questions

O’Reilly AI Conference London 2019

October 9, 2019 by Thomas Henson Leave a Comment

The Big Data Big Data Questions show is heading to London for the O’Reilly AI Conference October 15 – 17 2019. I’m excited to be a part of the O’Reilly AI Conference series. In fact, this will be my third O’Reilly AI conference in the past year. Let’s look back at those events and forward to London.

San Jose & New York

 

View this post on Instagram

 

Late night packing my conference gear for my trip to O’Reilly AI Conference this week. Most important items: 1️⃣ Stickers 2️⃣ 🎧 3️⃣ 💻 4️⃣ Bandages? (I’ll explain later) 5️⃣ 📚 (this weeks its my Neural Networking) What’s your list of must have gear for tech conferences? #programming #coding #AI #conference #techconference

A post shared by Thomas Henson (@thomas_henson) on Sep 5, 2018 at 5:09am PDT


First in 2018 I attended the San Jose conference where I spent a good portion of the time in the Dell EMC booth talking with Data Engineers and Data Scientist. One of the major themes I heard from Data professionals was they were attending to learn how to incorporate Tensorflow into their workflows. In my opinion Tensorflow was talked about in every aspect of the conference. We had a blast learning from attendees and discussing how to Scale Deep Learning Workloads. Also this was my first time attending a conference with 14 stitches in my left hand (trouble on the pull up bar)!

Oreilly AI Conference

Next was O’Reilly AI New York. Forever this conference will be known in my head as the Sofia the Robot trip. During this conference I worked with Sofia the Robot not only at the conference but in a Dell EMC event at Time Square Studios (part of the Dell Technologies Magic of AI Series). Before the Magic of AI event, Sofia and I spent the day recording with O’Reilly TV about the current state of AI and what’s driving the widespread adoption. After a day of recording, I had a keynote for day two of the O’Reilly AI Conference where I discussed how AI is impacting future generations already. Then there was a whirlwind of activity as Sofia the Robot took questions at the Dell Technologies booth. The last thing of the day was the Magic of AI event in Time Square Studio where we had 100 people taking part in a questions and answer session with Sofia the Robot.

Keynote O’Reilly AI Conference New York

Coffee with Sofia the Robot

On To London

Next up is O’Reilly AI London. To say I’m excited is an understatement. During this trip I will accomplish many first time moments.

To begin with it’s my first international conference along with my first time in London. So many things to see and so little time to do it. Feel free to give me suggestions about visit locations in the comment section below. 

Second at O’Reilly AI London I will give my first breakout session at an O’Reilly Conference. While I’ve been on O’Reilly TV and given a keynote I’ve yet to have a breakout session.  My session is titled AI Growing Pains: Platform Considerations for Moving from POC to Large-Scale Deployments. The world is changing to innovate and incorporate Artificial Intelligence in many applications and services. However, with all this excitement many Data Engineers are still struggling with how to get projects past the Proof-of-Concept phase (POC) and into Production. Production environments present a list of challenges. The 3 biggest challenges I see when moving from POC to Production are the following:

  • The gravity of data is just as real as the gravity in the physical world. As Deep Learning workloads continue grow so does the amount of data stored to train these models. The data has gravity that will attract services and applications to the data. The trouble here making sure you have correct Data pipelines Strategy on place.
  • Once I had dinner with one of the Co-founders of Hortonworks, during which he said “Everything as Scale is exponentially harder. Have you ever moved around photos on your desktop? For the most part this is an easy task except when you accidentally move a large set of photos. Instantly after moving these large folders you are endlessly waiting for the hour glass to finish. Image doing this with 10 PBs of data. I think you get the picture here.
  • The talent pool today compared to early days of “Big Data” is much larger. However, the demand for skills in Deep Learning, Machine Learning, and Data Engineering is stressing the system. Which still leaves a skills gap for experienced engineers with Deep Learning and Machine Learning skills. The skills gap is one huge factor for why many projects get stuck in the POC phase instead into production.

If you would like to know more about moving projects from POC to Production make sure to checkout my session if you are attending O’Reilly AI Conference in London. AI Growing Pains: Platform Considerations for Moving from POC to Large-Scale Deployments @ 11:55 on October 16, 2019.

Want More Data Engineering Tips?

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

Filed Under: Data Engineers, Data Science Tagged With: AI, Conference, Data Engineers, Data Science, Deep Learning

What I’m Learning Report #1 (Docker Deep Dive, K8s, & More)

October 3, 2019 by Thomas Henson Leave a Comment

One question I get a lot of on Big Data Big Questions is “Thomas what are you learning”. Honestly not as much as a Is should. It’s true I believe the key to being successful in any part of life is to continually learn.

Looking to change careers from Web Developer to DevOps?

Do you want to be a better partner or spouse?

Trouble with public speaking?

All the answer to the above questions start with learn and end with consistency. If you make it a habit to learn and are consistent with it there isn’t anything you can’t accomplish. Alright enough with selling you on learning! I wanted to share with you what I’m learning and to help motivate myself TO KEEP LEARNING. The way I plan to accomplish this is with monthly learning reports. 

30 Minutes of Learning Everydayish…

For along time I’ve advocated for the idea of taking 30 minutes everyday to learn something new. I’ll go through time periods where I’m hitting that everyday then sometimes where I fall behind. While it’s only a target and nothing to beat myself up about, I find it a useful technique when learning any concept. The way I do my 30 minutes of learning is to set a timer on my phone for 30 minutes and focus only on that topic for 30 minutes. Recently, in order to track this habit I’ve been using the Super Habit App. Below you can see how I’ve done over the last month. Honestly not my best work but let’s see how it improves overtime.

What I'm Learning 30 Minutes

Pluralsight

Mainly my 30 Minutes of Learning comes from Pluralsight courses. Not only am I an Author but I’m also a student. Pluralsight has been a part of my personal learning path long before I was an Author. Back when I was a fresh new Web developer I used Pluralsight to learn C# and ASP.NET frameworks. Of course, I also dove into the world of JavaScript, JQuery, and other JS frameworks. Now days I still learn with Pluralsight but the content is more Data Engineering and IT OPs focused.

Docker Big Picture Course

Docker seems to be taking over the world. In fact, contributions and adoption of Docker and Kubernetes has outpaced Hadoop exponentially. So many new applications and services offer a containerized version. In the Hadoop 3.0 release multiple features were to add support for containers. All this container talk has pushed me to learn this amazing Platform-as-a-Services for OS virtualization. My guide on this journey is the great Nigel Poulton. Checkout my notes from the Docker Big Picture Course:

  • Kubernetes originated out of Google (shocking I know)
  • Kubernetes is Greek for helmsman or Captain
  • Web Playground for K8s
  • Web Playground for Docker
  • Docker Engine – daemon –> containerd –> OCI
  • Docker has both community and enterprise versions
  • Kubernetes = K8s

Docker Deep Dive Course

After working my way through the Docker Big Picture Course I decided to stay in the Docker world by watching the Docker Deep Dive Course. I loved this course where I was able to get hands on with Docker and learn a good bit beyond the basics. Here are a few notes I jotted down throughout this course:

  • Docker Commands like
    • List – docker ps
    • Pull Image – docker image pull
    • Build Image – docker image build
    • Run Image – docker exec
  • Building Docker images is as easy as writing a YAML file
  • YAML = Yet Another Markup Language
  • Docker networking with bridge drive on Linux or NAT driver on Windows
  • Stacks and Services – code –> container –> services
  • Docker Universal Control Plane (UCP) is installed on top of Docker Engine
  • Docker Trusted Registry can be setup for storing all Docker Images.

What I'm Learning

Data Related Podcast & Blogs

Data Engineer learning doesn’t only take place in courses. I also wanted to track some of the Podcast and articles I consumed throughout the month. The great thing about Podcast is you can listen to them while commuting, working out, house chores, or just about anywhere. Here are a few Podcast and Articles I’ve consumed over the past month.

  • Conversational AI Best Practices with Cathy Pearl and Jessica Dene Earley-Cha – GCP podcast that digs into the aspects of conversational AI. I loved this podcast to explore where conversational AI is going and where to get started with NLP in GCP. Actually gave me some ideas for my October learning goals.
  • Microservices.io – Uh I can’t even begin to summarize how much content is on this site. If you are looking to learn more around Microservices (which you should!!) then bookmark this site and read this content over time.
  • Doctor AI – Dell Tech (full disclosure: #iworkforDell)  podcast diving into different topics around AI. In this episode host Jessica explores the possibilities of AI augmentation in the medical field. One the areas I’ve spent a good bit of research in and spoke about. Earlier this year I spoke with a group of Medical Doctors and Researchers at NYU around advances in AI.
  • Exploring AWS Lake Formation – AWS podcast with guest from around the AWS world. A lot of great content on this on this podcast. Listened to this particularly episode while walking my son so my attention wasn’t what it should have been. Mostly I remember that Data Lake Formation is an AWS services that helps with cataloging and label data to support multiple services (MySQL, Redshift, S3).

On To Next Month

Thanks for supporting this new series and I’m excited to see how it matures over time. Also would love if I got more consistent with my learning as well.  If you have ideas for things I should be learning or would like to share what you are learning put it in the comments below. Right now my thoughts are to wrap up the Kubernetes Deep Dive course then move on to Natural Language Processing (NLP). I’ve got ideas for some really cool projects in NLP so it should be fun.

Filed Under: Article Tagged With: AI, AWS, Docker, GCP, K8s, Kubernetes, Learning

Will AI Replace Data Scientist?

July 12, 2019 by Thomas Henson Leave a Comment

WIll AI Replace Data Scientist

Will AI Take My Job?

Artificial Intelligence is disrupting many different industries from transportation to healthcare. With any disruptions fear begins to pop around how that will impact me! One question poised on Big Data Big Questions was if “AI Will Replace Data Scientist”. We are truly in the early days of AI and Deep Learning but let’s look forward to see if AI will be able to replace Data Scientist. Find out my thoughts on AI Replacing Data Scientist by watching the video below.

Transcript – Will AI Replace Data Scientist?

Hi folks! Thomas Henson here with thomashenson.com. Today is another episode of Big Data Big Questions. Today’s question comes in from a viewer. If you have a question, put it in the comment section here below, or you can reach out to me on thomashenson.com /big-questions. I’ll do my best to answer your question. This one came in around, “Hey, you know, is software going to replace data science?”

Whenever I think about software, specifically we’re probably talking about artificial intelligence. Artificial intelligence, or machine learning, or deep learning, or any of those models, are we going to be able to build models that can replace the data scientist?

This is a common theme, if you go out and Google anything right now, you can see, “Will AI replace lawyers?” Will AI replace doctors? All kinds of different things. Unequivocally, I think the short answer is no, but I’m going to talk about what I think are some of the reasons that I don’t think that AI is going to replace data scientists. Also, at the end, I’m going to give you some industry experts on what they think and what they’ve said about that whole concept.

Let’s jump in. Let’s talk a little bit about what a data scientist is, and then, talk about how we would even begin to look at how AI would replace that. Remember before, when we talked about data scientists in the past. These are the types of people that are trying to work on finding data that can build a model that might be able to predict an outcome. If we can predict the outcome, then maybe we can do something prescriptive. Hey, this is what’s going to happen, so let’s do this portion here after something happens. Think of if you’re creating, building a model to detect insider threats. You want to be able to decide, “Okay, does this user, maybe they’re potentially an insider threat.” Once you’ve identified that, maybe you can drop their access. Be prescriptive [Inaudible 00:02:04] it. Drop access that they have to certain directories, certain folders, and then also alert security.

We’re wanting to be able to build applications or models like that, that can be able to help. Can artificial intelligence do all that, kind of take the data scientist out? I don’t believe that’s the case. That’s very, very hard. If we really look at AI, and what’s going on right now, any time you hear the word AI, replace that with automation, and you’re like, “Okay, now I understand what’s going on.” Really, we’re not at the point where we’re actually building these super intelligent systems, kind of like what you see in Hollywood. I’m going to give you three different reasons around why I think that AI is not going to replace or software is not going to replace data scientists.

The first thing is, when we think about it, artificial intelligence has been around for quite some time. The term has, we’re getting better with our models. If you listen and read some of the books that I’ve read, we’re in that implementation phase where we’re putting these things out there. If you really look at it, even in the past, when we talk about the world’s best chess player versus artificial intelligence, we got to a point in the late ’90s where the world’s best chess player could win, or I’m sorry, the machine would beat the world’s best chess player. However, if you took a medium machine or artificial intelligence that was pretty good at chess, you paired it with a pretty good or an advanced human chess player, they could beat the world’s best machine learning model, or deep learning, or AI chess player. Same thing. What we’re doing, I think, the tools and the skills that you’re seeing being implemented for data scientists are about how we can help, right? What are the types of tools that can help us identify quickly maybe some complex algorithms that would work. Should I use a Generative Adversarial Network here? Should I used a convolutional neural network, or different types of things there?

Same thing that we’re seeing in the medical industry. Doctors aren’t going to be taken out of the loop, but doctors are going to be given maybe a voice assistant that you can prescribe and give the different, these are some of the symptoms that we’re seeing. What are some of the latest journal articles, and giving a summary to that, versus your data scientist or your medical, somebody in the medial field, they’re having to go out, and there’s always research, and research papers that they could be reading, and could be intaking, same thing here. You’re going to have assistants as a data scientist, to be able to say, “Look, what are…?” Run some stats on this, and let’s see what models might be good indicators here. I’m still in the loop. I’m still deciding what we’re going to do from that model, but it’s going to help me streamline and get faster, what we’re doing.

Number two, really simple, just go out there and look at the talent gap. We’re still looking for data scientists. That’s, go do a Google search, and you’re finding that there’s a ton of different open job applicants. If you go to any kind of symposium. There was a symposium over at Georgia Tech. One of the people from Google there was talking, and they were like, “Hey, man, I will take every PhD or even Master’s level candidate you have around data science and statistics,” and everything like that. There’s still a huge, huge talent gap there, and I don’t think it’s going to be cured by AI. Like I said, I think it’s going to be about automating, and then maybe AI can help us to train better humans that can fill those roles, but I think that’s another indication that, man, I don’t even know that we’re at our peak in data science. Just from a hype cycle perspective, either.

Number three, the industry experts. If you look at Andrew Neen, you look at Kai [Inaudible 00:05:41], you look at what their predictions are, data science is in one of those quadrants where it’s like, “Hey. It’s not a simple task that can be repetitive.” You’ve all seen the videos where it’s like, hey, robots, and AI can help on assembly lines. It’s a controlled environment. Data science is not controlled. It’s out there. It’s in the wild, and you’re having to, “This model,” or even ETL. We can’t even fix ETL. We’re still having to rely on human beings to help and automate, and make sure that we’re curating the right data sets, too. We’re still not at that point, and even if we do get to that point from an ETL perspective, still going to have to have data scientists. No, AI will not replace data scientists in the near future. All that’s subject to change. There could be advances in technology in 10 years that I don’t foresee. I’m not a futurist yet. Maybe, I don’t l know. I don’t have enough education, I guess, or understanding to be that. If you have any questions, put them in the comment section below. Make sure you subscribe, so that you never miss an episode of Big Data Big Questions. Ring that bell. Until next time, see you again. Big Data Big Questions.

Filed Under: Career Tagged With: AI, Data Science, Deep Learning

Where Were You When Artificial Intelligence Transformed the Enterprise?

June 10, 2019 by Thomas Henson Leave a Comment

Blog Post First Appeared on Dell EMC Post “Where Were you When Artificial Intelligence Transformed the Enterprise“…

Where were you when artificial intelligence (AI) came online? Remember that science fiction movie where AI takes over in a near dystopian future? The plot revolves around where a crazy scientist accidentally put AI online only to realize the mistake too late. Soon the machines become the human’s overlords. While these science fiction scenarios are entertaining they really just stoke fear and add to the confusion about AI. What enterprises should be worried about regarding AI, is understanding how their competition is embracing it to get a leg up.

Where were you when your competition put Artificial Intelligence online?

Artificial Intelligence Transformed the Enterprise

Artificial Intelligence in the Enterprise

Implementations of artificial intelligence with Natural Language Processing is changing the way enterprises interact with customers and conduct customer calls. Organizations are also embracing another form artificial intelligence called computer vision that is changing the way Doctors read MRIs and the transportation industry. It’s clear that artificial intelligence and deep learning are making an impact in the enterprise. If you are feeling behind no problem let’s walk three strategies enterprises are embracing for implementing AI in their organizations.

Key Strategies for Enterprise AI

The first key point to embracing AI into your organization is to define an AI strategy. Jack Welch said it best “In reality strategy is actually very straightforward. You pick a general direction and implement like hell.”  Designing a strategy starts with understanding the business value that AI will bring into the enterprise. For example, a hospital might have an AI initiative to reduce the time to recognize from CT scans patients experiencing a stroke. Reducing that time by minutes or hours could help get critical care to patients and bring out about better patient outcomes. By narrowing and defining a strategy Data Scientist and Data Engineers now have a goal to focus on achieving.

Once you have a strategy in mind, the most important factor in the success of artificial intelligence projects is the data. Successful AI models cannot be built without it. Data is an organizations number one competitive advantage. In fact, AI and deep learning love big data. An artificial intelligence model that helps detect Parkinson’s disease must be trained with considerable amounts of data. If data is the most critical factor, then architecting proper data pipelines is paramount. Enterprise must embrace scaled out architectures that break down data silos and provide flexibility to expand based on the performance needs of the workload. Only with scale-out architectures can Data Engineers help unlock the potential in data.

After ensuring data pipelines are architected with a scale-out solution, it is time to fail quickly. YES! Data Scientist and Data Engineers have permission to fail but in a smart fashion. Successful Data Science teams embracing AI have learned how to fail quickly. Leveraging GPU processing allows Data Scientist to build AI models faster than anytime in human history. To speed up the development process though failures, solutions should incorporate GPUs or accelerated compute. Not every model end with success but leads Data Scientist closer to the solution. Ever watched a small child when they are first learning how to walk? Learning to walk is a natural practice of trial and error. If the child waits until she has all the information and the perfect environment they may never learn to walk. However, that child doesn’t learn to walk on a balance beam it starts in a controlled environment where she can fail. A Data Science team’s start in AI should take the same approach, where they embrace trial and error while capturing data from failures and successes to iterate into the next cycle quickly.

Want More Data Engineering Tips?

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

Filed Under: Business Tagged With: AI, Business, Enterprise

Subscribe to Newsletter

Archives

  • February 2021 (2)
  • January 2021 (5)
  • May 2020 (1)
  • January 2020 (1)
  • November 2019 (1)
  • October 2019 (9)
  • July 2019 (7)
  • June 2019 (8)
  • May 2019 (4)
  • April 2019 (1)
  • February 2019 (1)
  • January 2019 (2)
  • September 2018 (1)
  • August 2018 (1)
  • July 2018 (3)
  • June 2018 (6)
  • May 2018 (5)
  • April 2018 (2)
  • March 2018 (1)
  • February 2018 (4)
  • January 2018 (6)
  • December 2017 (5)
  • November 2017 (5)
  • October 2017 (3)
  • September 2017 (6)
  • August 2017 (2)
  • July 2017 (6)
  • June 2017 (5)
  • May 2017 (6)
  • April 2017 (1)
  • March 2017 (2)
  • February 2017 (1)
  • January 2017 (1)
  • December 2016 (6)
  • November 2016 (6)
  • October 2016 (1)
  • September 2016 (1)
  • August 2016 (1)
  • July 2016 (1)
  • June 2016 (2)
  • March 2016 (1)
  • February 2016 (1)
  • January 2016 (1)
  • December 2015 (1)
  • November 2015 (1)
  • September 2015 (1)
  • August 2015 (1)
  • July 2015 (2)
  • June 2015 (1)
  • May 2015 (4)
  • April 2015 (2)
  • March 2015 (1)
  • February 2015 (5)
  • January 2015 (7)
  • December 2014 (3)
  • November 2014 (4)
  • October 2014 (1)
  • May 2014 (1)
  • March 2014 (3)
  • February 2014 (3)
  • January 2014 (1)
  • September 2013 (3)
  • October 2012 (1)
  • August 2012 (2)
  • May 2012 (1)
  • April 2012 (1)
  • February 2012 (2)
  • December 2011 (1)
  • September 2011 (2)

Tags

Agile AI Apache Pig Apache Pig Latin Apache Pig Tutorial ASP.NET AWS Big Data Big Data Big Questions Book Review Books Data Analytics Data Engineer Data Engineers Data Science Deep Learning DynamoDB Hadoop Hadoop Distributed File System Hadoop Pig HBase HDFS IoT Isilon Isilon Quick Tips Learn Hadoop Machine Learning Machine Learning Engineer Management Motivation MVC NoSQL OneFS Pig Latin Pluralsight Project Management Python Quick Tip quick tips Scrum Splunk Streaming Analytics Tensorflow Tutorial Unstructured Data

Follow me on Twitter

My Tweets

Recent Posts

  • Tips & Tricks for Studying Machine Learning Projects
  • Getting Started as Big Data Product Marketing Manager
  • What is a Chief Data Officer?
  • What is an Industrial IoT Engineer with Derek Morgan
  • Ultimate List of Tensorflow Resources for Machine Learning Engineers

Copyright © 2023 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in

 

Loading Comments...