Thomas Henson

  • Data Engineering Courses
    • Installing and Configuring Splunk
    • Implementing Neural Networks with TFLearn
    • Hortonworks Getting Started
    • Analyzing Machine Data with Splunk
    • Pig Latin Getting Started Course
    • HDFS Getting Started Course
    • Enterprise Skills in Hortonworks Data Platform
  • Pig Eval Series
  • About
  • Big Data Big Questions

Archives for May 2017

Big Data Big Questions: Do I need to know Java to become a Big Data Developer?

May 31, 2017 by Thomas Henson 1 Comment

know Java to become a Big Data Developer

Today there are so many applications and frameworks in the Hadoop ecosystem, most of which are written in Java. So does this mean anyone wanting to become a Hadoop developer or Big Data Developer must learn Java? Should you go through hours and weeks of training to learn Java to become an awesome Hadoop Ninja or Big Data Developer? Will not knowing Java hinder your Big Data career? Watch this video and find out.

Transcript Of The Video

Thomas Henson:

Hi, I’m Thomas Henson with thomashenson.com. Today, we’re starting a new series called “Big Data, Big Questions.” This is a series where I’m going to answer questions, all from the community, all about big data. So, feel free to submit your questions, and at the end of this episode, I’ll show you how. So, today, the first question I have is a very common question. A lot of people ask, “Do you need to know Java in order to be a big data developer?” Find out the answer, right after this.

So, do you need to know Java in order to be a big data developer? The simple answer is no. Maybe that was the case in early Hadoop 1.0, but even then, there were a lot of tools that were being created like Pig, and Hive, and HBase, that are all using different syntax so that you can extrapolate and kind of abstract away Java. Because the key is, if you’re a data analyst or a Hadoop administrator, most of those people aren’t going to have Java skills. So, for the community to really move forward with this big data and Hadoop, we needed to be able to say that it was a tool that not only Java developers were going to be able to use. So, that’s where Pig, and Hive, and a lot of those other tools came. Now, as we start to look into Hadoop 2.0 and Hadoop 3.0, it’s really not the case.

Now, Java is not going to hinder you, right? So, it’s going to be beneficial if you do know it, but I don’t think it’s something that you would want to run out and have to learn just to be able to become a big data developer. Then, the question is, too, when you say big data developer, what are we really talking about? So, are we talking about somebody that’s writing MapReduce jobs or writing Spark jobs? That’s where we look at it as a big data developer. Or, are we talking about maybe a data scientist, where a data scientist is probably using more like R, and Python, and some of those skills, to pull their insights back? Then, of course, your Hadoop administrators, they don’t need to know Java. It’s beneficial if they know Linux and some of the other pieces, but Java’s not really necessary.

Now, I will say, in a lot of this technology… So, if you look at getting out of the Hadoop world but start looking at Spark – Spark has Java, so you can write your Spark jobs in Java, but you can also do it in Python and Scala. So, it’s not a requirement for people to have Java. I would say that there’s a lot of developers out there that are big data developers that don’t have any Java skills, and that’s quite okay. So, don’t let that hinder you. Jump in, join an open-source community project, do something to expand your big data knowledge and become a big data developer.

Well, that’s all we have today. Make sure to submit your questions. So, I’ve got a space on my blog where you can submit the questions or just submit them here, in the comments section, and I’ll answer your big data big questions. See you again!

 

Filed Under: Big Data Tagged With: Big Data, Big Data Big Questions, Hadoop, Learn Hadoop

Complete Guide to Splunk Add-Ons

May 24, 2017 by Thomas Henson Leave a Comment

Splunk is a popular application for analyzing machine data in the data center. What happens when Splunk Administrators want to add new data sources to their Splunk environment outside the default list?

The Administrators have two options:

  • First they can import the data source using the regular expression option. Only fun if you like regular expressions.
  • Second they can use a Splunk Ad-On or Application.

Let’s learn how Splunk Add-Ons are developed and how to install them.

Splunk Add-Ons

How to Create Splunk Plugins

Developers have a couple of options to create Splunk Application or Add-Ons. Let’s step through the options for creating Splunk Add-Ons by going from the easiest to hardest.

The first option to create a Splunk Add-Ons is by using the dashboard editor inside the Splunk app. Using the dashboard editor you can create custom visualizations of your Splunk data. Simply click to add custom searches, tables, and fields. Next save the dashboard and test out the Splunk Application.

The second option developers have is to use XML or HTML markup inside the Splunk dashboard. Using either markup language gives developers more flexibility into the look and feel of their dashboards. Most developer with basic HTML, CSS, and XML skills will choose this option over the standard dashboard editor.

The last option inside the local Splunk environment is SplunkJS. Out of all the option for creating application in the local Splunk environment SplunkJS allows the greatest control for developing Splunk applications. Developer with intermediate JavaScript skills will find using SplunkJS fairly easy while those without JavaScript skills will have a more difficult time.

Finally for developers who want the most control and flexibility for their Splunk Ad-Ons Splunk offers Application SDK options. These applications leverage the Splunk API and allow for developer to write the application in their favorite language.   By far using the SDK is the most difficult but also creates the ultimate Splunk Application.

Splunk Application SDK options:

  • Javascript
  • C#
  • Python
  • Java

What is Splunkbase

After developers create their applications they can then be uploaded to the Splunkbase. Splunkbase is the de facto marketplace for Splunk Add-Ons and applications. It’s a community driven market place for both licensed (paid) and non-licensed (not paid) Splunk Ad-Ons and Applications. Splunk certified applications ensure  secure and mature Splunk Applications.

Think of Splunkbase as Apple’s App store. Users download applications that run on top of iOS to extend the functionality of the iPhone. Both the community and corporate developers build Apple’s iOS Apps. Just like the iOS App store, Splunkbase offers both paid and free applications and Ad-Ons as well.

How to install from Splunkbase

The local Splunk environment integrates with Splunkbase. Meaning Splunkbase install are seamless. Let’s walk through a scenario below installing the Splunk Analytics for Hadoop in my local Splunk environment.

Steps for Installing App from Splunkbase:

  1. First log into local Splunk environment
  2. Second click Splunk Apps
  3. Next browse for “Splunk Analytics for Hadoop”
  4. Click Install & enter log in information
  5. Finally view App to begin using App

Another option is to install directly to the local Splunk environment. Simply download application directly and upload to local Splunk environment. Make sure to practice good Splunk hygiene by only downloading trusted Splunk Apps.

Closing thoughts on Splunk Apps & Add-Ons

In addition to extending Splunk, Add-Ons increase the Splunk environment’s use cases. The problem with Splunk is as user begin using they want to add new data sources. While often the new data sources are supported, times when data sources aren’t default Splunk’s community of App developers fill that gap. Splunk’s hockey adoption comes from the ability to add new data sources. New insights are constantly pushing new data sources in Splunk.

Looking to learn more about Splunk? Checkout my Pluralsight Course Analyzing Machine Data with Splunk.

 

Filed Under: Splunk Tagged With: Data Analytics, IT Operations, Splunk

Isilon Quick Tips: Deep Dive FTP

May 17, 2017 by Thomas Henson Leave a Comment