In this episode of Big Data Biq Questions we explain the basics of the Splunk Architecture. Splunk is a hot solution in the world of Big Data and many Data Engineers are eager to learn how to use Splunk to analyze machine data. One of the first things you want to understand is the 3 basic architecture structures in Splunk:
- Forwarder – helps move data or log files from devices, edge, IoT, or anything into other Splunk instances.
- Indexer – Adds searchable order to data coming into Splunk instances.
- Search Head – Allows data to be searched in Splunk by Data Engineers, Splunk Users, and Splunk Architects.
Learn more about Splunk Architecture by watching the video below.
Transcript – Explaining Splunk Architecture Basics
Hi folks! Thomas Henson here with thomashenson.com. Today is another episode of Big Data Big Questions. Today, we’ve got a good topic coming in. Something we’ve talked about a little bit before. We’re going to talk a little bit about Splunk. Today’s question, just remember, if you want your question answered here on Big Data Big Questions, put it in the comment section below. Find me on YouTube. Wait, we’re already on YouTube. Find me on Twitter. Find me on Instagram. Just put it in the comments section here below. Reach out, and I will do my best to answer those questions. Today’s question comes in, and we’re talking around Splunk.
What are the basics of Splunk architecture? Really, just wanted to key off of that, and talk a little bit. We’re going to break it down by three different pieces, but the first thing we need to know is, we need to know what Splunk is. Splunk, if you’ve been watching this, is one of those tools that’s out there, that allows for you to take machine generated data and be able to analyze it. My joke is, if you can create tables, and pivot tables in Excel, then you can easily start ingesting and starting looking and visualizing your data in Splunk. Think about, it started off as icy operations. Being able to take in, whether it be log files, whether it be system files, whether it be people trying to break into your network. Anything that’s going on from your network trafficking perspective or logins.
All those different log files from all these different machines, being able to put them in one place, be able to index them, and be able to view them. Splunk has been an amazing tool for that. Like I said, Easy Button. They coined the phrase Easy Button For Machine Data. Pretty cool. Anything machine generated, they’ve been into, but they’re also into IT security. Really, if you think about big data, you’re talking Splunk. IoT is one other big key features and focal points, too.
Let’s talk about those three basic architecture features. We’re going to break it down. The first thing you need to know, if you’re looking to be able to talk Splunk and know what the Splunk architecture is made up of, the first thing is forwarders. What forwarders are is, think of this as a way to, you’ve got a machine running on the edge. You’ve got a machine running your data center. You’ve got one running in the cloud. Anywhere you have a machine or have any kind of device that you want to get data back from, there’s something called a Splunk forwarder. The forwarder is that first key. What that’s going to do if, that’s a very, very small file that’s running or very small application that’s running on that device, that machine, whatever it is, and it’s just forwarding whatever the information is. You’re looking to forward log files. You’re forwarding log files. Say that you have a phone. You’re forwarding log files from a game or from an application on your phone. You’re going to use a forwarder to send that data off. First thing is, learn what a forwarder is. We’re going to be able to run a small application and send data to our Splunk environment.
Number two, the next piece, building block for Plunk architecture, is going to be our indexer. What the indexer is, it’s going to take that data. We’re forwarding those files, it’s forwarding that data to the indexer. What the indexer’s going to do is, they’re going to put a timestamp on it, put some other information, but it’s basically the indexer’s going to say, hey, this is how we’re going to look for this file. We’re probably talking about millions and millions of files. Think about is being able to index it if you’re familiar with databases. You definitely understand. If you’re a data engineer in the big data world, on Hadoop, you understand how indexes work and how you can use indexers to be able to search your data a lot quicker. The second portion, just to recap, is our indexer.
Now that we’ve got our data indexed, it’s time to move on to the next phase. In the next phase, we’re talking about number three. That’s going to be our search head. Our search head is how we can visualize and how we can start looking, and querying out data. Think about it. We’ve got our data that’s been forwarded from our phone. We’ve got our application file that’s coming off of a mobile device, being pushed into our indexer. Our indexer says, “Hey, you know, here’s a timestamp for it. Here’s some other information that we’re pulling into it. Now, me, the user, comes in and says, “Hey, I want to index that data,” or, “I want to search that data, and so, I’m using, interacting in with a search head that’s going to go out, and going to find that data, and going to be able to help with our queries. But, also help whenever we’re using our queries to build out dashboards, or some amazing tables that’s going to help us visualize our data. Those are the three basic building blocks when we’re talking about Splunk architecture. You have your forwarder, you have your indexer, and you have your search head, and there’s a lot of different ways that you can configure those, and there’s a lot of different ways that you can architect those. Those are the basic building blocks that you’re going to use if you’re talking about the Splunk architecture. If you’d like to learn more, I’ve got a couple Pluralsight courses out there. One called Analyzing Machine Data With Splunk, and then also another one that’s building on the Splunk learning path for Pluralsight. That’s [Inaudible 00:05:07] configuring Spunk, with other courses coming and showing you how to visualize that data, how to search that data, how to set up alerts. A lot of different information, so if you’re curious about that, there are some resources for it, but there’s a ton out there as well. Splunk has great documentation. There’s other courses and other things out on YouTube that you can find, that will help you learn more about Splunk. If you’re interested in Splunk, and interested in being able to use a tool like Splunk to visualize whether it be machine-generated data or IoT. Especially if you’re trying to get into the more security path. Then, Splunk is a great took for that. A lot of information out there. Hope you found this video very informative. If you have any questions or have any ideas for the show, put them in the comments section here below, but also make sure that you’re subscribed and you ring that bell, so that you never miss an episode of Big Data Big Questions.
Should get a sponsorship about water. Does anybody know who the agent is for water? Eh. Maybe get some kind of sponsorship. Hey man, you know? There’s those milk ads, right? Who knows?