Thomas Henson

  • Data Engineering Courses
    • Installing and Configuring Splunk
    • Implementing Neural Networks with TFLearn
    • Hortonworks Getting Started
    • Analyzing Machine Data with Splunk
    • Pig Latin Getting Started Course
    • HDFS Getting Started Course
    • Enterprise Skills in Hortonworks Data Platform
  • Pig Eval Series
  • About
  • Big Data Big Questions

GDPR Good or Bad?

June 7, 2018 by Thomas Henson Leave a Comment

GDPR Good Or Bad

Is GDPR Good or Bad?

How many emails have you received about GDPR? At this point I almost have to set a rule in Outlook to send all emails with the word “GDPR” in them to a separate folder. I’ve explained what GDPR is and how it applies to Data Engineer but is it good or bad. Generally regulations are put in place to make society better, but does Big Data need regulation? Find out my thoughts on the policies put in place with GDPR in the video below.

Transcript – GDPR Good Or Bad?

Hi folks! Thomas Henson here, with thomashenson.com. Today is another episode of Big Data Big Questions. Today, I’m going to jump back in a little bit more around GDPR. We want to find out, had a lot of questions, seen a lot of things on Twitter, and I just thought it would be a great time to discuss, is GDRP [Phonetic] good or bad?

This is not going to be about politics. It’s going to be about policy and what’s really driving GDPR. What does it really mean, as far as, is that a good thing for us that are involved in big data? And, it’s consumers. Find out more, right after this.

[Sound effects]

Welcome back. This is the second episode where we’re going to talk about GDPR. If you’re curious about what GDPR is and what it means to data engineers, make sure you check out the video that I did before just talking about, what does GDPR mean to data engineers, machine learning engineers, or data scientists?

I really wanted to focus this time on, we’ve talked about what it is, but what does it really mean? Is it a good thing? I’ve gotten a ton of emails just on my personal stuff, from people who’ve built websites for me, from different HortonWorks, and Cloudera, and everybody’s kind of talking about, what does GDPR mean to us? Every time you turn around right now, you’re going to have to update some kind of policy, whether it be from Apple on your iPhone, or from Android, or anybody that’s collecting or holding onto your data, all those privacy updates are all going on, and you’re going to have to click yes on each one of them.

Yes, I understand that you’re going to protect my data, and it’s going to be more private. Is that a good thing or is it a bad thing? Is it okay for us to have regulation around it?

I look at it from this perspective. I was thinking about it, and it’s like, if you really think about where we’re going, there’s regulation for everything. For most products, as they get big. What this really means to me, and why I think it’s a good thing, is because this shows that your digital data is growing up. It’s maturing. When you think about it, in America, when cars first came out, we didn’t really have regulations around it. You didn’t have to get a license. It was just something fun that you could do, and if you could afford a car, you could get it. As that product started maturing, we started realizing, “Hey, this is something that needs to be regulated to some extent.”

We need to have some kind of standards around who’s going to drive on what side of the road, and how all that’s going to work through. If you think about digital data, we’re getting to that point. A couple reasons why we’re at that point, if you think about it, the first thing, privacy matters. Privacy’s always kind of mattered, and people really pay attention to being able to be private and have those things. For a long time, data has not seen one of them.

We have regulations and laws around if people can go into private residencies without consent and things like that. Your data, it’s the same way, and that’s where we’re starting to look at it and say, “Okay, that data, you have rights to it. It’s yours. You created it, so your privacy does matter.” That’s where the regulations are coming. Also a big thing is, think about how many different data breaches we have.

For a long time, if you follow Troy Hunt, or anybody that’s big in security, you can always see at least weekly, they’re talking about a huge data breach that happened. That compounded with trying to figure out, “Okay, if you’re collecting these data, how much of a liability, how much is that for you, and then how much of a responsibility is it of yours if the data becomes breached? Are there certain standards that you should have to follow to be able to better protect that data, so that you can turn around and say, “Hey, we do have some bad actors out there, that have hacked and taken this data, but we went through these steps.”

There’s not really been a standard for what those steps are, and so this is a further implementation of it. The thing, and one of the reasons, a couple of the reasons, actually, that I think that it’s a good thing, right? Not talking politics here, just why I think GDPR’s good.

It gives you back control of your data. It gives you the opportunity to say, “Hey, I would like for you to be able to report and see what data you have on me. What does my digital footprint look like?” What kind of data are you collecting on me? You have the authorization to ask for that and to be able to get an answer to that.

Secondly, you can say, “Hey, I want to drop off. I want all my data gone. I don’t want you to collect and hold onto my data.” I think this is a big point, because while I’m on Facebook, and I’ve been a Facebook user for I don’t know how long, just a long time, I’ve heard of other people and other stories around people who’ve gone off Facebook. You’ve probably not seen them. They’ve deleted their profile, only to come back a year or two later, and all their stuff’s still there.

I can’t say that I’ve seen that happen for me, but I’ve heard a ton of stories, where I know that there much be some sort of truth to that. This is an opportunity where, if you do want to get off the grid, so it’s like, “Hey, you know, it’s 2018, I’m going to get off the grid,” this gives you the opportunity. That’s another reason why I think it’s really good. It puts you in control of your data and lets you decide.

Also, it’s going to create a framework for companies to have a standard around how they’re going to protect that data. It’s going to protect companies and organizations that collect data by having a set of standards that we’re able to follow, to say, “Hey, we’re doing as much as we can to be able to protect, and make sure that, your data, when it comes in, is as secure as can be.” This gives us the opportunity to start setting those standards and testing it. Maybe we won’t have as many data breaches in the future.

Maybe, we can trust and understand that, while there are bad actors out there, that maybe there will be less involvement around the hackers, because it really puts the onus on the people who collect the data. We had some of that before, but a lot of it has been, I would say, public perception. You want your public perception to be okay. How much of a law, and really bearing, is going to be on companies if that data is discovered, or data is breached? Now, this gives us the framework to say, “Hey, there are regulations, and we are saying that, you know, this is something that you need to protect.”

That was just my thoughts on it. I’d love to hear your thoughts. If anybody has any opposing views or anything like that, put them in the comments section here below or just reach out and ask. Let’s jump on YouTube, and let’s record a video, and maybe talk about it a little bit more. Let me know where I’m wrong, but, that’s my thoughts. I think, in general, GDPR is good. I think there’s going to be a lot of opportunities around products and around people with that expertise, so if you’re looking to get involved in big data, and you like looking into and following regulations, and putting security metrics into works, then I think GDPR is a good place to go.

I think there’s going to be a lot of companies that are going to make products. There’s going to be products that are out there, that’s going to help with GDPR compliance, because May 25th’s coming, 2018. I don’t know that everybody’s going to be ready. Until next time.

Filed Under: Business Tagged With: GDPR

Big Data Impact of GDPR

May 7, 2018 by Thomas Henson Leave a Comment

Big Data Impact of GDPR

How does GDPR Impact Data Engineers

The General Data Protection Regulation (GDPR) goes into effect in May 2018. Many organizations are scrambling to understand how to implement these regulations. In this video we will be discussing Big Data Impact of GDPR.

Transcript – Big Data Impact of GDPR

Hi, folks! Thomas Henson here, with thomashenson.com, and today is another episode of Big Data Big Questions. Today is a very special episode. We’re going to talk a little bit more about regulation than we’ve probably talked about before.

We’re going to tackle the GDPR and what that means for big data, big data analytics, and why data engineers and even data scientists should understand the regulation and know it at least from a high level. Find out more right after this.

[Sound effects]

Welcome back. Today is a special episode. We’re going to talk about the GDPR, which is the general data protection regulation, and we’re going to talk about what that means for a data engineer, and why you should understand that.

Just to have a high-level overview, this is going to be one of those things where understanding this regulation is really going to help you. You’re going to have meetings about it. This is such a big change for our industry. If we think about it from an IT perspective or a big data perspective, think of changes that have happened in other industries.

Think of what happened in the US with the SEC in accounting, around Enron and some of the other financial accounting problems that happened in the early 2000s, and then also think about healthcare. Healthcare regulation is, if you know anything about healthcare, you probably know at least the HIPAA requirement. This GDPR is going to be similar to that. Nowadays, taking place in the EU, but the ramifications are going to happen, I believe, everywhere, because one, data exists everywhere. Most companies are global companies, and the way that we handle and capture that data, whether it be from a user in the EU or a user anywhere else in the world, we’re going to have to have those regulations, and have those systems in place, so that we can comply to that.

Just from a high level, if you’re a data engineer, and we focus on the technology, and the hardware, but from non-technical careers, remember we’ve talked about this before, so some of these non-technical careers. We talk about data governance in other places. If you’re interested in that, get head first, dive into the general data protection regulation. Find out as much as you can, because that’s really going to make yourself, one, valuable in the meetings, but also if you’re looking to do a career change, maybe you’re already doing some kind of compliance or something like that, and you want to get involved in big data, here’s your opportunity. Become an expert at this, because we’re moving fast to have to comply.

Just to talk a little bit about it. It’s the EU agreement on how data is processed and stored. It’s a replacement for the data protection directive 95/46, so this is a more stringent, more all-encompassing. You’re probably like, “Why are we going down this route? Why is a regulation coming out?” If you think about it, a lot of things have been happening over the last few years.

How often do we hear about a data breach? There was a huge one last year, right? Affecting millions and millions of users, people’s credit card, people’s social security numbers. Our data is constantly under attack, and it’s, from a big data perspective, we hold onto data so that we can analyze it and make better products, make more efficient products, make better websites, better clickthrough rates on your ads, there’s so many different things that we do with these data, but also, there’s so much danger in having it.

We have to make sure that we’re protecting it, and then also, we want to make sure from a privacy perspective, and this is where this is really going to hit, is allowing users to opt in or opt out. Knowing what’s being collected and how long they’re going to have it, and then also giving you the ability to say, “You know what? Let’s get rid of that data.” I don’t want you to hold onto it.

Those are some of the things that you’re going to be tackling with it. Also, just as a note, it was approved on April 14th, 2018. Must be complied with by May 25th. We’ve got some time, here, between it, and that’s where I’m really encouraging people, even if you’re watching this video after the date, you’re wanting to get in big data, on the governance side, maybe you have non-technical career options, learn this. I’m serious. Just learn this. This is going to be huge. You’re seeing, if you follow anything from Hortonworks, or Cloudera, or anybody involved in big data or even IT, you’re getting bombarded with information about it, because it’s such a big deal, and then the compliance on this, like I said, it’s industry-shifting, just like HIPAA was, and just like some of the SEC regulations and accounting regulations that came out in the early 2000s. If you’re looking for, I’ve got the official site listed here, so you can see where to go from the EU and see it.

Like I said, you’re going to see a ton of blog posts. There’s a ton of resources out there. Some of the tools, if you’re on the technical side, and you’re wondering, okay, I’ve got to go into a meeting. If somebody’s going to ask me what we’re doing about some of the data governance, and some of the other pieces, where can I focus, or where can I say, “Hey, you know what? Give me a week or two. Let me look at some of the things maybe you weren’t doing, and maybe the way that you’re protecting the data is a little bit different.”

Maybe the way that you’re tracking and holding onto the data, so that you can comply by getting rid of users’ data or opting not to track it, or even using a way to mask it, right? Using a way so that you can mask it, so you’re protecting the identities a little bit better. Maybe those are some of the weak points that you are…

Look into Apache Atlas, Apache Rancher, and Cloudera Navigator. Depending on the flavor of the Hadoop framework you’re using, or Hadoop package you’re using, whether it be Hortonworks or Cloudera, if you’re using one of those two main ones, look into these two tools, these three tools right here. This will give you some kind of framework, so you’re starting to see. So, you walk into the meeting, somebody says, “Hey, we’ve got to look at how we’re complying with GDPR, we want to really focus on data governance. What are we doing?” You’re sitting there saying, “I don’t know how to tackle this,” if you’re not doing it.

Go. Know these tools. Understand them from a high level. If you need to implement them, it’s a whole different story, but you can start getting trained up, start implementing those, too. Hope this was very helpful. This is something that I’m sure we will make some more videos on. We’ll be talking about constantly. I predict that this is kind of, like I said, industry-shifting regulation for IT and especially for big data, for all of us. I’m sure there’s going to be follow-on. I’m sure other countries in other areas, they’re starting to look at regulations. I’m sure here in the US, I’m sure Russia, Japan, I’m sure everywhere, they’re starting to look at some of these regulations. It’s not going to be just for the EU. Even if it was, it’s still affecting us. Everything’s global. If you have any questions, make sure you put them in the comments section here below. I will answer them here on Big Data Big Questions. You can go to my website, thomashenson.com. Look for the Big Questions, send me a comment. Also, make sure that you’re subscribing, so that you never miss an episode, and I will see you next time.

Filed Under: Big Data Tagged With: Big Data, GDPR

Subscribe to Newsletter

Archives

  • February 2021 (2)
  • January 2021 (5)
  • May 2020 (1)
  • January 2020 (1)
  • November 2019 (1)
  • October 2019 (9)
  • July 2019 (7)
  • June 2019 (8)
  • May 2019 (4)
  • April 2019 (1)
  • February 2019 (1)
  • January 2019 (2)
  • September 2018 (1)
  • August 2018 (1)
  • July 2018 (3)
  • June 2018 (6)
  • May 2018 (5)
  • April 2018 (2)
  • March 2018 (1)
  • February 2018 (4)
  • January 2018 (6)
  • December 2017 (5)
  • November 2017 (5)
  • October 2017 (3)
  • September 2017 (6)
  • August 2017 (2)
  • July 2017 (6)
  • June 2017 (5)
  • May 2017 (6)
  • April 2017 (1)
  • March 2017 (2)
  • February 2017 (1)
  • January 2017 (1)
  • December 2016 (6)
  • November 2016 (6)
  • October 2016 (1)
  • September 2016 (1)
  • August 2016 (1)
  • July 2016 (1)
  • June 2016 (2)
  • March 2016 (1)
  • February 2016 (1)
  • January 2016 (1)
  • December 2015 (1)
  • November 2015 (1)
  • September 2015 (1)
  • August 2015 (1)
  • July 2015 (2)
  • June 2015 (1)
  • May 2015 (4)
  • April 2015 (2)
  • March 2015 (1)
  • February 2015 (5)
  • January 2015 (7)
  • December 2014 (3)
  • November 2014 (4)
  • October 2014 (1)
  • May 2014 (1)
  • March 2014 (3)
  • February 2014 (3)
  • January 2014 (1)
  • September 2013 (3)
  • October 2012 (1)
  • August 2012 (2)
  • May 2012 (1)
  • April 2012 (1)
  • February 2012 (2)
  • December 2011 (1)
  • September 2011 (2)

Tags

Agile AI Apache Pig Apache Pig Latin Apache Pig Tutorial ASP.NET AWS Big Data Big Data Big Questions Book Review Books Data Analytics Data Engineer Data Engineers Data Science Deep Learning DynamoDB Hadoop Hadoop Distributed File System Hadoop Pig HBase HDFS IoT Isilon Isilon Quick Tips Learn Hadoop Machine Learning Machine Learning Engineer Management Motivation MVC NoSQL OneFS Pig Latin Pluralsight Project Management Python Quick Tip quick tips Scrum Splunk Streaming Analytics Tensorflow Tutorial Unstructured Data

Follow me on Twitter

My Tweets

Recent Posts

  • Tips & Tricks for Studying Machine Learning Projects
  • Getting Started as Big Data Product Marketing Manager
  • What is a Chief Data Officer?
  • What is an Industrial IoT Engineer with Derek Morgan
  • Ultimate List of Tensorflow Resources for Machine Learning Engineers

Copyright © 2023 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in

 

Loading Comments...