Thomas Henson

  • Data Engineering Courses
    • Installing and Configuring Splunk
    • Implementing Neural Networks with TFLearn
    • Hortonworks Getting Started
    • Analyzing Machine Data with Splunk
    • Pig Latin Getting Started Course
    • HDFS Getting Started Course
    • Enterprise Skills in Hortonworks Data Platform
  • Pig Eval Series
  • About
  • Big Data Big Questions

Isilon Quick Tips: SyncIQ Deep Dive

December 17, 2016 by Thomas Henson 1 Comment

OneFS offers many options to customize replication policies using SyncIQ. In this episode of Isilon Quick Tips we walk through those options in deep dive into SyncIQ.

 

Transcript

hello and welcome back to another episode of ISILON on quick tips today we’re going to talk about SyncIQ and we’re going to go a little bit deeper than we have in the past the SyncIQ so before it was all about just setting up a one-time replication now let’s talk about some of the options and how we can really customize are seeking jobs so what we’re going to do is I’m going to swing over to policies and we look at a policy of already got created i’m just going to edit that policy that policy is my home shares and so this is all my corporate home directories here in Huntsville and this is something that I’m replicating to my secondary cluster so the first thing i want to talk about I want to talk about some of the differences between copy and synchronization copy is just when you’re specifically moving data from one directory to another directory you’re not caring about if the date has been deleted or if the data is merged so when you’re synchronizing that’s going to be different because synchronizing is actually going to keep your primary cluster and your secondary cluster in sync so a file has been moved to another directory inside of that directory will be replicated that way in your secondary cluster and so will delete if you have a file has been deleted your primary once the scene job completes it will be deleted on your secondary cluster as well some of the areas you would use these is if you want to have a backup of your data but you don’t want to sink it or maybe there’s certain directors you want to pull out that you want to have all of those copy that’s typically when you’re going to see them but most cases you’re going to use the synchronization so in our run job or run job is our option of windows is dropping around so we want to sync you know this is our SyncIQ policy when they’re going to kick off you have a couple different options the one we did before was our manual so that just saying hey you know just going to manually push the button and every time i do that i’m going to sink my going to start that state policy then you also have the option to do it on schedule this is the most common one used so this is says hey you know two times a day three times a day you can set it up however you want that you’re going to set a schedule that you know the state is going to be replicated you know let’s say you do at six in the morning then six in the afternoon you know you doing a weekly monthly or yearly basis another common way to run these is whenever sources modified and so you can have a source of this modified so you know think about if you move a file if you delete a fall anything like that any changes to that directory it’s going to go ahead and modify the source you do have to set a time frame around that so let’s say that you modify a file how fast you want that see you can send delay to happen a few different seconds minutes hours or days so you can say hey you know every time some modified let’s wait a few minutes and then go ahead and never replicated over and then you can also have it set up where whenever a snapshot of the source directory is taken that’s going to run that the state policy job to setting the source directory is very simple right so what directory or want to move in this case i’m moving all my data that’s in ifs is my source directory the cool thing is where you can really customize this job is not only were sent in from different directories or also can include directories or exclude directories you can come in and say hey you know all the directories under data i’m only going to move over RI salon support or you know I’m only one will move over my easy gather / or i can exclude directories and say you know what move everything that’s in data but these two directories here I salon support a nice long gather those are all administrative things that you know not really trying to replicate over the change in a lot there’s not any data in there that’s not recoverable don’t replicate those so that just gives you a lot more control so you can set you said something at a high level in the tree and then only replicate the times that you want to in that tree and not have to worry about okay you have to set up you know 15 different policies because I’ve got different datasets now you can still come back and set one or two policies to replicate over the data and then we talk about some of these advanced settings so you can actually set a priority on this and you can take it from you know a normal default policy or you can have always have it but this specific policies always going to be high level and never just make sure that that priority has been lifted on this job here and also you can set a limit on how long you’re going to keep the reports from these jobs because you know depending on how often these jobs running you start to have a lot of reports and so you have that option there let’s cancel out of this and I’m going to close out i will show you one more thing so we’re talking about setting up those on modified depending on how often those files change that’s how much bandwidth is going to go over your network so if you have some performance concerns about how often or how much data is being problem across you can actually set of performance role in these jobs and so one of the cool things that I really like about it is you can set it on a schedule so you say hey you know I really want to replicate this anytime my name is modified but there are certain business our times maybe or certain times certain days of the week that I’m really want to throttle back and say hey you know what it’s not as a bigger problem during this time you know the rest of the week let’s go ahead and have it you know open throttle there and so you can set a schedule man you can even set some bandwidth rules around the limitations and so you can kinda throttle back saying hey you know always want to be modifying that data but let’s just set a performance rule about how much bandwidth can be taken up and so that’s just a deeper dive on CQ and so you can really see how you can customize and design those same policies to fit whatever kind of rules that you want to have for replicating your data between your eyes on clusters thanks for taking the time to watch and i hope you’ll join me for another episode of ISILON on quick tips

Related

Filed Under: Quick Tip Tagged With: Isilon, Quick Tip

Subscribe to Newsletter

Archives

  • February 2021 (2)
  • January 2021 (5)
  • May 2020 (1)
  • January 2020 (1)
  • November 2019 (1)
  • October 2019 (9)
  • July 2019 (7)
  • June 2019 (8)
  • May 2019 (4)
  • April 2019 (1)
  • February 2019 (1)
  • January 2019 (2)
  • September 2018 (1)
  • August 2018 (1)
  • July 2018 (3)
  • June 2018 (6)
  • May 2018 (5)
  • April 2018 (2)
  • March 2018 (1)
  • February 2018 (4)
  • January 2018 (6)
  • December 2017 (5)
  • November 2017 (5)
  • October 2017 (3)
  • September 2017 (6)
  • August 2017 (2)
  • July 2017 (6)
  • June 2017 (5)
  • May 2017 (6)
  • April 2017 (1)
  • March 2017 (2)
  • February 2017 (1)
  • January 2017 (1)
  • December 2016 (6)
  • November 2016 (6)
  • October 2016 (1)
  • September 2016 (1)
  • August 2016 (1)
  • July 2016 (1)
  • June 2016 (2)
  • March 2016 (1)
  • February 2016 (1)
  • January 2016 (1)
  • December 2015 (1)
  • November 2015 (1)
  • September 2015 (1)
  • August 2015 (1)
  • July 2015 (2)
  • June 2015 (1)
  • May 2015 (4)
  • April 2015 (2)
  • March 2015 (1)
  • February 2015 (5)
  • January 2015 (7)
  • December 2014 (3)
  • November 2014 (4)
  • October 2014 (1)
  • May 2014 (1)
  • March 2014 (3)
  • February 2014 (3)
  • January 2014 (1)
  • September 2013 (3)
  • October 2012 (1)
  • August 2012 (2)
  • May 2012 (1)
  • April 2012 (1)
  • February 2012 (2)
  • December 2011 (1)
  • September 2011 (2)

Tags

Agile AI Apache Pig Apache Pig Latin Apache Pig Tutorial ASP.NET AWS Big Data Big Data Big Questions Book Review Books Data Analytics Data Engineer Data Engineers Data Science Deep Learning DynamoDB Hadoop Hadoop Distributed File System Hadoop Pig HBase HDFS IoT Isilon Isilon Quick Tips Learn Hadoop Machine Learning Machine Learning Engineer Management Motivation MVC NoSQL OneFS Pig Latin Pluralsight Project Management Python Quick Tip quick tips Scrum Splunk Streaming Analytics Tensorflow Tutorial Unstructured Data

Follow me on Twitter

My Tweets

Recent Posts

  • Tips & Tricks for Studying Machine Learning Projects
  • Getting Started as Big Data Product Marketing Manager
  • What is a Chief Data Officer?
  • What is an Industrial IoT Engineer with Derek Morgan
  • Ultimate List of Tensorflow Resources for Machine Learning Engineers

Copyright © 2023 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in