Hello, today I want to teach you about how to schedule your KNIME workflows in batch mode on a MAC with screenshots and text. Some times people will call this “automating,” and often words can be used interchangeably, however it’s important to ask people what they mean when they say the word automate if they have never done any automation before. It helps to operationalize what words mean to you in your IT role, or in your business role.
In this blog I will cover many important topics to scheduling your knime workflows on a MacOS.
Understand the two methods to scheduling knime workflows; workflowdir and workflowfile
One method to automating workflows in knime can be considered a little slower, more steps to get going, and has less limitations.
This is workflowfile method and it requires you export the workflow to a directory.
Another method is faster to test but it blocks you from practically everything while it runs.
This is your workflowdir method and it requires you aim your code at a directory within the the knime analytics platform installation. It also requires that you shutdown the workflow as it will need to have it available to run. This mode IMO is more for testing and it’s what I’m explaining below. NOTE: It’s easy to swap between the two.
The code below, if you run and copy paste, you won’t understand this important functionality which is exactly what I did at first and trying to make the workflowdir be a functional solution is a bit of a hassle, however I can understand it’s an option. I feel comfortable telling you about it early because it will make a large difference in how you use KNIME and it’s very important to how you automate your workflows on a schedule on a laptop, server, or whatever you decide to use to kick off knime workflows.
Please have a look at “workflowdir” which limits your ability to using KNIME, it’s covered below, workflowdir is requesting a location of the workflow that’s within KNIME analytics platform. When you do this, it will lock down the application. You don’t want it to lock down KNIME, you want to run the workflow as many times as you want to run the workflow. This leads me to workflowfile. Workflowfile allows schedule your workflow files that you exported to a directory. It’s easy to change the workflowdir into a workflowfile and remove all the residual popups being generated by the workflowdir workflow.
Use the below code for workflowfile;
echo "start1" && /Applications/KNIME\ 4.1.0.app/Contents/MacOS/Knime -reset -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION -workflowFile="/Users/dev3lop/Desktop/Covid19Googlesheets.knwf" && echo "end1"
Scheduling workflows in knime analytics platform to automate things can be completed easily with a task scheduling software.
However you need to know what code to put into the task scheduling software, which is everything below this task scheduling explanation.
Windows Task scheduler, is a windows product and can enable you to kick off KNIME workflows in the app knime or in the destination you exported the workflow. It’s 1 of the go-to-products for IT because it’s on their computers, servers, etc. and they have experience utilizing this for scheduling scripts, tasks, batch processes, stored procedures, and much more.
What are softwares you can use to schedule workflows to run in KNIME Analytics platform?
- windows task scheduler – requires windows, hit windows key, type ‘task schedule’, you will need to learn their user experience and software to get going, and maybe need a bit of technical expertise or patience.
- canopys task scheduler software – does not have a limitation on operating systems, does not require a VM, all you need to do is download canopys, open canopys, click new task, paste knime code from below, set your schedule, and that’s it. Click automate in the menu above to check out the app.
- apache airflow – requires a lot of code to even lift off the ground 1 single schedule, a VM, and engineering experience to support/remediate. They have a button that says install on their website, however you need to use code to install it locally and to install it on a VM you need a VM and a very good understanding of the code
Apache Airflow is rather techie however a lot of people like to use the tool. We built canopys to be easier.
These 3 options will give you the ability to schedule or automate your workflows, and the ability to kick off the scripts, workflows, prep and blend, & effectively schedule your KNIME applications!
When following this blog, also check out https://www.knime.com/faq#q12 FAQ on KNIME. It took me several months of using the product before the FAQ become some-what relevant for helping me understand how to automate. Eventually it took me sitting down and really have a lot of free time to test the code over and over until I found one bit of code that worked 100% of the time for me.
There’s two paths to automating this process. One path is
To begin the blog, lets start with the solution that worked for me more than once!
The code to use batch mode on a mac, begin automating your knime apps
The code shared below can be used in your terminal on your macbook and will help you automatically run your workflows.
You will need to change the directory in the beginning based on the correct directories for your computer or “environment.”
I’m using a mac, this is where the KNIME executable is located for me and that’s my workflow directory.
/Applications/KNIME\ 4.1.0.app/Contents/MacOS/Knime -reset -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION -workflowDir=”/Users/itylergarrett/knime-workspace4.1/CoronaVirus2″
Before we dive into the code, lets tour where /Contents/MacOS/Knime is located.
PRO TIP
Add –launcher.supressErrors after -nosplash to remove the pesky popup. More on this in the last paragraph of this blog. It would look like this; /Applications/KNIME\ 4.1.0.app/Contents/MacOS/Knime -reset -nosplash –launcher.suppressErrors -application org.knime.product.KNIME_BATCH_APPLICATION -workflowDir=”/Users/dev3lop/knime-workspace/CoronaVirus7″
If you notice any weird errors, try typing the quotes VS pasting the quotes. I find sometimes wordpress does a bad job of figuring this part out for you and may give you a strange double quote in the process. I know because I made this mistake.
Finding the contents folder in your knime workflow
When using a mac, be sure to note your Knime file is built within your applications. It’s subtly different than Windows and may require this small explanation for those who are beginners. I will break this set of steps into smaller bites so you understand what we are doing to schedule your knime workflow.
Mac is explained in this blog, windows is going to be similar to the MAC, and it’s well documented on the KNIME forums.
Here’s a screenshot of your applications on your mac. You may want to open your applications folder this way, however you can open a finder and get to the same area. I’m setting this up for those who need this workflow to setup their automation.
Navigate to the KNIME you need to utilize. If you have many versions of KNIME, pay attention to how the code is different in that area to access that knime installer.
Here you will find logs, more code, and knime executable.
Expanded in this screenshot below.
You can build your directories accordingly, however MAC will likely be identical to my directories in my code. With some variations on your version numbers. Knowing where this file is located is important to accessing and launching this KNIME file.
How does the KNIME workflow scheduling code work?
The KNIME Workflow scheduling code can be broken into 3 various chunks of code and I’ll do my best to explain everything as if you’re not a developer.
For sake of simplicity, I will paste the code again.
/Applications/KNIME\ 4.1.0.app/Contents/MacOS/Knime -reset -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION -workflowDir=”/Users/itylergarrett/knime-workspace4.1/CoronaVirus2″
The blue refreshing code above can be broken into 3 major parts.
- The location of Knime executable.
- The code that makes a knime workflow do
stuffmagic. - The directory to your workspace & workflow.
IF you skipped around and scrolled down, be sure to understand the “workflowfile” aspect by using the find function CTRL+F or cmd+F (scroll up for more info).
The location of the knime executable.
Finding the location of the knime executable is only challenging on a Mac, so if you’re a windows user, have no fear.
/Applications/KNIME\ 4.1.0.app/Contents/MacOS/Knime
This is the default install location of the KNIME app for my 4.1.0 installation. Make changes as you see fit.
The weird aspect is the backslash forward slash between the spaces of KNIME 4.1.0 and this is typical when running code like this and being a space inside of the directory.
The code that makes a knime workflow do stuff magic
Now, let’s discuss the code that makes a workflow do stuff magic.
Knime -reset -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION
From my testing, -reset in the beginning is important. You want to reset the workflow when it starts.
Reseting the workflow is a common request in KNIME. Please keep these settings the same. If you desire to dig deeper, check out their FAQ page.
(FROM KNIME FORUM) Use the –launcher.suppressErrors option, and remove the -noexit option, on the original workflow in batch mode. Even if the workflow encounters the Fail In Execution node, the popup error dialog will not appear. This has the downside of suppressing any other errors that may occur, though.
The directory to your workspace & workflow.
Your directory to your workspace & workflow is a good thing to understand about your KNIME products.
I’m going to share my code, and how to get the code in your KNIME product with a native feature.
-workflowDir=”/Users/itylergarrett/knime-workspace4.1/CoronaVirus2″
Knowing where these workflows are means you can begin automating them on a schedule!
Open your KNIME product, navigate to your workflows, right click the workflow.
Highlight Copy Location, click local path.
Paste this into your ‘workflow directory’ & rinse and repeat as necessary.
Knowing this little process will help you get the code to your future apps quickly and also I hope it shows you different things in the menu. Take some time to start reading what’s inside of these menu’s to improve your skills while you ramp.
Open your terminal and test.
To begin automating with KNIME, test your code to your KNIME application. Make sure it works end to end.
This code is a “request” being sent to your application to start your workflow. “Hey is it cool if I use KNIME?”
Knime will say YES but it also needs to know where your workflow directory is too. “Yes but what and where and how?”
/Applications/KNIME\ 4.1.0.app/Contents/MacOS/Knime -reset -nosplash -application org.knime.product.KNIME_BATCH_APPLICATION -workflowDir="/Users/itylergarrett/knime-workspace4.1/CoronaVirus2"
Now with terminal open, paste your KNIME automation code into the terminal. This is the code you will later paste into canopys task scheduler because you want to create a new task to a schedule it to run each night or weekly or hourly, or whatever custom schedule you can consider.
Paste the code into the terminal.
Turn off your KNIME analytics platform, or at least the workspace or it won’t start. This is the locking portion I explain when using workflowdir VS worflowfile. You will understand as you test more and see the differences or strengths/weaknesses. I’m trying to teach you the method that doesn’t allow you to lock up your computer because you turned on a large ETL job 50 times.
OH RIGHT, back to the tutorial. Hit enter and sit back.
If it works, you will see your tools flashing on the terminal feedback. If it breaks, it will give you more information to learn.
If you get errors, troubleshoot the errors. If you keep getting errors, give yourself an easier task of writing a file to your desktop and see if that file refreshes when running your KNIME automation in batch mode.
Don’t try to validate how this code works by using a long running KNIME process. I tried and it took a long time and it does not need to be a big process to test how this code works. I encourage you to find an easier path VS a complicate path. Start with something easy, then move to your complex ETL workflows or data science development.
Remove the pop up after you automate a task in KNIME batch mode
If you’re dealing with a ton of popups and not entirely sure, start by posting that information in the KNIME forum. They have helped me a ton and the users on the forum are very respectful of each other.
I hope some more info from the forum will help you get rid of popups, but do not hesitate to ask questions below if you’re unable to move the needle.
Removing the popup allows you to “leave it and sleep”… Here’s a post from one of the knime team members Scott, on a forum post on knime.com.
- Use the –launcher.suppressErrors option, and remove the -noexit option, on the original workflow in batch mode. Even if the workflow encounters the Fail In Execution node, the popup error dialog will not appear. This has the downside of suppressing any other errors that may occur, though.
- Create a separate workflow, and use the Call Local Workflow node (see attached example). This node will execute successfully even if it encounters an error in the workflow that it calls.
Recent Comments