Kubeflow "Unboxing" - Part 2 - Kubeflow itself
- Derek Ferguson
- Dec 16, 2018
- 2 min read
OK - so 6 hours of open warfare with machinery I haven't touched in 6 months - seems like there might be an hours-to-month ratio there. Anyhow, at this point, I have a nice 3-node cluster running the latest k8s bits, so let's go ahead and try to install Kubeflow itself.
The first thing one notices is that the instructions directly on the Kubeflow homepage discuss how to hydrate a ready-made Kubeflow cluster "from scratch" - k8s and the Kubeflow bits all-in-one. But, this is problematic because we already have the k8s cluster we want to use. In fact, even if I could go back to start-of-day and restart from scratch, I wouldn't, because I feel most organizations are going to have pre-created k8s capacity that are under independent (but internal) management - so understanding how to reuse existing capacity is essential.
Let's try the instructions in Step 2 at https://www.katacoda.com/kubeflow/scenarios/deploying-kubeflow-with-ksonnet
First stumbling block is that it requires kSonnet, which I haven't yet installed. So, a Google search reveals that this is available as a ready-made binary download. https://github.com/ksonnet/ksonnet/releases . I download that onto my k8s master and... extract the "ks" file out of it onto my desktop - then "mv" it to /usr/bin.
Next wrinkle - the instructions require a Github personalized token. Doesn't seem to be a problem there - I just Google "github personal access token", go to the link it provides (https://github.com/settings/tokens) hit "Generate New Token," give it permissions to everything and I get back a code. In retrospect, I just wish I had done this from my k8s master running in Virtual Box, because I *still* haven't hooked up the copy/paste integration with my MacBook host. :-(
Another hurdle - at the end of the instructions, it talks about installing some Katacoda extensions. The instructions after this go pretty deeply into using files that are only available in the Katacoda environment. Looking back at the other docs from Google they (somewhat predictably, I guess) want you to use GKE for everything. :-( . So, I'm going to try to use the tutorial here (https://codelabs.developers.google.com/codelabs/cloud-kubeflow-e2e-gis/index.html#2) and adapt it for use against a local Kubernetes cluster.
From step 2, I'm going to try only downloading the suggested source code and skipping everything else - as it appears to be setting up connectivity with GKE, which I don't want.
From step 3, I ran all of the steps from the top down and - in a particularly good sign - the image at the bottom exactly reflects what I have in my cluster when I am finished. It appears that if you simply skip the bits in step 2 above that point everything at Google's Public Cloud offering, it will use your existing kubectl setup by default - which is, if you've been following along, going to be pointing to your own cluster.
At this point, the above tutorial goes back to trying to exchange data with the Google Public Cloud. As a result, I'm going to leave this here and pick up again with setting up the JupyterHub and running a simple TF test script.
Comments