Language Flashcards from Songs and Movies - Part 7: Starting the Move to kNative
- Derek Ferguson
- May 17, 2020
- 7 min read
It's been a long time since my blog post, and in the intervening time, I let my AWS subscription lapse. Rather than reinstating it, I have had the idea to redo it all "on premise," or - phrased another way - running on the k8s cluster I've built at my home on a motley assortment of old hardware. :-)
I start by looking back at article 1 in this series. It was written well over 6 months ago, so I think I can be forgiven for not recalling all the details. I am reminded of two things here. First - I was using Lambda to run the lemmatizer that starts all of this - identifying the unique words in a set of Russian text, so that we don't find up processing multiple variations of the same root words. Second, I was using Bitbucket to build and deploy all of this onto AWS.
So, I start by seeing what the status of my kubectl cluster is.

Few things of interest here. First, 191 days of uptime. That's pretty impressive (or is it shocking... should I be restarting more often? I honestly don't know). Second, I literally forgot I even own that Mac Mini still. It has disappeared onto the equipment shelves - but it is still running... and it is my master. Wow! Finally, "fritzymac" has lived up to its name and gone offline. I will start by turning it on again (assuming it has just turned off).
Honestly, I have to fight with it a bit, but eventually, "fritzymac" comes online. I wanted this because it is a reasonably powerful MacBook upon which I have installed Ubuntu Linux - so... it is one of the more powerful cores in my cluster. By the way, "mini" is running Ubuntu, also -- all without any UI components, so, generally making optimal use of their processor and memory capacities.
I find a link to this video, and watch the first 15 minutes (I'm an impatient sort). It gives me the commands to install kNative on my cluster so - why not... let's give it a whirl!
kubectl apply \
-f https://github.com/knative/serving/releases/.../serving.yaml \
-f https://github.com/knative/eventing/releases/.../eventing.yaml \
-f https://github.com/knative/monitoring/releases/.../monitoring.yaml
LOL -- ok, that "blowed up good!" :-). I'm guessing those ellipses in the middle weren't meant to be taken literally. :-). Let's go check out the github repo.
kubectl apply \
-f https://github.com/knative/serving/releases/v0.11.2/serving.yaml \
-f https://github.com/knative/eventing/releases/v0.14.2/eventing.yaml
I couldn't find monitoring anymore, so let's try the above, for starters! Nope, that doesn't work, either... let's watch a bit more of the video! :-)
Unfortunately, his first demo starts with it already installed. Let's try this document, which has a couple of specific lines for installation.
kubectl apply --filename https://github.com/knative/serving/releases/download/v0.14.0/serving-crds.yaml
kubectl apply --filename https://github.com/knative/serving/releases/download/v0.14.0/serving-core.yaml
At this point, I know that I have Istio installed from the way that this blog series ended previously, so I will proceed to install the kNative bits that use it, rather than following the instructions to install it from scratch. I have to hope that my previous Istio installation has all the needed bits on it!
kubectl apply --filename https://github.com/knative/net-istio/releases/download/v0.14.0/release.yaml
kubectl --namespace istio-system get service istio-ingressgateway

Hmmm... looking ahead, I can see I'm going to need an EXTERNAL-IP. And yet, I have none. What does this "<pending>" thing mean under "EXTERNAL-IP"? A little reading suggests that it simply means I'm not running on a public cloud - which I'm definitely not - so... let's try kicking this one down the road a bit and see if it really becomes an issue or not.
kubectl apply --selector knative.dev/crd-install=true \ --filename https://github.com/knative/eventing/releases/download/v0.14.0/eventing.yaml
kubectl apply --filename https://github.com/knative/eventing/releases/download/v0.14.0/eventing.yaml
The next bit is a bit tricky, because I know that I eventually want to get to Kafka, but... I don't want to tackle all of that just yet. So... let's start with an in-memory channel.
kubectl apply --filename https://github.com/knative/eventing/releases/download/v0.14.0/in-memory-channel.yaml
kubectl apply --filename https://github.com/knative/eventing/releases/download/v0.14.0/channel-broker.yaml

Looks like kNative is all up-and-running at this point! The video above starts using "kn," though - which is the kNative command line. So, let's install that, also!
curl -L https://github.com/knative/client/releases/download/v0.14.0/kn-linux-amd64 --output kn
chmod +x ./kn
The "-L" flag on the curl download above is particularly important, because github likes to do redirects to point to specific binaries and without that flag, you'll get a bunch of HTML in your "kn" instead of the actual binary download. As it stands, the above worked.

So, I try to run the command shown in the demo and it errors out. I read through the error and determine that the Docker image it is trying to pull down is not publicly available at the moment. However, the source code for it appears to be here. So, let's try pulling down the source code, building the image, pushing it up to our own repo and retrying.
git clone https://github.com/mchmarny/maxprime.git
cd maxprime
docker build . -t dotnetderek/maxprime
docker push dotnetderek/maxprime
./kn service delete jaxsrv --namespace demo

That "Ingress has not yet been reconciled" line looks distinctly problematic to me, but we'll give it a shot. I have a feeling this is where my dismissing of the earlier error about not having an external API is going to come back to bite me. Let's see.
Yeah, that URL has no existence outside the real of Istio. Thankfully, I'm able to Google around a little and find this gem of an article. It tells me how to run a few commands that help me get to the bottom of accessing this page - at least from CURL.
export IP_ADDRESS=$(kubectl get node --output 'jsonpath={.items[0].status.addresses[0].address}'):$(kubectl get svc istio-ingressgateway --namespace istio-system --output 'jsonpath={.spec.ports[?(@.port==80)].nodePort}')
kubectl get services.serving.knative.dev --namespace demo

We take the hostname from the URL above and use it below, replacing the red portion, if it is different.
curl -H "Host: jaxsrv.demo.example.com" $IP_ADDRESS
This gives us a scroll of text on the page that is the main page in our first application to use kNative functions.
But, let's go a step further and see this in the browser. I use Chrome, so I install a plugin called ModHeader. Using this plugin, I am able to tell Chrome that I want to send the requested Host parameter on every request going forward.

Now, if I go to the host and port currently saved in the environmental variable $IP_ADDRESS above, I get the web app, and I'm able to futz with it as much as I like.

Now that we have the sample kNative app working, let's see if we can apply it to get the lemmatizor running.
I start by reviewing my previous code at https://bitbucket.org/PythonDerek/lemmatizor/src/master/service.py. The primary difference between kNative and Lambda at this point appears to be the expectation in kNative that the developer will provide a Docker image with some listener on a port that will be passed in on an environment variable named PORT, whereas Lambda takes care of any kind of network listening upstream and simply invokes the method with the prenamed signature inline.
So, seeing that there are unit tests defined for my existing Lemmatizer, and finding an excellent "Hello World" for kNative in Python at this location, I'm thinking - let's start by Docker-izing our existing code and making sure the unit tests still pass, then we'll move on to the next bits.
All code for this will be at https://github.com/JavaDerek/lemmatizor.
We start by adding a Dockerfile, modeled on the sample Dockerfile above, but without an entrypoint to start the listener process yet, because we want to see our tests pass first.
I use the installer script at https://bitbucket.org/PythonDerek/lemmatizor/src/master/bitbucket-pipelines.yml as the basis for the libraries needed to run my unit tests.
I start to copy the complete requirements.txt from our previous build of the lemmatizor, but it is easily 50 large Python libraries, and I doubt we need most of these. So, I'll just start with the line that imports nltk and see how that goes.
Trying to build the image reminds me I need to move over mystem, so I copy that file manually from the old git repo to the new. The Docker image builds correctly and I log into using...
docker run -it dotnetderek/lemmatizor /bin/bash
At which point I realize that I forgot to actually put in any of the Python code or tests. :-). I copy them over "as is" and retry: service.py and the entire "tests" subdirectory (minus a pycache file that was in there). I also follow the HelloWorld above's advice in step 4 to create a .dockerignore file at this point.

So, running the unit tests as I had in the previous repository's Bitbucket pipeline reveals we need that boto3 library. I rerun about 5 more times and find more libraries to copy over from the old requirements.txt file. Then, finally all the tests run, but at least one fails because I forgot to copy the config.yml file over from the old repository. So, I move that one over, too and rerun. Now, the tests are passing - so I can start hacking with confidence that I have a way to test my changes.
I decide to create a small app.py in the style of the Hello World example, but one that will take a POST and use it to invoke the proper methods that used to be our Lambda signatures - for minimal change. For the first pass, I actually hardcode it to simply return a hard-coded response, so I can work out any additional issues I may find with deployment.
I put the gunicorn entrypoint back at the end of the Dockerfile and proceed with another build and run. It is important to pass in a port on the command line as an environment variable and to forward that port from the container to the host computer.
docker run --env PORT=8080 -p 8080:8080 dotnetderek/lemmatizor
After running this command, I'm able to open a web browser, go to http://localhost:8080 and see my hard-coded message. So, I'm comfortable that at least this much is working properly. Now to try a POST with some returned lemmas.
Running the code below and hitting it from POSTMAN with some random JSON sends back the hard-coded JSON response shown in the code. So, we seem stable to start connecting to the actual natural language processor.

Hooking up to the natural language processor requires first extracting a method that performs the tokenization in isolation from writing to SQS, since we won't be using that either, any more. Then, it turns out that once actual Cyrllic is passed on the wire, the JSON serialization shown above winds up turning all the characters to escape codes rather than simple UTF-8. So, a slightly different process is required. The code below, when passed a Russian sentence returns a proper JSON list of the lemmas in that sentence.

Now to use the kn command line tool to deploy this to kNative and try it out there.
docker push dotnetderek/lemmatizor
kubectl create namespace lm
./kn service create lemmatizor --image dotnetderek/lemmatizor --namespace lm
kubectl get services.serving.knative.dev --namespace lm




So, as shown above - to get this final piece working, we have to move the Host header over to Postman and send a Russian message to our service as the "text" parameter of a JSON message. In return, we get a JSON response with all the lemmas that were in that sentence.
In the next section, we'll hook up to Kafka, so the rest of our services can be connected on kNative.
All the code from this blog post is available at https://github.com/JavaDerek/lemmatizor.
Commentaires