How I made a Kate Winslet, Hotdog (and not Hotdog) classifier on a whim

Did we really need this? No. Did I have 7 days left on my semester break? Yes.

It started with one poorly planned text to my (last 3̶2) friends at 11 PM — “GUYS! Want me to make hotdog not hotdog”. The old adage of Silicon Valley comes floating back to me, at 11:48, in the night, obviously.

Yeah. The wrong kind of Rose.

So why a Kate Winslet classifier you ask? Because there isn’t one, and you need it. Okay really, I actually passed a picture of Rose (as in the film character from Titanic) as an image to label and the TF for Poets flower labelling script told me pretty accurately that it was 99% a rose, and I think it was hugely unjust for a program to do that (it’s really not but this entire piece is going to be in the same line of exaggeration).

Firstly, if you’re new to ML (like me), the question is why would it give a poor performance and the simple explanation is that it’s a classifier program, it tries to classify and if a program had to aggressively do that, it would, even if the confidence was abysmal or fishy. Furthermore, the TF for Poets actually works on only the last layer of a network, this makes it much easier to retrain it for a basic user but also much harder to methodically train it on our exact parameters.

Problem #1: Kate Winslet

A movie is good only until you have to ffmpeg it frame-by-frame. OR, you can hit up Google Images, and get the first 100, then the next 100, and a couple more, until you’ve kind of exhausted your mobile data.*

With that comes the challenge — how. Doing it yourself is not an option even for a procrastinator like me (by virtue of being myself obviously). So, there’s the obvious choice of running a script to do it, using some app which charges $10/month or my best choice, finding blatantly feature-less but getting-the-job-done Firefox add-ons. And it turns out, there’s never a lack of those.

Oh ho! Not a jpg you say.

Except it wasn’t the best choice of the lot. So, it’s already 12 AM the next day (I procrastinated through the daytime if you’re wondering why the next day) and I really want to get this done before I sleep and then I realise the issue at hand is, this godfind of an add-on wasn’t converting all of the images to a proper format, so yes it did say .jpg but was it? Only as much I am a writer. 2) It simply did not want to download some images, why you ask? — this obviously meant I had to download more (and today had to keep getting worse in more ways than one).

But even then, we were ready. We were finally ready after retrain.py was done scolding us with exceedingly verbose errors (if you think I’m complaining, I literally StackOverflow-ed the entire thing, so yeah it’s just an ingenuine compliment).

* — not recommended

Problem #2: The Hotdog (or not-Hotdog)

Wait, if I’m writing this article, let’s at least use a good architecture — not our ‘ol MobileNet 1.0. It did not work out well, well at least until I actually downloaded Inception v3 and it turned out to be just ever so mildly faster than the original configuration (almost), I felt smart.

Interestingly though, Inception processes 299×299 though I doubt that was a factor.

It’s almost like it works.

So they don’t matter? Yes, and no. I’m running a MacBook Pro (no, not rich enough for the 2018 one) and if you use a less/more powerful system, it will affect the results. Also, these are the TF-Slim models (just think of it as Diet Coke if you’re really confused, the taste minus the diabetes, just like this analogy) which means they aren’t the actual modules themselves.

Checkpoint: Where do we stand? How far are we? Do we need to do more of this? Really.

If you’re lucky you’ll be here before anything else and then when you run the label_image.py script, you’ll be hit with the final boss, so instead we’ll have hypothesis and proof model.

Can StackOverflow correct this KeyError?

Unsurprisingly, yes. And that’s all the QED you need. With that, we’re really drawing a close to this entire useless affair now. And if you’ve made it this far, that’s a real surprise. Just before we wrap up with the results, a few straight facts, this your “Hello World!” into TensorFlow and if you are running a Pentium, I’m sorry you had to go through this (or just anyway).

Problem #3: How much good is too good?

The correct answer is all of it. Now before you ask why, wait for a bit. Let’s dig in.

27.jpg — our hotdog (don’t ask questions); 42.png — our wrong Rose; random.png (reusing the screenshot because that’s the kind of person I am)

And with no ado, (or maybe a bit)

That’s not bad, not bad at all.

That’s a least accuracy of 98.8% and that’s high for Indian board marks standards, so you know it’s the real deal.

If we’re going to go into explanations, the 98.8% might probably not hold, given that I retrained it using book covers, flowers, wallpapers, texty stuff, icons and literally anything I could get my hands on, it simply won’t be good fit enough for “not hotdog”.

Now that I’m done weeping in front of the screen at 2, I think we can observe how picking up ML is exceedingly easy, and I’m saying it because I picked up this particular module yesterday. It is a vast field, with no end, but if you’re going to be into it, there’s no better time.

And if you still think this isn’t too good.

It’s almost kind of sad.

read original article at https://medium.com/@ankittt/badclassifier-5f19b5d018a5?source=rss——artificial_intelligence-5