epilogue (2): ai app

Next step is to understand how to get the AI into an app. I see tensorflow has a couple tutorials on that. I will follow to set one up.

following TensorFlow Lite guide to complete the two steps circled above to make bring the model I built from google colab notebook -> an app format in preparation for integration. (Also considered this source, but I am worried about potential superfluous steps required.)

This ^ github has a good starting package for an iOS app with TensorFlow. It was last updated June of ’19, fairly recent, so I’m hopeful I won’t have too many bugs.

extra steps required to build app (besides the basics of having Xcode etc)
found in the README of the examples/lite/speech_commands/ios folder

SUCCESS

unfortunately I had a slight bug after that first run even though I changed literally one character and then immediately changed it back. I trashed the project, cleared my trashcan and was going to repeat the steps, but tensorFlow has a limit on how many downloads you can do, it would not allow me to run a second package for free (and access their files and GPU etc). So I decided to move on to actually building my AI and worrying about putting the code in an app after.

NEW SOURCE – Jeremy Neiman focuses on building a haiku generator that strictly adhere to the 5-7-5 syllable structure, which posed a problem to the haiku generators of the past, according to his article

most modern haiku don’t adhere to that structure, which means that a training corpus won’t reflect it
“Generating Haiku with Deep Learning (Part 1)” Jeremy Neiman

this is his generator

here is his github: https://github.com/docmarionum1/haikurnn

note he uses recurrent neural networks (RNN) as opposed to convolutional neural networks (CNN)

according to this stack overflow forum, the difference is:

CNN:

CNN takes a fixed size inputs and generates fixed-size outputs.
CNN is a type of feed-forward artificial neural network – are variations of multilayer perceptrons which are designed to use minimal amounts of preprocessing.
CNNs use connectivity pattern between its neurons and is inspired by the organization of the animal visual cortex, whose individual neurons are arranged in such a way that they respond to overlapping regions tiling the visual field.
CNNs are ideal for images and video processing.

RNN:

RNN can handle arbitrary input/output lengths.
RNN unlike feedforward neural networks – can use their internal memory to process arbitrary sequences of inputs.
Recurrent neural networks use time-series information. i.e. what I spoke last will impact what I will speak next.
RNNs are ideal for text and speech analysis.

“The model is essentially a character-to-character text generation network with a twist. The number of syllables for each line is provided to the network, passed through a dense layer and then added to the LSTM’s internal state. This means that by changing the three numbers provided, we can alter the behavior of the network. My hope is that this will still allow the network to learn “English” from the whole corpus even though most of the samples are not 5–7–5 haiku, while still allowing us to generate haiku of that length later.”

I am using this google colab notebook to feed haikus into the training model. I am using the files made available by Neiman (mentioned above) because he has specifically haiku (5-7-5) poems already compiled and I am storing them on my personal website: abigailtovastein.com/haiku-source

Epoch 1/3
234/234 [==============================] - 5s 20ms/step - loss: 0.0536 - accuracy: 0.9999 - val_loss: 2.5273e-05 - val_accuracy: 1.0000
Epoch 2/3
234/234 [==============================] - 4s 19ms/step - loss: 4.7539e-06 - accuracy: 1.0000 - val_loss: 1.9558e-05 - val_accuracy: 1.0000
Epoch 3/3
234/234 [==============================] - 4s 18ms/step - loss: 4.6387e-06 - accuracy: 1.0000 - val_loss: 1.4532e-05 - val_accuracy: 1.0000
<tensorflow.python.keras.callbacks.History at 0x7f0e67d02588>

Next issue to iron out: as you can see, it is giving itself a 100% accuracy

I added more sources. The next .txt training doc I found came from here (gitHub:geoffbass/Haiku-Generator).

Epoch 1/3
258/258 [==============================] - 5s 20ms/step - loss: 0.2487 - accuracy: 0.9357 - val_loss: 0.1500 - val_accuracy: 0.9554
Epoch 2/3
258/258 [==============================] - 5s 18ms/step - loss: 0.1131 - accuracy: 0.9666 - val_loss: 0.1334 - val_accuracy: 0.9592
Epoch 3/3
258/258 [==============================] - 5s 18ms/step - loss: 0.0887 - accuracy: 0.9756 - val_loss: 0.1353 - val_accuracy: 0.9574
<tensorflow.python.keras.callbacks.History at 0x7f642055dba8>

This accuracy is better…

epilogue: integration

OCR – Abbyy’s open source code stuff:

https://www.ocrsdk.com/documentation/code-samples/?gclid=Cj0KCQjw1Iv0BRDaARIsAGTWD1tge7rIph0vup1NdaHVRew_TeOeSrNtcRfoZCSO-eEuC3QvwEn0reYaAn6YEALw_wcB

this is what the code must be changed to (my ABBYY credentials have been inserted, contact me for the password)…
if replicating, be sure to remove the line above these that says “#error”, it seems swift allows programmer to hard code errors in and the ABBYY developers utilized this function to be extra sure that users remembered to add their own credentials in but simply deleting solves the issue.

the demo ABBYY iOS package ran on my iPhoneXR and came with the default image shown on the left below, but it threw the error (pic on the right) when i tapped the “Take photo” button I am guessing because it could not access the camera due to privacy restrictions.

Below is the xcode error screen:

"This app has crashed because it attempted to access privacy-sensitive data without a usage description.  The app's Info.plist must contain an NSCameraUsageDescription key with a string value explaining to the user how the app uses this data."

In order to work with the privacy, I have to check the info.plist

that worked

next error – presents after clicking “recognize” and after it loads, uploads, and processes the image for text:

“NSURLErrorDomain error -1012“

found here: https://forum.ocrsdk.com/search/?Term=inet

i believe I should be ediitng line 38 of HTTPOperation.m as it has to do with NSURLConnection; it has a warning stating that it is deprecated and it should use NSURLSession instead but I am not sure how to use that.. will update when i figure it out

final presentation

tensor flow & camera app

After reading about machine learning and how convolutional neural networks work, I am ready to begin building a learning system. I start here, on Google colab notebook.

First you have to enable the GPU (edit>notebooksettings>select GPU)

Then the tutorial walks through a way of setting up a learning framework (necessary functions etc).

Here is a link to my google colab notebook, which currently looks like this after following the tut and getting my bearings:

The next step is to feed in some sort of source of haikus.

I realized that I should not be setting up an image classifier, but I will be working with text (I will be integrating the scanning of text from another source). Thankfully, TensorFlow has a training data for that and a tutorial. Will be following that next.

the output is 2 syllables away from being a haiku right now…

current error: train_data type not consistent between both declarations… next step is to figure out what the type should be, and see if I can resolve by orienting the second declaration to be of that type

**** SUCCESS – trained a tensorflow model with text ****

Look at my google colab notebook to see it train and run with ~83% accuracy.

To run it, click the Runtime tab > run all.

APP STUFF:

[using: https://developer.apple.com/documentation/avfoundation/cameras_and_media_capture/avcam_building_a_camera_app to learn about how the capture device works]

[ overview of managing capture preview session, might be more relevant since I won’t be needing to take a picture https://developer.apple.com/documentation/avfoundation/cameras_and_media_capture/setting_up_a_capture_session]

“An AVCaptureSession is the basis for all media capture in iOS and macOS. It manages your app’s exclusive access to the OS capture infrastructure and capture devices, as well as the flow of data from input devices to media outputs. “

Block diagram of detailed capture session architecture example: separate AVCaptureDeviceInput objects for camera and microphone connect, through AVCaptureConnection objects managed by AVCaptureSession, to AVCapturePhotoOutput, AVCaptureMovieFileOutput, and AVCaptureVideoPreviewLayer. — Architecture of an example capture session

“All capture sessions need at least one capture input and capture output. Capture inputs (AVCaptureInput subclasses) are media sources—typically recording devices like the cameras and microphone built into an iOS device or Mac. Capture outputs (AVCaptureOutput subclasses) use data provided by capture inputs to produce media, like image and movie files.”

**** SUCCESS – GOT CAMERA TO WORK ****

also includes camera flipping function when pressing button in the top left-hand corner

a little cosmetic development
[this tutorial was very simple and helpful]

ml resource

another source that is helpful for learning about ai>machine learning>deep learning/cnn’s (convolutional neural networks):

app wireframe

ap4

i gave up & deleted evrythng (a few days ago) bc nothing worked & deleted my files are now empty.. need to rebuild

app (3)

My guess is that camera permission is rooted in the info.plist page

app (2)

Been having a lot of trouble accessing the camera. Seems complicated because of permissions. Working on that.

app

tutorial source for camera enablement:

📸 Swift Camera — Part 1

https://medium.com/@rizwanm/https-medium-com-rizwanm-swift-camera-part-1-c38b8b773b2

So far I have built a little app that runs in the simulator, but it’s hard to tell if the camera is being accessed from the simulation, so I attempted to run it on my iPhone XR (iOS 13.3) but I had an issue with support from xcode, I had to download support files for iOS 13.3 and place them in the correct dev. support folder on my computer. That took care of the issue, but I have now been waiting for this message (screenshot below) to resolve. Online, people also had this problem and reported this message showing for upwards of 40 minutes. I will wait about an hour before shutting down the operation.

^ this is what happens when I try to use the simulator to activate the camera

next error to attack (attempting to run on iPhone)

tried updating Xcode, as suggested by apple developers in response to this issue, but new issue arises

upgraded the os on my computer then:

success: after all errors were taken care of (by updating my computer, updating Xcode, deleting and then replacing the os 13.3 files in the dev fold for Xcode and then going into my settings and approving of this app on my iPhone), I finally got the app onto my phone

I am considering rebranding to “haiScan”

next error to solve (this is the output when clicking on the camera button)

according to this article, you must now declare access to private data ahead of time (lest a crashing app)

This article (from Apple), explains the necessary code more in-depth. This is their structure:

switch AVCaptureDevice.authorizationStatus(for: .video) {
    case .authorized: // The user has previously granted access to the camera.
        self.setupCaptureSession()
    
    case .notDetermined: // The user has not yet been asked for camera access.
        AVCaptureDevice.requestAccess(for: .video) { granted in
            if granted {
                self.setupCaptureSession()
            }
        }
    
    case .denied: // The user has previously denied access.
        return

    case .restricted: // The user can't grant access due to restrictions.
        return
}