Thursday, February 15, 2018

Update On RNN's for Predicting Crypto Prices

This is a Recurrent Neural Network diagram from here

Sporadically, I have been working on this little project to both learn more about recurrent neural networks and build something useful to predict future cryptocurrency prices.  As I talked about before, I have been looking into ways of predicting the price on a rolling basis.  As of right now, I am predicting the next day's price from a history of 6 days before.  Let's take a look at what I did.

Recurrent Neural Networks are a good choice for this type of timeseries because they can incorporate new values and keep track of history in order to make new predictions.  I am using Keras to create the network and here is how I built it:

lookback=6
model = Sequential()
batch_size = 1
model.add(LSTM(4, input_shape=(lookback, 1)))
model.add(Dense(1))
model.compile(loss='mse', optimizer='rmsprop', metrics=['mae'])

As you can see, I used 4 Long-Short Term Memory blocks and a lookback of 6 days.  I used "rmsprop" as my optimizer because it is essentially a more advanced gradient descent method which is usually fine for regression tasks.  The Loss Metric chosen was Mean Square Error, which is the classic loss function for regression problems.  I am also keeping track of Mean Absolute Error, just to confirm the results.

The data in this example consists of BTC/USD daily closes from January 2016 to February 2018.  This is the plot of that data.
Before training, I scale the data between 0 and 0.9 to account for higher prices in the future, with a Min-Max Scaler from Sci-kit Learn.  In the future, I may try dividing by 1 million instead, to better account for future prices (I don't see it hitting 1 million any time soon, but it could in the future).   Then I split the data into training and testing datasets with a 67% training split.  During the train, I also check a 20% validation set, just to watch how each iteration of the model performs.  I have plotted these values during the train.  This allows me to see at what point the model begins to over-train. We can see this by looking at the point at which the validation loss (MSE) significantly diverges from the training loss.  This is an image of that plot, with the validation loss filtered to discard noise:


In this example, I have trained to 1000 iterations.  It is kind of tough to see the divergence, but it happens around 125 iterations.  I am curious if I were to leave it training for 10,000 iterations, whether there might be a more clear divergence point.  Anyway, if we train to about 125 iterations, we get a result that looks like the one below.  The green line is the prediction of trained data and the red line is the prediction of the untrained portion of the data.  Although the result is clearly worse, I am pretty happy with how well it did.  


The results are as follows: 
- On Trained data the RMSE is 44.67
- On Test data the RMSE is 1342.08

The question is, how can I improve this result?  My initial thoughts are to experiment with different look-back values, and possibly more LSTM blocks.  However, I suspect that the most practical way to improve the result is to also add in open's, high's and low's as features as well.  This may vastly improve the model because it will be able to see momentum and other patterns at each timestep.  This where I will focus next.


Tuesday, February 13, 2018

Some Computer Vision For Kicks

Over the past few weeks, I have been working on a few different projects to broaden my skills and learn about some new technologies.  One area that I have been jumping into is computer vision.  Recently, I have been working my way through the book "OpenCV with Python Blueprints".  Some of the projects I have done so far include building a Cartoonizer and some Feature Matching.  Let me show you!


This is me just after finishing the Cartoonizer.  As you can see, the program cartoonizes live video and shows the result in real-time.  The process involves:
- Applying a bilateral filter in order to smooth flat areas while keeping edges sharp
- Converting the original to grayscale
- Applying a median blur to reduce image noise
- Using an adaptive threshold to detect edges
- Combining the color image with the edge mask

The result is really great!

Another project I just completed is the Feature Matching project.  The idea here is to find robust features of an image, match them in another image and find the transform that converts between the two images.


Here is an examples of what that looks like in practice.  On the left, is the still image that I took from my computer and on the right is a live image of the scene.  The red lines show where the feature in the first frame is located in the second frame. This seemed to work pretty well for very similar scenes, but had some trouble when I changed the scene significantly.  However, it is not unexpected that it would fail on different scenes, because the features are not very similar at all.

Here is how I did it:
- First I used SURF (Speeded-Up Robust Features) to find some distinctive keypoints.
- Then, I used FLANN (Fast Library for Approximate Nearest Neighbors) to check whether the other frame has similar keypoints and then match them.
- Then, using outlier rejection, I was able to pare down the number of features to only the good ones.
-  Finally, using a perspective transform, I was able to warp the second image so that it matched the first (not shown here).

I am currently in the middle of the 3D scene reconstruction project.  This something I have been meaning to do for a long time and I am currently really enjoying working on it.