A long long time ago in deep learning time and about two years ago in human years I created a convolutional neural net for the Kaggle National Data Science bowl. In a brave efffort I ported a convolitional net for MNIST to the plankton dataset using python and Theano. More or less a mano in Theano. It worked, I ended up somewhere halfway down the field.
I remember being somewhat proud an somewhat confused. I spent a lot of time learning deep learning concepts, and a lot of time coding in Theano. There was not really time or grit left to improve the working model. Deep learning proved a lot of work.
Flash forward to today. Within a few hours I am running a better convolutional net on the same dataset. Nothing fancy yet, but it works; and I have good hopes because of VGG16. VGG16 is a one time winning model for the ImageNet competition. As it turns out one can create a Franken-VGG16 by chopping of layers and retraining parts of the model specifically for the the plankton dataset. Ergo, the feature learning based on approximately 1.5M images is reused. Just like word embeddings are available for download, feature filters will become available for all types of datasets. Progress.
I added one of the plankton images as an example. The contest is aimed at identifying plankton species to assess the bio diversity in oceans.
As mentioned before, Keras is just kick-ass for creating neural nets. The Keras API allows for the quick creation of deep neural nets and just get on with it. Very impressive. Also many thanks to Jeremy Howard, for democratizing deep learning.