While most of the media is harping on AI taking over the world, the reality of the field is much more mundane, but no less impactful. We are already seeing many applications in the medical field with Enlitic and MetaMind diagnosing illnesses and potential dangers far sooner and faster than a doctor could. Lawyers are able to use Legal Robot or Beagle to dig through billions of pages of contracts to find exactly the piece of information they need. However, what the industry really needs still is a simple application of deep learning that transforms how the general public views the technology. And for that, I think personalization engines are the right answer.
Recently VIV debuted its flagship product at TechCrunch Disrupt to great fanfare, but unfortunately the results didn’t quite match up to the hype. Let’s first talk about what they did right. The founders at VIV have spent a lot of time thinking about what the ideal experience for a digital personal assistant might look like and have come up with four main criteria:
For my Stanford Convolutional Neural Networks course, I partnered with a brilliant friend of mine to analyze images from a collection of 40,000 digitized works of art by classifying them according to artist, genre, and location. After some standard pre-processing, we employed a modified VGGNet architecture to achieve better than state-of-the-art results on artist and genre classification. Along the way though, we hit a number of roadblocks, and saw Error allocating 86118400 bytes of device memory (out of memory). Driver report 32735232 bytes free and 4294770688 bytes total. Segmentation fault (core dumped) more times than we would like to remember. In getting our network to run to properly, we encountered a number of problems and their solutions.
So if you’ve started studying RNNs, and you heard that LSTMs and GRUs at the type of RNNs you should use because vanilla RNNs suffer from the vanishing gradient problem. That makes sense because the hidden state is passed along for each iteration, so when back-propagating, the same Jacobian matrix is multiplied by itself over and over again. If that matrix has a principal eigenvalue less than one, then we have a vanishing gradient. Incidentally, if the matrix has a principal eigenvalue greater than one: exploding gradient.