Given the goal of creating an optimal recommendation system, one could consider using a neural turing machine (NTM) with a “programmable” head performing read/write operations on the various inputs in order to generate a prediction. The inputs would be encoding of the situation (shopping for clothes), encodings of the options (sweater, jacket, T-shirt, dress shirt) along with their features of those options (blue/green, soft, polyester, wool), tons of other previously collected data points, and obviously a vector embedding of the individual for whom we are making the recommendation. The predictions would just the result of softmax classifier with thousands of possibilities, each one representing a cluster of clothing items. (There is no need to recommend just one item because eve a mobile UI can display 7-10 items fairly easily to let the customer make the final decision.)
While most of the media is harping on AI taking over the world, the reality of the field is much more mundane, but no less impactful. We are already seeing many applications in the medical field with Enlitic and MetaMind diagnosing illnesses and potential dangers far sooner and faster than a doctor could. Lawyers are able to use Legal Robot or Beagle to dig through billions of pages of contracts to find exactly the piece of information they need. However, what the industry really needs still is a simple application of deep learning that transforms how the general public views the technology. And for that, I think personalization engines are the right answer.
Recently VIV debuted its flagship product at TechCrunch Disrupt to great fanfare, but unfortunately the results didn’t quite match up to the hype. Let’s first talk about what they did right. The founders at VIV have spent a lot of time thinking about what the ideal experience for a digital personal assistant might look like and have come up with four main criteria:
For my Stanford Convolutional Neural Networks course, I partnered with a brilliant friend of mine to analyze images from a collection of 40,000 digitized works of art by classifying them according to artist, genre, and location. After some standard pre-processing, we employed a modified VGGNet architecture to achieve better than state-of-the-art results on artist and genre classification. Along the way though, we hit a number of roadblocks, and saw Error allocating 86118400 bytes of device memory (out of memory). Driver report 32735232 bytes free and 4294770688 bytes total. Segmentation fault (core dumped) more times than we would like to remember. In getting our network to run to properly, we encountered a number of problems and their solutions.