Today's search engines are more than just stupid keywords that used to be. You can ask a question – say, "How high is the tower in Paris?" – and tell you that the Eiffel Tower is 324 meters high (1
How do they do that? As in other days, they use machine learning. Machine learning algorithms – basically long lists of numbers – are used to create vectors, which in some sense represent their input, be it text on a website, images, sound, or videos. Bing captures billions of these vectors for all the different kinds of media it indexes. To search for vectors, Microsoft uses an algorithm that calls SPTAG ("Space Partition Tree and Graph"). An input query is converted to a vector, and SPTAG is used to quickly locate an "approximate nearest neighbor" (ANN), a vector that is similar to an input.
This (with some amount of manual flicking) can answer the Eiffel Tower question: "How Tall is the Tower in Paris?" they will be "close" to the towers, Paris, and the high things they are. Such sites will almost certainly be related to the Eiffel Tower.
Microsoft today released the SPTAG algorithm as the GITHub open source MIT software. This code is a proven and production grade that is used to answer Bing questions. Developers can use this algorithm to find their own sets of vectors as quickly as possible: one machine can process 250 million vectors and answer 1000 queries per second. There are several samples and explanations in the AI lab from Microsoft and Azure will use the service using the same algorithms.
Microsoft CEO Satya Nadella has on several occasions talked about his desire to "democratize AI" and make it available to everyone. Creating not only a centralized, specialized tool that requires considerable expertise, but something that a wide range of developers who address a wide range of issues can use as part of their toolkit. SPTAG is an example of how Microsoft puts these words into practice; The combination of Azure and Open Source means that developers can start with more limited, easy-to-use services, and since their expertise or requirements are becoming more complex, they can use SPTAG to create their own services.