I’m now working with LUIS since late December 2016 and I have detected some patterns that I think can be very useful when training your models. My observations are based on models that served different purposes. I’m not going to show you screenshots of the UI since it recently changed dramatically so I’ll focus more on the features and I guess you’ll find your way in the UI yourself.
Entities and phrase list features
IMHO, entities are the corner stone of a LUIS model. They can help LUIS pairing intents & entities together while allowing the resulting action to benefit from the captured value(s). To take a concrete example, if a user asks this question:
Where to find documentation on SharePoint
you might want to tag SharePoint as a software. Of course, while the LUIS UI allows to tag these entities manually, you’d rather want it to detect them automatically. There are a few ways for that to happen:
- Establishing an utterance pattern: if you enter the following utterances “who is stephane eyskens”, “who is rick van rousselt”, “who is gaston lagaffe” and you mark “stephane eyskens”, “rick van rousselt” and “gaston lagaffe” as being entities of type “person”, the next time you’ll enter “who is dilbert”, “dilbert” will be identified automatically as being a person. There is of course a risk of erroneous mappings from LUIS.
- The above example with persons is a hard one to tackle because you can have an infinite number of persons. However, when dealing with softwares, internal system names, etc. which are in theory limited in numbers, you should make use of phrase list features. As an example, if you create an entity called software and if you create a phrase list with the same name (important to have the same name) that contains “sharepoint,yammer,onedrive….”, LUIS will automatically resolve these items as being entities of type software. The day you deal with new softwares, simply update the phrase list features and retrain your model.
Of course, LUIS comes with a set of prebuilt entities which you should use in the first place if any of these matches your requirements.
Unlike phrase list features, pattern features cannot be used to directly associate a type of entity with values. It’s rather a way to improve the machine learning system prioritizing words among others. By defining pattern features, you explicitly instruct the system that some words/expressions are meaninful for you and should be somehow matched with entities. To give a concrete example, if you create the following pattern feature:
that should be mapped (not automatically) to an entity named incident, the ML algorithm will make use of that pattern feature when utterances such as this one:
I'd like to get information about INC123456
are sent by end users. Unlike the phrase list feature, it will not detect INC123456 as being an entity of type incident but if you train it a little bit by assigning the entity to the value manually, it will automatically make the association later. You’ll see a difference if you enter other alphanumeric strings such as ABCD3445 which won’t be mapped because no pattern feature was created for them.
It might sound obvious but whenever you add a phrase list feature, a pattern feature, an entity, the first thing you should do right after is re-train your model. Then, add a few utterances and map entity values with their corresponding types and re-train again. Only a few examples are enough for LUIS to automatically detect entities correctly with the new utterances.
Happy LUIS training!