NCModel processing logic is defined as a pipeline and the collection of one or more intents to be matched on. The sections below explain what intent is, how to define it in your model, and how it works.
The goal of the data model implementation is to take the user input text, pass it through processing pipeline and match the resulting variants to a specific user-defined code that will execute for that input. The mechanism that provides this matching is called an intent.
The intent generally refers to the goal that the end-user had in mind when speaking or typing the input utterance. The intent has a declarative part or template written in IDL - Intent Definition Language that strictly defines a particular form the user input. Intent is also bound to a callback method that will be executed when that intent, i.e. its template, is detected as the best match for a given input. A typical data model will have multiple intents defined for each form of the expected user input that model wants to react to.
For example, a data model for banking chatbot or analytics application can have multiple intents for each domain-specific group of input such as opening an account, closing an account, transferring money, getting statements, etc.
Intents can be specific or generic in terms of what input they match. Multiple intents can overlap and NLPCraft will disambiguate such cases to select the intent with the overall best match. In general, the most specific intent match wins.
NLPCraft intents are written in Intent Definition Language (IDL). IDL is a relatively straightforward declarative language. For example, here's a simple intent x
with two terms a
and b
:
/* Intent 'x' definition. */ intent=x term(a)~{# == 'my_elm'} // Term 'a'. term(b)={has(ent_groups, "my_group")} // Term 'b'.
IDL intent defines a match between the parsed user input represented as the collection of entities, and the user-define callback method. IDL intents are bound to their callbacks via Java annotation and can be located in the same Java annotations or in external *.idl
files.
You can review the formal ANTLR4 grammar for IDL, but here are the general properties of IDL:
// Comment.
as well as multi-line /* Comment. */
.'text'
) or double quotes ("text"
) simplifying IDL usage in Scala - you don't have to escape double quotes. Both quotes can be escaped in string, i.e. "text with \" quote"
or 'text with \' quote'
true
, false
and null
for boolean and null values.'_'
character for separation as in 200_000
.flow fragment import intent meta options term true false null
length(trim(" text "))
vs. OOP-style " text ".trim().length()
.if
and or_else
also provide the similar short-circuit evaluation.IDL program consists of intent, fragment, or import statements in any order or combination:
intent
statement
Intent is defined as one or more terms. Each term is a predicate over a instance of NCEntity trait. For an intent to match all of its terms have to evaluate to true. Intent definition can be informally explained using the following full-feature example:
intent=xa flow="^(?:login)(^:logout)*$" meta={'enabled': true} term(a)={month >= 6 && !# != "z" && meta_intent('enabled') == true}[1,3] term(b)~{ @usrTypes = meta_req('user_types') (# == 'order' || # == 'order_cancel') && has_all(@usrTypes, list(1, 2, 3)) } intent=xb options={ 'ordered': false, 'unused_free_words': true, 'unused_entities': false, 'allow_stm_only': false } term(a)={length("some text") > 0} fragment(frag, {'p1': 25, 'p2': {'a': false}})
NOTES:
intent=xa
line 1intent=xb
line 11xa
and xb
are the mandatory intent IDs. Intent ID is any arbitrary unique string matching the following lexer template: (UNI_CHAR|UNDERSCORE|LETTER|DOLLAR)+(UNI_CHAR|DOLLAR|LETTER|[0-9]|COLON|MINUS|UNDERSCORE)*
options={...}
line 12Option | Type | Description | Default Value |
ordered | Boolean | Whether or not this intent is ordered. For ordered intent the specified order of terms is important for matching this intent. If intent is unordered its terms can be found in any order in the input text. Note that ordered intent significantly limits the user input it can match. In most cases the ordered intent is only applicable to processing of a formal grammar (like a programming language) and mostly unsuitable for the natural language processing. Note that while the
| false |
unused_free_words | Boolean | Whether or not free words - that are unused by intent matching - should be ignored (value true ) or reject the intent match (value false ). Free words are the words in the user input that were not recognized as any entity. Typically, for the natural language comprehension it is safe to ignore free words. For the formal grammar, however, this could make the matching logic too loose. | true |
unused_entities | Boolean | Whether or not unused entities should be ignored (value true ) or reject the intent match (value false ). By default, tne unused entities are not ignored since it is assumed that user would define pipeline entity parser on purpose and construct the intent logic appropriate. | false |
allow_stm_only | Boolean | Whether or not the intent can match when all of the matching entities came from STM. By default, this special case is disabled (value false ). However, in specific intents designed for free-form language comprehension scenario, like, for example, SMS messaging - you may want to enable this option. | false |
flow="^(?:login)(^:logout)*$"
line 2Optional. Dialog flow is a history of previously matched intents to match on. If provided, the intent will first match on the history of the previously matched intents before processing its terms by using regular expressions. Dialog flow specification is a string with the standard Java regular expression. The history of previously matched intents is presented as a space separated string of intent IDs that were selected as the best match during the current conversation, in the chronological order with the most recent matched intent ID being the first element in the string. Dialog flow regular expression will be matched against that string representing intent IDs.
In the line 2, the ^(?:login)(^:logout)*$
dialog flow regular expression defines that intent should only match when the immediate previous intent was login
and no logout
intents are in the history. If the history is "login order order"
- this intent will match. However, for "login logout"
or "order login"
history this dialog flow will not match.
Note that if dialog flow is defined and it doesn't match the history the terms of the intent won't be tested at all.
meta={'enabled': true}
line 3 Optional. Just like the most of the components in NLPCraft, the intent can have its own metadata. Intent metadata is defined as a standard JSON object which will be converted into java.util.Map
instance and can be accessed in intent's terms via meta_intent()
IDL function. The typical use case for declarative intent metadata is to parameterize its behavior, i.e. the behavior of its terms, with a clearly defined properties that are provided inside intent definition itself.
term(a)={month >= 6 && !# != "z" && meta_intent('enabled') == true}[1,3]
line 4term(b)~{
line 5@usrTypes = meta_req('user_types')
(# == 'order' || # == 'order_cancel') && has_all(@usrTypes, list(1, 2, 3))
}
term(a)={length("some text") > 0}
line 18 Term is a building block of the intent. Intent must have at least one term. Term has optional ID, an entity predicate and optional quantifiers. It supports conversation context if it uses '~'
symbol or not if it uses '='
symbol in its definition. For the conversational term the system will search for a match using entities from the current request as well as the entities from conversation STM (short-term-memory). For a non-conversational term - only entities from the current request will be considered.
A term is matched if its entity predicate returns true. The matched term represents one or more entities, sequential or not, that were detected in the user input. Intent has a list of terms (always at least one) that all have to be matched in the user input for the intent to match. Note that term can be optional if its min quantifier is zero. Whether the order of the terms is important for matching is governed by intent's ordered
parameter.
Term ID (a
and b
) is optional. It is only required by @NCIntentTerm
annotation to link term's entities to a formal parameter of the callback method. Note that term ID follows the same lexical rules as intent ID.
Inside of curly brackets {
}
you can have an optional list of term variables and the mandatory term expression that must evaluate to a boolean value. Term variable name must start with @
symbol and be unique within the scope of the current term. All term variables must be defined and initialized before term expression which must be the last statement in the term:
term(b)~{ @a = meta_req('a') @lst = list(1, 2, 3, 4) has_all(@lst, list(@a, 2)) }
Term variable initialization expression as well as term's expression follow Java-like expression grammar including precedence rules, brackets and logical combinators, as well as built-in IDL functions calls:
term={true} // Special case of 'constant' term. term={ // Variable declarations. @a = round(1.25) @b = meta_req('my_prop') // Last expression must evaluate to boolean. (@a + 2) * @b > 0 } term={ // Variable declarations. @c = meta_ent('prop') @lst = list(1, 2, 3) // Last expression must evaluate to boolean. abs(@c) > 1 && size(@lst) != 5 }
NOTE: while term variable initialization expressions can have any type - the term's expression itself, i.e. the last expression in the term's body, must evaluate to a boolean result only. Failure to do so will result in a runtime exception during intent evaluation. Note also that such errors cannot be detected during intent compilation phase.
?
and [1,3]
define an inclusive quantifier for that term, i.e. how many times the match for this term should found. You can use the following quick abbreviations:
*
is equal to [0,∞]
+
is equal to [1,∞]
?
is equal to [0,1]
[1,1]
As mentioned above the quantifier is inclusive, i.e. the [1,3]
means that the term should appear once, two times or three times.
fragment(frag, {'p1': 25, 'p2': {'a': false}})
line 19 Fragment reference allows to insert the terms defined by that fragment in place of this fragment reference. Fragment reference has mandatory fragment ID parameter and optional JSON second parameter. Optional JSON parameter allows to parameterize the inserted terms' behavior and it is available to the terms via meta_frag()
IDL function.
fragment
statement
Fragments allow to group and name a set of reusable terms. Such groups can be further parameterized at the place of reference and enable the reuse of one or more terms by multiple intents. For example:
// Fragments. fragment=buzz term~{# == meta_frag('id')} fragment=when term(nums)~{ // Term variable. @type = meta_ent('num:unittype') @iseq = meta_ent('num:isequalcondition') # == 'num' && @type == 'datetime' && @iseq == true }[0,7] // Intents. intent=alarm // Insert parameterized terms from fragment 'buzz'. fragment(buzz, {"id": "x:alarm"}) // Insert terms from fragment 'when'. fragment(when)
NOTES:
buzz
and when
) and a list of terms. import
statement
Import statement allows to import IDL declarations from either local file, classpath resource or URL:
// Import using absolute path. import('/opt/globals.idl') // Import using classpath resource. import('org/apache/nlpcraft/examples/alarm/intents.idl') // Import using URL. import('ftp://user:password@myhost:22/opt/globals.idl')
NOTES:
import
keyword and has a string parameter that indicates the location of the resource to import.During NCModelClient initialization it scans the provided model class for the intents. All found intents are compiled into an internal representation.
Note that not all intent-related problems can be detected at the compilation phase, and NCModelClient can be initialized with intents not being completely validated. For example, each term in the intent must evaluate to a boolean result. This can only be checked at runtime. Another example is the number and the types of parameters passed into IDL function which is only checked at runtime as well.
Intents are compiled only once during the NCModelClient initialization and cannot be re-compiled. Model logic, however, can affect the intent behavior through NCModel callback methods and metadata all of which can change at runtime and are accessible through IDL functions.
Here's few of intent examples with explanations:
Example 1:
intent=a term~{# == 'x:type'} term(nums)~{# == 'num' && lowercase(meta_ent('num:unittype')) == 'datetime'}[0,2]
NOTES:
a
.true
) and default order (false
).~
) that have to be found for the intent to match. Note that second term is optional as it has [0,2]
quantifier.x:type
.num
with num:unittype
metadata property equal to 'datetime'
string.lowercase
used on num:unittype
metadata property value.nums
) it can be references by @NCIntentTerm
annotation by the callback formal parameter.Example 2:
intent=id2 flow='id1 id2' term={# == 'myent' && signum(get(meta_ent('score'), 'best')) != -1} term={has_any(ent_groups, list('actors', 'owners'))}
NOTES:
id2
.'id1 id2'
. It expects the sequence of intents id1
and id2
somewhere in the history of previously matched intents in the course of the current conversation.=
). Both terms have to be present only once (their implicit quantifiers are [1,1]
).myent
and have metadata property score
of type map. This map should have a value with the string key 'best'
. signum
of this map value should not equal -1
. Note that meta_ent()
, get()
and signum()
are all built-in IDL functions.actors
or owners
group. IDL provides over 100 built-in functions that can be used in IDL intent definitions. IDL function call takes on traditional fun_name(p1, p2, ... pk)
syntax form. If function has no parameters, the brackets are optional. IDL function operates on stack - its parameters are taken from the stack and its result is put back onto stack which in turn can become a parameter for the next function call and so on. IDL functions can have zero or more parameters and always have one result value. Some IDL functions support variable number of parameters.
Special Shorthand #
The frequently used IDL function ent_type()
has a special shorthand #
. For example, the following expressions are all equal:
ent_type() == 'type' ent_type == 'type' // Remember - empty parens are optional. # == 'type'
When chaining the function calls IDL uses mathematical notation (a-la Python) rather than object-oriented one: IDL length(trim(" text "))
vs. OOP-style " text ".trim().length()
.
IDL functions operate with the following types:
JVM Type | IDL Name | Notes |
---|---|---|
java.lang.String | String | |
java.lang.Long java.lang.Integer java.lang.Short java.lang.Byte | Long | Smaller numerical types will be converted to java.lang.Long . |
java.lang.Double java.lang.Float | Double | java.lang.Float will be converted to java.lang.Double . |
java.lang.Boolean | Boolean | You can use true or false literals. |
java.util.List<T> | List[T] | Use list(...) IDL function to create new list. |
java.util.Map<K,V> | Map[K,V] | |
NCEntity | Entity | |
java.lang.Object | Any | Any of the supported types above. Use null literal for null value. |
Some IDL functions are polymorphic, i.e. they can accept arguments and return result of multiple types. Encountering unsupported types will result in a runtime error during intent matching. It is especially important to watch out for the types when adding objects to various metadata containers and using that metadata in the IDL expressions.
Unsupported Types
Detection of the unsupported types by IDL functions cannot be done during IDL compilation and can only be done during runtime execution. This means that even though the model compiles IDL intents and NCModelClient starts successfully - it does not guarantee that intents will operate correctly.
All IDL functions are organized into the following groups:
Description:
Returns entity type for the current entity (default) or the provided one by the optional parameter t
. Note that this functions has a special shorthand #
.
Usage:
// Result: 'true' if the current entity type is equal to 'my_type'. ent_type == 'my_type' # == 'my_type' ent_type(ent_this) == 'my_type' #(ent_this) == 'my_type'
Description:
Gets the list of groups the current entity (default) or the provided one by the optional parameter t
belongs to. Note that, by default, if not specified explicitly, entity always belongs to one group with type equal to entity type. May return an empty list but never a null
.
Usage:
// Result: list of groups this entity belongs to. ent_groups ent_groups(ent_this)
Description:
Returns current entity.
Usage:
// Result: current entity. ent_this
Description:
Returns entity's original text. If t
is not provided the current entity is assumed.
Usage:
// Result: entity original input text. ent_text
Description:
Returns entity's index in the original input. Note that this is an index of the entity and not of the character. If t
is not provided the current entity is assumed.
Usage:
// Result: 'true' if index of this entity in the original input is equal to 1. ent_index == 1 ent_index(ent_this) == 1
Description:
Returns true
if this entity is the first in the original input. Note that this checks index of the entity and not of the character. If t
is not provided the current entity is assumed.
Usage:
// Result: 'true' if this entity is the first entity in the original input. ent_is_first ent_is_first(ent_this)
Description:
Returns true
if this entity is the last in the original input. Note that this checks index of the entity and not of the character. If t
is not provided the current entity is assumed
Usage:
// Result: 'true' if this entity is the last entity in the original input. ent_is_last ent_is_last(ent_this)
Description:
Returns true
if there is a entity with type type
after this entity.
Usage:
// Result: 'true' if there is a entity with type 'a' after this entity. ent_is_before_type('a')
Description:
Returns true
if there is a entity with type type
before this entity.
Usage:
// Result: 'true' if there is a entity with type 'a' before this entity. ent_is_after_type('a')
Description:
Returns true
if this entity is located between entities with types type1
and type2
.
Usage:
// Result: 'true' if this entity is located after entity with type 'before' and before the entity with type 'after'. ent_is_between_types('before', 'after')
Description:
Returns true
if this entity is located between entities with group IDs grp1
and grp2
.
Usage:
// Result: 'true' if this entity is located after entity belonging to the group 'before' and before the entity belonging to the group 'after'. ent_is_between_groups('before', 'after')
Description:
Returns true
if there is a entity that belongs to the group grp
after this entity.
Usage:
// Result: 'true' if there is a entity that belongs to the group 'grp' after this entity. ent_is_before_group('grp')
Description:
Returns true
if there is a entity that belongs to the group grp
before this entity.
Usage:
// Result: 'true' if there is a entity that belongs to the group 'grp' before this entity. ent_is_after_group('grp')
Description:
Returns all entities from the original input.
Usage:
// Result: list of all entities for the original input. ent_all
Description:
Returns number of entities from the original input. It is equivalent to size(ent_all)
Usage:
// Result: number of all entities for the original input. ent_count
Description:
Returns list of entities from the original input with type type
.
Usage:
// Result: list of entities for the original input that have type 'type'. ent_all_for_type('type')
Description:
Returns list of entities from the original input that belong to the group grp
.
Usage:
// Result: list of entities for the original input that belong to th group 'grp'. ent_all_for_group('grp')
Description:
Returns size or length of the given string, list or map. This function has aliases: size
and count
.
Usage:
// Result: 9 length("some text") // Result: 3 @lst = list(1, 2, 3) size(@lst) count(@lst)
Description:
Returns true
if string s
matches Java regular expression rx
, false
otherwise.
Usage:
regex('textabc', '^text.*$') // Returns 'true'. regex('_textabc', '^text.*$') // Returns 'false'.
Description:
Calls String.trim()
on given parameter p
and returns its result. This function has alias: strip
Usage:
// Result: "text" trim(" text ") strip(" text ")
Description:
Calls String.toUpperCase()
on given parameter p
and returns its result.
Usage:
// Result: "TEXT" uppercase("text")
Description:
Calls String.toLowerCase()
on given parameter p
and returns its result.
Usage:
// Result: "text" lowercase("TeXt")
Description:
Calls Apache Commons StringUtils.isAlpha()
on given parameter p
and returns its result.
Usage:
// Result: true is_alpha("text")
Description:
Calls Apache Commons StringUtils.isAlphanumeric()
on given parameter p
and returns its result.
Usage:
// Result: true is_alphanum("text123")
Description:
Calls Apache Commons StringUtils.isWhitespace()
on given parameter p
and returns its result.
Usage:
// Result: false is_whitespace("text123") // Result: true is_whitespace(" ")
Description:
Calls Apache Commons StringUtils.isNumeric()
on given parameter p
and returns its result.
Usage:
// Result: true is_num("123")
Description:
Calls Apache Commons StringUtils.isNumericSpace()
on given parameter p
and returns its result.
Usage:
// Result: true is_numspace(" 123")
Description:
Calls Apache Commons StringUtils.isAlphaSpace()
on given parameter p
and returns its result.
Usage:
// Result: true is_alphaspace(" text ")
Description:
Calls Apache Commons StringUtils.isAlphaNumericSpace()
on given parameter p
and returns its result.
Usage:
// Result: true is_alphanumspace(" 123 text ")
Description:
Calls p1
.split(p2)
and returns its result converted to the list.
Usage:
// Result: [ "a", "b", "c" ] split("a|b|c", "|")
Description:
Calls p1
.split(p2)
converting the result to the list. Then calls String.strip() on each element.
Usage:
// Result: ["a", "b", "c"] split_trim("a | b | c", "|")
Description:
Calls p1
.startsWith(p2)
and returns its result.
Usage:
// Result: true starts_width("abc", "ab")
Description:
Calls p1
.endsWith(p2)
and returns its result.
Usage:
// Result: true ends_width("abc", "bc")
Description:
Calls p1
.contains(p2)
and returns its result.
Usage:
// Result: true contains("abc", "bc")
Description:
Calls p1
.substring(p2, p3)
and returns its result.
Usage:
// Result: "bc" substr("abc", 1, 3)
Description:
Calls p1
.replace(p2, p3)
and returns its result.
Usage:
// Result: "aBC" replace("abc", "bc", "BC")
Description:
Converts given integer or string to double value.
Usage:
// Result: 1.2 to_double("1.2") // Result: 1.0 to_double(1)
Description:
Converts given double or string to integer value. In case of double value it will be rounded to the nearest integer value.
Usage:
// Result: 1 to_int("1.2") to_int(1.2)
Description:
Returns absolute value for parameter x
.
Usage:
// Result: 1 abs(-1) // Result: 1.5 abs(-1.5)
Description:
Returns square of >x
Usage:
// Result: 4 square(2)
Description:
Returns PI constant.
Usage:
// Result: 3.14159265359 pi
Description:
Returns Euler constant.
Usage:
// Result: 0.5772156649 euler
Description:
Returns maximum value for given list. Throws runtime exception if the list is empty. This function uses a natural ordering.
Usage:
// Result: 3 max(list(1, 2, 3))
Description:
Returns minimum value for given list. Throws runtime exception if the list is empty. This function uses a natural ordering.
Usage:
// Result: 1 min(list(1, 2, 3))
Description:
Returns average (mean) value for given list of ints, doubles or strings. Throws runtime exception if the list is empty. If list list contains strings, they have to be convertable to int or double.
Usage:
// Result: 2.0 avg(list(1, 2, 3)) avg(list("1.0", 2, "3"))
Description:
Returns standard deviation value for given list of ints, doubles or strings. Throws runtime exception if the list is empty. If list list contains strings, they have to be convertable to int or double.
Usage:
stdev(list(1, 2, 3)) stdev(list("1.0", 2, "3"))
Description:
Returns new list with given parameters.
Usage:
// Result: [] list // Result: [1, 2, 3] list(1, 2, 3) // Result: ["1", true, 1.25] list("1", 2 == 2, to_double('1.25'))
Description:
Gets element from either list or map c
. For list the k
must be 0-based integer index in the list.
Usage:
// Result: 1 get(list(1, 2, 3), 0) // Result: true get(json('{"a": true}'), "a")
Description:
Calls c
.contains(x
) and returns its result.
Usage:
// Result: true has(list("a", "b"), "a")
Description:
Calls c
.containsAll(x
) and returns its result.
Usage:
// Result: false has_all(list("a", "b"), "a")
Description:
Checks if list c
contains any of the elements from the list x
.
Usage:
// Result: true has_any(list("a", "b"), list("a"))
Description:
Returns first element from the list c
or null
if the list is empty.
Usage:
// Result: "a" first(list("a", "b"))
Description:
Returns last element from the list c
or null
if the list is empty.
Usage:
// Result: "b" last(list("a", "b"))
Description:
Returns list of keys for map m
.
Usage:
// Result: ["a", "b"] keys(json('{"a": true, "b": 1}'))
Description:
Returns list of values for map m
.
Usage:
// Result: [true, 1] values(json('{"a": true, "b": 1}'))
Description:
Returns reversed list x
. This function uses the natural sorting order.
Usage:
// Result: [3, 2, 1] reverse(list(1, 2, 3))
Description:
Returns sorted list x
. This function uses the natural sorting order.
Usage:
// Result: [1, 2, 3] sort(list(2, 1, 3))
Description:
Checks if given string, list or map x
is empty.
Usage:
// Result: false is_empty("text") is_empty(list(1)) is_empty(json('{"a": 1}'))
Description:
Checks if given string, list or map x
is non empty.
Usage:
// Result: true non_empty("text") non_empty(list(1)) non_empty(json('{"a": 1}'))
Description:
Makes list x
distinct.
Usage:
// Result: [1, 2, 3] distinct(list(1, 2, 2, 3, 1))
Description:
Concatenates lists x1
and x2
.
Usage:
// Result: [1, 2, 3, 4] concat(list(1, 2), list(3, 4))
Description:
Gets entity metadata property p
.
Usage:
// Result: 'nlp:token:text' entity metadata property. meta_ent('nlp:token:text')
Description:
Gets request metadata property p
.
Usage:
// Result: 'my:prop' user request data property. meta_req('my:prop')
Description:
Gets intent metadata property p
.
Usage:
// Result: 'my:prop' intent metadata property. meta_intent('my:prop')
Description:
Gets conversation metadata property p
.
Usage:
// Result: 'my:prop' conversation metadata property. meta_conv('my:prop')
Description:
Gets fragment metadata property p
. Fragment metadata can be optionally passed in when referencing the fragment to parameterize it.
Usage:
// Result: 'my:prop' fragment metadata property. meta_frag('my:prop')
Description:
Gets system property or environment variable p
.
Usage:
// Result: 'java.home' system property. meta_sys('java.home') // Result: 'HOME' environment variable. meta_sys('HOME')
Description:
Gets configuration property p
.
Usage:
// Result: 'my:prop' configuration property. meta_cfg('my:prop')
Description:
Returns current year.
Usage:
// Result: 2021 year
Description:
Returns current month: 1 ... 12.
Usage:
// Result: 5 month
Description:
Returns current day of the month: 1 ... 31.
Usage:
// Result: 5 day_of_month
Description:
Returns current day of the week: 1 ... 7.
Usage:
// Result: 5 day_of_week
Description:
Returns current day of the year: 1 ... 365.
Usage:
// Result: 51 day_of_year
Description:
Returns current hour: 0 ... 23.
Usage:
// Result: 11 hour
Description:
Returns current minute: 0 ... 59.
Usage:
// Result: 11 minute
Description:
Returns current second: 0 ... 59.
Usage:
// Result: 11 second
Description:
Returns current week of the month: 1 ... 4.
Usage:
// Result: 2 week_of_month
Description:
Returns current week of the year: 1 ... 56.
Usage:
// Result: 21 week_of_year
Description:
Returns current quarter: 1 ... 4.
Usage:
// Result: 2 quarter
Description:
Returns current time in milliseconds.
Usage:
// Result: 122312341212 now
Description:
Gets UTC/GMT timestamp in ms when user input was received.
Usage:
// Result: input receive timsstamp in ms. req_tstamp
Description:
Returns user ID
Usage:
// Result: user ID. user_id
Description:
This function provides 'if-then-else' equivalent as IDL does not provide branching on the language level. This function will evaluate c
parameter and either return then
value if it evaluates to true
or else
value in case if it evaluates to false
. Note that evaluation will be short-circuit, i.e. either then
or else
will actually be computed but not both.
Usage:
// Result: // - 'list(1, 2, 3)' if 1st parameter is 'true'. // - 'null' if 1st parameter is 'false'. if(meta_model('my_prop') == true, list(1, 2, 3), null)
Description:
Converts JSON in p
parameter to a map. Use single quoted string to avoid escaping double quotes in JSON.
Usage:
// Result: Map. json('{"a": 2, "b": [1, 2, 3]}')
Description:
Converts p
parameter to a string. In case of a list this function will convert individual list elements to string and return the list of strings.
Usage:
// Result: "1.25" to_string(1.25) // Result: list("1", "2", "3") to_string(list(1, 2, 3))
Description:
Returns p
if it is not null
, a
otherwise. Note that evaluation will be short-circuit, i.e. a
will be evaluated only if p
is null
.
Usage:
// Result: 'some_prop' model metadata or 'text' if one is 'null'. @dflt = 'text' or_else(meta_model('some_prop'), @dflt)
IDL declarations can be placed in different locations based on user preferences:
NCIntent
java annotation takes a string as its parameter that should be a valid IDL declaration. For example, Scala code snippet:
@NCIntent("import('/opt/myproj/global_fragments.idl')") // Importing. @NCIntent("intent=act term(act)={has(ent_groups, 'act')} fragment(f1)") // Defining in place. def onMatch( @NCIntentTerm("act") actEnt: NCEntity, @NCIntentTerm("loc") locEnts: List[NCEntity] ): NCResult = { ... }
*.idl
files contain IDL declarations and can be imported in any other places where IDL declarations are allowed. See import()
statement explanation below. For example:/* * File 'my_intents.idl'. * ====================== */ import('/opt/globals.idl') // Import global intents and fragments. // Fragments. // ---------- fragment=buzz term~{# == 'x:alarm'} fragment=when term(nums)~{ // Term variables. @type = meta_ent('num:unittype') @iseq = meta_ent('num:isequalcondition') # == 'num' && @type != 'datetime' && @iseq == true }[0,7] // Intents. // -------- intent=alarm fragment(buzz) fragment(when)
IDL intents must be bound to their callback methods. This binding is accomplished using the following Java annotations:
Annotation | Target | Description |
---|---|---|
@NCIntent | Callback method or model class | When applied to a method this annotation allows to define IDL intent in-place on the method serving as its callback. This annotation can also be applied to a model's class in which case it will just declare the intent without binding it and the callback method will need to use This method is ideal for simple intents and quick declaration right in the source code and has all the benefits of having IDL to be part of the source code. However, multi-line IDL declaration can be awkward to add and maintain depending on Scala language, i.e. multi-line string literal support. In such cases it is advisable to move IDL declarations into separate |
@NCIntentRef | Callback method | This annotation allows to reference an intent defined elsewhere like an external *.idl file, or other @NCIntent annotations. In real applications, this is a most common way to bound an externally defined intent to its callback method. |
@NCIntentObject | Model class field | Marker annotation that can be applied to class member of main model. The fields objects annotated with this annotation are scanned the same way as main model. |
@NCIntentTerm | Callback method parameter | This annotation marks a formal callback method parameter to receive term's entities when the intent to which this term belongs is selected as the best match. |
Here's a couple of examples of intent declarations to illustrate the basics of intent declaration and usage.
An intent from Light Switch example:
@NCIntent("intent=ls term(act)={has(ent_groups, 'act')} term(loc)={# == 'ls:loc'}*") def onMatch( @ctx: NCContext, @im: NCIntentMatch, @NCIntentTerm("act") actEnt: NCEntity, @NCIntentTerm("loc") locEnts: List[NCEntity] ): NCResult = { ... }
NOTES:
@NCIntent
annotation.act
has two non-conversational terms: one mandatory term and another that can match zero or more entities with method onMatch(...)
as its callback.'~'
and non-conversational if it uses '='
symbol in its definition. If term is conversational, the matching algorithm will look into the conversation context short-term-memory (STM) to seek the matching entities for this term. Note that the terms that were fully or partially matched using entities from the conversation context will contribute a smaller weight to the overall intent matching weight since these terms are less specific. Non-conversational terms will be matched using entities found only in the current user input without looking at the conversation context.onMatch(...)
will be called if and when this intent is selected as the best match.min=1, max=1
quantifiers by default, i.e. one and only one.act
. Note that model elements can belong to multiple groups.act
and loc
) that are used in onMatch(...)
method parameters to automatically assign terms' entities to the formal method parameters using @NCIntentTerm
annotations. In the following Time example the intent is defined model class and referenced in code using @NCIntentRef
annotation:
@NCIntent("fragment=city term(city)~{# == 'opennlp:location'}") @NCIntent("intent=intent2 term~{# == 'x:time'} fragment(city)") class TimeModel extends NCModel( ... @NCIntentRef("intent2") private def onRemoteMatch( ctx: NCContext, im: NCIntentMatch, @NCIntentTerm("city") cityEnt: NCEntity ): NCResult = ...
NOTES:
line 2
.@NCIntentRef("intent2")
with method onMatch(...)
as its callback, line 5
.'~'
and non-conversational if it uses '='
symbol in its definition. If term is conversational, the matching algorithm will look into the conversation context short-term-memory (STM) to seek the matching entities for this term. Note that the terms that were fully or partially matched using entities from the conversation context will contribute a smaller weight to the overall intent matching weight since these terms are less specific. Non-conversational terms will be matched using entities found only in the current user input without looking at the conversation context.onMatch(...)
will be called when this intent is the best match detected.min=1, max=1
quantifiers by default.min=1, max=1
) entity with ID x:time
whose element is defined in the model.min=1, max=1
) entity with entityopennlp:location
.What time is it now in New York City?
Show me time of the day in London.
Can you please give me the Tokyo's current date and time.
NCPipeline processing result is collection of NCVariant instances. NCSemanticEntityParser is used for following example configured via JSON file. Let's consider the input text 'A B C D'
and the following elements defined in our model:
"elements": [ { "id": "elm1", "synonyms": ["A B"] }, { "id": "elm2", "synonyms": ["B C"] }, { "id": "elm3", "synonyms": ["D"] } ],
All of these elements will be detected but since two of them are overlapping (elm1
and elm2
) there should be two parsing variants at the output of this step:
elm1
('A', 'B') freeword
('C') elm3
('D')freeword
('A') elm2
('B', 'C') elm3
('D')Note that initially the system cannot determine which of these variants is the best one for matching - there's simply not enough information at this stage. It can only be determined when each variant is matched against model's intents. So, each parsing variant is matched against each intent. Each matching pair of a variant and an intent produce a match with a certain weight. If there are no matches at all - an error is returned. If matches were found, the match with the biggest weight is selected as a winning match. If multiple matches have the same weight, their respective variants' weights will be used to further sort them out. Finally, the intent's callback from the winning match is called.
Although details on exact algorithm on weight calculation are too complex, here's the general guidelines on what determines the weight of the match between a parsing variant and the intent. Note that these rules coalesce around the principle idea that the more specific match always wins:
Whether the intent is defined directly in @NCIntent
annotation or indirectly via @NCIntentRef
annotation - it is always bound to a callback method:
@NCIntentTerm
annotation. @NCIntentTerm
annotation marks callback parameter to receive term's entities. This annotations can only be used for the parameters of the callbacks, i.e. methods that are annotated with @NCIntnet
or @NCIntentRef
. @NCIntentTerm
takes a term ID as its only mandatory parameter and should be applied to callback method parameters to get the entities associated with that term (if and when the intent was matched and that callback was invoked).
Depending on the term quantifier the method parameter type can only be one of the following types:
Quantifier | Scala Type |
---|---|
[1,1] | NCEntity |
[0,1] | Option[NCEntity] |
[1,∞] or [0,∞] | List[NCEntity] |
For example:
NCIntent("intent=id term(termId)~{# == 'my_ent'}?") private def onMatch( ctx: NCContext, im: NCIntentMatch, @NCIntentTerm("termId") myEnt: Option[NCEntity] ): NCResult = { ... }
NOTES:
termId
has [0,1]
quantifier (it's optional).Option[NCEntity]
because the term's quantifier is [0,1]
.NCRejection
and NCIntentSkip
Exceptions There are two exceptions that can be used by intent callback logic to control intent matching process.
When NCRejection exception is thrown by the callback it indicates that user input cannot be processed as is. This exception typically indicates that user has not provided enough information in the input string to have it processed automatically. In most cases this means that the user's input is either too short or too simple, too long or too complex, missing required context, or is unrelated to the requested data model.
NCIntentSkip is a control flow exception to skip current intent. This exception can be thrown by the intent callback to indicate that current intent should be skipped (even though it was matched and its callback was called). If there's more than one intent matched the next best matching intent will be selected and its callback will be called.
This exception becomes useful when it is hard or impossible to encode the entire matching logic using only declarative IDL. In these cases the intent definition can be relaxed and the "last mile" of intent matching can happen inside of the intent callback's user logic. If it is determined that intent in fact does not match then throwing this exception allows to try next best matching intent, if any.
Note that there's a significant difference between NCIntentSkip exception and model's NCModel#onMatchedIntent callback. Unlike this callback, the exception does not force re-matching of all intents, it simply picks the next best intent from the list of already matched ones. The model's callback can force a full reevaluation of all intents against the user input.
IDL Expressiveness
Note that usage of NCIntentSkip
exception (as well as model's life-cycle callbacks) is a required technique when you cannot express the desired matching logic with only IDL alone. IDL is a high-level declarative language and it does not support a complex programmable logic or other types of sophisticated matching algorithms. In such cases, you can define a broad intent that would broadly match and then define the rest of the more complex matching logic in the callback using NCIntentSkip
exception to effectively indicate when intent doesn't match and other intents, if any, have to be tried.
There are many use cases where IDL is not expressive enough. For example, if your intent matching depends on financial market conditions, weather, state from external systems or details of the current user geographical location or social network status - you will need to use NCIntentSkip
-based logic or model's callbacks to support that type of matching.
NCContext
Trait NCContext trait passed into intent callback as its first parameter. This trait provide runtime information about the model configuration, request, extracted tokens and all entities variants, conversation control trait NCConversation.
NCIntentMatch
Trait NCIntentMatch trait passed into intent callback as its second parameter. This trait provide runtime information about the intent that was matched (i.e. the intent with which this callback was annotated with).
Entity
FunctionsText
FunctionsMath
FunctionsCollection
FunctionsMetadata
FunctionsDatetime
FunctionsRequest
FunctionsOther
Functions