Interface NCToken

  • All Superinterfaces:
    NCMetadata

    public interface NCToken
    extends NCMetadata
    Detected model element. A token is a detected model element and is a part of a parsed user input. Sequence of tokens represents fully parsed (see NCContext.getVariants() method) user input. A single token corresponds to a one or more words, sequential or not, in the user sentence.

    Configuring Token Providers
    Token providers (built-in or 3rd party) have to be enabled in the REST server configuration. Data models also have to specify tokens they are expecting the REST server and probe to detect. This is done to limit the unnecessary processing since implicit enabling of all token providers and all tokens can lead to a significant slow down of processing. REST server configuration property nlpcraft.server.tokenProvides provides the list of enabled token providers. Data models provide their required tokens in NCModelView.getEnabledBuiltInTokens() method.

    Read full documentation in Data Model section and review examples.

    See Also:
    NCElement, NCContext.getVariants()
    • Method Summary

      All Methods Instance Methods Abstract Methods Default Methods 
      Modifier and Type Method Description
      default List<NCToken> findPartTokens​(String... idOrAlias)
      Gets the list of all part tokens with given IDs or aliases traversing entire part token graph.
      List<String> getAliases()
      Gets optional list of aliases this token is known by.
      List<String> getAncestors()
      Gets the list of all parent IDs from this token up to the root.
      int getEndCharIndex()
      Gets end character index of this token in the original text.
      List<String> getGroups()
      Gets the list of groups this token belongs to.
      String getId()
      If this token represents user defined model element this method returns the ID of that element.
      default int getIndex()
      A shortcut method that gets index of this token in the sentence.
      NCModelView getModel()
      Gets reference to the model this token belongs to.
      default String getOriginalText()
      A shortcut method that gets original user input text for this token.
      String getParentId()
      Gets the optional parent ID of the model element this token represents.
      List<NCToken> getPartTokens()
      Gets the list of tokens this tokens is composed of.
      String getServerRequestId()
      Gets ID of the server request this token is part of.
      int getStartCharIndex()
      Gets start character index of this token in the original text.
      default String getUnid()
      A shortcut method that gets internal globally unique system ID of the token.
      String getValue()
      Gets the value if this token was detected via element's value (or its synonyms).
      default boolean isChildOf​(String tokId)
      Tests whether this token is a child of given token ID.
      default boolean isFreeWord()
      A shortcut method checking whether or not this token represents a free word.
      default boolean isMemberOf​(String grp)
      Tests whether or not this token belongs to the given group.
      default boolean isOfAlias​(String alias)
      Tests whether or not this token has given alias.
      default boolean isStopWord()
      A shortcut method checking whether or not this token is a stopword.
      default boolean isUserDefined()
      Tests whether or not this token is a user-defined token.
    • Method Detail

      • getModel

        NCModelView getModel()
        Gets reference to the model this token belongs to.
        Returns:
        Model reference.
      • getServerRequestId

        String getServerRequestId()
        Gets ID of the server request this token is part of.
        Returns:
        ID of the server request this token is part of.
      • getId

        String getId()
        If this token represents user defined model element this method returns the ID of that element. Otherwise, it returns ID of the built-in system token. Note that a sentence can have multiple tokens with the same element ID.
        Returns:
        ID of the element (system or user defined).
        See Also:
        NCElement.getId()
      • getParentId

        String getParentId()
        Gets the optional parent ID of the model element this token represents. This only available for user-defined model elements (built-in tokens do not have parents).
        Returns:
        ID of the token's element immediate parent or null if not available.
        See Also:
        NCElement.getParentId(), getAncestors()
      • getAncestors

        List<String> getAncestors()
        Gets the list of all parent IDs from this token up to the root. This only available for user-defined model elements (built-in tokens do not have parents).
        Returns:
        List, potentially empty but never null, of all parent IDs from this token up to the root.
        See Also:
        getParentId()
      • isChildOf

        default boolean isChildOf​(String tokId)
        Tests whether this token is a child of given token ID. It is equivalent to:
             return getAncestors().contains(tokId);
         
        Parameters:
        tokId - Ancestor token ID.
        Returns:
        true this token is a child of given token ID, false otherwise.
      • getPartTokens

        List<NCToken> getPartTokens()
        Gets the list of tokens this tokens is composed of. This method returns only immediate part tokens.
        Returns:
        List of constituent tokens, potentially empty but never null, that this token is composed of.
        See Also:
        findPartTokens(String...)
      • findPartTokens

        default List<NCToken> findPartTokens​(String... idOrAlias)
        Gets the list of all part tokens with given IDs or aliases traversing entire part token graph.
        Parameters:
        idOrAlias - List of token IDs or aliases, potentially empty. If empty, the entire tree of part tokens is return as a list.
        Returns:
        List of all part tokens with given IDs or aliases. Potentially empty but never null.
        See Also:
        getPartTokens()
      • getAliases

        List<String> getAliases()
        Gets optional list of aliases this token is known by. Token can get an alias if it is a part of other composed token and token DSL expression that was used to match it specified an alias. Note that token can have zero, one or more aliases.
        Returns:
        List of aliases this token is known by. Can be empty, but never null.
      • isOfAlias

        default boolean isOfAlias​(String alias)
        Tests whether or not this token has given alias. It is equivalent to:
              return getAliases().contains(alias);
         
        Parameters:
        alias - Alias to test.
        Returns:
        True if this token has alias alias, false otherwise.
      • getValue

        String getValue()
        Gets the value if this token was detected via element's value (or its synonyms). Otherwise returns null. Only applicable for user-defined model elements (built-in tokens do not have values).
        Returns:
        Value for the user-defined model element or null, if not available.
        See Also:
        NCElement.getValues()
      • getGroups

        List<String> getGroups()
        Gets the list of groups this token belongs to. By default, if not specified explicitly, the group is token's ID.
        Returns:
        Token groups list. Never null - but can be empty.
        See Also:
        NCElement.getGroups()
      • isMemberOf

        default boolean isMemberOf​(String grp)
        Tests whether or not this token belongs to the given group. It is equivalent to:
              return getGroups().contains(grp);
         
        Parameters:
        grp - Group to test.
        Returns:
        True if this token belongs to the group grp, false otherwise.
      • getStartCharIndex

        int getStartCharIndex()
        Gets start character index of this token in the original text.
        Returns:
        Start character index of this token.
      • getEndCharIndex

        int getEndCharIndex()
        Gets end character index of this token in the original text.
        Returns:
        End character index of this token.
      • isStopWord

        default boolean isStopWord()
        A shortcut method checking whether or not this token is a stopword. Stopwords are some extremely common words which add little value in helping understanding user input and are excluded from the processing entirely. For example, words like a, the, can, of, about, over, etc. are typical stopwords in English. NLPCraft has built-in set of stopwords.

        This method is equivalent to:

             return meta("nlpcraft:nlp:stopword");
         
        Returns:
        Whether or not this token is a stopword.
      • isFreeWord

        default boolean isFreeWord()
        A shortcut method checking whether or not this token represents a free word. A free word is a token that was detected neither as a part of user defined or system tokens.

        This method is equivalent to:

             return meta("nlpcraft:nlp:freeword");
         
        Returns:
        Whether or not this token is a freeword.
      • getOriginalText

        default String getOriginalText()
        A shortcut method that gets original user input text for this token.

        This method is equivalent to:

             return meta("nlpcraft:nlp:origtext");
         
        Returns:
        Original user input text for this token.
      • getIndex

        default int getIndex()
        A shortcut method that gets index of this token in the sentence.

        This method is equivalent to:

             return meta("nlpcraft:nlp:index");
         
        Returns:
        Index of this token in the sentence.
      • getUnid

        default String getUnid()
        A shortcut method that gets internal globally unique system ID of the token.

        This method is equivalent to:

             return meta("nlpcraft:nlp:unid");
         
        Returns:
        Internal globally unique system ID of the token.
      • isUserDefined

        default boolean isUserDefined()
        Tests whether or not this token is a user-defined token.
        Returns:
        {code true} if this token is defined by the model element in the user model, false otherwise.