• Docs
  • Resources
  • Community
  • Use Cases
  • Downloads
  • v.0.9.0
  • GitHub
  1. Home
  2. Synonyms Tool

Synonyms Tool

  • Developer Guide
  • Overview
  • Installation
  • First Example
  • Data Model
  • Intent Matching
  • Short-Term Memory
  • Server & Probe
  • Metrics & Tracing
  • Integrations
  • REST API
  • Tools
  • nlpcraft.{sh|cmd}
  • Test Framework
  • Embedded Probe
  • SQL Model Generator
  • Synonyms Tool
  • Examples
  • Alarm Clock
  • Light Switch
  • Weather Bot
  • SQL Model

Overview

Synonym suggester tool takes an existing model, analyses its synonyms and intents and comes up with a list of synonyms that are currently missing that you might want to add to your model.

This tool is accessed via REST call. It is based on Google's BERT and Facebook fasttext models. It requires @NCIntentSample or @NCIntentSampleRef annotations present on intent callbacks. When invoked, the tool scans the given data model for intents and these annotations, and based on these samples tries to determine which synonyms are missing in the model.

Single Word Synonyms

Synonym suggester tool analyses only single word synonyms ignoring any multi-word synonyms. You can often convert a named element with multi-word synonyms into a combination of multiple named elements each with a single word synonyms using Composable NERs technique.

Usage

In order to use this tool the ctxword server and NLPCraft server should be started as well as the server's configuration should potentially be updated.

ctxword Server

Python 3.6-3.8

As of this writing (Dec 2020) the ctxword server and its dependencies work only with Python 3.6-3.8 version.

'ctxword' server is a Python-based module that provides BERT and fasttext based implementation for finding a contextually related words for a given word from the input sentence. NLPCraft server interacts with 'ctxword' server via internal REST interface. To configure NLPCraft server and start 'ctxword' Python-based server follow these steps:

  1. Install necessary dependencies by running the following commands from the NLPCraft installation directory:
    NOTE: this step should only be performed once.

    Linux/Unix/MacOS Windows
                                $ cd nlpcraft/src/main/python/ctxword
                                $ bin/install_dependencies.sh
                            

    Read src\main\python\ctxword\bin\WINDOWS_SETUP.md file for manual installation instructions.

  2. Optional.
    Configure nlpcraft.server.ctxword.url property in nlpcraft.conf file (or your own configuration file). This property comes with a default endpoint and you only need to change it if you change the 'ctxword' module implementation.
  3. Start the 'ctxword' server by running the following commands from NLPCraft installation directory:
                        $ cd nlpcraft/src/main/python/ctxword
                        $ bin/start_server.{sh|cmd}
                    

    1st Start

    Note that on the first start the server will try to load compressed BERT model which is not yet available. It will then download this library and compress it which will take a several minutes and may require 10 GB+ of available memory. Subsequent starts will skip this step, and the server will start much faster.

REST Server

REST server should be started.

Running

Synonyms tool can be run in two different ways:

NLPCraft CLI REST Call
                    $ bin/nlpcraft.sh help --cmd=model-sugsyn
                    $ bin/nlpcraft.sh model-sugsyn --mdlId=nlpcraft.alarm.ex --minScore=0.5
                

NOTES:

  • mldId parameter is only required if there is more than one model deployed in the connected data probe. If the data probe has only one model you can ommit this parameter.
  • minScore - Optional minimum confidence score to include into the result, ranging from 0 to 1, default is 0. minScore of 0 will include all results, and minScore of 1 will include only results with the absolutely highest confidence score. Values between 0.5 and 0.7 is generally suggested.
  • NLPCraft CLI is available as nlpcraft.sh for and nlpcraft.cmd for .
  • Run bin/nlpcraft.sh help --cmd=model-sugsyn to get a full help on this command.

REST API accepts only POST HTTP calls and application/json content type for JSON payload and responses. When issuing a REST call for this tool you will be using the following URL:

                    https://localhost:8081/api/v1/model/sugsyn
                

where:

http
Either http or https protocol.
localhost:8081
Host and port on which REST server is started. localhost:8081 is the default configuration and can be changed.
/api/v1
Mandatory prefix indicating API version.
model/sugsyn
Synonym suggester REST call.

The parameters should be passed in as JSON:

        {
            "acsTok": "qweqw9123uqwe",
            "mdlId": "nlpcraft.alarm.ex",
            "minScore": 0.5
        }
                

where:

  • acsTok - access token obtain via previous '/signin' call.
  • mdlId - ID of the model to run synonym suggester on.
  • minScore - Optional minimum confidence score to include into the result, ranging from 0 to 1, default is 0. minScore of 0 will include all results, and minScore of 1 will include only results with the absolutely highest confidence score. Values between 0.5 and 0.7 is generally suggested.

Either way the synonym suggester returns the following JSON result (nlpcraft.alarm.ex model from Alarm example):

{
"status": "API_OK",
"result": {
  "modelId": "nlpcraft.alarm.ex",
  "minScore": 0.5,
  "durationMs": 424.0,
  "timestamp": 1.60091239852E12,
  "suggestions": [
    {
      "x:alarm": [
        {
          "score": 1.0,
          "synonym": "ask"
        },
        {
          "score": 0.9477103542042674,
          "synonym": "join"
        },
        {
          "score": 0.8882341083867801,
          "synonym": "get"
        },
        {
          "score": 0.7330826349218547,
          "synonym": "remember"
        },
        {
          "score": 0.6902880910527778,
          "synonym": "contact"
        },
        {
          "score": 0.6014764219771813,
          "synonym": "time"
        },
        {
          "score": 0.5816398376889104,
          "synonym": "follow"
        },
        {
          "score": 0.5640882890681899,
          "synonym": "watch"
        },
        {
          "score": 0.5139855649326083,
          "synonym": "stop"
        },
        {
          "score": 0.5136895804732818,
          "synonym": "kill"
        },
        {
          "score": 0.5001167992233122,
          "synonym": "send"
        }
      ]
    }
  ],
  "warnings": [
    "Model has too few (3) intents samples. It will negatively affect the quality of suggestions. Try to increase overall sample count to at least 20."
  ]
}
        

The result is structured as a list of proposed synonyms with their corresponding scores for each model's element. You should analyse the results for their fitness for your model and its existing synonyms. The tool cannot guarantee that every suggested synonym is appropriate or valid - but it gives a good "courtesy" check for potentially missing synonyms.

Run Periodically

It is a good idea to run this tool periodically if you are actively changing the model. With dozens or hundreds of model elements it is very hard to manually maintain quality set of synonyms. With a good list of user input samples for each intent this tool can be indispensable for easy maintenance of the synonyms.

  • On This Page
  • Overview
  • Usage
  • Quick Links
  • Examples
  • Javadoc
  • REST API
  • Download
  • Cheat Sheet
  • News & Events
  • Support
  • JIRA
  • Dev List
  • Stack Overflow
  • GitHub
  • Gitter
  • Twitter
  • YouTube
Copyright © 2021 Apache Software Foundation asf Events • Privacy • News • Docs release: 0.9.0 Gitter Built in: