Apache NLPCraft is a Java-based open source library for adding a natural language interface to any applications. It can work with any private or public data source, and has no hardware or software lock-in. You can build models and intents for NLPCraft using any JVM-based languages like Java, Scala, Kotlin, Groovy, etc. NLPCraft exposes REST APIs for integration with user applications that can be written in any language or system.
One of the key features of NLPCraft is its use of advance semantic modelling that is tailor made for domain-specific natural language interface. It doesn't require traditional ML approach involving model training or corpora development leading to much simpler implementation and shorter development time.
Another key aspect of NLPCraft is its singular focus on processing English language. Although it may sound counterintuitive, this narrow focus enables NLPCraft to deliver unprecedented ease of use combined with unparalleled comprehension capabilities for English input out-of-the-box. It's been shown that support for multiple languages in a single framework leads to either watered down functionality or overly complicated configuration, training and usage. It's also important to note that English language is spoken by more than a billion people on this planet and is de facto standard global language of the business and commerce.
So, how does it work in a nutshell?
When using NLPCraft you will be dealing with three main components:
NLPCraft employs model-as-a-code approach where entire data model is an implementation of NCModel Java interface that can be developed using any JVM programming language like Java, Scala, Kotlin or Groovy. Data model implementation defines how to interpret user input, and how to query or control a particular data source. Model-as-a-code natively supports any software lifecycle tools and frameworks in Java ecosystem.
Typically, declarative portion of the model will be stored in a separate JSON or YAML file for simpler maintenance. There are no practical limitation on how complex or simple a model can be, or what other tools it can use. Data models use intent-based matching provided by NLPCraft out-of-the-box.
To use data model it has to be deployed into a data probe.
Data probe is a light-weight container application designed to securely deploy and manage data models. Each probe can deploy and manage multiple models and many probes can be connected to the REST server (or a cluster of REST servers). The main purpose of the data probe is to separate data model hosting from managing REST calls from the clients. While you would typically have just one REST server, you may have multiple data probes deployed in different geo-locations and configured differently.
Data probes can be deployed and run anywhere as long as there is an ingress connectivity from the REST server, and are typically deployed in DMZ or close to your target data sources: on-premise, in the cloud, etc. Data probe uses strong 256-bit encryption and ingress only connectivity for communicating with the REST server.
REST server (or a cluster of REST servers behind a load balancer) provides URL endpoint for user applications to securely query data sources using NLI via data models deployed in data probes. Its main purpose is to accept REST-over-HTTP calls from user applications, manage connected data probes, and route user requests to and from requested data probes.
Unlike data probe that gets restarted every time the model is changed, i.e. during development, the REST server is a "start-and-forget" component that can be launched once while various data probes can continuously reconnect to it. It can typically run as a Docker image locally on premise or on the cloud.