Custom Connector

This article describes when and how a Starmind Custom Connector learns from an on-premise or custom data source. Please reach out to Starmind, if you want to connect a Custom Connector. We will support you with the project.

The process to learn from the source system is separated into three steps. First, the customer extracts the data from the source system. Then the data has to be transformed and uploaded into Starmind.

Starmind provides the endpoints within the application and guidance on the data extraction. The customer has to extract the data from the source system and transform the data.

General Process to import data

The sequence diagram below shows the required steps to import the data. The extraction of the data can differ between the different systems and versions of the source application.

First step

The customer’s script extracts the data from the source system.

Second step

The script uploads the data info Starmind. We will provide all the details for the request. The format of the HTTP request is:

POST /connectors/api/v1/entities-with-action HTTP/1.1
Host: {domain}
APIKEY: {api_key}
Content-Type: application/json

{
    "entity_type":"custom_connector_sharepoint",
    "action_type":"custom_connector_sharepoint_document",
    "action_date":"2020-04-24T11:12:27.000Z",
    "user_email":"[email protected]",
    "text":"Text to learn from",
    "texts":{
        "text_format":"plain|html",
        "descriptor":"default|primary|secondary",
        "text":"Text to learn from"
    }
}

Key	Description
entity_type	Entities represent anything with one or more topics (e.g., documents, chat, or email messages...). Entities are linked to one or more topics.
action_type	An action is an interaction of a user with an entity. It connects the user to the entity and creates an implicit connection of the user with the entity’s topic. Actions have a weight that defines the strength of these resulting connections. One entity can contain multiple actions.
action_date	Date of the action.
user_email	The email address of the user who made the action.
text \| texts	The content from which the expertise topics are extracted. The chapter “Text or Texts” describes which one to use.

Text or Texts

The Endpoint offers two ways to extract topics from a text. Text is used when the datapoint contains only one text, for example, for a chat message. The “texts” option allows sending multiple texts that are related together. For example, documents that have a title and content or description.

The text endpoints expect the following object:

Key	Description
text_format	The text format can be “plain” for plain text or “html” for HTML text.
descriptor	Text descriptors help the user with consistency over defining factors for the text field. It may be burdensome to keep track of the different factors the user has used across other scripts. For consistency, we decided to implement text descriptors that will define a factor for a given text. Starmind can advise you which to use. The following options exist: - Default - Primary - Secondary
text	The content from which the expertise topics are extracted.

Avoiding duplicates

There are two different approaches to avoid learning twice from the same topic. First of all, this can be implemented in the script, only to read data that was not yet processed. One solution would be a pointer, which indicates up to which entry the data is already imported.

The second approach would be to use the alias for the entry. Starmind allows each alias only once. Otherwise, the endpoint will return an error. The alias is part of the request body and has to have the following form:

...
    "action_alias":"custom_connector_sharepoint_document_1",
...

Please reach out to Starmind before using the alias. We will advise which form to use.