Custom Models
Using the API or SDK you can use our system to classify your transactions with a set of custom labels. The process requires you to provide a set of labeled transactions which are then used to train a custom model. The custom model makes use of Ntropy's advanced base models and provides additional capabilities for customization and fine-tuning based on user provided labeled data.
Usage
A model is identified by a unique name that can be re-used (overwriting the previous model with that name). Each user can have up to 10 unique models simultaneously. After a model is trained it will be automatically kept up-to-date with the Ntropy system to benefit from improvements to our base models.
If you train a model with the same name as a previously trained model, the system will overwrite the existing model and you will lose access to the earliest model.
Train
The training of a custom model requires a set of input transactions with the expected attributes (as described in Enrichment) and one additional label
attribute containing the ground-truth label that the model should learn for that transaction. You can consult the API reference for more details.
The models are fine-tuned on a combination of the description
, entry_type
, amount
and iso_currency_code
fields. Providing a training dataset with a good diversity of these fields for each class will improve the quality of the final model.
With the labelled data, training a new model is done by a single API or SDK call as shown below:
- cURL
- Python SDK
$ curl \
-H "X-API-KEY: <YOUR-API-KEY>" \
-H "Content-Type: application/json" \
-X POST \
--data '{
"transactions": [
{
"description": "AMAZON WEB SERVICES AWS.AMAZON.CO WA Ref5543286P25S Crd15",
"entry_type": "outgoing",
"amount": 12042.37,
"iso_currency_code": "USD",
"date": "2021-11-01",
"transaction_id": "4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFmn",
"country": "US",
"account_holder_id": "id-1",
"account_holder_type": "business",
"label": "cloud"
},
{
"description": "Purchase Return 10/22 Apple.Com/US CA Card 5233",
"entry_type": "incoming",
"amount": 150.94,
"iso_currency_code": "USD",
"date": "2021-11-02",
"transaction_id": "tw3tFmn4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xke",
"country": "US",
"account_holder_type": "business",
"label": "returns"
},
...
]
}' \
https://api.ntropy.com/v2/models/my-model-name
from ntropy_sdk import SDK, LabeledTransaction
sdk = SDK("YOUR-API-KEY")
txs = [
LabeledTransaction(
description = "AMAZON WEB SERVICES AWS.AMAZON.CO WA Ref5543286P25S Crd15",
entry_type = "outgoing",
amount = 12042.37,
iso_currency_code = "USD",
date = "2021-11-01",
transaction_id = "4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFmn",
country = "US",
account_holder_id = "id-1",
account_holder_type = "business",
label="cloud"
),
LabeledTransaction(
description = "Purchase Return 10/22 Apple.Com/US CA Card 5233",
entry_type = "incoming",
amount = 150.94,
iso_currency_code = "USD",
date = "2021-11-02",
transaction_id = "tw3tFmn4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xke",
country = "US",
account_holder_id = "id-1",
account_holder_type = "business",
label="returns"
),
...
]
model = sdk.train_custom_model(txs, "my-model-name")
Since model training is an asynchronous operation, you can check the status of the submitted model, or wait for it to complete, as follows:
- cURL
- Python SDK
$ curl \
-H "X-API-KEY: <YOUR-API-KEY>" \
-H "Content-Type: application/json" \
-X GET \
https://api.ntropy.com/v2/models/my-model-name
# non-blocking
_, status, progress = model.poll()
print(status)
# block until training is complete
model.wait()
When the returned model status is ready
, the model can be used for classification.
Evaluation
Evaluating a trained model is simple if you're using the SDK. The model instance returned by sdk.train_custom_model
or sdk.get_custom_model
can be used to calculate a number of metrics given a list of LabeledTransactions
to be used as a test set:
from ntropy_sdk import SDK, LabeledTransaction
train_txs, test_txs = load_transactions(...) # creates two lists of LabeledTransaction objects
model = sdk.train_custom_model(train_txs, "my-model-name")
res = model.eval(test_txs)
print(f"accuracy = {res.accuracy()}")
You can find more information on how each metric is calculated in the SDK reference for the eval()
method.
Classify
To obtain the labels of a transaction with a trained custom model you must run enrichment on the transaction while providing the model name in the model-name
query parameter of the /v2/transactions/sync
or /v2/transactions/async
API calls (or model_name
argument in SDK calls). See the example below:
- cURL
- Python SDK
$ curl \
-H "X-API-KEY: <YOUR-API-KEY>" \
-H "Content-Type: application/json" \
-X POST \
--data '{
[
...
]
' \
https://api.ntropy.com/v2/transactions/sync?model-name=my-model-name
from ntropy_sdk import SDK, LabeledTransaction
sdk = SDK("YOUR-API-KEY")
txs = [
Transaction(
...
),
...
]
enriched = sdk.add_transactions(txs, model_name="my-model-name")
In this case, the returned EnrichedTransaction
list will contain labels obtained using the custom trained model.
Constraints
Note that there are currently some constraints to the training process:
- The maximum number of transactions to train a model is 50k
- The minimum number of categories (labels) to train a model is 2
- The minimum number of transactions per category is 16