Skip to main content

Enrichment

The core functionality of Ntropy API is the enrichment of transactions. Each transaction is understood by our system and a structured response is returned containing all the extracted information. The API supports both enrichment synchronously up to 4000 transactions and asynchronously up to 24960 transactions in the same batch.

Transactions

The enrichment process requires a list of transaction objects as input and outputs a list of enriched transaction objects (in the same order) containing additional structured information. The input transaction object contains the following fields:

attributetypesummary
descriptionstringDescription text of the transaction. Has a maximum length of 1024 characters.
entry_typeenumDirection of the flow of money from the perspective of the account holder. Possible values are incoming and outgoing.
amountfloatAmount of the transaction. Should always be a positive number. Use entry_type to represent the direction of the flow of money.
iso_currency_codestringCurrency of the transaction in ISO-4217 format. See supported options.
datedateDate when the transaction was made in ISO-8601 format, i.e. YYYY-MM-DD.
transaction_idstringUnique identifier of the transaction in your system.
countrystring(optional) - The country where the transaction was made in ISO-3166-2 format. See supported options.
account_holder_idstring(optional) - Unique identifier of the account holder in your system.
account_holder_typeenum(optional) - Type of the account holder. Possible values are consumer, business, freelance.

In the transaction model, the amount attribute must always be positive, as the direction of the flow of money is represented only by the entry_type attribute. From the point of view of the account holder, if a transaction represents money leaving the account it should have an entry_type value of outgoing (debit), and if a transaction represents money entering an account it should have an entry_type value of incoming (credit).

caution

Multiple transactions submitted with the same transaction_id will be considered the same transaction, where the most recent will replace previous submissions.

While providing account holder information is optional, it is needed for labeling. If neither account_holder_id nor account_holder_type are provided, labeling will be disabled. Check the Account Holder section for more information regarding this. There are other transaction attributes that are not required; however, they should be submitted if known to increase the quality of the results.

Enrichment results

The enriched transactions that are returned by the API will contain the set of fields described below.

attributetypesummary
labelslist(string)Label from our live hierarchy, depending on the type of account holder (consumer, business, freelance, unknown).
label_groupstringHigher level category that groups together related labels
recurrenceenumIndicates whether a transaction is a one-time transfer, e.g. purchasing a mattress (one-off), regularly repeats with personalized pricing, e.g. utilities, mortgage (recurring), regularly repeats with fixed pricing (subscription).
recurrence_groupRecurrenceGroupIf a transaction is recurrent, this is a RecurrenceGroup object with fields described below, (null if transaction is not recurrent).
locationstringNormalized location of the merchant (if a location is present).
logostring, url formatLogo of the merchant (if a merchant is present).
merchantstringNormalized merchant name (if a merchant is present).
merchant_idstringUnique merchant identifier (if a merchant is present).
personstringName of the person in the transaction text (if a person is present).
transaction_idstringUnique transaction identifier.
websitestringWebsite of the merchant (if a merchant is present).
mcclist(int)Predicted MCC codes, usually containing a single value. Can be multiple values if the merchant can operate with multiple MCCs (if a merchant is present).
intermediarieslist(object)List of objects containing the properties id, name, website and logo corresponding to all the intermediary merchants of the transaction (i.e., the payment processor)

Most of the output fields listed above can be enabled / disabled for your specific API key, and if disabled they won't be returned by the API. You can check your MSA for details. You can also reach out to Ntropy Support at support@ntropy.com for more information regarding returned fields.

More details on consumer labels can be found here.

Recurrence Group Schema

When a transaction is recurrent, the recurrence group contains the calculated information for that transaction and others within the same recurrence group. The schema of the RecurrenceGroup is as follows.

attributetypesummary
first_payment_datedateDate of the first recurrent group's transaction.
latest_payment_datedateDate of the last recurrent group's transaction.
periodicity_in_daysintMedian number of days between consecutive transactions in the recurrent group.
average_amountfloatAverage amount of the recurrent group's transactions.
other_partystringSource/merchant of the recurrent group's transactions.
idstringRecurrent group identifier.
transaction_idslist(string)List of unique transaction identifier of the recurrence group.
total_amountfloatSum of amounts of the recurrent group's transactions.
periodicityenumDetected periodicity (weekly, bi-weekly, monthly, bi-monthly, quarterly, semi-yearly, yearly, other).

In the enriched transaction some fields may be null if it was not possible (or does not make sense in the context of the request) to determine its value. For example: merchant, merchant_id, logo, website and mcc will be set to null if a merchant is not detected in a transaction. There are also fields that must be enabled for an account per request through any support communication channel.

Labeling hierarchies

The labels returned by the API are different depending on the account_holder_type of the transaction. You can find the listing containing all the labels for each account_holder_type below:

Synchronous enrichment

When submitting transactions synchronously, the API will block until the enrichment is complete and then return the list of enriched transactions in the same order as the input transactions.

Synchronous enrichment is straightforward, but it is only appropriate for smaller batches (up to 4000 transactions). For larger batches, asynchronous enrichment must be used.

You can see how to run a synchronous enrichment for two transactions in the following example:

$ curl \
-H "X-API-KEY: <YOUR-API-KEY>" \
-H "Content-Type: application/json" \
-X POST \
--data '[
{
"description": "SQ* STARBUCKS BRYANT PRK",
"entry_type": "outgoing",
"amount": 42.17,
"iso_currency_code": "USD",
"date": "2023-01-01",
"transaction_id": "4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFmn",
"country": "US",
"account_holder_id": "id-1",
"account_holder_type": "consumer"
},
{
"description": "Purchase Return 10/22 Apple.Com/US CA Card 5233",
"entry_type": "incoming",
"amount": 150.94,
"iso_currency_code": "USD",
"date": "2021-11-02",
"transaction_id": "tw3tFmn4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xke",
"country": "US",
"account_holder_type": "business"
}
]' \
https://api.ntropy.com/v2/transactions/sync

You will get back the enriched transactions as a JSON object with the API and as a list of EnrichedTransaction with the SDK:

  [
{
"labels": [
"infrastructure",
"cloud operations"
],
"location": "wa",
"logo": "https://logos.ntropy.com/aws.amazon.com",
"merchant": "Amazon Web Services",
"merchant_id": "2c0c799b-d003-30a6-9f53-878f3ebf46aa",
"person": null,
"recurrence": "recurring",
"recurrence_group": {
"average_amount": 12042.37,
"first_payment_date": "2021-08-01",
"latest_payment_date": "2021-11-01",
"periodicity_in_days": 31,
"id": "def66e68-37fd-3358-a9c7-2ba3f0f8edc4",
"other_party": "aws.amazon.com",
"periodicity": "monthly",
"total_amount": 48169.48,
"transaction_ids": [
"4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFzn",
"4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFcn",
"4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFan",
"4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFmn"
]
},
"transaction_id": "4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFmn",
"transaction_type": "business",
"website": "aws.amazon.com"
},
{
"labels": [
"inflows",
"refunds"
],
"location": "ca",
"logo": "https://logos.ntropy.com/apple.com",
"merchant": "Apple",
"merchant_id": "15569cfd-b93d-3990-9c0a-be6b9a5fab64",
"person": null,
"recurrence": "one off",
"recurrence_group": null,
"transaction_id": "tw3tFmn4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xke",
"transaction_type": "unknown",
"website": "apple.com"
}
]

In the SDK you can retrieve a serializable representation of the transaction containing all attributes by using to_dict() method of the EnrichedTransacion:


from ntropy_sdk import SDK, Transaction

sdk = SDK("YOUR-API-KEY")
txs = [
Transaction(
description = "AMAZON WEB SERVICES AWS.AMAZON.CO WA Ref5543286P25S Crd15",
entry_type = "outgoing",
amount = 12042.37,
iso_currency_code = "USD",
date = "2021-11-01",
transaction_id = "4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFmn",
country = "US",
account_holder_id = "id-1",
account_holder_type = "business"
),
]

enriched = sdk.add_transactions(txs)[0]
print(enriched.to_dict())

# and convert it to JSON
import json
print(json.dumps(enriched.to_dict()))

Asynchronous enrichment

Transactions can also be enriched asynchronously with the async endpoint. Asynchronous enrichment follows the same input format as synchronous, but can handle larger batches between 1-24960 transactions in a single request.

Unlike the synchronous endpoint, the asynchronous endpoint will answer immediately, returning information for the submitted batch. This includes an id which can be used to query for the status of the batch.

The following example shows how to submit a batch asynchronously:

$ curl \
-H "X-API-KEY: <YOUR-API-KEY>" \
-H "Content-Type: application/json" \
-X POST \
--data '[
{
"description": "AMAZON WEB SERVICES AWS.AMAZON.CO WA Ref5543286P25S Crd15",
"entry_type": "outgoing",
"amount": 12042.37,
"iso_currency_code": "USD",
"date": "2021-11-01",
"transaction_id": "4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xketw3tFmn",
"country": "US",
"account_holder_id": "id-1",
"account_holder_type": "business"
},
{
"description": "Purchase Return 10/22 Apple.Com/US CA Card 5233",
"entry_type": "incoming",
"amount": 150.94,
"iso_currency_code": "USD",
"date": "2021-11-02",
"transaction_id": "tw3tFmn4yp49x3tbj9mD8DB4fM8DDY6Yxbx8YP14g565Xke",
"country": "US",
"account_holder_id": "id-1",
"account_holder_type": "business"
}
]' \
https://api.ntropy.com/v2/transactions/async

Handling async in the API

The async API endpoint returns a JSON response containing information of the batch, such as id, status, and progress.

{
"id": "39a316c6-eb48-4dc4-a094-025243221ddc",
"status": "started",
"progress": 0,
"updated_at": "2021-10-28T08:32:55.340976+00:00"
}

The id field can be used in the /v2/transactions/async/{id} endpoint to request the enriched set of transactions. For example, for the previous batch:

$ curl \
-H "X-API-KEY: <YOUR-API-KEY>" \
https://api.ntropy.com/v2/transactions/async/39a316c6-eb48-4dc4-a094-025243221ddc

If the processing is still in progress, you will get a response back similar to the previous request, with the progress field showing the number of transactions from the batch that have finished processing. When batch enrichment is complete, the status field will be set to finished and the object in the response will contain an additional results field that holds the list of enriched transactions, e.g.:

{
"id": "39a316c6-eb48-4dc4-a094-025243221ddc",
"progress": 2,
"results": [ ... ],
"status": "finished",
"updated_at": "2021-10-28T08:55:25.294044+00:00"
}

Handling async in the SDK

The SDK response for async enrichment is encapsulated in a Batch object that lets users poll the API automatically or block while continously polling for a response:

response, status = batch.poll()

print("id: %s, status: %s", % (batch.id, status))

# do any operations in the meantime
# ...

# block waiting for the result
result = batch.wait(poll_interval=1)

Additionally, a Batch object can be directly constructed from its id. The following code is equivalent to querying the /v2/transactions/async/{id} endpoint:

batch = Batch(sdk, "39a316c6-eb48-4dc4-a094-025243221ddc")
response, status = batch.poll()

Accessing input transactions

If you are using the SDK, all EnrichedTransaction objects obtained through add_transactions or get_account_holder_transactions contain a reference to the original input transaction, allowing access to all input parameters such as description and amount. You can use it as follows:

sdk = SDK("YOUR-API-KEY")
enriched = sdk.add_transactions(txs)

for e in enriched:
print(f"{e.parent_tx.description} -> {e.merchant}")

# it can also be used when fetching history
account_holder_history = sdk.get_account_holder_history("YOUR-ACCOUNT-HOLDER")
for e in account_holder_history:
print(f"{e.parent_tx.description} -> {e.merchant}")

When using the API, the

Enriching tabular data (csv, dataframe)

The SDK can be used to enrich tabular data directly by integrating with the pandas library. All the enrichment operations can receive a pandas.DataFrame and will in turn return one as well. The tabular data should contain columns with the same names as the expected input transaction attributes described above.

An example .csv file would be:

transaction_id,description,entry_type,amount,date,iso_currency_code,country_code,account_holder_type,account_holder_id
1234,TEST TRANSACTION,outgoing,123.4,2022-01-01,USD,USA,business,id-1234

This file can be processed by loading the .csv with pandas and enriching with the SDK. Note that the output will also be a pandas.DataFrame that can be persisted to .csv using the to_csv method:

import pandas as pd
from ntropy_sdk import SDK

tx_df = pd.read_csv("transactions.csv")
sdk = SDK("YOUR-API-KEY")

enriched_df = sdk.add_transactions(tx_df) # output is also a dataframe
enriched_df.to_csv("enriched.csv")

Enriching bank statements

The SDK also supports enriching PDF files with up to 200MB. The file data is then submitted and processed by our OCR pipeline asynchronously.

Once processed, the result is a set of input transactions that can then be fed to the enrichment API as seen previously.

An example code snippet is as follows:

from ntropy_sdk import SDK
sdk = SDK("YOUR-API-KEY")

with open('bank_statement.pdf', 'rb') as fh:
bsr = sdk.add_bank_statement(file=fh, filename="bank_statement.pdf")

# do operations in the meantime
# ...

# block and wait for result
bs = bsr.wait()
enriched_txs = sdk.add_transctions(bs.transactions)