File Enrichment (CSV, PDF)
Enriching tabular data (csv, dataframe)
The SDK can be used to enrich tabular data directly by integrating with the pandas
library. All the enrichment operations can receive a pandas.DataFrame
and will in turn return one as well.
The tabular data should contain columns with the same names as the expected input transaction attributes described above.
An example .csv
file would be:
transaction_id,description,entry_type,amount,date,iso_currency_code,country_code,account_holder_type,account_holder_id
1234,TEST TRANSACTION,outgoing,123.4,2022-01-01,USD,USA,business,id-1234
This file can be processed by loading the .csv
with pandas
and enriching with the SDK. Note that the output will also be a pandas.DataFrame
that can be persisted to .csv
using the to_csv
method:
import pandas as pd
from ntropy_sdk import SDK
tx_df = pd.read_csv("transactions.csv")
sdk = SDK("YOUR-API-KEY")
enriched_df = sdk.add_transactions(tx_df) # output is also a dataframe
enriched_df.to_csv("enriched.csv")
Enriching PDF files
The SDK also supports enriching PDF files with up to 200MB. The file is then submitted and processed by our OCR pipeline asynchronously.
Once processed, the result is a table containing the enriched transactions that were recognised by OCR.
An example code snippet is as follows:
from ntropy_sdk import SDK
sdk = SDK("YOUR-API-KEY")
with open('bank_statement.pdf', 'rb') as f:
bsr = sdk.add_bank_statement(file=f)
# do operations in the meantime
# ...
# block and wait for result
df = bsr.wait()
You may also specify an account holder to which the underlying transactions will be associated instead of relying on default values. It works similarly to when they are associated with transactions in that they are created implicitly1:
from ntropy_sdk import SDK, AccountHolderType
sdk = SDK("YOUR-API-KEy")
# Creates account holder with default account_type `business`
with open('bank_statement.pdf', 'rb') as f:
bsr = sdk.add_bank_statement(file=f,
account_holder_id="7b6ce4f1-004a-40f0-a480-562a831809ef")
# Uses `consumer` as account_holder_type
with open('bank_statement.pdf', 'rb') as f:
bsr = sdk.add_bank_statement(file=f,
account_holder_id="7b6ce4f1-004a-40f0-a480-562a831809ef",
account_type=AccountHolderType.consumer)