Dedup Algorithms - Concatenate Elements and Dedup with Groovy Script

The script below is used to take one or more Salesforce Opportunity and prep the data to query NetSuite if a customer is already created. The parameter on the NetSuite connector would use current data perform the query. Then a decision can occur later in the process can occur to see if a new NetSuite customer needs to be created. With the script, multiple documents will come into the Data Process shape and the data will be transformed into a flat file with a single row. The data will be deduped and concatenated with commas to reduce the number of queries made to NetSuite. The output become a single string of text that is comma delimited in batches of 500 comma delimited elements per document. The NetSuite query will use external Id in NetSuite. This configuration will ultimately reduce the number of API calls made to NetSuite and decrease the total time it takes to perform the query.

The script will likely need to be modified for your specific use case but can be viewed as common way to dedup when using XML and the endpoint access queries with comma separated values. An option in modifying the script is updating it to read a dynamic document property instead of parsing the XML.

// Groovy 2.4
import java.util.Properties
import java.io.InputStream
import com.boomi.execution.ExecutionUtil

/*
*   This script will concat unique AccountId elements on multiple Salesforce
*   Opportunity. It will also output document(s) with each unique
*   AccountIds separated with a comma. The number of AccountIds per
*   document is set with BatchSize. This is to prevent the queries from being
*   too long and causing an error.
*/

// Input Variable
def BatchSize = 500
logger = ExecutionUtil.getBaseLogger();

// Run if there is 1 or more documents
if (dataContext.getDataCount() > 0) {

    // Create a list to contain the XML elements
    def AccountList = []

    for (int i = 0; i < dataContext.getDataCount(); i++) {
        InputStream is = dataContext.getStream(i)
        String Data = is.getText()
        def slurper = new XmlSlurper(false, true)
        def BaseElement = slurper.parseText(Data)
        AccountList.add(BaseElement.AccountId.text())
    }

    // Make all IDs unique. Remove duplicates.
    AccountList = AccountList.unique()
    def AccountListSize = AccountList.size()
    def OutDocumentNumber = AccountListSize / BatchSize

    // Don't output a document if AccountList is empty.
    if (AccountListSize > 0) {
        for (int p = 0; p < OutDocumentNumber; p++) {
            def lower = p * BatchSize
            def upper = (p * BatchSize) + (BatchSize - 1)
            if (upper > AccountListSize) {
                upper = AccountListSize - 1
            }

            def outData = AccountList[lower..upper].join(',')
            logger.info("List of Salesforce Account Internal IDs: " + outData)
            Properties prop = new Properties()
            InputStream is = new ByteArrayInputStream(outData.toString().getBytes('UTF-8'))
            dataContext.storeStream(is, prop)
        }
    }
}

Article originally posted at Boomi Community.