Download
PDF

Intelligent Document Separation & Migration

The Problem

A mid-sized investment bank had contracted for the sale of one of its internally developed, but peripheral, businesses. Ninety days before closing firm leadership realized that the business they were divesting held more than 5 terabytes of digital assets and a serious percentage of those assets were proprietary to their firm and could not be legally transferred to the acquirer. Discussions with the firm's IT outsourcer and multi-national consulting firms resulted in proposals requiring 6 months to more than a year of time to segregate and transfer the digital assets -- and with costs in the millions of dollars. As the acquirer could not continue to service the clients of the divested business without access to employee emails, financial models, legal documents, et al, the deal was at considerable risk.

The bank's outside counsel introduced the bank to CBIZ Technology a long-term OpStack partner who brought us in to assess the situation and propose a solution.

The challenges that we needed to overcome were:

Key Success Factors

The OpStack team provided requirements for a full inventory of all the digital assets in scope and while that was collected by client's MSP, began a discussion with the client on the criteria that would need to be applied to determine the future ownership of any given document or message. It was determined that a combination of external email addresses and a specific and limited set of keywords would be used to determine if any specific message or document needed to be retained and not transferred to the acquirer. In parallel, OpStack experimented with mail and file movement at volume to determine achievable rates given Microsoft's provision of bandwidth for data exfiltration.

With that information in hand the OpStack team determined that the solution required:

The Solution

The OpStack team selected X1 Enterprise as the search and discovery solution. X1's combination of in-place search, centrally maintained indices, user-friendly review interface, and enterprise management console made it the heart of the information segregation process. This was complimented by best-in-class tools from both Microsoft and third party vendors.

The OpStack team scripted an orchestrated workflow that allowed for the parallel execution of:

Tuning of the solution continued throughout the project, optimizing the efficiency of indexing, of search, of deletion, and segregation. By the last month of the project, the performance choke point was Microsoft's undocumented algorithm-triggered actions to throttle I/O throughput on the client's subscription. Breaking operations into smaller than otherwise optimal batches and putting wait timers in some activities lessened the impact,particularly minimizing the notice-free killing of jobs.

The Result