Transforming financial services (FS) clients to take advantage of advances in digital technologies often demands labor-intensive and time-consuming code migrations. The hype around ChatGPT would have you believe that you can simply upload legacy code and watch it get rewritten before your eyes. But in its work with a large FS client, EY has found life isn’t quite that simple.
The client asked EY to apply generative artificial intelligence (GenAI) to a code conversion from the statistical analysis language SAS to PySpark, an interface for Apache Spark in Python. PySpark offers the advantages of allowing coders to write applications using Python APIs and providing the PySpark shell to analyze data in distributed environments.
EY found simple code snippet migration was easy. But more complex code and complete code repository migration at scale required custom engineering on top of the large language models (LLMs) such as ChatGPT. Without the custom work, the team encountered challenges with consistency, dependencies, and code contamination.
It solved the challenges with a custom solution that packaged the native code conversion capabilities of GPT-4 with steps that checked the code for context and dependencies and provided a schema for the overall program flow. The pilot converted 4,000 lines of code with 85%+ accuracy and 80% less effort.
Now EY has adapted the initial bespoke SAS-to-PySpark version to handle other programming languages. EY’s Generative Solution for Code Repository Migration converts complete code repositories in one go, maintaining intact repository structures, dependencies, and other essential variables.
When using GenAI tools such as GitHub and Copilot, code gets converted as snippets, during which context gets lost. The code that GenAI generates may also contain outdated or no-longer-supported libraries. EY’s solution includes intelligent “chunking” to split large chunks of code logically, code schema identification and mapping to maintain consistency, nested parsing of large code to retain context, and ensuring the use of the correct libraries to avoid code contamination.
Because anyone can access AI through the interfaces of ChatGPT and its kin, leaders should be wary of the rash of “dangerous experts” this will deliver—folks who have dabbled and now know “enough to be dangerous.” When it comes to business-critical legacy code, it’s clear that service providers and consultancies retain an essential role in identifying and responding to the risks while deploying the benefits of GenAI.
Further reading
Register now for immediate access of HFS' research, data and forward looking trends.
Get StartedIf you don't have an account, Register here |
Register now for immediate access of HFS' research, data and forward looking trends.
Get Started