Highlight Report

EY fixes flaws in GenAI to deliver large-scale code migration for financial services client

Home » Research & Insights » EY fixes flaws in GenAI to deliver large-scale code migration for financial services client

Transforming financial services (FS) clients to take advantage of advances in digital technologies often demands labor-intensive and time-consuming code migrations. The hype around ChatGPT would have you believe that you can simply upload legacy code and watch it get rewritten before your eyes. But in its work with a large FS client, EY has found life isn’t quite that simple.

The client asked EY to apply generative artificial intelligence (GenAI) to a code conversion from the statistical analysis language SAS to PySpark, an interface for Apache Spark in Python. PySpark offers the advantages of allowing coders to write applications using Python APIs and providing the PySpark shell to analyze data in distributed environments.

Code snippet migration is easy, but complex complete migrations reveal challenges with consistency and code contamination

EY found simple code snippet migration was easy. But more complex code and complete code repository migration at scale required custom engineering on top of the large language models (LLMs) such as ChatGPT. Without the custom work, the team encountered challenges with consistency, dependencies, and code contamination.

It solved the challenges with a custom solution that packaged the native code conversion capabilities of GPT-4 with steps that checked the code for context and dependencies and provided a schema for the overall program flow. The pilot converted 4,000 lines of code with 85%+ accuracy and 80% less effort.

Pilot led the way for a multiple-language solution

Now EY has adapted the initial bespoke SAS-to-PySpark version to handle other programming languages. EY’s Generative Solution for Code Repository Migration converts complete code repositories in one go, maintaining intact repository structures, dependencies, and other essential variables.

When using GenAI tools such as GitHub and Copilot, code gets converted as snippets, during which context gets lost. The code that GenAI generates may also contain outdated or no-longer-supported libraries. EY’s solution includes intelligent “chunking” to split large chunks of code logically, code schema identification and mapping to maintain consistency, nested parsing of large code to retain context, and ensuring the use of the correct libraries to avoid code contamination.

The Bottom Line: It will take more than a ChatGPT-plus license and an enthusiastic amateur to convert your legacy code.

Because anyone can access AI through the interfaces of ChatGPT and its kin, leaders should be wary of the rash of “dangerous experts” this will deliver—folks who have dabbled and now know “enough to be dangerous.” When it comes to business-critical legacy code, it’s clear that service providers and consultancies retain an essential role in identifying and responding to the risks while deploying the benefits of GenAI.

Further reading

Sign in to view or download this research.

Login

Register

Insight. Inspiration. Impact.

Register now for immediate access of HFS' research, data and forward looking trends.

Get Started

Logo

confirm

Congratulations!

Your account has been created. You can continue exploring free AI insights while you verify your email. Please check your inbox for the verification link to activate full access.

Sign In

Insight. Inspiration. Impact.

Register now for immediate access of HFS' research, data and forward looking trends.

Get Started
ASK
HFS AI