EY fixes flaws in GenAI to deliver large-scale code migration for financial services client

Transforming financial services (FS) clients to take advantage of advances in digital technologies often demands labor-intensive and time-consuming code migrations. The hype around ChatGPT would have you believe that you can simply upload legacy code and watch it get rewritten before your eyes. But in its work with a large FS client, EY has found life isn’t quite that simple.

The client asked EY to apply generative artificial intelligence (GenAI) to a code conversion from the statistical analysis language SAS to PySpark, an interface for Apache Spark in Python. PySpark offers the advantages of allowing coders to write applications using Python APIs and providing the PySpark shell to analyze data in distributed environments.

Code snippet migration is easy, but complex complete migrations reveal challenges with consistency and code contamination

EY found simple code snippet migration was easy. But more complex code and complete code repository migration at scale required custom engineering on top of the large language models (LLMs) such as ChatGPT. Without the custom work, the team encountered challenges with consistency, dependencies, and code contamination.

It solved the challenges with a custom solution that packaged the native code conversion capabilities of GPT-4 with steps that checked the code for context and dependencies and provided a schema for the overall program flow. The pilot converted 4,000 lines of code with 85%+ accuracy and 80% less effort.

Pilot led the way for a multiple-language solution

Now EY has adapted the initial bespoke SAS-to-PySpark version to handle other programming languages. EY’s Generative Solution for Code Repository Migration converts complete code repositories in one go, maintaining intact repository structures, dependencies, and other essential variables.

When using GenAI tools such as GitHub and Copilot, code gets converted as snippets, during which context gets lost. The code that GenAI generates may also contain outdated or no-longer-supported libraries. EY’s solution includes intelligent “chunking” to split large chunks of code logically, code schema identification and mapping to maintain consistency, nested parsing of large code to retain context, and ensuring the use of the correct libraries to avoid code contamination.

The Bottom Line: It will take more than a ChatGPT-plus license and an enthusiastic amateur to convert your legacy code.

Because anyone can access AI through the interfaces of ChatGPT and its kin, leaders should be wary of the rash of “dangerous experts” this will deliver—folks who have dabbled and now know “enough to be dangerous.” When it comes to business-critical legacy code, it’s clear that service providers and consultancies retain an essential role in identifying and responding to the risks while deploying the benefits of GenAI.

EY fixes flaws in GenAI to deliver large-scale code migration for financial services client

Code snippet migration is easy, but complex complete migrations reveal challenges with consistency and code contamination

Pilot led the way for a multiple-language solution

The Bottom Line: It will take more than a ChatGPT-plus license and an enthusiastic amateur to convert your legacy code.

Login

Register

Insight. Inspiration. Impact.

Author

David Cushman Executive Research Leader

Related

Congratulations!

Sign In

Insight. Inspiration. Impact.

Email
	If you don't have an account, Register here

Username
Password

	Remember Me Lost your password?

Email
	If you don't have an account, Register here

Username
Password

	Remember Me Lost your password?