-
Notifications
You must be signed in to change notification settings - Fork 94
Description
Summary
When flink-agents-dist.jar is deployed in /opt/flink/lib (which is required), user-defined resource classes (e.g., custom ChatModel implementations) cannot be loaded from user JARs uploaded via the REST API, resulting in ClassNotFoundException.
Error Message
java.lang.ClassNotFoundException: com.example.AzureOpenAIChatModelSetup
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source)
at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Unknown Source)
at org.apache.flink.agents.plan.resourceprovider.JavaResourceProvider.provide(JavaResourceProvider.java:40)
Root Cause
The framework code in /opt/flink/lib is loaded by the System ClassLoader. User JARs uploaded at runtime are loaded by Flink's User ClassLoader (a child of the System ClassLoader).
The existing code uses Class.forName(className) which defaults to the caller's classloader (System ClassLoader). Due to Java's parent-first delegation model, the System ClassLoader cannot see classes in its child classloaders.
Affected locations:
JavaResourceProvider.java- main resource instantiationJavaSerializableResourceProvider.java- serializable resource deserializationAgentPlan.java- PythonResourceWrapper class checksActionJsonDeserializer.java- parameter type and config deserializationFunctionToolJsonDeserializer.java- parameter type deserializationEventLogRecordJsonDeserializer.java- event class deserialization
Solution
Use the Thread Context ClassLoader (TCCL) instead:
Class.forName(className, true, Thread.currentThread().getContextClassLoader())Flink sets the TCCL to the User ClassLoader before executing user code, making user-defined classes accessible to framework code.
Workaround
Place user-defined resource classes in /opt/flink/lib alongside flink-agents-dist.jar. However, this is inconvenient for deployment scenarios where the platform cannot anticipate what users will run (e.g., would require rebuilding Docker images for each custom resource).
Fix
PR #514