Skip to content

Allow clearing all caches to avoid classloader leaks #4953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jin-harmoney opened this issue Feb 6, 2025 · 3 comments
Closed

Allow clearing all caches to avoid classloader leaks #4953

jin-harmoney opened this issue Feb 6, 2025 · 3 comments
Labels
2.19 Issues planned at 2.19 or later
Milestone

Comments

@jin-harmoney
Copy link
Contributor

jin-harmoney commented Feb 6, 2025

Is your feature request related to a problem? Please describe.

I'm using both hypersistence-utils and jackson in my Quarkus project and I see that my application classloader is not garbage collected on live-reload.

This seems to be caused by the static ObjectMapper in ObjectMapperWrapper.

The ObjectMapper is not loaded by the application classloader, but it does contain references to classes of the my application (custom deserializers etc). Hence, it blocks the garbage collection of the application classloader.

I described a similar case for hibernate-envers here.

Describe the solution you'd like

As mentioned in another issue, a clear() method on ObjectMapper to evict all caches before the application shuts down would probably solve this.

Usage example

Specifically when using hypersistence-utils, it would be ObjectMapperWrapper.INSTANCE.getObjectMapper().clear();. That code would be placed in a method annotated with @Shutdown.

Additional context

No response

Workaround

// Hypersistence utils uses an object mapper. The object mapper has several caches that keep references to classes and cause classloader memory leaks.
var hyObjectmapper = ObjectMapperWrapper.INSTANCE.getObjectMapper();
hyObjectmapper.getTypeFactory().clearCache();

// mapper._rootDeserializers is private, so we need to clear it using reflection
var field = hyObjectmapper.getClass().getDeclaredField("_rootDeserializers");
field.setAccessible(true);
var rootDeserializers = (Map) field.get(hyObjectmapper);
rootDeserializers.clear();

// DeserializationContext._cache is private, so we need to clear it using reflection
field = hyObjectmapper.getDeserializationContext().getClass().getSuperclass().getSuperclass().getDeclaredField("_cache");
field.setAccessible(true);
var cache = (DeserializerCache) field.get(hyObjectmapper.getDeserializationContext());
cache.flushCachedDeserializers();

// Flush other stuff
((DefaultSerializerProvider) ObjectMapperWrapper.INSTANCE.getObjectMapper().getSerializerProvider()).flushCachedSerializers();
@jin-harmoney jin-harmoney added the to-evaluate Issue that has been received but not yet evaluated label Feb 6, 2025
@cowtowncoder cowtowncoder added the 2.19 Issues planned at 2.19 or later label Feb 6, 2025
@cowtowncoder
Copy link
Member

cowtowncoder commented Feb 6, 2025

Sounds like good idea. PR would be welcome (as usual, but might be easy enough to implement).

On naming, instead of clear() needs to be clearCaches() (or clearAllCaches()) but aside from that.

@cowtowncoder cowtowncoder added pr-welcome Issue for which progress most likely if someone submits a Pull Request and removed to-evaluate Issue that has been received but not yet evaluated labels Feb 6, 2025
@cowtowncoder
Copy link
Member

One other quick note: usually one should drop ObjectMapper instance to get rid of references. But in practice static instances are probably quite widely used so we can help such usage by adding this feature.

@cowtowncoder
Copy link
Member

Completed for 2.19(.0).

@cowtowncoder cowtowncoder removed the pr-welcome Issue for which progress most likely if someone submits a Pull Request label Feb 8, 2025
dongjoon-hyun pushed a commit to apache/spark that referenced this issue May 19, 2025
…de kubernetes-client to version 7.3.0

### What changes were proposed in this pull request?
The primary objective of this pr is to upgrade Jackson from 2.18.2 to 2.19.0, and simultaneously upgrade the kubernetes-client from 7.2.0 to 7.3.0 to ensure compatibility with Jackson 2.19.0.

### Why are the changes needed?
The new version of Jackson brings several bug fixes:
- FasterXML/jackson-databind#4849
- FasterXML/jackson-databind#4934
- FasterXML/jackson-databind#5052
- FasterXML/jackson-databind#4953

The full release notes as follow:
- https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.19

The release of kubernetes-client 7.3.0 is solely for the purpose of ensuring compatibility with Jackson 2.19.0:
- https://github.com/fabric8io/kubernetes-client/releases/tag/v7.3.0

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #50730 from LuciferYang/SPARK-51927.

Lead-authored-by: yangjie01 <yangjie01@baidu.com>
Co-authored-by: YangJie <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
LuciferYang added a commit to apache/spark that referenced this issue May 21, 2025
…de kubernetes-client to version 7.3.0

### What changes were proposed in this pull request?
The primary objective of this pr is to upgrade Jackson from 2.18.2 to 2.19.0, and simultaneously upgrade the kubernetes-client from 7.2.0 to 7.3.0 to ensure compatibility with Jackson 2.19.0.

### Why are the changes needed?
The new version of Jackson brings several bug fixes:
- FasterXML/jackson-databind#4849
- FasterXML/jackson-databind#4934
- FasterXML/jackson-databind#5052
- FasterXML/jackson-databind#4953

The full release notes as follow:
- https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.19

The release of kubernetes-client 7.3.0 is solely for the purpose of ensuring compatibility with Jackson 2.19.0:
- https://github.com/fabric8io/kubernetes-client/releases/tag/v7.3.0

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- manual check:

```
build/sbt clean package -Phive
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf' --python-executables=python3.11
```

```
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf' --python-executables=python3.11
Running PySpark tests. Output is in /Users/yangjie01/SourceCode/git/spark-sbt/python/unit-tests.log
Will test against the following Python executables: ['python3.11']
Will test the following Python tests: ['pyspark.sql.tests.arrow.test_arrow_python_udf']
python3.11 python_implementation is CPython
python3.11 version is: Python 3.11.12
Starting test(python3.11): pyspark.sql.tests.arrow.test_arrow_python_udf (temp output: /Users/yangjie01/SourceCode/git/spark-sbt/python/target/83c10dc1-64a2-4b7d-80b4-4977dadd26fa/python3.11__pyspark.sql.tests.arrow.test_arrow_python_udf___gzv2b5u.log)
Finished test(python3.11): pyspark.sql.tests.arrow.test_arrow_python_udf (62s) ... 8 tests were skipped
Tests passed in 62 seconds

Skipped tests in pyspark.sql.tests.arrow.test_arrow_python_udf with python3.11:
      test_broadcast_in_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_broadcast_in_udf) ... skip (0.000s)
      test_datasource_with_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_datasource_with_udf) ... skip (0.001s)
      test_register_java_function (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_register_java_function) ... skip (0.000s)
      test_register_java_udaf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_register_java_udaf) ... skip (0.000s)
      test_broadcast_in_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_broadcast_in_udf) ... skip (0.000s)
      test_datasource_with_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_datasource_with_udf) ... skip (0.001s)
      test_register_java_function (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_register_java_function) ... skip (0.000s)
      test_register_java_udaf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_register_java_udaf) ... skip (0.000s)
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #50730 from LuciferYang/SPARK-51927.

Lead-authored-by: yangjie01 <yangjie01@baidu.com>
Co-authored-by: YangJie <yangjie01@baidu.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.19 Issues planned at 2.19 or later
Projects
None yet
Development

No branches or pull requests

2 participants