Skip to content

HDFS-17769. Allows client to actively retry to Active NameNode when the Observer NameNode is too far behind client state id. #7602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

gyz-web
Copy link
Contributor

@gyz-web gyz-web commented Apr 11, 2025

JIRA: HDFS-17769.

Description of PR

When we use Router to forward read requests to the observer, if the cluster experiences heavy write workloads, Observer nodes may fail to keep pace with edit log synchronization, even if the dfs.ha.tail-edits.in-progress parameter is configured, it may still occur.
This triggers RetriableException: Observer Node is too far behind errors.
Especially when the client ipc.client.ping parameter is set to true, it will strive to wait and constantly retry, which can cause the business to be unable to obtain the desired data timely. We should consider having the active namenode handle this at this time.

Here are our some errors and repair verification:

1.The stateid of the observer is too far behind the active:

 Tue Apr 15 11:22:41 CST 2025, Active latest txId: 5698245512, Observer latest txId:5695118653,Observer far behind: 3126859, time takes0s 
Tue Apr 15 11:22:43 CST 2025, Active latest txId: 5698253145, Observer latest txId:5695118653,Observer far behind: 3134492, time takes0s 
Tue Apr 15 11:22:45 CST 2025, Active latest txId: 5698260942, Observer latest txId:5695118653,Observer far behind: 3142289, time takes0s 
Tue Apr 15 11:22:47 CST 2025, Active latest txId: 5698268614, Observer latest txId:5695123653,Observer far behind: 3144961, time takes0s 
Tue Apr 15 11:22:49 CST 2025, Active latest txId: 5698276490, Observer latest txId:5695123653,Observer far behind: 3152837, time takes0s 
Tue Apr 15 11:22:51 CST 2025, Active latest txId: 5698284361, Observer latest txId:5695128653,Observer far behind: 3155708, time takes0s 
Tue Apr 15 11:22:54 CST 2025, Active latest txId: 5698292641, Observer latest txId:5695128653,Observer far behind: 3163988, time takes0s 

2.RetriableException:

10:16:53.744 [IPC Client (24555242) connection to routerIp:8888 from hdfs] DEBUG org.apache.hadoop.ipc.Client - IPC Client (24555242) connection to routerIp:8888 from hdfs: stopped, remaining connections 0 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RetriableException): Observer Node is too far behind: serverStateId = 5695128653 clientStateId = 5698292641 
at sun.reflect.GeneratedConstructorAccessor49.newInstance(Unknown Source) 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) 
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:505) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeSequential(RouterRpcClient.java:972) 
at org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.getFileInfo(RouterClientProtocol.java:981) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getFileInfo(RouterRpcServer.java:883) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:1044) 
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) 
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1106) 
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1029) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3063) 
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RetriableException): Observer Node is too far behind: serverStateId = 5632963133 clientStateId = 5635526176 
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1567) 
at org.apache.hadoop.ipc.Client.call(Client.java:1513) 
at org.apache.hadoop.ipc.Client.call(Client.java:1410) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) 
at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:966) 
at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:637) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:467) 
... 15 more 

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1584) 
at org.apache.hadoop.ipc.Client.call(Client.java:1529) 
at org.apache.hadoop.ipc.Client.call(Client.java:1426) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) 
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.lambda$getFileInfo$41(ClientNamenodeProtocolTranslatorPB.java:820) 
at org.apache.hadoop.ipc.internal.ShadedProtobufHelper.ipc(ShadedProtobufHelper.java:160) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:820) 
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.hadoop.hdfs.server.namenode.ha.RouterObserverReadProxyProvider$RouterObserverReadInvocationHandler.invoke(RouterObserverReadProxyProvider.java:216) 
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) 
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:437) 
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:170) 
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:162) 
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:100) 
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:366) 
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) 
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1770) 
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1828) 
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1825) 
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) 
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1840) 
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:611) 
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:468) 
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:432) 
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2592) 
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2558) 
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2520) 
at hadoop.write_then_observer_read2.main(write_then_observer_read2.java:64) 

3.repair verification:

(1) View the status of the cluster NameNode:

[root@20w ~]# hdfs haadmin -ns hh-rbf-test5 -getAllServiceState 
20w:8020                            active     
21w:8020                            standby    
22w:8020                            observer  

(2) We enable the dfs.namenode.observer.too.stale.retry.active.enable parameter and execute a read command on the 21w machine:

[root@21w ~]# hdfs dfs -cat /t.sh 
/bin/ssh $1

(3) The read RPC request can be found in hdfs-audit.log in the active namennode, so the request is forwarded to the active namenode:

[root@20w ~]# tail -f /data/disk02/var/log/hadoop/hdfs/hdfs-audit.log|grep t.sh 
2025-04-15 11:24:31,148 INFO FSNamesystem.audit: allowed=true   ugi=root (auth:SIMPLE)  ip=/xx cmd=getfileinfo src=/t.sh       dst=null        perm=null       proto=rpc 
2025-04-15 11:24:31,461 INFO FSNamesystem.audit: allowed=true   ugi=root (auth:SIMPLE)  ip=/xx cmd=open        src=/t.sh       dst=null        perm=null       proto=rpc

(4) there are logs of retries to active in the observer log:
2025-04-15 11:24:30,148 WARN namenode.FSNamesystem (GlobalStateIdContext.java:receiveRequestState(163)) - Retrying to Active NameNode, Observer Node is too far behind: serverStateId = 5695393653 clientStateId = 5699337672

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 33s trunk passed
+1 💚 compile 0m 45s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 38s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 35s trunk passed
+1 💚 mvnsite 0m 41s trunk passed
+1 💚 javadoc 0m 44s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 11s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 46s trunk passed
+1 💚 shadedclient 22m 6s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 34s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 37s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 33s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 28s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 310 unchanged - 0 fixed = 312 total (was 310)
+1 💚 mvnsite 0m 36s the patch passed
+1 💚 javadoc 0m 33s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 3s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 36s the patch passed
+1 💚 shadedclient 22m 37s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 55s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 21s The patch does not generate ASF License warnings.
81m 46s
Reason Tests
Failed junit tests hadoop.tools.TestHdfsConfigFields
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/1/artifact/out/Dockerfile
GITHUB PR #7602
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux d793b347e830 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0d0b22c
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/1/testReport/
Max. process+thread count 552 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 27m 17s trunk passed
+1 💚 compile 0m 45s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 43s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 38s trunk passed
+1 💚 mvnsite 0m 46s trunk passed
+1 💚 javadoc 0m 44s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 10s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 48s trunk passed
+1 💚 shadedclient 21m 12s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 35s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 37s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 34s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 27s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 310 unchanged - 0 fixed = 312 total (was 310)
+1 💚 mvnsite 0m 36s the patch passed
+1 💚 javadoc 0m 33s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 0s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 44s the patch passed
+1 💚 shadedclient 22m 2s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 57s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 21s The patch does not generate ASF License warnings.
84m 13s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/2/artifact/out/Dockerfile
GITHUB PR #7602
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 789088097ae4 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 25f49ca
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/2/testReport/
Max. process+thread count 553 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 8m 38s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 25m 20s trunk passed
+1 💚 compile 0m 45s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 37s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 36s trunk passed
+1 💚 mvnsite 0m 44s trunk passed
+1 💚 javadoc 0m 42s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 5s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 41s trunk passed
+1 💚 shadedclient 21m 27s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 36s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 37s the patch passed
+1 💚 compile 0m 30s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 30s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 30s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 310 unchanged - 0 fixed = 312 total (was 310)
+1 💚 mvnsite 0m 38s the patch passed
+1 💚 javadoc 0m 33s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 58s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 42s the patch passed
+1 💚 shadedclient 21m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 2s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 21s The patch does not generate ASF License warnings.
89m 43s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/3/artifact/out/Dockerfile
GITHUB PR #7602
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 0772b272dc0e 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2cfaff7
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/3/testReport/
Max. process+thread count 554 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

throw new RetriableException(
"Observer Node is too far behind: serverStateId = "
+ serverStateId + " clientStateId = " + clientStateId);
if (namesystem.isRetryActive()) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question. Is it necessary for us to keep this configuration item? Can we just retry according to ObserverRetryOnActiveException by default and then cooperate with a certain number of retries?

like this:

int retryTimes = xxx;
if(retryTimes < max){
throw new ObserverRetryOnActiveException(message);
}else{
throw new RetriableException(message);
}

Copy link

@BsoBird BsoBird Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hexiaoqiao @slfan1989 @jojochuang Hello Sir,Can u help check this? Tks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is meaningless to retry in the context of the observer stateId is too far behind the client stateid, because it may take a long time for the observer to catch up with the active edit log, which is very bad for the business perception.

In addition, I think this configuration addition is necessary. The default is to follow the original retry logic. Only when the business has such a demand, or it is considered that retry is not necessary, the parameter is turned on and transferred to the active namenode. It is not easy to reuse parameters, because there will be a parameter with two meanings, resulting in coupling and non-independence.

In addition, is it possible to change it directly to ObserverRetryOnActiveException?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 24m 36s trunk passed
+1 💚 compile 0m 46s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 40s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 43s trunk passed
+1 💚 mvnsite 0m 47s trunk passed
+1 💚 javadoc 0m 43s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 14s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 53s trunk passed
+1 💚 shadedclient 23m 6s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 39s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 37s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 33s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 29s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 4 new + 310 unchanged - 0 fixed = 314 total (was 310)
+1 💚 mvnsite 0m 38s the patch passed
+1 💚 javadoc 0m 31s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 3s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 38s the patch passed
+1 💚 shadedclient 21m 1s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 53s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 25s The patch does not generate ASF License warnings.
86m 16s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/4/artifact/out/Dockerfile
GITHUB PR #7602
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 9dd9206467c4 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / db4453c
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/4/testReport/
Max. process+thread count 998 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

String message = "Retrying to Active NameNode, Observer Node is too"
+ " far behind: serverStateId = " + serverStateId
+ " clientStateId = " + clientStateId;
FSNamesystem.LOG.warn(message);
Copy link

@BsoBird BsoBird Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have thrown an exception, is it necessary to log another message here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because it is an observer ->active exception, I think it should be printed out in the observer's log to help maintainers understand the forwarding status of requests.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 51s trunk passed
+1 💚 compile 0m 40s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 36s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 35s trunk passed
+1 💚 mvnsite 0m 38s trunk passed
+1 💚 javadoc 0m 38s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 3s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 43s trunk passed
+1 💚 shadedclient 23m 13s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 36s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 37s the patch passed
+1 💚 compile 0m 31s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 31s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 28s the patch passed
+1 💚 mvnsite 0m 35s the patch passed
+1 💚 javadoc 0m 30s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 2s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 48s the patch passed
+1 💚 shadedclient 22m 38s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 38s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 24s The patch does not generate ASF License warnings.
85m 17s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/5/artifact/out/Dockerfile
GITHUB PR #7602
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 37dbc46b7353 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / af2f89c
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/5/testReport/
Max. process+thread count 1292 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 25m 9s trunk passed
+1 💚 compile 0m 44s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 39s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 36s trunk passed
+1 💚 mvnsite 0m 44s trunk passed
+1 💚 javadoc 0m 44s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 0s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 46s trunk passed
+1 💚 shadedclient 23m 10s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 41s the patch passed
+1 💚 compile 0m 40s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 40s the patch passed
+1 💚 compile 0m 32s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 32s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 28s the patch passed
+1 💚 mvnsite 0m 38s the patch passed
+1 💚 javadoc 0m 34s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 59s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 41s the patch passed
+1 💚 shadedclient 22m 28s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 34s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 21s The patch does not generate ASF License warnings.
87m 4s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/6/artifact/out/Dockerfile
GITHUB PR #7602
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 486cb6c2c2a9 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a985ada
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/6/testReport/
Max. process+thread count 1264 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

…he Observer NameNode is too far behind client state id.
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 24m 44s trunk passed
+1 💚 compile 0m 43s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 42s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 38s trunk passed
+1 💚 mvnsite 0m 44s trunk passed
+1 💚 javadoc 0m 42s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 5s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 43s trunk passed
+1 💚 shadedclient 21m 48s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 22m 0s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 33s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 36s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 36s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 30s the patch passed
+1 💚 mvnsite 0m 41s the patch passed
+1 💚 javadoc 0m 31s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 5s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 41s the patch passed
+1 💚 shadedclient 21m 32s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 47s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 25s The patch does not generate ASF License warnings.
84m 35s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/7/artifact/out/Dockerfile
GITHUB PR #7602
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 87433ea18c08 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 8682989
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/7/testReport/
Max. process+thread count 878 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 55s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 39s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 36s trunk passed
+1 💚 mvnsite 0m 43s trunk passed
+1 💚 javadoc 0m 43s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 7s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 37s trunk passed
+1 💚 shadedclient 21m 42s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 35s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 36s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 33s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 29s the patch passed
+1 💚 mvnsite 0m 36s the patch passed
+1 💚 javadoc 0m 31s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 59s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 39s the patch passed
+1 💚 shadedclient 22m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 53s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 23s The patch does not generate ASF License warnings.
85m 4s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/8/artifact/out/Dockerfile
GITHUB PR #7602
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 7a5a56b605f0 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / e266a52
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/8/testReport/
Max. process+thread count 1139 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@gyz-web
Copy link
Contributor Author

gyz-web commented Apr 16, 2025

@BsoBird Hi, I see that there are no problem with the testing. Can you help me contact someone with merge permissions to merge it into the trunk. Thanks

@gyz-web
Copy link
Contributor Author

gyz-web commented Apr 18, 2025

@simbadzina hello,can you help me review this pr? thanks you very much~

@BsoBird
Copy link

BsoBird commented Apr 18, 2025

@ayushtkn @Hexiaoqiao @slfan1989 I have preliminarily judged that this modification does not seem to have any significant issues. I have already applied this patch to our production environment, and it appears to be working fine. However, my understanding of HDFS may not be deep enough. Could you help review this code?

@gyz-web
Copy link
Contributor Author

gyz-web commented Apr 25, 2025

I believe that in a production environment, ensuring that requests can be processed is the top priority. When we use Hive for performance testing, if the speed of writing data exceeds the synchronization speed of the Observer and the difference between the client StateId and server StateId exceeds 160000 transactions, re reading at this time will indeed keep reporting RetroiableException about "Observer Node is too far behind", causing all read requests in the entire sub cluster to fail, which may lead to a production failure. Therefore, ObservarMetryOnActiveException should be thrown.
Hi Sirs @Hexiaoqiao @ayushtkn @goiri @ZanderXu @simbadzina @slfan1989 The above is my understanding, Could you please help me review this pr when you have free time ? Thanks a lot~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants