HDFS-17769. Allows client to actively retry to Active NameNode when the Observer NameNode is too far behind client state id. #7602

gyz-web · 2025-04-11T12:46:09Z

Description of PR

When we use Router to forward read requests to the observer, if the cluster experiences heavy write workloads, Observer nodes may fail to keep pace with edit log synchronization, even if the dfs.ha.tail-edits.in-progress parameter is configured, it may still occur.
This triggers RetriableException: Observer Node is too far behind errors.
Especially when the client ipc.client.ping parameter is set to true, it will strive to wait and constantly retry, which can cause the business to be unable to obtain the desired data timely. We should consider having the active namenode handle this at this time.

Here are our some errors and repair verification:

1.The stateid of the observer is too far behind the active:

 Tue Apr 15 11:22:41 CST 2025, Active latest txId: 5698245512, Observer latest txId:5695118653，Observer far behind: 3126859, time takes0s 
Tue Apr 15 11:22:43 CST 2025, Active latest txId: 5698253145, Observer latest txId:5695118653，Observer far behind: 3134492, time takes0s 
Tue Apr 15 11:22:45 CST 2025, Active latest txId: 5698260942, Observer latest txId:5695118653，Observer far behind: 3142289, time takes0s 
Tue Apr 15 11:22:47 CST 2025, Active latest txId: 5698268614, Observer latest txId:5695123653，Observer far behind: 3144961, time takes0s 
Tue Apr 15 11:22:49 CST 2025, Active latest txId: 5698276490, Observer latest txId:5695123653，Observer far behind: 3152837, time takes0s 
Tue Apr 15 11:22:51 CST 2025, Active latest txId: 5698284361, Observer latest txId:5695128653，Observer far behind: 3155708, time takes0s 
Tue Apr 15 11:22:54 CST 2025, Active latest txId: 5698292641, Observer latest txId:5695128653，Observer far behind: 3163988, time takes0s

2.RetriableException:

10:16:53.744 [IPC Client (24555242) connection to routerIp:8888 from hdfs] DEBUG org.apache.hadoop.ipc.Client - IPC Client (24555242) connection to routerIp:8888 from hdfs: stopped, remaining connections 0 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RetriableException): Observer Node is too far behind: serverStateId = 5695128653 clientStateId = 5698292641 
at sun.reflect.GeneratedConstructorAccessor49.newInstance(Unknown Source) 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) 
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:505) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeSequential(RouterRpcClient.java:972) 
at org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.getFileInfo(RouterClientProtocol.java:981) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getFileInfo(RouterRpcServer.java:883) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:1044) 
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) 
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1106) 
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1029) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3063) 
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RetriableException): Observer Node is too far behind: serverStateId = 5632963133 clientStateId = 5635526176 
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1567) 
at org.apache.hadoop.ipc.Client.call(Client.java:1513) 
at org.apache.hadoop.ipc.Client.call(Client.java:1410) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) 
at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:966) 
at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:637) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:654) 
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:467) 
... 15 more 

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1584) 
at org.apache.hadoop.ipc.Client.call(Client.java:1529) 
at org.apache.hadoop.ipc.Client.call(Client.java:1426) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) 
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) 
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.lambda$getFileInfo$41(ClientNamenodeProtocolTranslatorPB.java:820) 
at org.apache.hadoop.ipc.internal.ShadedProtobufHelper.ipc(ShadedProtobufHelper.java:160) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:820) 
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.hadoop.hdfs.server.namenode.ha.RouterObserverReadProxyProvider$RouterObserverReadInvocationHandler.invoke(RouterObserverReadProxyProvider.java:216) 
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) 
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:437) 
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:170) 
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:162) 
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:100) 
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:366) 
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) 
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1770) 
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1828) 
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1825) 
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) 
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1840) 
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:611) 
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:468) 
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:432) 
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2592) 
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2558) 
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2520) 
at hadoop.write_then_observer_read2.main(write_then_observer_read2.java:64)

3.repair verification:

(1) View the status of the cluster NameNode:

[root@20w ~]# hdfs haadmin -ns hh-rbf-test5 -getAllServiceState 
20w:8020                            active     
21w:8020                            standby    
22w:8020                            observer

(2) We enable the dfs.namenode.observer.too.stale.retry.active.enable parameter and execute a read command on the 21w machine:

[root@21w ~]# hdfs dfs -cat /t.sh 
/bin/ssh $1

(3) The read RPC request can be found in hdfs-audit.log in the active namennode, so the request is forwarded to the active namenode:

[root@20w ~]# tail -f /data/disk02/var/log/hadoop/hdfs/hdfs-audit.log|grep t.sh 
2025-04-15 11:24:31,148 INFO FSNamesystem.audit: allowed=true   ugi=root (auth:SIMPLE)  ip=/xx cmd=getfileinfo src=/t.sh       dst=null        perm=null       proto=rpc 
2025-04-15 11:24:31,461 INFO FSNamesystem.audit: allowed=true   ugi=root (auth:SIMPLE)  ip=/xx cmd=open        src=/t.sh       dst=null        perm=null       proto=rpc

(4) there are logs of retries to active in the observer log:
2025-04-15 11:24:30,148 WARN namenode.FSNamesystem (GlobalStateIdContext.java:receiveRequestState(163)) - Retrying to Active NameNode, Observer Node is too far behind: serverStateId = 5695393653 clientStateId = 5699337672

hadoop-yetus · 2025-04-11T14:08:57Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 21s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	23m 33s		trunk passed
+1 💚	compile	0m 45s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	compile	0m 38s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	checkstyle	0m 35s		trunk passed
+1 💚	mvnsite	0m 41s		trunk passed
+1 💚	javadoc	0m 44s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 11s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 46s		trunk passed
+1 💚	shadedclient	22m 6s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 34s		the patch passed
+1 💚	compile	0m 37s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javac	0m 37s		the patch passed
+1 💚	compile	0m 33s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	javac	0m 33s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 28s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 310 unchanged - 0 fixed = 312 total (was 310)
+1 💚	mvnsite	0m 36s		the patch passed
+1 💚	javadoc	0m 33s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 3s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 36s		the patch passed
+1 💚	shadedclient	22m 37s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	0m 55s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 21s		The patch does not generate ASF License warnings.
		81m 46s

Reason	Tests
Failed junit tests	hadoop.tools.TestHdfsConfigFields

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/1/artifact/out/Dockerfile
GITHUB PR	#7602
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux d793b347e830 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `0d0b22c`
Default Java	Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/1/testReport/
Max. process+thread count	552 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/1/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2025-04-11T15:50:56Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 20s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+0 🆗	xmllint	0m 0s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	27m 17s		trunk passed
+1 💚	compile	0m 45s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	compile	0m 43s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	checkstyle	0m 38s		trunk passed
+1 💚	mvnsite	0m 46s		trunk passed
+1 💚	javadoc	0m 44s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 10s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 48s		trunk passed
+1 💚	shadedclient	21m 12s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 35s		the patch passed
+1 💚	compile	0m 37s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javac	0m 37s		the patch passed
+1 💚	compile	0m 34s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	javac	0m 34s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 27s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 310 unchanged - 0 fixed = 312 total (was 310)
+1 💚	mvnsite	0m 36s		the patch passed
+1 💚	javadoc	0m 33s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 0s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 44s		the patch passed
+1 💚	shadedclient	22m 2s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	0m 57s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 21s		The patch does not generate ASF License warnings.
		84m 13s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/2/artifact/out/Dockerfile
GITHUB PR	#7602
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 789088097ae4 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `25f49ca`
Default Java	Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/2/testReport/
Max. process+thread count	553 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/2/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2025-04-14T02:41:21Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	8m 38s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+0 🆗	xmllint	0m 0s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	25m 20s		trunk passed
+1 💚	compile	0m 45s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	compile	0m 37s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	checkstyle	0m 36s		trunk passed
+1 💚	mvnsite	0m 44s		trunk passed
+1 💚	javadoc	0m 42s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 5s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 41s		trunk passed
+1 💚	shadedclient	21m 27s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 36s		the patch passed
+1 💚	compile	0m 37s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javac	0m 37s		the patch passed
+1 💚	compile	0m 30s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	javac	0m 30s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 30s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 310 unchanged - 0 fixed = 312 total (was 310)
+1 💚	mvnsite	0m 38s		the patch passed
+1 💚	javadoc	0m 33s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	0m 58s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 42s		the patch passed
+1 💚	shadedclient	21m 7s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	1m 2s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 21s		The patch does not generate ASF License warnings.
		89m 43s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/3/artifact/out/Dockerfile
GITHUB PR	#7602
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 0772b272dc0e 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `2cfaff7`
Default Java	Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/3/testReport/
Max. process+thread count	554 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/3/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

BsoBird · 2025-04-16T06:53:57Z

...t/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GlobalStateIdContext.java

-      throw new RetriableException(
-          "Observer Node is too far behind: serverStateId = "
-              + serverStateId + " clientStateId = " + clientStateId);
+      if (namesystem.isRetryActive()) {


I have a question. Is it necessary for us to keep this configuration item? Can we just retry according to ObserverRetryOnActiveException by default and then cooperate with a certain number of retries?

like this:

int retryTimes = xxx; if(retryTimes < max){ throw new ObserverRetryOnActiveException(message); }else{ throw new RetriableException(message); }

@Hexiaoqiao @slfan1989 @jojochuang Hello Sir,Can u help check this? Tks.

I think it is meaningless to retry in the context of the observer stateId is too far behind the client stateid, because it may take a long time for the observer to catch up with the active edit log, which is very bad for the business perception.

In addition, I think this configuration addition is necessary. The default is to follow the original retry logic. Only when the business has such a demand, or it is considered that retry is not necessary, the parameter is turned on and transferred to the active namenode. It is not easy to reuse parameters, because there will be a parameter with two meanings, resulting in coupling and non-independence.

In addition, is it possible to change it directly to ObserverRetryOnActiveException?

hadoop-yetus · 2025-04-16T09:56:19Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 20s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+0 🆗	xmllint	0m 0s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	24m 36s		trunk passed
+1 💚	compile	0m 46s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	compile	0m 40s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	checkstyle	0m 43s		trunk passed
+1 💚	mvnsite	0m 47s		trunk passed
+1 💚	javadoc	0m 43s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 14s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 53s		trunk passed
+1 💚	shadedclient	23m 6s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 39s		the patch passed
+1 💚	compile	0m 37s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javac	0m 37s		the patch passed
+1 💚	compile	0m 33s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	javac	0m 33s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 29s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 4 new + 310 unchanged - 0 fixed = 314 total (was 310)
+1 💚	mvnsite	0m 38s		the patch passed
+1 💚	javadoc	0m 31s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 3s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 38s		the patch passed
+1 💚	shadedclient	21m 1s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	3m 53s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 25s		The patch does not generate ASF License warnings.
		86m 16s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/4/artifact/out/Dockerfile
GITHUB PR	#7602
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 9dd9206467c4 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `db4453c`
Default Java	Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/4/testReport/
Max. process+thread count	998 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/4/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

BsoBird · 2025-04-16T10:28:26Z

...t/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GlobalStateIdContext.java

+      String message = "Retrying to Active NameNode, Observer Node is too"
+          + " far behind: serverStateId = " + serverStateId
+          + " clientStateId = " + clientStateId;
+      FSNamesystem.LOG.warn(message);


Since we have thrown an exception, is it necessary to log another message here?

Yes, because it is an observer ->active exception, I think it should be printed out in the observer's log to help maintainers understand the forwarding status of requests.

hadoop-yetus · 2025-04-16T11:45:03Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 20s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+0 🆗	xmllint	0m 1s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	23m 51s		trunk passed
+1 💚	compile	0m 40s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	compile	0m 36s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	checkstyle	0m 35s		trunk passed
+1 💚	mvnsite	0m 38s		trunk passed
+1 💚	javadoc	0m 38s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 3s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 43s		trunk passed
+1 💚	shadedclient	23m 13s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 36s		the patch passed
+1 💚	compile	0m 37s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javac	0m 37s		the patch passed
+1 💚	compile	0m 31s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	javac	0m 31s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 28s		the patch passed
+1 💚	mvnsite	0m 35s		the patch passed
+1 💚	javadoc	0m 30s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 2s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 48s		the patch passed
+1 💚	shadedclient	22m 38s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	3m 38s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 24s		The patch does not generate ASF License warnings.
		85m 17s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/5/artifact/out/Dockerfile
GITHUB PR	#7602
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 37dbc46b7353 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `af2f89c`
Default Java	Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/5/testReport/
Max. process+thread count	1292 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/5/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2025-04-16T12:16:45Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 22s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+0 🆗	xmllint	0m 0s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	25m 9s		trunk passed
+1 💚	compile	0m 44s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	compile	0m 39s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	checkstyle	0m 36s		trunk passed
+1 💚	mvnsite	0m 44s		trunk passed
+1 💚	javadoc	0m 44s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 0s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 46s		trunk passed
+1 💚	shadedclient	23m 10s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 41s		the patch passed
+1 💚	compile	0m 40s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javac	0m 40s		the patch passed
+1 💚	compile	0m 32s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	javac	0m 32s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 28s		the patch passed
+1 💚	mvnsite	0m 38s		the patch passed
+1 💚	javadoc	0m 34s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	0m 59s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 41s		the patch passed
+1 💚	shadedclient	22m 28s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	3m 34s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 21s		The patch does not generate ASF License warnings.
		87m 4s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/6/artifact/out/Dockerfile
GITHUB PR	#7602
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 486cb6c2c2a9 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `a985ada`
Default Java	Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/6/testReport/
Max. process+thread count	1264 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/6/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

…he Observer NameNode is too far behind client state id.

hadoop-yetus · 2025-04-16T13:22:10Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 20s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	24m 44s		trunk passed
+1 💚	compile	0m 43s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	compile	0m 42s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	checkstyle	0m 38s		trunk passed
+1 💚	mvnsite	0m 44s		trunk passed
+1 💚	javadoc	0m 42s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 5s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 43s		trunk passed
+1 💚	shadedclient	21m 48s		branch has no errors when building and testing our client artifacts.
-0 ⚠️	patch	22m 0s		Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 33s		the patch passed
+1 💚	compile	0m 36s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javac	0m 36s		the patch passed
+1 💚	compile	0m 36s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	javac	0m 36s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 30s		the patch passed
+1 💚	mvnsite	0m 41s		the patch passed
+1 💚	javadoc	0m 31s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 5s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 41s		the patch passed
+1 💚	shadedclient	21m 32s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	3m 47s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 25s		The patch does not generate ASF License warnings.
		84m 35s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/7/artifact/out/Dockerfile
GITHUB PR	#7602
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 87433ea18c08 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `8682989`
Default Java	Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/7/testReport/
Max. process+thread count	878 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/7/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2025-04-16T14:48:07Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 32s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 1s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	23m 55s		trunk passed
+1 💚	compile	0m 41s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	compile	0m 39s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	checkstyle	0m 36s		trunk passed
+1 💚	mvnsite	0m 43s		trunk passed
+1 💚	javadoc	0m 43s		trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	1m 7s		trunk passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 37s		trunk passed
+1 💚	shadedclient	21m 42s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 35s		the patch passed
+1 💚	compile	0m 36s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javac	0m 36s		the patch passed
+1 💚	compile	0m 33s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	javac	0m 33s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 29s		the patch passed
+1 💚	mvnsite	0m 36s		the patch passed
+1 💚	javadoc	0m 31s		the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚	javadoc	0m 59s		the patch passed with JDK Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
+1 💚	spotbugs	1m 39s		the patch passed
+1 💚	shadedclient	22m 47s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	3m 53s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 23s		The patch does not generate ASF License warnings.
		85m 4s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/8/artifact/out/Dockerfile
GITHUB PR	#7602
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 7a5a56b605f0 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `e266a52`
Default Java	Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06~~us1-0ubuntu1~~20.04-b06
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/8/testReport/
Max. process+thread count	1139 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7602/8/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

gyz-web · 2025-04-16T15:07:48Z

@BsoBird Hi, I see that there are no problem with the testing. Can you help me contact someone with merge permissions to merge it into the trunk. Thanks

gyz-web · 2025-04-18T01:31:25Z

@simbadzina hello,can you help me review this pr? thanks you very much~

BsoBird · 2025-04-18T08:55:11Z

@ayushtkn @Hexiaoqiao @slfan1989 I have preliminarily judged that this modification does not seem to have any significant issues. I have already applied this patch to our production environment, and it appears to be working fine. However, my understanding of HDFS may not be deep enough. Could you help review this code?

gyz-web · 2025-04-25T00:30:41Z

I believe that in a production environment, ensuring that requests can be processed is the top priority. When we use Hive for performance testing, if the speed of writing data exceeds the synchronization speed of the Observer and the difference between the client StateId and server StateId exceeds 160000 transactions, re reading at this time will indeed keep reporting RetroiableException about "Observer Node is too far behind", causing all read requests in the entire sub cluster to fail, which may lead to a production failure. Therefore, ObservarMetryOnActiveException should be thrown.
Hi Sirs @Hexiaoqiao @ayushtkn @goiri @ZanderXu @simbadzina @slfan1989 The above is my understanding, Could you please help me review this pr when you have free time ? Thanks a lot~

github-actions bot added HDFS trunk labels Apr 11, 2025

BsoBird reviewed Apr 16, 2025

View reviewed changes

HDFS-17769. Allows client to actively retry to Active NameNode when t…

e266a52

…he Observer NameNode is too far behind client state id.

gyz-web force-pushed the HDFS-17769 branch from 8682989 to e266a52 Compare April 16, 2025 13:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDFS-17769. Allows client to actively retry to Active NameNode when the Observer NameNode is too far behind client state id. #7602

HDFS-17769. Allows client to actively retry to Active NameNode when the Observer NameNode is too far behind client state id. #7602

gyz-web commented Apr 11, 2025 •

edited

Loading

hadoop-yetus commented Apr 11, 2025

hadoop-yetus commented Apr 11, 2025

hadoop-yetus commented Apr 14, 2025

BsoBird Apr 16, 2025

BsoBird Apr 16, 2025 •

edited

Loading

gyz-web Apr 16, 2025

hadoop-yetus commented Apr 16, 2025

BsoBird Apr 16, 2025 •

edited

Loading

gyz-web Apr 16, 2025

hadoop-yetus commented Apr 16, 2025

hadoop-yetus commented Apr 16, 2025

hadoop-yetus commented Apr 16, 2025

hadoop-yetus commented Apr 16, 2025

gyz-web commented Apr 16, 2025

gyz-web commented Apr 18, 2025

BsoBird commented Apr 18, 2025

gyz-web commented Apr 25, 2025

HDFS-17769. Allows client to actively retry to Active NameNode when the Observer NameNode is too far behind client state id. #7602

Are you sure you want to change the base?

HDFS-17769. Allows client to actively retry to Active NameNode when the Observer NameNode is too far behind client state id. #7602

Conversation

gyz-web commented Apr 11, 2025 • edited Loading

Description of PR

hadoop-yetus commented Apr 11, 2025

hadoop-yetus commented Apr 11, 2025

hadoop-yetus commented Apr 14, 2025

BsoBird Apr 16, 2025

Choose a reason for hiding this comment

BsoBird Apr 16, 2025 • edited Loading

Choose a reason for hiding this comment

gyz-web Apr 16, 2025

Choose a reason for hiding this comment

hadoop-yetus commented Apr 16, 2025

BsoBird Apr 16, 2025 • edited Loading

Choose a reason for hiding this comment

gyz-web Apr 16, 2025

Choose a reason for hiding this comment

hadoop-yetus commented Apr 16, 2025

hadoop-yetus commented Apr 16, 2025

hadoop-yetus commented Apr 16, 2025

hadoop-yetus commented Apr 16, 2025

gyz-web commented Apr 16, 2025

gyz-web commented Apr 18, 2025

BsoBird commented Apr 18, 2025

gyz-web commented Apr 25, 2025

gyz-web commented Apr 11, 2025 •

edited

Loading

BsoBird Apr 16, 2025 •

edited

Loading

BsoBird Apr 16, 2025 •

edited

Loading