-
Notifications
You must be signed in to change notification settings - Fork 9k
HDFS-17769. Allows client to actively retry to Active NameNode when the Observer NameNode is too far behind client state id. #7602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
throw new RetriableException( | ||
"Observer Node is too far behind: serverStateId = " | ||
+ serverStateId + " clientStateId = " + clientStateId); | ||
if (namesystem.isRetryActive()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question. Is it necessary for us to keep this configuration item? Can we just retry according to ObserverRetryOnActiveException by default and then cooperate with a certain number of retries?
like this:
int retryTimes = xxx;
if(retryTimes < max){
throw new ObserverRetryOnActiveException(message);
}else{
throw new RetriableException(message);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hexiaoqiao @slfan1989 @jojochuang Hello Sir,Can u help check this? Tks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is meaningless to retry in the context of the observer stateId is too far behind the client stateid, because it may take a long time for the observer to catch up with the active edit log, which is very bad for the business perception.
In addition, I think this configuration addition is necessary. The default is to follow the original retry logic. Only when the business has such a demand, or it is considered that retry is not necessary, the parameter is turned on and transferred to the active namenode. It is not easy to reuse parameters, because there will be a parameter with two meanings, resulting in coupling and non-independence.
In addition, is it possible to change it directly to ObserverRetryOnActiveException?
🎊 +1 overall
This message was automatically generated. |
String message = "Retrying to Active NameNode, Observer Node is too" | ||
+ " far behind: serverStateId = " + serverStateId | ||
+ " clientStateId = " + clientStateId; | ||
FSNamesystem.LOG.warn(message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have thrown an exception, is it necessary to log another message here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because it is an observer ->active exception, I think it should be printed out in the observer's log to help maintainers understand the forwarding status of requests.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
…he Observer NameNode is too far behind client state id.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@BsoBird Hi, I see that there are no problem with the testing. Can you help me contact someone with merge permissions to merge it into the trunk. Thanks |
@simbadzina hello,can you help me review this pr? thanks you very much~ |
@ayushtkn @Hexiaoqiao @slfan1989 I have preliminarily judged that this modification does not seem to have any significant issues. I have already applied this patch to our production environment, and it appears to be working fine. However, my understanding of HDFS may not be deep enough. Could you help review this code? |
I believe that in a production environment, ensuring that requests can be processed is the top priority. When we use Hive for performance testing, if the speed of writing data exceeds the synchronization speed of the Observer and the difference between the client StateId and server StateId exceeds 160000 transactions, re reading at this time will indeed keep reporting RetroiableException about "Observer Node is too far behind", causing all read requests in the entire sub cluster to fail, which may lead to a production failure. Therefore, ObservarMetryOnActiveException should be thrown. |
JIRA: HDFS-17769.
Description of PR
When we use Router to forward read requests to the observer, if the cluster experiences heavy write workloads, Observer nodes may fail to keep pace with edit log synchronization, even if the dfs.ha.tail-edits.in-progress parameter is configured, it may still occur.
This triggers RetriableException: Observer Node is too far behind errors.
Especially when the client ipc.client.ping parameter is set to true, it will strive to wait and constantly retry, which can cause the business to be unable to obtain the desired data timely. We should consider having the active namenode handle this at this time.
Here are our some errors and repair verification:
1.The stateid of the observer is too far behind the active:
2.RetriableException:
3.repair verification:
(1) View the status of the cluster NameNode:
(2) We enable the dfs.namenode.observer.too.stale.retry.active.enable parameter and execute a read command on the 21w machine:
(3) The read RPC request can be found in hdfs-audit.log in the active namennode, so the request is forwarded to the active namenode:
(4) there are logs of retries to active in the observer log:
2025-04-15 11:24:30,148 WARN namenode.FSNamesystem (GlobalStateIdContext.java:receiveRequestState(163)) - Retrying to Active NameNode, Observer Node is too far behind: serverStateId = 5695393653 clientStateId = 5699337672