Skip to content

HDFS-17769. Allows client to actively retry to Active NameNode when the Observer NameNode is too far behind client state id. #7602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider;
import org.apache.hadoop.hdfs.server.namenode.ha.ReadOnly;
import org.apache.hadoop.ipc.AlignmentContext;
import org.apache.hadoop.ipc.ObserverRetryOnActiveException;
import org.apache.hadoop.ipc.RetriableException;
import org.apache.hadoop.ipc.StandbyException;
import org.apache.hadoop.ipc.protobuf.RpcHeaderProtos.RpcRequestHeaderProto;
Expand Down Expand Up @@ -156,9 +157,9 @@ public long receiveRequestState(RpcRequestHeaderProto header,
ESTIMATED_TRANSACTIONS_PER_SECOND
* TimeUnit.MILLISECONDS.toSeconds(clientWaitTime)
* ESTIMATED_SERVER_TIME_MULTIPLIER) {
throw new RetriableException(
"Observer Node is too far behind: serverStateId = "
+ serverStateId + " clientStateId = " + clientStateId);
throw new ObserverRetryOnActiveException("Retrying to Active NameNode, Observer Node is too"
+ " far behind: serverStateId = " + serverStateId
+ " clientStateId = " + clientStateId);
}
return clientStateId;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -518,6 +518,31 @@ public void testObserverRetryActiveException() throws Exception {
assertTrue(thrownRetryException);
}

/**
* Test that, when the server stateId is too far behind the
* client stateId, the request should be retried directly to
* Active NameNode, instead of constantly trying again.
*/
@Test
public void testObserverRetryActiveExceptionWhenStateIdTooStale() throws Exception {
dfs.mkdir(testPath, FsPermission.getDefault());
assertSentTo(0);

// Set large stateId on the client,the server stateId is too far behind
// the client stateId and will retry to active.
long realStateId = HATestUtil.setACStateId(dfs, 1000000);
FileStatus fileStatus = dfs.getFileStatus(testPath);
assertNotNull(fileStatus);
assertSentTo(0);

// StateId restored to normal, request processed by observer.
HATestUtil.setACStateId(dfs, realStateId);
FileStatus fileStatus2= dfs.getFileStatus(testPath);
assertNotNull(fileStatus2);
assertSentTo(2);

}

/**
* Test that for open call, if access time update is required,
* the open call should go to active, instead of observer.
Expand Down