Skip to content

Fixes Ion Binary Reader to fail fast when an IVM type code is encountered in an illegal position #397

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 24, 2021

Conversation

popematt
Copy link
Contributor

@popematt popematt commented Nov 22, 2021

Issue #, if available:
#395

Description of changes:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov
Copy link

codecov bot commented Nov 22, 2021

Codecov Report

Merging #397 (dad9622) into master (ac594ac) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #397   +/-   ##
=========================================
  Coverage     66.35%   66.35%           
- Complexity     5361     5362    +1     
=========================================
  Files           154      154           
  Lines         22622    22624    +2     
  Branches       4083     4084    +1     
=========================================
+ Hits          15010    15013    +3     
+ Misses         6254     6253    -1     
  Partials       1358     1358           
Impacted Files Coverage Δ
src/com/amazon/ion/impl/IonReaderBinaryRawX.java 80.09% <100.00%> (+0.06%) ⬆️
src/com/amazon/ion/impl/BlockedBuffer.java 51.09% <0.00%> (+0.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ac594ac...dad9622. Read the comment docs.

@tgregg
Copy link
Contributor

tgregg commented Nov 22, 2021

It's really hard for me to tell why this fixes the problem. Why is the reader returning a null field name symbol token when it's supposed to be in a struct? Is it not actually in a struct?

Also, for changes to ion-tests, those need to be made directly in the https://github.com/amzn/ion-tests/ repo. Then we can update ion-java's submodule to point to the latest commit on ion-tests.

@popematt
Copy link
Contributor Author

It's really hard for me to tell why this fixes the problem. Why is the reader returning a null field name symbol token when it's supposed to be in a struct? Is it not actually in a struct?

It is actually in a struct... but someone managed to devise an invalid input that tricks the ion reader into thinking there's a symbol with a null field name. Here's the input:

E0 01 00 EA EE 01 F7 81   83 DE 01 F2 86 BE 01 67
A8 9A 11 86 7C 7A 9A 9E   72 73 00 00 91 74 81 83
DE DC 04 3D B5 00 00 00   81 E0 01 00 EA EE 01 F7
81 98 DE 01 F2 83 DE 01

There's a number of issues with this input. The length codes for all of the containers are wrong. It also looks like someone took the first 8 bytes (including the IVM) and repeated them again at position 41, in the middle of a struct.

After looking into it more, it seems like the root cause is actually that IVMs are not handled correctly inside containers. When $ion_1_0 is found by a text reader, the reader will treat it as a symbol if it is not a top-level value. The binary reader is doing the same thing with E0 01 00 EA even though E0 unambiguously indicates it is not a symbol.

https://github.com/amzn/ion-java/blob/ac594aca46bccad84b736348efb5bd8c780deded/src/com/amazon/ion/impl/IonReaderBinaryRawX.java#L285-L290

Here we see that any time E0 01 00 EA occurs, it is interpreted as the marker with a value type of "symbol", and in the load_version_marker() method, it always sets the field name sid to unknown.

https://github.com/amzn/ion-java/blob/ac594aca46bccad84b736348efb5bd8c780deded/src/com/amazon/ion/impl/IonReaderBinaryRawX.java#L339-L360

Also, for changes to ion-tests, those need to be made directly in the https://github.com/amzn/ion-tests/ repo. Then we can update ion-java's submodule to point to the latest commit on ion-tests.

Right. I forgot about that.

@popematt
Copy link
Contributor Author

I managed to come up with a minimal, complete example of the bad data. See amazon-ion/ion-tests#74

@tgregg
Copy link
Contributor

tgregg commented Nov 23, 2021

I suspect the better fix is to throw an error here unless the reader is at depth zero. E.g.

if (getDepth() != 0) {
    throwErrorAt("Encountered illegal type code E0 below the top level.");
}

@popematt
Copy link
Contributor Author

I suspect the better fix is to throw an error here unless the reader is at depth zero. E.g.

if (getDepth() != 0) {
    throwErrorAt("Encountered illegal type code E0 below the top level.");
}

Yes. That's what I realized when I looked into it more. I haven't pushed an updated PR though because I'm waiting to get the tests updated first.

@popematt popematt merged commit 676455a into amazon-ion:master Nov 24, 2021
@popematt popematt changed the title Fixes NullPointerException when reading malformed symbol table import Fixes Ion Binary Reader to fail fast when an IVM type code is encountered in an illegal position Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants