You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+6Lines changed: 6 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
5
5
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
8
+
## [7.3.0] - 2024-10-11
9
+
10
+
### Added
11
+
12
+
- Add ability to (with the right incantations) retrieve the chunks used by an Assistant file search - thanks to [@agamble](https://github.com/agamble) for the addition!
Copy file name to clipboardExpand all lines: README.md
+110Lines changed: 110 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1111,6 +1111,116 @@ end
1111
1111
1112
1112
Note that you have 10 minutes to submit your tool output before the run expires.
1113
1113
1114
+
#### Exploring chunks used in File Search
1115
+
1116
+
Take a deep breath. You might need a drink for this one.
1117
+
1118
+
It's possible for OpenAI to share what chunks it used in its internal RAG Pipeline to create its filesearch example.
1119
+
1120
+
An example spec can be found [here](https://github.com/alexrudall/ruby-openai/blob/main/spec/openai/client/assistant_file_search_spec.rb) that does this, just so you know it's possible.
1121
+
1122
+
Here's how to get the chunks used in a file search. In this example I'm using [this file](https://css4.pub/2015/textbook/somatosensory.pdf):
1123
+
1124
+
```
1125
+
require "openai"
1126
+
1127
+
# Make a client
1128
+
client = OpenAI::Client.new(
1129
+
access_token: "access_token_goes_here",
1130
+
log_errors: true # Don't do this in production.
1131
+
)
1132
+
1133
+
# Upload your file(s)
1134
+
file_id = client.files.upload(
1135
+
parameters: {
1136
+
file: "path/to/somatosensory.pdf",
1137
+
purpose: "assistants"
1138
+
}
1139
+
)["id"]
1140
+
1141
+
# Create a vector store to store the vectorised file(s)
0 commit comments