Skip to content

Commit 1308eef

Browse files
committed
Add sections of define/data modes, and collective/independent modes
* add example code fragments for switching between define and data modes * add example code fragments for switching between collective and independent I/O modes
1 parent 74ee38c commit 1308eef

File tree

2 files changed

+219
-3
lines changed

2 files changed

+219
-3
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,8 @@ python programs to achieve a scalable I/O performance.
2929
["test/"](./test) and ["examples/"](./examples).
3030

3131
### Additional Resources
32-
* [Example programs](./examples#pnetcdf-python-examples) are available in
33-
folder `./examples`.
32+
* [Example python programs](./examples#pnetcdf-python-examples) available in
33+
folder [./examples](./examples).
3434
* PnetCDF-python [User Guide](https://pnetcdf-python.readthedocs.io/en/latest)
3535
* [Data objects](docs/pnetcdf_objects.md) in PnetCDF python programming
3636
* [Comparison](docs/nc4_vs_pnetcdf.md) of NetCDF4-python and PnetCDF-python

docs/nc4_vs_pnetcdf.md

Lines changed: 217 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,14 @@
1-
# Difference between NetCDF4-python and PnetCDF-python
1+
# Comparison between PnetCDF-Python and NetCDF4-Python
2+
3+
Programming using [NetCDF4-Python](http://unidata.github.io/netcdf4-python/) and
4+
[PnetCDF-Python](https://pnetcdf-python.readthedocs.io) are very similar.
5+
Below lists some of the differences, including the file format support and
6+
operational modes.
27

38
* [Supported File Formats](#supported-file-formats)
49
* [Differences in Python Programming](#differences-in-python-programming)
10+
* [Define Mode and Data Mode](#define-mode-and-data-mode)
11+
* [Collective and Independent I/O Mode](#collective-and-independent-io-mode)
512
* [Blocking vs. Nonblocking APIs](#blocking-vs-nonblocking-apis)
613

714
---
@@ -59,6 +66,215 @@
5966
| ... ||
6067
| # close file<br>f.close() | ditto NetCDF4 |
6168

69+
---
70+
## Define Mode and Data Mode
71+
72+
In PnetCDF, an opened file is in either define mode or data mode. Switching
73+
between the modes is done by explicitly calling `"pnetcdf.File.enddef()"` and
74+
`"pnetcdf.File.redef()"`. NetCDF4-Python has no such mode switching
75+
requirement. The reason of PnetCDF enforcing such a requirement is to ensure
76+
the metadata consistency across all the MPI processes and keep the overhead of
77+
metadata synchronization small.
78+
79+
* Define mode
80+
+ When calling constructor of python class `"pnetcdf.File()"` to create a new
81+
file, the file is automatically put in the define mode. While in the
82+
define mode, the python program can create new dimensions, i.e. instances
83+
of class `"pnetcdf.Dimension"`, new variables, i.e. instances of class
84+
`"pnetcdf.Variable"`, and netCDF attributes. Modification of these data
85+
objects' metadata can only be done when the file is in the define mode.
86+
+ When opening an existing file, the opened file is automatically put in the
87+
data mode. To add or modify the metadata, a python program must call
88+
`"pnetcdf.File.redef()"`.
89+
90+
* Data mode
91+
+ Once the creation or modification of metadata is complete, the python
92+
program must call `"pnetcdf.File.enddef()"` to leave the define mode and
93+
enter the data mode.
94+
+ While an open file is in data mode, the python program can make read and
95+
write requests to that variables that have been created.
96+
97+
<ul>
98+
<li> A PnetCDF-Python example shows switching between define and data modes
99+
after creating a new file.</li>
100+
<li> <details>
101+
<summary>Example code fragment (click to expand)</summary>
102+
103+
```python
104+
import pnetcdf
105+
...
106+
# Create the file
107+
f = pnetcdf.File(filename, 'w', "NC_64BIT_DATA", MPI.COMM_WORLD)
108+
...
109+
# Define dimensions
110+
dim_y = f.def_dim("Y", 16)
111+
dim_x = f.def_dim("X", 32)
112+
113+
# Define a 2D variable of integer type
114+
var = f.def_var("grid", pnetcdf.NC_INT, (dim_y, dim_x))
115+
116+
# Add an attribute of string type to the variable
117+
var.str_att_name = "example attribute"
118+
119+
# Exit the define mode
120+
f.enddef()
121+
122+
# Write to a subarray of the variable, var
123+
var[4:8, 20:24] = buf
124+
125+
# Re-enter the define mode
126+
f.redef()
127+
128+
# Define a new 2D variable of float type
129+
var_flt = f.def_var("temperature", pnetcdf.NC_FLOAT, (dim_y, dim_x))
130+
131+
# Exit the define mode
132+
f.enddef()
133+
134+
# Write to a subarray of the variable, var_flt
135+
var_flt[0:4, 16:20] = buf_flt
136+
137+
# Close the file
138+
f.close()
139+
```
140+
</details></li>
141+
142+
<li> An example shows switching between define and data modes after opening an existing file.
143+
</li>
144+
<li> <details>
145+
<summary>Example code fragment (click to expand)</summary>
146+
147+
```python
148+
import pnetcdf
149+
...
150+
# Opening an existing file
151+
f = pnetcdf.File(filename, 'r', MPI.COMM_WORLD)
152+
...
153+
# get the python handler of variable named 'grid', a 2D variable of integer type
154+
var = f.variables['grid']
155+
156+
# Read the variable's attribute named "str_att_name"
157+
str_att = var.str_att_name
158+
159+
# Read a subarray of the variable, var
160+
r_buf = np.empty((4, 4), var.dtype)
161+
r_buf = var[4:8, 20:24]
162+
163+
# Re-enter the define mode
164+
f.redef()
165+
166+
# Define a new 2D variable of double type
167+
var_dbl = f.def_var("precipitation", pnetcdf.NC_DOUBLE, (dim_y, dim_x))
168+
169+
# Add an attribute of string type to the variable
170+
var_dbl.unit = "mm/s"
171+
172+
# Exit the define mode
173+
f.enddef()
174+
175+
# Write to a subarray of the variable, temperature
176+
var_dbl[0:4, 16:20] = buf_dbl
177+
178+
# Close the file
179+
f.close()
180+
```
181+
</details></li>
182+
</ul>
183+
184+
185+
---
186+
## Collective and Independent I/O Mode
187+
188+
The terminology of collective and independent I/O comes from MPI standard. A
189+
collective I/O function call requires all the MPI processes opening the same
190+
file to participate. On the other hand, an independent I/O function can be
191+
called by an MPI process independently from others.
192+
193+
For metadata I/O, both PnetCDF and NetCDF4 require the function calls to be
194+
collective.
195+
196+
* Mode Switch Mechanism
197+
+ PnetCDF-Python -- when a file is in the data mode, it can be put into
198+
either collective or independent I/O mode. The default mode is collective
199+
I/O mode. Switching to and exiting from the independent I/O mode is done
200+
by explicitly calling `"pnetcdf.File.begin_indep()"` and
201+
`"pnetcdf.File.end_indep()"`.
202+
203+
+ NetCDF4-Python -- collective and independent mode switching is done per
204+
variable basis. Switching mode is done by explicitly calling
205+
`"Variable.set_collective()"` before accessing the variable.
206+
For more information, see
207+
[NetCDF4-Python User Guide on Parallel I/O](https://unidata.github.io/netcdf4-python/#parallel-io)
208+
209+
<ul>
210+
<li> A PnetCDF-Python example shows switching between collective and
211+
independent I/O modes.</li>
212+
<li> <details>
213+
<summary>Example code fragment (click to expand)</summary>
214+
215+
```python
216+
import pnetcdf
217+
...
218+
# Create the file
219+
f = pnetcdf.File(filename, 'w', "NC_64BIT_DATA", MPI.COMM_WORLD)
220+
...
221+
# Metadata operations to define dimensions and variables
222+
...
223+
# Exit the define mode (by default, in the collective I/O mode)
224+
f.enddef()
225+
226+
# Write to variables collectively
227+
var_flt[start_y:end_y, start_x:end_x] = buf_flt
228+
var_dbl[start_y:end_y, start_x:end_x] = buf_dbl
229+
230+
# Leaving collective I/O mode and entering independent I/O mode
231+
f.begin_indep()
232+
233+
# Write to variables independently
234+
var_flt[start_y:end_y, start_x:end_x] = buf_flt
235+
var_dbl[start_y:end_y, start_x:end_x] = buf_dbl
236+
237+
# Close the file
238+
f.close()
239+
```
240+
</details></li>
241+
</ul>
242+
243+
<ul>
244+
<li> A NetCDF4-Python example shows switching between collective and
245+
independent I/O modes.</li>
246+
<li> <details>
247+
<summary>Example code fragment (click to expand)</summary>
248+
249+
```python
250+
import netCDF4
251+
...
252+
# Create the file
253+
f = netCDF4.File(filename, 'w', "NC_64BIT_DATA", MPI.COMM_WORLD, parallel=True)
254+
...
255+
# Metadata operations to define dimensions and variables
256+
...
257+
258+
# Write to variables collectively
259+
var_flt.set_collective(True)
260+
var_flt[start_y:end_y, start_x:end_x] = buf_flt
261+
262+
var_dbl.set_collective(True)
263+
var_dbl[start_y:end_y, start_x:end_x] = buf_dbl
264+
265+
# Write to variables independently
266+
var_flt.set_collective(False)
267+
var_flt[start_y:end_y, start_x:end_x] = buf_flt
268+
269+
var_dbl.set_collective(False)
270+
var_dbl[start_y:end_y, start_x:end_x] = buf_dbl
271+
272+
# Close the file
273+
f.close()
274+
```
275+
</details></li>
276+
</ul>
277+
62278
---
63279

64280
## Blocking vs Nonblocking APIs

0 commit comments

Comments
 (0)