|
1 |
| -# Difference between NetCDF4-python and PnetCDF-python |
| 1 | +# Comparison between PnetCDF-Python and NetCDF4-Python |
| 2 | + |
| 3 | +Programming using [NetCDF4-Python](http://unidata.github.io/netcdf4-python/) and |
| 4 | +[PnetCDF-Python](https://pnetcdf-python.readthedocs.io) are very similar. |
| 5 | +Below lists some of the differences, including the file format support and |
| 6 | +operational modes. |
2 | 7 |
|
3 | 8 | * [Supported File Formats](#supported-file-formats)
|
4 | 9 | * [Differences in Python Programming](#differences-in-python-programming)
|
| 10 | +* [Define Mode and Data Mode](#define-mode-and-data-mode) |
| 11 | +* [Collective and Independent I/O Mode](#collective-and-independent-io-mode) |
5 | 12 | * [Blocking vs. Nonblocking APIs](#blocking-vs-nonblocking-apis)
|
6 | 13 |
|
7 | 14 | ---
|
|
59 | 66 | | ... ||
|
60 | 67 | | # close file<br>f.close() | ditto NetCDF4 |
|
61 | 68 |
|
| 69 | +--- |
| 70 | +## Define Mode and Data Mode |
| 71 | + |
| 72 | +In PnetCDF, an opened file is in either define mode or data mode. Switching |
| 73 | +between the modes is done by explicitly calling `"pnetcdf.File.enddef()"` and |
| 74 | +`"pnetcdf.File.redef()"`. NetCDF4-Python has no such mode switching |
| 75 | +requirement. The reason of PnetCDF enforcing such a requirement is to ensure |
| 76 | +the metadata consistency across all the MPI processes and keep the overhead of |
| 77 | +metadata synchronization small. |
| 78 | + |
| 79 | +* Define mode |
| 80 | + + When calling constructor of python class `"pnetcdf.File()"` to create a new |
| 81 | + file, the file is automatically put in the define mode. While in the |
| 82 | + define mode, the python program can create new dimensions, i.e. instances |
| 83 | + of class `"pnetcdf.Dimension"`, new variables, i.e. instances of class |
| 84 | + `"pnetcdf.Variable"`, and netCDF attributes. Modification of these data |
| 85 | + objects' metadata can only be done when the file is in the define mode. |
| 86 | + + When opening an existing file, the opened file is automatically put in the |
| 87 | + data mode. To add or modify the metadata, a python program must call |
| 88 | + `"pnetcdf.File.redef()"`. |
| 89 | + |
| 90 | +* Data mode |
| 91 | + + Once the creation or modification of metadata is complete, the python |
| 92 | + program must call `"pnetcdf.File.enddef()"` to leave the define mode and |
| 93 | + enter the data mode. |
| 94 | + + While an open file is in data mode, the python program can make read and |
| 95 | + write requests to that variables that have been created. |
| 96 | + |
| 97 | +<ul> |
| 98 | + <li> A PnetCDF-Python example shows switching between define and data modes |
| 99 | + after creating a new file.</li> |
| 100 | + <li> <details> |
| 101 | + <summary>Example code fragment (click to expand)</summary> |
| 102 | + |
| 103 | +```python |
| 104 | + import pnetcdf |
| 105 | + ... |
| 106 | + # Create the file |
| 107 | + f = pnetcdf.File(filename, 'w', "NC_64BIT_DATA", MPI.COMM_WORLD) |
| 108 | + ... |
| 109 | + # Define dimensions |
| 110 | + dim_y = f.def_dim("Y", 16) |
| 111 | + dim_x = f.def_dim("X", 32) |
| 112 | + |
| 113 | + # Define a 2D variable of integer type |
| 114 | + var = f.def_var("grid", pnetcdf.NC_INT, (dim_y, dim_x)) |
| 115 | + |
| 116 | + # Add an attribute of string type to the variable |
| 117 | + var.str_att_name = "example attribute" |
| 118 | + |
| 119 | + # Exit the define mode |
| 120 | + f.enddef() |
| 121 | + |
| 122 | + # Write to a subarray of the variable, var |
| 123 | + var[4:8, 20:24] = buf |
| 124 | + |
| 125 | + # Re-enter the define mode |
| 126 | + f.redef() |
| 127 | + |
| 128 | + # Define a new 2D variable of float type |
| 129 | + var_flt = f.def_var("temperature", pnetcdf.NC_FLOAT, (dim_y, dim_x)) |
| 130 | + |
| 131 | + # Exit the define mode |
| 132 | + f.enddef() |
| 133 | + |
| 134 | + # Write to a subarray of the variable, var_flt |
| 135 | + var_flt[0:4, 16:20] = buf_flt |
| 136 | + |
| 137 | + # Close the file |
| 138 | + f.close() |
| 139 | +``` |
| 140 | +</details></li> |
| 141 | + |
| 142 | + <li> An example shows switching between define and data modes after opening an existing file. |
| 143 | + </li> |
| 144 | + <li> <details> |
| 145 | + <summary>Example code fragment (click to expand)</summary> |
| 146 | + |
| 147 | +```python |
| 148 | + import pnetcdf |
| 149 | + ... |
| 150 | + # Opening an existing file |
| 151 | + f = pnetcdf.File(filename, 'r', MPI.COMM_WORLD) |
| 152 | + ... |
| 153 | + # get the python handler of variable named 'grid', a 2D variable of integer type |
| 154 | + var = f.variables['grid'] |
| 155 | + |
| 156 | + # Read the variable's attribute named "str_att_name" |
| 157 | + str_att = var.str_att_name |
| 158 | + |
| 159 | + # Read a subarray of the variable, var |
| 160 | + r_buf = np.empty((4, 4), var.dtype) |
| 161 | + r_buf = var[4:8, 20:24] |
| 162 | + |
| 163 | + # Re-enter the define mode |
| 164 | + f.redef() |
| 165 | + |
| 166 | + # Define a new 2D variable of double type |
| 167 | + var_dbl = f.def_var("precipitation", pnetcdf.NC_DOUBLE, (dim_y, dim_x)) |
| 168 | + |
| 169 | + # Add an attribute of string type to the variable |
| 170 | + var_dbl.unit = "mm/s" |
| 171 | + |
| 172 | + # Exit the define mode |
| 173 | + f.enddef() |
| 174 | + |
| 175 | + # Write to a subarray of the variable, temperature |
| 176 | + var_dbl[0:4, 16:20] = buf_dbl |
| 177 | + |
| 178 | + # Close the file |
| 179 | + f.close() |
| 180 | +``` |
| 181 | +</details></li> |
| 182 | +</ul> |
| 183 | + |
| 184 | + |
| 185 | +--- |
| 186 | +## Collective and Independent I/O Mode |
| 187 | + |
| 188 | +The terminology of collective and independent I/O comes from MPI standard. A |
| 189 | +collective I/O function call requires all the MPI processes opening the same |
| 190 | +file to participate. On the other hand, an independent I/O function can be |
| 191 | +called by an MPI process independently from others. |
| 192 | + |
| 193 | +For metadata I/O, both PnetCDF and NetCDF4 require the function calls to be |
| 194 | +collective. |
| 195 | + |
| 196 | +* Mode Switch Mechanism |
| 197 | + + PnetCDF-Python -- when a file is in the data mode, it can be put into |
| 198 | + either collective or independent I/O mode. The default mode is collective |
| 199 | + I/O mode. Switching to and exiting from the independent I/O mode is done |
| 200 | + by explicitly calling `"pnetcdf.File.begin_indep()"` and |
| 201 | + `"pnetcdf.File.end_indep()"`. |
| 202 | + |
| 203 | + + NetCDF4-Python -- collective and independent mode switching is done per |
| 204 | + variable basis. Switching mode is done by explicitly calling |
| 205 | + `"Variable.set_collective()"` before accessing the variable. |
| 206 | + For more information, see |
| 207 | + [NetCDF4-Python User Guide on Parallel I/O](https://unidata.github.io/netcdf4-python/#parallel-io) |
| 208 | + |
| 209 | +<ul> |
| 210 | + <li> A PnetCDF-Python example shows switching between collective and |
| 211 | + independent I/O modes.</li> |
| 212 | + <li> <details> |
| 213 | + <summary>Example code fragment (click to expand)</summary> |
| 214 | + |
| 215 | +```python |
| 216 | + import pnetcdf |
| 217 | + ... |
| 218 | + # Create the file |
| 219 | + f = pnetcdf.File(filename, 'w', "NC_64BIT_DATA", MPI.COMM_WORLD) |
| 220 | + ... |
| 221 | + # Metadata operations to define dimensions and variables |
| 222 | + ... |
| 223 | + # Exit the define mode (by default, in the collective I/O mode) |
| 224 | + f.enddef() |
| 225 | + |
| 226 | + # Write to variables collectively |
| 227 | + var_flt[start_y:end_y, start_x:end_x] = buf_flt |
| 228 | + var_dbl[start_y:end_y, start_x:end_x] = buf_dbl |
| 229 | + |
| 230 | + # Leaving collective I/O mode and entering independent I/O mode |
| 231 | + f.begin_indep() |
| 232 | + |
| 233 | + # Write to variables independently |
| 234 | + var_flt[start_y:end_y, start_x:end_x] = buf_flt |
| 235 | + var_dbl[start_y:end_y, start_x:end_x] = buf_dbl |
| 236 | + |
| 237 | + # Close the file |
| 238 | + f.close() |
| 239 | +``` |
| 240 | +</details></li> |
| 241 | +</ul> |
| 242 | + |
| 243 | +<ul> |
| 244 | + <li> A NetCDF4-Python example shows switching between collective and |
| 245 | + independent I/O modes.</li> |
| 246 | + <li> <details> |
| 247 | + <summary>Example code fragment (click to expand)</summary> |
| 248 | + |
| 249 | +```python |
| 250 | + import netCDF4 |
| 251 | + ... |
| 252 | + # Create the file |
| 253 | + f = netCDF4.File(filename, 'w', "NC_64BIT_DATA", MPI.COMM_WORLD, parallel=True) |
| 254 | + ... |
| 255 | + # Metadata operations to define dimensions and variables |
| 256 | + ... |
| 257 | + |
| 258 | + # Write to variables collectively |
| 259 | + var_flt.set_collective(True) |
| 260 | + var_flt[start_y:end_y, start_x:end_x] = buf_flt |
| 261 | + |
| 262 | + var_dbl.set_collective(True) |
| 263 | + var_dbl[start_y:end_y, start_x:end_x] = buf_dbl |
| 264 | + |
| 265 | + # Write to variables independently |
| 266 | + var_flt.set_collective(False) |
| 267 | + var_flt[start_y:end_y, start_x:end_x] = buf_flt |
| 268 | + |
| 269 | + var_dbl.set_collective(False) |
| 270 | + var_dbl[start_y:end_y, start_x:end_x] = buf_dbl |
| 271 | + |
| 272 | + # Close the file |
| 273 | + f.close() |
| 274 | +``` |
| 275 | +</details></li> |
| 276 | +</ul> |
| 277 | + |
62 | 278 | ---
|
63 | 279 |
|
64 | 280 | ## Blocking vs Nonblocking APIs
|
|
0 commit comments