Version 1.1.0
This release integrates many smaller changes that have been made over the past year.
The most significant new features are:
- The NCUObserver to include performance metrics from the Nvidia Profiler during tuning
- TegraObserver to read/set clock frequencies, power and temperature on Nvidia Jetson GPUs
In addition, a lot of work has been put into several backends, including OpenACC, the compiler backend, the HIP backend and so on.
Thanks to everyone who contributed to Kernel Tuner in the past year!
What's Changed
- Add Tegra Observer to control clocks on Jetson devices by @loostrum in #243
- Catch RuntimeError when importing from pyhip by @loostrum in #252
- Bump pillow from 10.2.0 to 10.3.0 by @dependabot in #249
- Read instant power in pwr_usage by @csbnw in #247
- Bump idna from 3.6 to 3.7 by @dependabot in #250
- Register observer & correct clock setting by @fjwillemsen in #242
- Compiler backend uses g++ instead of gcc by @benvanwerkhoven in #254
- Improved OpenACC support by @isazi in #248
- Small improvements to searchspaces and simulation mode by @fjwillemsen in #251
- Simplify contributing info by @benvanwerkhoven in #255
- Support Python 3.12 and drop Python 3.8 by @benvanwerkhoven in #256
- Support Python 3.12 and drop Python 3.8 (2) by @fjwillemsen in #260
- Add NCUObserver by @csbnw in #253
- Update PMTObserver for latest PMT changes by @csbnw in #261
- OpenACC bug fixing by @isazi in #262
- ESiWACE3 hackathon by @isazi in #267
- fix reading of graphics and memory clocks by @benvanwerkhoven in #271
- Directives: summer refactoring by @isazi in #269
- Tegra observer by @MartijnFr in #270
- Tegra observer with continuous observer by @benvanwerkhoven in #275
- base implementation for pmt continuous observer by @benvanwerkhoven in #276
- Add support for float16 to HIP backend by @loostrum in #280
- Fix: out-of-date PMTContinuousObserver readings by @wvbbreu in #283
- Hip local memory error handling by @MiloLurati in #284
- Replacing PyHIP with new official python wrapper of ROCm HIP by @MiloLurati in #285
- update observer to latest python bindings by @benvanwerkhoven in #279
- add support for any case spelling of block size name defaults by @benvanwerkhoven in #277
- update documentation by @benvanwerkhoven in #293
- Updated pyproject to use hip-python from testpypi by @fjwillemsen in #294
New Contributors
- @MartijnFr made their first contribution in #270
- @wvbbreu made their first contribution in #283
Full Changelog: 1.0...1.1.0