You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Deep metric learning (DML) learns a generalizable embedding space of a dataset,
7
-
where semantically similar samples are mapped closer.
6
+
Deep metric learning (DML) learns a generalizable embedding space of a dataset, where semantically similar samples are mapped closer.
8
7
Recently, the record-breaking methodologies have been generally evolving from pairwise-based approaches to proxy-based approaches.
9
8
However, many recent works begin to achieve only marginal improvements on the classical datasets.
10
9
Thus, the explanation approaches of DML are in need for understanding
11
-
**why the trained model can confuse the dissimilar samples and cannot recognize the similar samples**.
10
+
**why the trained model can confuse the dissimilar samples?**.
12
11
13
-
To answer the above question, we conduct extensive experiments by running 2 comparable state-of-the-art DML approaches.
14
-
The observation leads us to design an influence function based explanation framework to investigate the existing datasets, consisting of:
12
+
The question motivates us to design an influence function based explanation framework to investigate the existing datasets, consisting of:
15
13
-[x] Scalable training-sample attribution:
16
14
- We propose empirical influence function to identify what training samples contribute to the generalization errors, and quantify how much contribution they make to the errors.
17
15
-[x] Dataset relabelling recommendation:
18
16
- We further aim to identify the potentially ``buggy'' training samples with mistaken labels and generate their relabelling recommendation.
19
17
20
18
## Requirements
21
-
Install torch, torchvision compatible with your CUDA, see here: https://pytorch.org/get-started/previous-versions/
19
+
- Step 1: Install torch, torchvision compatible with your CUDA, see here: [https://pytorch.org/get-started/previous-versions/](https://pytorch.org/get-started/previous-versions/)
20
+
- Step 2: Install faiss compatible with your CUDA, see here: [https://github.com/facebookresearch/faiss/blob/main/INSTALL.md](https://github.com/facebookresearch/faiss/blob/main/INSTALL.md)
21
+
- Step 3:
22
22
```
23
23
pip install -r requirements.txt
24
24
```
@@ -37,6 +37,22 @@ Put them under mnt/datasets/
37
37
- We use the same hyperparameters specified in [Proxy-NCA++](https://github.com/euwern/proxynca_pp), except for In-Shop we reduce the batch size to 32 due to the limit of our GPU resources.
38
38
39
39
## Project Structure
40
+
```
41
+
|__ config/: training config json files
42
+
|__ dataset/: define dataloader
43
+
|__ mnt/datasets/
44
+
|__ CARS_196/
45
+
|__ CUB200_2011/
46
+
|__ inshop/
47
+
|__ evaluation/: evaluation script for recall@k, NMI etc.
48
+
|__ experiments/: scripts for experiments
49
+
|__ Influence_function/: implementation of IF and EIF
50
+
|__ train.py: normal training script
51
+
|__ train_noisy_data.py: noisy data trianing script
52
+
|__ train_sample_reweight.py: re-weighted training script
53
+
```
54
+
55
+
## Instructions
40
56
- Training the original models
41
57
- Training the DML models with Proxy-NCA++ loss or with SoftTriple loss
See Influence_function/influence_function.py
87
-
88
100
## Results
89
101
- All trained models: https://drive.google.com/drive/folders/1uzy3J78iwKZMCx_k5yESDLbcLl9RADDb?usp=sharing
90
102
- For the detailed statistics of Table 1, please see https://docs.google.com/spreadsheets/d/1f4OXVLO2Mu2CHrBVm72a2ztTHx5nNG92dczTNNw7io4/edit?usp=sharing
0 commit comments