Skip to content

Commit 59a9179

Browse files
authored
Add files via upload
1 parent 53c76e7 commit 59a9179

File tree

10 files changed

+1000
-0
lines changed

10 files changed

+1000
-0
lines changed

README.md

Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
### 说明:
2+
3+
介绍如何使用CPM(Convolutional Pose Machines)实现服饰关键点定位,是对阿里云天池“FashionAI全球挑战赛——服饰关键点定位”竞赛的一次尝试。
4+
5+
### 原理:
6+
7+
输入是一张图片,输出是每个关键点的x、y坐标,一般会归一化到0~1区间中,所以可以理解为回归问题,但是直接对坐标值进行回归会导致较大误差,更好的做法是输出一个低分辨率的热图,使得关键点所在位置输出较高响应,而其他位置则输出较低响应。
8+
9+
![img](https://pic4.zhimg.com/v2-9ef0325b047a1e9357fcfe950bda39c7_r.jpg)
10+
11+
因此使用CPM(2016年的CVPR)的模型,其基本思想是使用多个级联的stage,每个stage包含多个CNN并且都输出热图,通过最小化每个stage的热图和ground truth之间的差距,从而得到越来越准确的关键点定位结果。
12+
13+
Github上有CPM的一个开源实现([https://github.com/timctho/convolutional-pose-machines-tensorflow](https://link.zhihu.com/?target=https%3A//github.com/timctho/convolutional-pose-machines-tensorflow))。
14+
15+
![img](https://pic3.zhimg.com/v2-d5f9761daca1eb6fbd978afa46e69a56_r.jpg)
16+
17+
### 数据:
18+
19+
使用天池FashionAI全球挑战赛提供的数据,[FashionAI—服饰关键点定位数据集_数据集-阿里云天池 (aliyun.com)](https://tianchi.aliyun.com/dataset/136923)
20+
21+
其中服饰关键点定位赛题提供的训练集包括7W多张图片,测试集包括5W多张图片。
22+
23+
每张图片都指定了对应的服饰类别,共5类:上衣(blouse)、外套(outwear)、连身裙(dress)、半身裙(skirt)、裤子(trousers)。
24+
25+
![img](https://pic2.zhimg.com/v2-a634ec65d864383dc125118c31ac75a9_r.jpg)
26+
27+
训练集还提供了每张图片对应的24个关键点的标注,包括x坐标、y坐标、是否可见三项信息,但并不是每类服饰都有24个关键点,数据详细信息可查看[FashionAI—服饰关键点定位数据集_数据集-阿里云天池 (aliyun.com)](https://tianchi.aliyun.com/dataset/136923)
28+
29+
30+
31+
### 安装:
32+
33+
#### 训练环境:
34+
35+
36+
37+
| Ubuntu | 22.04 |
38+
| :-------------------: | :------------------ |
39+
| GPU | 2080 Ti-11G |
40+
| cuda | 10.0.130 |
41+
| cudnn | 7.6.5.32-1+cuda10.0 |
42+
| python | 3.7.10 |
43+
| tensorflow-gpu | 1.15.5 |
44+
| numpy | 1.18.5 |
45+
| pandas | 1.2.4 |
46+
| scikit-learn | 0.24.2 |
47+
| opencv-contrib-python | 4.5.1.48 |
48+
| matplotlib | 3.4.1 |
49+
| imageio | 2.15.0 |
50+
| tqdm | 4.64.1 |
51+
52+
##### 注:
53+
54+
**由于tensorflow-gpu的1.x版不支持RTX 30系列显卡,故在30系列显卡上训练时,训练和测试损失会出现nan,即使使用已经训练好的模型测试也会出现关键点捕捉不准确,代码中已经默认使用CPU测试,若想使用GPU,则建议在20系列上进行实验(亲测可行)。**
55+
56+
如果事先未了解过**tensorflow-gpu****Cuda****cudnn**的相关知识,可以参考[Tensorflow、CUDA、cuDNN详细的下载安装过程_cudnn下载-CSDN博客](https://blog.csdn.net/weixin_45956028/article/details/119419463)进行安装。
57+
58+
59+
60+
### 训练:
61+
62+
我已经提供了预训练模型,在2080Ti的GPU上训练共历时两天,如果想要自己尝试训练进行优化,在代码目录下执行:
63+
64+
`python train.py`
65+
66+
对不同的服饰种类进行训练时,要在train.py中修改以下部分(此时为训练skirt服饰集):
67+
68+
```python
69+
#train = train[train.image_category == 'dress']#以dress为例,测试代码运行效果
70+
#train = train[train.image_category == 'blouse']
71+
#train = train[train.image_category == 'outwear']
72+
#train = train[train.image_category == 'trousers']
73+
train = train[train.image_category == 'skirt']
74+
```
75+
76+
修改上述代码,从训练集中选择所要训练的服饰类型。
77+
78+
```python
79+
#dress
80+
'''features = [
81+
'neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
82+
'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right',
83+
'cuff_left_in', 'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'hemline_left', 'hemline_right']#15'''
84+
#blouse
85+
'''features = [
86+
'neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
87+
'armpit_left', 'armpit_right', 'cuff_left_in', 'cuff_left_out', 'cuff_right_in',
88+
'cuff_right_out','top_hem_left','top_hem_right']#13'''
89+
#outwear
90+
'''features = [
91+
'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right','armpit_left',
92+
'armpit_right', 'waistline_left', 'waistline_right','cuff_left_in','cuff_left_out',
93+
'cuff_right_in', 'cuff_right_out','top_hem_left','top_hem_right']#14'''
94+
#trousers
95+
'''features = [
96+
'waistband_left','waistband_right','crotch','bottom_left_in','bottom_left_out',
97+
'bottom_right_in','bottom_right_out']#7'''
98+
#skirt
99+
features = [
100+
'waistband_left','waistband_right','hemline_left', 'hemline_right']#4
101+
```
102+
103+
修改上述代码,对不同服饰选择其所对应的特征集,集合后的数字为该服饰所拥有的特征数量。
104+
105+
```python
106+
OUTPUT_DIR ='skirt'
107+
```
108+
109+
选择训练后模型与训练中示例图片的保存路径。
110+
111+
![image-20240401120546511](C:\Users\风起\AppData\Roaming\Typora\typora-user-images\image-20240401120546511.png)
112+
113+
上图为一张skirt训练结果,第一行的三张依次是第1个、第2个、第3个Stage的响应图合成结果,第二行的三张分别对应第6个Stage的响应图合成结果、正确答案、正确答案和原图的合成,证明关键点捕捉准确。
114+
115+
116+
117+
### 预训练模型:
118+
119+
我已经使用上述数据集生成了预训练模型和示例图片,并将其保存在blouse、skirt、trousers、outwear和dress五个文件夹内,可根据需要自行选择使用的模型。
120+
121+
122+
123+
### 测试:
124+
125+
如果想要查看关键点捕捉效果,在代码目录下执行:
126+
127+
`python test.py`
128+
129+
会生成测试集中随机16张图片的关键点捕捉结果,保存在**你所测试的服饰类型**所对应的文件夹下,并输出前5张图片的关键点位置坐标。
130+
131+
若想修改所测试的服装类型,请修改以下test.py中的以下代码(以测试dress数据为例):
132+
133+
```python
134+
test = test[test.image_category == 'dress']
135+
#test = test[test.image_category == 'blouse']
136+
#test = test[test.image_category == 'outwear']
137+
#test = test[test.image_category == 'trousers']
138+
#test = test[test.image_category == 'skirt']
139+
```
140+
141+
上述代码用于选择所要测试的服装数据。
142+
143+
```python
144+
OUTPUT_DIR = 'dress'
145+
```
146+
147+
修改**OUTPUT_DIR**,从该目录下使用训练好的模型,并将测试结果也保存在该目录下。
148+
149+
```python
150+
y_dim = 15#根据训练的数据进行调整
151+
```
152+
153+
**y_dim的**的值修改为该类服饰的特征数量,与**train.py**中特征集合后的数字相同。
154+
155+
**注:为方便使用,已将测试代码封装为函数保存在test_def.py中。**
156+
157+
158+
159+
### 各文件作用说明:
160+
161+
- data文件夹下存放训练和测试集数据;
162+
163+
- blouse、skirt、trousers、outwear和dress五个文件夹分别存放对应的预训练模型和训练过程示例图片;
164+
165+
- cuda_test.py 用于测试tensorflow是否可使用GPU;
166+
167+
- data_augmentation.py用于数据增强,对训练集图片进行随机旋转、随机平移、随机水平翻转,增强模型鲁棒性;
168+
169+
- data_preprocess.py用于对天池赛上下载的数据集进行预处理;
170+
171+
- test.py是测试文件;
172+
173+
- test_def.py是封装好的测试文件;
174+
175+
- train.py用于对数据集进行训练,得到训练好的模型;
176+
177+
- utils_key.py是所有代码运行时所需要用的的外部依赖。
178+
179+
180+
181+
#### 使用提醒:
182+
183+
- 由于Linux与Windows路径符不同( / 和 \ )故在不同的操作系统中,建议先运行data_preprocess.py,得到适应操作系统的train_changed.csv文件,上传文件中为在Linux系统中生成的预处理后表格。
184+
185+
- 在对关键点数据进行训练时,起初试图将其一起训练,但是显存空间不足,故针对每种服装对其分别进行训练,训练时要自己根据服装种类,在train_changed.csv进行初步筛选,再在train.py中选择对应的特征集,并创建对应输出目录。
186+
187+
- 测试时,同样要根据服装种类,在test.csv进行初步筛选,然后由其拥有的特征集大小确定关键点数组的维度(即y_dim),选择对应的模型导入目录和输出目录。
188+
189+
190+
191+
### 参考:
192+
193+
- Convolutional Pose Machines:[https://arxiv.org/abs/1602.00134](https://link.zhihu.com/?target=https%3A//arxiv.org/abs/1602.00134)
194+
- Code repository for Convolutional Pose Machines:[https://github.com/shihenw/convolutional-pose-machines-release](https://link.zhihu.com/?target=https%3A//github.com/shihenw/convolutional-pose-machines-release)
195+
- 天池FashionAI全球挑战赛小小尝试:[https://zhuanlan.zhihu.com/p/34](https://zhuanlan.zhihu.com/p/34928763)
196+
- 服饰关键点定位:[27 服饰关键点定位 - 知乎 (zhihu.com)](https://zhuanlan.zhihu.com/p/44188417)
197+
- FashionAI—服饰关键点定位数据集_数据集:https://tianchi.aliyun.com/dataset/136923
198+
- Tensorflow、Cuda、cudnn安装:[Tensorflow、CUDA、cuDNN详细的下载安装过程_cudnn下载-CSDN博客](https://blog.csdn.net/weixin_45956028/article/details/119419463)
199+

cuda_test.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
from tensorflow.python.client import device_lib
2+
print(device_lib.list_local_devices()) # 抓取到GPU相关信息则可以使用GPU

data_augmentation.py

Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
from utils_key import *
2+
3+
#对训练图片进行数据增强,使得其经过随机旋转、随机平移、随机水平翻转后,关键点仍能对应良好,得到适应更多环境的CPM模型
4+
def transform(X_batch, Y_batch):
5+
X_data = []
6+
Y_data = []
7+
8+
offset = 20
9+
for i in range(X_batch.shape[0]):
10+
img = X_batch[i]
11+
# random rotation
12+
degree = int(np.random.random() * offset - offset / 2)
13+
rad = degree / 180 * np.pi
14+
mat = cv2.getRotationMatrix2D((img_size / 2, img_size / 2), degree, 1)
15+
img_ = cv2.warpAffine(img, mat, (img_size, img_size), borderValue=(255, 255, 255))
16+
# random translation
17+
x0 = int(np.random.random() * offset - offset / 2)
18+
y0 = int(np.random.random() * offset - offset / 2)
19+
mat = np.float32([[1, 0, x0], [0, 1, y0]])
20+
img_ = cv2.warpAffine(img_, mat, (img_size, img_size), borderValue=(255, 255, 255))
21+
# random flip
22+
if np.random.random() > 0.5:
23+
img_ = np.fliplr(img_)
24+
flip = True
25+
else:
26+
flip = False
27+
28+
X_data.append(img_)
29+
30+
points = []
31+
for j in range(y_dim):
32+
x = Y_batch[i, j, 0] * img_size
33+
y = Y_batch[i, j, 1] * img_size
34+
# random rotation
35+
dx = x - img_size / 2
36+
dy = y - img_size / 2
37+
x = int(dx * np.cos(rad) + dy * np.sin(rad) + img_size / 2)
38+
y = int(-dx * np.sin(rad) + dy * np.cos(rad) + img_size / 2)
39+
# random translation
40+
x += x0
41+
y += y0
42+
43+
x = x / img_size
44+
y = y / img_size
45+
points.append([x, y])
46+
# random flip
47+
if flip:
48+
data = {features[j]: points[j] for j in range(y_dim)}
49+
points = []
50+
for j in range(y_dim):
51+
col = features[j]
52+
if col.find('left') >= 0:
53+
col = col.replace('left', 'right')
54+
elif col.find('right') >= 0:
55+
col = col.replace('right', 'left')
56+
[x, y] = data[col]
57+
x = 1 - x
58+
points.append([x, y])
59+
60+
Y_data.append(points)
61+
62+
X_data = np.array(X_data)
63+
Y_data = np.array(Y_data)
64+
65+
# preprocess
66+
X_data = (X_data / 255. - 0.5) * 2
67+
Y_heatmap = []
68+
for i in range(Y_data.shape[0]):
69+
heatmaps = []
70+
invert_heatmap = np.ones((heatmap_size, heatmap_size))
71+
for j in range(Y_data.shape[1]):
72+
x0 = int(Y_data[i, j, 0] * heatmap_size)
73+
y0 = int(Y_data[i, j, 1] * heatmap_size)
74+
x = np.arange(0, heatmap_size, 1, float)
75+
y = x[:, np.newaxis]
76+
cur_heatmap = np.exp(-((x - x0) ** 2 + (y - y0) ** 2) / (2.0 * 1.0 ** 2))
77+
heatmaps.append(cur_heatmap)
78+
invert_heatmap -= cur_heatmap
79+
heatmaps.append(invert_heatmap)
80+
Y_heatmap.append(heatmaps)
81+
Y_heatmap = np.array(Y_heatmap)
82+
Y_heatmap = np.transpose(Y_heatmap, (0, 2, 3, 1)) # batch_size, heatmap_size, heatmap_size, y_dim + 1
83+
84+
return X_data, Y_data, Y_heatmap
85+
86+
#读取数据
87+
train = pd.read_csv(os.path.join('data', 'train', 'train_changed.csv'))
88+
train = train[train.image_category == 'dress']#以dress为例,测试代码运行效果
89+
train = train.to_dict('records')
90+
features = [
91+
'neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
92+
'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right','cuff_left_in',
93+
'cuff_left_out', 'cuff_right_in', 'cuff_right_out','top_hem_left','top_hem_right',
94+
'waistband_left','waistband_right','hemline_left', 'hemline_right','crotch',
95+
'bottom_left_in','bottom_left_out','bottom_right_in','bottom_right_out']
96+
97+
img_size = 256
98+
batch_size = 16
99+
heatmap_size = 32
100+
stages = 6
101+
#整理数据并分割训练集和验证集
102+
X_train = []
103+
Y_train = []
104+
for i in tqdm(range(len(train))):
105+
record = train[i]
106+
img = imread(record['image_id'])
107+
img = cv2.resize(img, (img_size, img_size))
108+
109+
y = []
110+
for col in features:
111+
y.append([record[col + '_x'], record[col + '_y']])
112+
113+
X_train.append(img)
114+
Y_train.append(y)
115+
116+
X_train = np.array(X_train)
117+
Y_train = np.array(Y_train)
118+
119+
#划分训练集和验证集,其中验证集占10%
120+
X_train, X_valid, Y_train, Y_valid = train_test_split(X_train, Y_train, test_size=0.1)
121+
122+
y_dim = Y_train.shape[1]
123+
124+
#对数据增强后的训练集图片查看其关键点的对应效果
125+
X_batch = X_train[:batch_size]
126+
Y_batch = Y_train[:batch_size]
127+
X_data, Y_data, Y_heatmap = transform(X_batch, Y_batch)
128+
129+
n = int(np.sqrt(batch_size))
130+
puzzle = np.ones((img_size * n, img_size * n, 3))
131+
for i in range(batch_size):
132+
img = (X_data[i] + 1) / 2
133+
for j in range(y_dim):
134+
cv2.circle(img, (int(img_size * Y_data[i, j, 0]), int(img_size * Y_data[i, j, 1])), 3, (120, 240, 120), 2)
135+
r = i // n
136+
c = i % n
137+
puzzle[r * img_size: (r + 1) * img_size, c * img_size: (c + 1) * img_size, :] = img
138+
plt.figure(figsize=(12, 12))
139+
plt.imshow(puzzle)
140+
plt.show()

data_preprocess.py

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
from utils_key import *
2+
3+
train = pd.read_csv(os.path.join('data', 'train', 'train.csv'))
4+
#print(len(train))
5+
#print(train.head())
6+
train['image_id'] = train['image_id'].apply(lambda x:os.path.join('data','train', x))#更换训练图片目录
7+
#print(train.head())
8+
#train.to_csv(os.path.join('data', 'train','train_changed.csv'), index=False)#保存
9+
10+
#修改关键点的属性值,将其由一个属性的三元组变成3个属性的一元值
11+
columns = train.columns
12+
for col in columns:
13+
if col in ['image_id', 'image_category']:
14+
continue
15+
train[col + '_x'] = train[col].apply(lambda x:float(x.split('_')[0]))
16+
train[col + '_y'] = train[col].apply(lambda x:float(x.split('_')[1]))
17+
train[col + '_s'] = train[col].apply(lambda x:float(x.split('_')[2]))
18+
train.drop([col], axis=1, inplace=True)
19+
#print(train.head())
20+
21+
#对所有服饰的坐标进行归一化,得出关键点在图像中的相对位置
22+
features = [
23+
'neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
24+
'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right','cuff_left_in',
25+
'cuff_left_out', 'cuff_right_in', 'cuff_right_out','top_hem_left','top_hem_right',
26+
'waistband_left','waistband_right','hemline_left', 'hemline_right','crotch',
27+
'bottom_left_in','bottom_left_out','bottom_right_in','bottom_right_out']
28+
29+
train = train.to_dict('records')
30+
for i in tqdm(range(len(train))):
31+
record = train[i]
32+
img = imread(record['image_id'])
33+
h = img.shape[0]
34+
w = img.shape[1]
35+
for col in features:
36+
if record[col + '_s'] >= 0:
37+
train[i][col + '_x'] /= w
38+
train[i][col + '_y'] /= h
39+
else:
40+
train[i][col + '_x'] = 0
41+
train[i][col + '_y'] = 0
42+
43+
train_df = pd.DataFrame(train)
44+
print(train_df.head())
45+
train_df.to_csv(os.path.join('data', 'train','train_changed.csv'), index=False)#保存

0 commit comments

Comments
 (0)