cs-wywang
diff --git a/‎README.md
Lines changed: 199 additions & 0 deletions b/‎README.md
Lines changed: 199 additions & 0 deletions
diff --git a/‎cuda_test.py
Lines changed: 2 additions & 0 deletions b/‎cuda_test.py
Lines changed: 2 additions & 0 deletions
diff --git a/‎data_augmentation.py
Lines changed: 140 additions & 0 deletions b/‎data_augmentation.py
Lines changed: 140 additions & 0 deletions
diff --git a/‎data_preprocess.py
Lines changed: 45 additions & 0 deletions b/‎data_preprocess.py
Lines changed: 45 additions & 0 deletions
@@ -0,0 +1,199 @@
+### 说明：
+
+介绍如何使用CPM（Convolutional Pose Machines）实现服饰关键点定位，是对阿里云天池“FashionAI全球挑战赛——服饰关键点定位”竞赛的一次尝试。
+
+### 原理：
+
+输入是一张图片，输出是每个关键点的x、y坐标，一般会归一化到0～1区间中，所以可以理解为回归问题，但是直接对坐标值进行回归会导致较大误差，更好的做法是输出一个低分辨率的热图，使得关键点所在位置输出较高响应，而其他位置则输出较低响应。
+
+![img](https://pic4.zhimg.com/v2-9ef0325b047a1e9357fcfe950bda39c7_r.jpg)
+
+因此使用CPM（2016年的CVPR）的模型，其基本思想是使用多个级联的stage，每个stage包含多个CNN并且都输出热图，通过最小化每个stage的热图和ground truth之间的差距，从而得到越来越准确的关键点定位结果。
+
+Github上有CPM的一个开源实现（[https://github.com/timctho/convolutional-pose-machines-tensorflow](https://link.zhihu.com/?target=https%3A//github.com/timctho/convolutional-pose-machines-tensorflow)）。
+
+![img](https://pic3.zhimg.com/v2-d5f9761daca1eb6fbd978afa46e69a56_r.jpg)
+
+### 数据：
+
+使用天池FashionAI全球挑战赛提供的数据，[FashionAI—服饰关键点定位数据集_数据集-阿里云天池 (aliyun.com)](https://tianchi.aliyun.com/dataset/136923)
+
+其中服饰关键点定位赛题提供的训练集包括7W多张图片，测试集包括5W多张图片。
+
+每张图片都指定了对应的服饰类别，共5类：上衣（blouse）、外套（outwear）、连身裙（dress）、半身裙（skirt）、裤子（trousers）。
+
+![img](https://pic2.zhimg.com/v2-a634ec65d864383dc125118c31ac75a9_r.jpg)
+
+训练集还提供了每张图片对应的24个关键点的标注，包括x坐标、y坐标、是否可见三项信息，但并不是每类服饰都有24个关键点，数据详细信息可查看[FashionAI—服饰关键点定位数据集_数据集-阿里云天池 (aliyun.com)](https://tianchi.aliyun.com/dataset/136923)。
+
+
+
+### 安装：
+
+#### 训练环境：
+
+
+
+|        Ubuntu         | 22.04               |
+| :-------------------: | :------------------ |
+|          GPU          | 2080 Ti-11G         |
+|         cuda          | 10.0.130            |
+|         cudnn         | 7.6.5.32-1+cuda10.0 |
+|        python         | 3.7.10              |
+|    tensorflow-gpu     | 1.15.5              |
+|         numpy         | 1.18.5              |
+|        pandas         | 1.2.4               |
+|     scikit-learn      | 0.24.2              |
+| opencv-contrib-python | 4.5.1.48            |
+|      matplotlib       | 3.4.1               |
+|        imageio        | 2.15.0              |
+|         tqdm          | 4.64.1              |
+
+##### 注：
+
+**由于tensorflow-gpu的1.x版不支持RTX 30系列显卡，故在30系列显卡上训练时，训练和测试损失会出现nan，即使使用已经训练好的模型测试也会出现关键点捕捉不准确，代码中已经默认使用CPU测试，若想使用GPU，则建议在20系列上进行实验（亲测可行）。**
+
+如果事先未了解过**tensorflow-gpu**和**Cuda**、**cudnn**的相关知识，可以参考[Tensorflow、CUDA、cuDNN详细的下载安装过程_cudnn下载-CSDN博客](https://blog.csdn.net/weixin_45956028/article/details/119419463)进行安装。
+
+
+
+### 训练：
+
+我已经提供了预训练模型，在2080Ti的GPU上训练共历时两天，如果想要自己尝试训练进行优化，在代码目录下执行：
+
+`python train.py`
+
+对不同的服饰种类进行训练时，要在train.py中修改以下部分(此时为训练skirt服饰集）：
+
+```python
+#train = train[train.image_category == 'dress']#以dress为例，测试代码运行效果
+#train = train[train.image_category == 'blouse']
+#train = train[train.image_category == 'outwear']
+#train = train[train.image_category == 'trousers']
+train = train[train.image_category == 'skirt']
+```
+
+修改上述代码，从训练集中选择所要训练的服饰类型。
+
+```python
+#dress
+'''features = [
+    'neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
+    'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right',
+    'cuff_left_in', 'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'hemline_left', 'hemline_right']#15'''
+#blouse
+'''features = [
+    'neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
+    'armpit_left', 'armpit_right', 'cuff_left_in', 'cuff_left_out', 'cuff_right_in',
+    'cuff_right_out','top_hem_left','top_hem_right']#13'''
+#outwear
+'''features = [
+    'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right','armpit_left',
+    'armpit_right', 'waistline_left', 'waistline_right','cuff_left_in','cuff_left_out', 
+    'cuff_right_in', 'cuff_right_out','top_hem_left','top_hem_right']#14'''
+#trousers
+'''features = [
+    'waistband_left','waistband_right','crotch','bottom_left_in','bottom_left_out',
+    'bottom_right_in','bottom_right_out']#7'''
+#skirt
+features = [
+    'waistband_left','waistband_right','hemline_left', 'hemline_right']#4
+```
+
+修改上述代码，对不同服饰选择其所对应的特征集，集合后的数字为该服饰所拥有的特征数量。
+
+```python
+OUTPUT_DIR ='skirt'
+```
+
+选择训练后模型与训练中示例图片的保存路径。
+
+![image-20240401120546511](C:\Users\风起\AppData\Roaming\Typora\typora-user-images\image-20240401120546511.png)
+
+上图为一张skirt训练结果，第一行的三张依次是第1个、第2个、第3个Stage的响应图合成结果，第二行的三张分别对应第6个Stage的响应图合成结果、正确答案、正确答案和原图的合成，证明关键点捕捉准确。
+
+
+
+### 预训练模型：
+
+我已经使用上述数据集生成了预训练模型和示例图片，并将其保存在blouse、skirt、trousers、outwear和dress五个文件夹内，可根据需要自行选择使用的模型。
+
+
+
+### 测试：
+
+如果想要查看关键点捕捉效果，在代码目录下执行：
+
+`python test.py`
+
+会生成测试集中随机16张图片的关键点捕捉结果，保存在**你所测试的服饰类型**所对应的文件夹下，并输出前5张图片的关键点位置坐标。
+
+若想修改所测试的服装类型，请修改以下test.py中的以下代码（以测试dress数据为例）：
+
+```python
+test = test[test.image_category == 'dress']
+#test = test[test.image_category == 'blouse']
+#test = test[test.image_category == 'outwear']
+#test = test[test.image_category == 'trousers']
+#test = test[test.image_category == 'skirt']
+```
+
+上述代码用于选择所要测试的服装数据。
+
+```python
+OUTPUT_DIR = 'dress'
+```
+
+修改**OUTPUT_DIR**，从该目录下使用训练好的模型，并将测试结果也保存在该目录下。
+
+```python
+y_dim = 15#根据训练的数据进行调整
+```
+
+将**y_dim的**的值修改为该类服饰的特征数量，与**train.py**中特征集合后的数字相同。
+
+**注：为方便使用，已将测试代码封装为函数保存在test_def.py中。**
+
+
+
+### 各文件作用说明：
+
+- data文件夹下存放训练和测试集数据；
+
+- blouse、skirt、trousers、outwear和dress五个文件夹分别存放对应的预训练模型和训练过程示例图片；
+
+- cuda_test.py 用于测试tensorflow是否可使用GPU；
+
+- data_augmentation.py用于数据增强，对训练集图片进行随机旋转、随机平移、随机水平翻转，增强模型鲁棒性；
+
+- data_preprocess.py用于对天池赛上下载的数据集进行预处理；
+
+- test.py是测试文件；
+
+- test_def.py是封装好的测试文件；
+
+- train.py用于对数据集进行训练，得到训练好的模型；
+
+- utils_key.py是所有代码运行时所需要用的的外部依赖。
+
+  
+
+#### 使用提醒：
+
+- 由于Linux与Windows路径符不同（ / 和 \ ）故在不同的操作系统中，建议先运行data_preprocess.py，得到适应操作系统的train_changed.csv文件，上传文件中为在Linux系统中生成的预处理后表格。
+
+- 在对关键点数据进行训练时，起初试图将其一起训练，但是显存空间不足，故针对每种服装对其分别进行训练，训练时要自己根据服装种类，在train_changed.csv进行初步筛选，再在train.py中选择对应的特征集，并创建对应输出目录。
+
+- 测试时，同样要根据服装种类，在test.csv进行初步筛选，然后由其拥有的特征集大小确定关键点数组的维度（即y_dim），选择对应的模型导入目录和输出目录。
+
+  
+
+### 参考：
+
+- Convolutional Pose Machines：[https://arxiv.org/abs/1602.00134](https://link.zhihu.com/?target=https%3A//arxiv.org/abs/1602.00134)
+- Code repository for Convolutional Pose Machines：[https://github.com/shihenw/convolutional-pose-machines-release](https://link.zhihu.com/?target=https%3A//github.com/shihenw/convolutional-pose-machines-release)
+- 天池FashionAI全球挑战赛小小尝试：[https://zhuanlan.zhihu.com/p/34](https://zhuanlan.zhihu.com/p/34928763)
+- 服饰关键点定位：[27 服饰关键点定位 - 知乎 (zhihu.com)](https://zhuanlan.zhihu.com/p/44188417)
+- FashionAI—服饰关键点定位数据集_数据集：https://tianchi.aliyun.com/dataset/136923
+- Tensorflow、Cuda、cudnn安装：[Tensorflow、CUDA、cuDNN详细的下载安装过程_cudnn下载-CSDN博客](https://blog.csdn.net/weixin_45956028/article/details/119419463)
+
@@ -0,0 +1,2 @@
+from tensorflow.python.client import device_lib
+print(device_lib.list_local_devices())   # 抓取到GPU相关信息则可以使用GPU
@@ -0,0 +1,140 @@
+from utils_key import *
+
+#对训练图片进行数据增强，使得其经过随机旋转、随机平移、随机水平翻转后，关键点仍能对应良好，得到适应更多环境的CPM模型
+def transform(X_batch, Y_batch):
+    X_data = []
+    Y_data = []
+
+    offset = 20
+    for i in range(X_batch.shape[0]):
+        img = X_batch[i]
+        # random rotation
+        degree = int(np.random.random() * offset - offset / 2)
+        rad = degree / 180 * np.pi
+        mat = cv2.getRotationMatrix2D((img_size / 2, img_size / 2), degree, 1)
+        img_ = cv2.warpAffine(img, mat, (img_size, img_size), borderValue=(255, 255, 255))
+        # random translation
+        x0 = int(np.random.random() * offset - offset / 2)
+        y0 = int(np.random.random() * offset - offset / 2)
+        mat = np.float32([[1, 0, x0], [0, 1, y0]])
+        img_ = cv2.warpAffine(img_, mat, (img_size, img_size), borderValue=(255, 255, 255))
+        # random flip
+        if np.random.random() > 0.5:
+            img_ = np.fliplr(img_)
+            flip = True
+        else:
+            flip = False
+
+        X_data.append(img_)
+
+        points = []
+        for j in range(y_dim):
+            x = Y_batch[i, j, 0] * img_size
+            y = Y_batch[i, j, 1] * img_size
+            # random rotation
+            dx = x - img_size / 2
+            dy = y - img_size / 2
+            x = int(dx * np.cos(rad) + dy * np.sin(rad) + img_size / 2)
+            y = int(-dx * np.sin(rad) + dy * np.cos(rad) + img_size / 2)
+            # random translation
+            x += x0
+            y += y0
+
+            x = x / img_size
+            y = y / img_size
+            points.append([x, y])
+        # random flip
+        if flip:
+            data = {features[j]: points[j] for j in range(y_dim)}
+            points = []
+            for j in range(y_dim):
+                col = features[j]
+                if col.find('left') >= 0:
+                    col = col.replace('left', 'right')
+                elif col.find('right') >= 0:
+                    col = col.replace('right', 'left')
+                [x, y] = data[col]
+                x = 1 - x
+                points.append([x, y])
+
+        Y_data.append(points)
+
+    X_data = np.array(X_data)
+    Y_data = np.array(Y_data)
+
+    # preprocess
+    X_data = (X_data / 255. - 0.5) * 2
+    Y_heatmap = []
+    for i in range(Y_data.shape[0]):
+        heatmaps = []
+        invert_heatmap = np.ones((heatmap_size, heatmap_size))
+        for j in range(Y_data.shape[1]):
+            x0 = int(Y_data[i, j, 0] * heatmap_size)
+            y0 = int(Y_data[i, j, 1] * heatmap_size)
+            x = np.arange(0, heatmap_size, 1, float)
+            y = x[:, np.newaxis]
+            cur_heatmap = np.exp(-((x - x0) ** 2 + (y - y0) ** 2) / (2.0 * 1.0 ** 2))
+            heatmaps.append(cur_heatmap)
+            invert_heatmap -= cur_heatmap
+        heatmaps.append(invert_heatmap)
+        Y_heatmap.append(heatmaps)
+    Y_heatmap = np.array(Y_heatmap)
+    Y_heatmap = np.transpose(Y_heatmap, (0, 2, 3, 1))  # batch_size, heatmap_size, heatmap_size, y_dim + 1
+
+    return X_data, Y_data, Y_heatmap
+
+#读取数据
+train = pd.read_csv(os.path.join('data', 'train', 'train_changed.csv'))
+train = train[train.image_category == 'dress']#以dress为例，测试代码运行效果
+train = train.to_dict('records')
+features = [
+    'neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
+    'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right','cuff_left_in',
+    'cuff_left_out', 'cuff_right_in', 'cuff_right_out','top_hem_left','top_hem_right',
+    'waistband_left','waistband_right','hemline_left', 'hemline_right','crotch',
+    'bottom_left_in','bottom_left_out','bottom_right_in','bottom_right_out']
+
+img_size = 256
+batch_size = 16
+heatmap_size = 32
+stages = 6
+#整理数据并分割训练集和验证集
+X_train = []
+Y_train = []
+for i in tqdm(range(len(train))):
+    record = train[i]
+    img = imread(record['image_id'])
+    img = cv2.resize(img, (img_size, img_size))
+
+    y = []
+    for col in features:
+        y.append([record[col + '_x'], record[col + '_y']])
+
+    X_train.append(img)
+    Y_train.append(y)
+
+X_train = np.array(X_train)
+Y_train = np.array(Y_train)
+
+#划分训练集和验证集，其中验证集占10%
+X_train, X_valid, Y_train, Y_valid = train_test_split(X_train, Y_train, test_size=0.1)
+
+y_dim = Y_train.shape[1]
+
+#对数据增强后的训练集图片查看其关键点的对应效果
+X_batch = X_train[:batch_size]
+Y_batch = Y_train[:batch_size]
+X_data, Y_data, Y_heatmap = transform(X_batch, Y_batch)
+
+n = int(np.sqrt(batch_size))
+puzzle = np.ones((img_size * n, img_size * n, 3))
+for i in range(batch_size):
+    img = (X_data[i] + 1) / 2
+    for j in range(y_dim):
+        cv2.circle(img, (int(img_size * Y_data[i, j, 0]), int(img_size * Y_data[i, j, 1])), 3, (120, 240, 120), 2)
+    r = i // n
+    c = i % n
+    puzzle[r * img_size: (r + 1) * img_size, c * img_size: (c + 1) * img_size, :] = img
+plt.figure(figsize=(12, 12))
+plt.imshow(puzzle)
+plt.show()
@@ -0,0 +1,45 @@
+from utils_key import *
+
+train = pd.read_csv(os.path.join('data', 'train', 'train.csv'))
+#print(len(train))
+#print(train.head())
+train['image_id'] = train['image_id'].apply(lambda x:os.path.join('data','train', x))#更换训练图片目录
+#print(train.head())
+#train.to_csv(os.path.join('data', 'train','train_changed.csv'), index=False)#保存
+
+#修改关键点的属性值，将其由一个属性的三元组变成3个属性的一元值
+columns = train.columns
+for col in columns:
+    if col in ['image_id', 'image_category']:
+        continue
+    train[col + '_x'] = train[col].apply(lambda x:float(x.split('_')[0]))
+    train[col + '_y'] = train[col].apply(lambda x:float(x.split('_')[1]))
+    train[col + '_s'] = train[col].apply(lambda x:float(x.split('_')[2]))
+    train.drop([col], axis=1, inplace=True)
+#print(train.head())
+
+#对所有服饰的坐标进行归一化，得出关键点在图像中的相对位置
+features = [
+    'neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
+    'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right','cuff_left_in',
+    'cuff_left_out', 'cuff_right_in', 'cuff_right_out','top_hem_left','top_hem_right',
+    'waistband_left','waistband_right','hemline_left', 'hemline_right','crotch',
+    'bottom_left_in','bottom_left_out','bottom_right_in','bottom_right_out']
+
+train = train.to_dict('records')
+for i in tqdm(range(len(train))):
+    record = train[i]
+    img = imread(record['image_id'])
+    h = img.shape[0]
+    w = img.shape[1]
+    for col in features:
+        if record[col + '_s'] >= 0:
+            train[i][col + '_x'] /= w
+            train[i][col + '_y'] /= h
+        else:
+            train[i][col + '_x'] = 0
+            train[i][col + '_y'] = 0
+
+train_df = pd.DataFrame(train)
+print(train_df.head())
+train_df.to_csv(os.path.join('data', 'train','train_changed.csv'), index=False)#保存
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+from tensorflow.python.client import device_lib`
	`2`	`+print(device_lib.list_local_devices()) # 抓取到GPU相关信息则可以使用GPU`