5  CONCLUSIONS 
In this paper, we proposed a cooperative learning for 
semantic segmentation that sends the feature maps of 
top  network  to  the  other  network.  Specifically,  we 
evaluated our methods with two kinds of CNNs and 
two connection methods. As a result, the effectiveness 
of our method was demonstrated by experiments on 
two  datasets.  Cooperative  learning  with  the  same 
layer  connection  gave  good  performance  for  both 
networks.  However,  the  improvement  of  multiple 
layer  connection  is  small  for  DANet  with  attention 
mechanism. Connection method depends on baseline 
network structure. In this paper, we use two kinds of 
connection  but  many  connection  methods  can  be 
considered. This is a subject for future works. 
ACKNOWLEDGEMENTS 
This  work  is  partially  supported  by  MEXT/JSPS 
KAKENHI Grant Number 18K111382. 
REFERENCES 
Krizhevsky,  A.,  Sutskever,  I.,  Hinton,  G.  E.  “ImageNet 
classification  with  deep  Convolutional  neural 
networks”,  In  Advances  in  neural  information 
processing systems, pp.1097-1105, (2012) 
Szegedy,  C.,  Liu,  W.,  Jia,  Y.,  Sermanet,  P.,  Reed,  S., 
Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, 
A.: Going deeper with convolutions. In: Proceedings of 
the IEEE conference on Computer Vision and Pattern 
Recognition. pp. 1–9 (2015)  
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only 
look  once:unified,  real-time  object  detection.  In: 
Proceedings  of  the  IEEE  Conference  on  Computer 
Vision and Pattern Recognition. pp. 779–788 (2016)  
Cao,  Z.,  Hidalgo,  G.,  Simon,  T.,  Wei,  S.E.,  Sheikh,  Y.: 
Openpose:  realtime  multi-person  2d  pose  estimation 
using  part  affinity  fields.  arXiv  preprint 
arXiv:1812.08008 (2018) 
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image 
translation  with  conditional  adversarial  networks.  In: 
Proceedings  of  the  IEEE  conference  on  Computer 
Vision and Pattern Recognition. pp. 1125–1134 (2017)  
Chen, L.C., Collins, M., Zhu, Y., Papandreou, G., Zoph, B., 
Schroff, F., Adam, H., Shlens, J.: Searching for efficient 
multi-scale architectures for dense image prediction. In: 
Advances  in  Neural  Information  Processing  Systems. 
pp. 8699–8710 (2018)  
Havaei,  M.,  Davy,  A.,  Warde-Farley,  D.,  Biard,  A., 
Courville,  A.,  Bengio,  Y.,  Pal,  C.,  Jodoin,  P.M., 
Larochelle,  H.:  Brain  tumor  segmentation  with  deep 
neural  networks.  Medical  image  analysis  35,  18–31 
(2017)  
Long,  J.,  Shelhamer,  E.,  Darrell,  T.:  Fully  convolutional 
networks for semantic segmentation. In: Proceedings of 
the IEEE Conference on Computer Vision and Pattern 
Recognition. pp. 3431–3440 (2015) 
Ding,  H.,  Jiang,  X.,  Shuai,  B.,  Qun  Liu,  A.,  Wang,  G.: 
Context  contrasted  feature  and  gated  multi-scale 
aggregation for scene segmentation. In: Proceedings of 
the IEEE Conference on Computer Vision and Pattern 
Recognition. pp. 2393–2402 (2018)  
Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: Denseaspp 
for  semantic  segmentation  in  street  scenes.  In: 
Proceedings  of  the  IEEE  Conference  on  Computer 
Vision and Pattern Recognition. pp. 3684–3692 (2018)  
Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel 
matters–improve  semantic  segmentation  by  global 
convolutional  network.  In:  Proceedings  of  the  IEEE 
conference  on  Computer  Cision  and  Pattern 
Recognition. pp. 4353–4361 (2017)  
Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, 
W.:  Ccnet:  Criss-cross  attention  for  semantic 
segmentation. In: Proceedings of the IEEE International 
Conference on Computer Vision. pp. 603–612 (2019) 
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: 
Dual  attention  network  for  scene  segmentation.  In: 
Proceedings  of  the  IEEE  Conference  on  Computer 
Vision and Pattern Recognition. pp. 3146–3154 (2019) 
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: 
Encoder-decoder with atrous separable convolution for 
semantic  image  segmentation.  In:  Proceedings  of  the 
European  Conference  on  Computer  Vision.  pp.  801–
818 (2018)  
Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, 
A.,  Agrawal,  A.:  Context  encoding  for  semantic 
segmentation. In: Proceedings of the IEEE conference 
on  Computer  Vision  and  Pattern  Recognition.  pp. 
7151–7160 (2018) 
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, 
M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The 
cityscapes  dataset  for  semantic  urban  scene 
understanding. In: Proceedings of the IEEE conference 
on  Computer  Vision  and  Pattern  Recognition.  pp. 
3213–3223 (2016) 
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., 
Zisserman, A.:  The  pascal  visual  object classes  (voc) 
challenge.  International  journal  of  computer  vision 
88(2), 303–338 (2010)  
Badrinarayanan,  V.,  Kendall,  A.,  Cipolla,  R.:  Segnet:  A 
deep  convolutional  encoder-decoder  architecture  for 
image  segmentation.  IEEE  Transactions  on  Pattern 
Analysis and Machine Intelligence 39(12), 2481–2495 
(2017) 
Ronneberger,  O.,  Fischer,  P.,  Brox,  T.:  U-net: 
Convolutional  networks  for  biomedical  image 
segmentation. In: International Conference on Medical 
Image  Computing  and  Computer-Assisted 
Intervention. pp. 234–241. Springer (2015)  
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene 
parsing  network.  In:  Proceedings  of  the  IEEE