first commit

This commit is contained in:
admin
2026-05-20 15:05:35 +08:00
commit ac09b26253
2048 changed files with 189478 additions and 0 deletions

View File

@@ -0,0 +1,53 @@
# BiSeNetV2
> [Bisenet v2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation](https://arxiv.org/abs/2004.02147)
## Introduction
<!-- [ALGORITHM] -->
<a href="">Official Repo</a>
<a href="https://github.com/open-mmlab/mmsegmentation/blob/v0.18.0/mmseg/models/backbones/bisenetv2.py#L545">Code Snippet</a>
## Abstract
<!-- [ABSTRACT] -->
The low-level details and high-level semantics are both essential to the semantic segmentation task. However, to speed up the model inference, current approaches almost always sacrifice the low-level details, which leads to a considerable accuracy decrease. We propose to treat these spatial details and categorical semantics separately to achieve high accuracy and high efficiency for realtime semantic segmentation. To this end, we propose an efficient and effective architecture with a good trade-off between speed and accuracy, termed Bilateral Segmentation Network (BiSeNet V2). This architecture involves: (i) a Detail Branch, with wide channels and shallow layers to capture low-level details and generate high-resolution feature representation; (ii) a Semantic Branch, with narrow channels and deep layers to obtain high-level semantic context. The Semantic Branch is lightweight due to reducing the channel capacity and a fast-downsampling strategy. Furthermore, we design a Guided Aggregation Layer to enhance mutual connections and fuse both types of feature representation. Besides, a booster training strategy is designed to improve the segmentation performance without any extra inference cost. Extensive quantitative and qualitative evaluations demonstrate that the proposed architecture performs favourably against a few state-of-the-art real-time semantic segmentation approaches. Specifically, for a 2,048x1,024 input, we achieve 72.6% Mean IoU on the Cityscapes test set with a speed of 156 FPS on one NVIDIA GeForce GTX 1080 Ti card, which is significantly faster than existing methods, yet we achieve better segmentation accuracy.
<!-- [IMAGE] -->
<div align=center>
<img src="https://user-images.githubusercontent.com/24582831/142898966-ec4a81da-b4b0-41ee-b083-1d964582c18a.png" width="70%"/>
</div>
## Results and models
### Cityscapes
| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | Device | mIoU | mIoU(ms+flip) | config | download |
| --------- | ---------------- | --------- | ------: | -------- | -------------- | ------ | ----: | ------------: | --------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| BiSeNetV2 | BiSeNetV2 | 1024x1024 | 160000 | 7.64 | 31.77 | V100 | 73.21 | 75.74 | [config](https://github.com/open-mmlab/mmsegmentation/blob/main/configs/bisenetv2/bisenetv2_fcn_4xb4-160k_cityscapes-1024x1024.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes_20210902_015551-bcf10f09.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes_20210902_015551.log.json) |
| BiSeNetV2 | BiSeNetV2 (OHEM) | 1024x1024 | 160000 | 7.64 | - | V100 | 73.57 | 75.80 | [config](https://github.com/open-mmlab/mmsegmentation/blob/main/configs/bisenetv2/bisenetv2_fcn_4xb4-ohem-160k_cityscapes-1024x1024.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_ohem_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_ohem_4x4_1024x1024_160k_cityscapes_20210902_112947-5f8103b4.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_ohem_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_ohem_4x4_1024x1024_160k_cityscapes_20210902_112947.log.json) |
| BiSeNetV2 | BiSeNetV2 (4x8) | 1024x1024 | 160000 | 15.05 | - | V100 | 75.76 | 77.79 | [config](https://github.com/open-mmlab/mmsegmentation/blob/main/configs/bisenetv2/bisenetv2_fcn_4xb8-160k_cityscapes-1024x1024.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_4x8_1024x1024_160k_cityscapes/bisenetv2_fcn_4x8_1024x1024_160k_cityscapes_20210903_000032-e1a2eed6.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_4x8_1024x1024_160k_cityscapes/bisenetv2_fcn_4x8_1024x1024_160k_cityscapes_20210903_000032.log.json) |
| BiSeNetV2 | BiSeNetV2 (FP16) | 1024x1024 | 160000 | 5.77 | 36.65 | V100 | 73.07 | 75.13 | [config](https://github.com/open-mmlab/mmsegmentation/blob/main/configs/bisenetv2/bisenetv2_fcn_4xb4-amp-160k_cityscapes-1024x1024.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_fp16_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_fp16_4x4_1024x1024_160k_cityscapes_20210902_045942-b979777b.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_fp16_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_fp16_4x4_1024x1024_160k_cityscapes_20210902_045942.log.json) |
Note:
- `OHEM` means Online Hard Example Mining (OHEM) is adopted in training.
- `FP16` means Mixed Precision (FP16) is adopted in training.
- `4x8` means 4 GPUs with 8 samples per GPU in training.
## Citation
```bibtex
@article{yu2021bisenet,
title={Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation},
author={Yu, Changqian and Gao, Changxin and Wang, Jingbo and Yu, Gang and Shen, Chunhua and Sang, Nong},
journal={International Journal of Computer Vision},
pages={1--18},
year={2021},
publisher={Springer}
}
```

View File

@@ -0,0 +1,24 @@
_base_ = [
'../_base_/models/bisenetv2.py',
'../_base_/datasets/cityscapes_1024x1024.py',
'../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
]
crop_size = (1024, 1024)
data_preprocessor = dict(size=crop_size)
model = dict(data_preprocessor=data_preprocessor)
param_scheduler = [
dict(type='LinearLR', by_epoch=False, start_factor=0.1, begin=0, end=1000),
dict(
type='PolyLR',
eta_min=1e-4,
power=0.9,
begin=1000,
end=160000,
by_epoch=False,
)
]
optimizer = dict(type='SGD', _delete_=True, lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
train_dataloader = dict(batch_size=4, num_workers=4)
val_dataloader = dict(batch_size=1, num_workers=4)
test_dataloader = val_dataloader

View File

@@ -0,0 +1,6 @@
_base_ = './bisenetv2_fcn_4xb4-160k_cityscapes-1024x1024.py'
optim_wrapper = dict(
_delete_=True,
type='AmpOptimWrapper',
optimizer=dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005),
loss_scale=512.)

View File

@@ -0,0 +1,83 @@
_base_ = [
'../_base_/models/bisenetv2.py',
'../_base_/datasets/cityscapes_1024x1024.py',
'../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
]
crop_size = (1024, 1024)
data_preprocessor = dict(size=crop_size)
norm_cfg = dict(type='SyncBN', requires_grad=True)
models = dict(
data_preprocessor=data_preprocessor,
decode_head=dict(
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000)),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=16,
channels=16,
num_convs=2,
num_classes=19,
in_index=1,
norm_cfg=norm_cfg,
concat_input=False,
align_corners=False,
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
dict(
type='FCNHead',
in_channels=32,
channels=64,
num_convs=2,
num_classes=19,
in_index=2,
norm_cfg=norm_cfg,
concat_input=False,
align_corners=False,
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
dict(
type='FCNHead',
in_channels=64,
channels=256,
num_convs=2,
num_classes=19,
in_index=3,
norm_cfg=norm_cfg,
concat_input=False,
align_corners=False,
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=19,
in_index=4,
norm_cfg=norm_cfg,
concat_input=False,
align_corners=False,
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
],
)
param_scheduler = [
dict(type='LinearLR', by_epoch=False, start_factor=0.1, begin=0, end=1000),
dict(
type='PolyLR',
eta_min=1e-4,
power=0.9,
begin=1000,
end=160000,
by_epoch=False,
)
]
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
train_dataloader = dict(batch_size=4, num_workers=4)
val_dataloader = dict(batch_size=1, num_workers=4)
test_dataloader = val_dataloader

View File

@@ -0,0 +1,24 @@
_base_ = [
'../_base_/models/bisenetv2.py',
'../_base_/datasets/cityscapes_1024x1024.py',
'../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
]
crop_size = (1024, 1024)
data_preprocessor = dict(size=crop_size)
model = dict(data_preprocessor=data_preprocessor)
param_scheduler = [
dict(type='LinearLR', by_epoch=False, start_factor=0.1, begin=0, end=1000),
dict(
type='PolyLR',
eta_min=1e-4,
power=0.9,
begin=1000,
end=160000,
by_epoch=False,
)
]
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
train_dataloader = dict(batch_size=8, num_workers=4)
val_dataloader = dict(batch_size=1, num_workers=4)
test_dataloader = val_dataloader

View File

@@ -0,0 +1,114 @@
Collections:
- Name: BiSeNetV2
License: Apache License 2.0
Metadata:
Training Data:
- Cityscapes
Paper:
Title: 'Bisenet v2: Bilateral Network with Guided Aggregation for Real-time Semantic
Segmentation'
URL: https://arxiv.org/abs/2004.02147
README: configs/bisenetv2/README.md
Frameworks:
- PyTorch
Models:
- Name: bisenetv2_fcn_4xb4-160k_cityscapes-1024x1024
In Collection: BiSeNetV2
Results:
Task: Semantic Segmentation
Dataset: Cityscapes
Metrics:
mIoU: 73.21
mIoU(ms+flip): 75.74
Config: configs/bisenetv2/bisenetv2_fcn_4xb4-160k_cityscapes-1024x1024.py
Metadata:
Training Data: Cityscapes
Batch Size: 16
Architecture:
- BiSeNetV2
- BiSeNetV2
Training Resources: 4x V100 GPUS
Memory (GB): 7.64
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes_20210902_015551-bcf10f09.pth
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_4x4_1024x1024_160k_cityscapes_20210902_015551.log.json
Paper:
Title: 'Bisenet v2: Bilateral Network with Guided Aggregation for Real-time Semantic
Segmentation'
URL: https://arxiv.org/abs/2004.02147
Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.18.0/mmseg/models/backbones/bisenetv2.py#L545
Framework: PyTorch
- Name: bisenetv2_fcn_4xb4-ohem-160k_cityscapes-1024x1024
In Collection: BiSeNetV2
Results:
Task: Semantic Segmentation
Dataset: Cityscapes
Metrics:
mIoU: 73.57
mIoU(ms+flip): 75.8
Config: configs/bisenetv2/bisenetv2_fcn_4xb4-ohem-160k_cityscapes-1024x1024.py
Metadata:
Training Data: Cityscapes
Batch Size: 16
Architecture:
- BiSeNetV2
- BiSeNetV2
Training Resources: 4x V100 GPUS
Memory (GB): 7.64
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_ohem_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_ohem_4x4_1024x1024_160k_cityscapes_20210902_112947-5f8103b4.pth
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_ohem_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_ohem_4x4_1024x1024_160k_cityscapes_20210902_112947.log.json
Paper:
Title: 'Bisenet v2: Bilateral Network with Guided Aggregation for Real-time Semantic
Segmentation'
URL: https://arxiv.org/abs/2004.02147
Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.18.0/mmseg/models/backbones/bisenetv2.py#L545
Framework: PyTorch
- Name: bisenetv2_fcn_4xb8-160k_cityscapes-1024x1024
In Collection: BiSeNetV2
Results:
Task: Semantic Segmentation
Dataset: Cityscapes
Metrics:
mIoU: 75.76
mIoU(ms+flip): 77.79
Config: configs/bisenetv2/bisenetv2_fcn_4xb8-160k_cityscapes-1024x1024.py
Metadata:
Training Data: Cityscapes
Batch Size: 32
Architecture:
- BiSeNetV2
- BiSeNetV2
Training Resources: 4x V100 GPUS
Memory (GB): 15.05
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_4x8_1024x1024_160k_cityscapes/bisenetv2_fcn_4x8_1024x1024_160k_cityscapes_20210903_000032-e1a2eed6.pth
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_4x8_1024x1024_160k_cityscapes/bisenetv2_fcn_4x8_1024x1024_160k_cityscapes_20210903_000032.log.json
Paper:
Title: 'Bisenet v2: Bilateral Network with Guided Aggregation for Real-time Semantic
Segmentation'
URL: https://arxiv.org/abs/2004.02147
Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.18.0/mmseg/models/backbones/bisenetv2.py#L545
Framework: PyTorch
- Name: bisenetv2_fcn_4xb4-amp-160k_cityscapes-1024x1024
In Collection: BiSeNetV2
Results:
Task: Semantic Segmentation
Dataset: Cityscapes
Metrics:
mIoU: 73.07
mIoU(ms+flip): 75.13
Config: configs/bisenetv2/bisenetv2_fcn_4xb4-amp-160k_cityscapes-1024x1024.py
Metadata:
Training Data: Cityscapes
Batch Size: 16
Architecture:
- BiSeNetV2
- BiSeNetV2
Training Resources: 4x V100 GPUS
Memory (GB): 5.77
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_fp16_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_fp16_4x4_1024x1024_160k_cityscapes_20210902_045942-b979777b.pth
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/bisenetv2/bisenetv2_fcn_fp16_4x4_1024x1024_160k_cityscapes/bisenetv2_fcn_fp16_4x4_1024x1024_160k_cityscapes_20210902_045942.log.json
Paper:
Title: 'Bisenet v2: Bilateral Network with Guided Aggregation for Real-time Semantic
Segmentation'
URL: https://arxiv.org/abs/2004.02147
Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.18.0/mmseg/models/backbones/bisenetv2.py#L545
Framework: PyTorch

View File

@@ -0,0 +1,155 @@
_base_ = [
'../_base_/models/bisenetv2.py',
'../_base_/datasets/publicdataset_autolaparo.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py', # TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
123.62464353460942,
85.34836259209033,
82.31539425671558,
],
std=[
47.172211618459315,
47.08256715323592,
48.135121265163605,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
123.62464353460942,
85.34836259209033,
82.31539425671558,
],
std=[
47.172211618459315,
47.08256715323592,
48.135121265163605,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=10,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=16,
channels=16,
num_convs=2,
num_classes=10,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=32,
channels=64,
num_convs=2,
num_classes=10,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=256,
num_convs=2,
num_classes=10,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=10,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,155 @@
_base_ = [
'../_base_/models/bisenetv2.py',
'../_base_/datasets/publicdataset_cholecseg8k.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py',# TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
85.65740418979115,
53.99282220050495,
46.074045888534535,
],
std=[
72.24589167201978,
56.76979155397199,
49.056637115061775,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
85.65740418979115,
53.99282220050495,
46.074045888534535,
],
std=[
72.24589167201978,
56.76979155397199,
49.056637115061775,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=13,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=16,
channels=16,
num_convs=2,
num_classes=13,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=32,
channels=64,
num_convs=2,
num_classes=13,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=256,
num_convs=2,
num_classes=13,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=13,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,154 @@
_base_ = [
'../_base_/models/bisenetv2.py',
'../_base_/datasets/publicdataset_dresden.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py', # TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
103.172638338208,
61.44762740851152,
51.407770213021976,
],
std=[
75.77031253622098,
54.63616729031377,
49.45572239497569,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
103.172638338208,
61.44762740851152,
51.407770213021976,
],
std=[
75.77031253622098,
54.63616729031377,
49.45572239497569,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=11,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=16,
channels=16,
num_convs=2,
num_classes=11,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=32,
channels=64,
num_convs=2,
num_classes=11,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=256,
num_convs=2,
num_classes=11,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=11,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,154 @@
_base_ = [
'../_base_/models/bisenetv2.py',
'../_base_/datasets/publicdataset_endovis_2017.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py', # TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=8,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=16,
channels=16,
num_convs=2,
num_classes=8,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=32,
channels=64,
num_convs=2,
num_classes=8,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=256,
num_convs=2,
num_classes=8,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=8,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,154 @@
_base_ = [
'../_base_/models/bisenetv2.py',
'../_base_/datasets/publicdataset_endovis_2018.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py', # TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=8,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=16,
channels=16,
num_convs=2,
num_classes=8,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=32,
channels=64,
num_convs=2,
num_classes=8,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=256,
num_convs=2,
num_classes=8,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=8,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,155 @@
_base_ = [
'../_base_/models/bisenetv2_large.py',
'../_base_/datasets/publicdataset_autolaparo.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py', # TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
123.62464353460942,
85.34836259209033,
82.31539425671558,
],
std=[
47.172211618459315,
47.08256715323592,
48.135121265163605,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
123.62464353460942,
85.34836259209033,
82.31539425671558,
],
std=[
47.172211618459315,
47.08256715323592,
48.135121265163605,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=10,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=32,
channels=16,
num_convs=2,
num_classes=10,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=64,
num_convs=2,
num_classes=10,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=256,
num_convs=2,
num_classes=10,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=256,
channels=1024,
num_convs=2,
num_classes=10,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,155 @@
_base_ = [
'../_base_/models/bisenetv2_large.py',
'../_base_/datasets/publicdataset_cholecseg8k.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py',# TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
85.65740418979115,
53.99282220050495,
46.074045888534535,
],
std=[
72.24589167201978,
56.76979155397199,
49.056637115061775,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
85.65740418979115,
53.99282220050495,
46.074045888534535,
],
std=[
72.24589167201978,
56.76979155397199,
49.056637115061775,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=13,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=32,
channels=16,
num_convs=2,
num_classes=13,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=64,
num_convs=2,
num_classes=13,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=256,
num_convs=2,
num_classes=13,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=256,
channels=1024,
num_convs=2,
num_classes=13,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,154 @@
_base_ = [
'../_base_/models/bisenetv2_large.py',
'../_base_/datasets/publicdataset_dresden.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py', # TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
103.172638338208,
61.44762740851152,
51.407770213021976,
],
std=[
75.77031253622098,
54.63616729031377,
49.45572239497569,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
103.172638338208,
61.44762740851152,
51.407770213021976,
],
std=[
75.77031253622098,
54.63616729031377,
49.45572239497569,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=11,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=32,
channels=16,
num_convs=2,
num_classes=11,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=64,
num_convs=2,
num_classes=11,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=256,
num_convs=2,
num_classes=11,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=256,
channels=1024,
num_convs=2,
num_classes=11,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,154 @@
_base_ = [
'../_base_/models/bisenetv2_large.py',
'../_base_/datasets/publicdataset_endovis_2017.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py', # TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=8,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=32,
channels=16,
num_convs=2,
num_classes=8,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=64,
num_convs=2,
num_classes=8,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=256,
num_convs=2,
num_classes=8,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=256,
channels=1024,
num_convs=2,
num_classes=8,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,154 @@
_base_ = [
'../_base_/models/bisenetv2_large.py',
'../_base_/datasets/publicdataset_endovis_2018.py',
'../_base_/default_runtime.py',
# '../_base_/schedules/schedule_300e_val1_check10.py', # TODO
'../_base_/schedules/schedule_300e_val1_check10_SGD.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=8,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=32,
channels=16,
num_convs=2,
num_classes=8,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=64,
channels=64,
num_convs=2,
num_classes=8,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=256,
num_convs=2,
num_classes=8,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=256,
channels=1024,
num_convs=2,
num_classes=8,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
# TODO
optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,164 @@
_base_ = [
'../_base_/models/en_bisenetv2.py',
'../_base_/datasets/publicdataset_autolaparo.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
123.62464353460942,
85.34836259209033,
82.31539425671558,
],
std=[
47.172211618459315,
47.08256715323592,
48.135121265163605,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
123.62464353460942,
85.34836259209033,
82.31539425671558,
],
std=[
47.172211618459315,
47.08256715323592,
48.135121265163605,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=10,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=64,
channels=16,
num_convs=2,
num_classes=10,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=64,
num_convs=2,
num_classes=10,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=256,
num_convs=2,
num_classes=10,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=10,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,164 @@
_base_ = [
'../_base_/models/en_bisenetv2.py',
'../_base_/datasets/publicdataset_cholecseg8k.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
85.65740418979115,
53.99282220050495,
46.074045888534535,
],
std=[
72.24589167201978,
56.76979155397199,
49.056637115061775,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
85.65740418979115,
53.99282220050495,
46.074045888534535,
],
std=[
72.24589167201978,
56.76979155397199,
49.056637115061775,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=13,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=64, # <-- FIX: Was 16. Corresponds to semantic_out_s3 (in_index=1)
channels=16,
num_convs=2,
num_classes=13,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128, # <-- FIX: Was 32. Corresponds to semantic_out_s4 (in_index=2)
channels=64,
num_convs=2,
num_classes=13,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128, # <-- FIX: Was 64. Corresponds to semantic_fused (in_index=3)
channels=256,
num_convs=2,
num_classes=13,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=13,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,164 @@
_base_ = [
'../_base_/models/en_bisenetv2.py',
'../_base_/datasets/publicdataset_dresden.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
103.172638338208,
61.44762740851152,
51.407770213021976,
],
std=[
75.77031253622098,
54.63616729031377,
49.45572239497569,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
103.172638338208,
61.44762740851152,
51.407770213021976,
],
std=[
75.77031253622098,
54.63616729031377,
49.45572239497569,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=11,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=64,
channels=16,
num_convs=2,
num_classes=11,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=64,
num_convs=2,
num_classes=11,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=256,
num_convs=2,
num_classes=11,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=11,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,164 @@
_base_ = [
'../_base_/models/en_bisenetv2.py',
'../_base_/datasets/publicdataset_endovis_2017.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=8,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=64,
channels=16,
num_convs=2,
num_classes=8,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=64,
num_convs=2,
num_classes=8,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=256,
num_convs=2,
num_classes=8,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=8,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,164 @@
_base_ = [
'../_base_/models/en_bisenetv2.py',
'../_base_/datasets/publicdataset_endovis_2018.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
),
decode_head=dict(
type='FCNHead',
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
num_classes=8,
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=64,
channels=16,
num_convs=2,
num_classes=8,
in_index=1,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=64,
num_convs=2,
num_classes=8,
in_index=2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=256,
num_convs=2,
num_classes=8,
in_index=3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
dict(
type='FCNHead',
in_channels=128,
channels=1024,
num_convs=2,
num_classes=8,
in_index=4,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
reduction='none',
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]