first commit

This commit is contained in:
admin
2026-05-20 15:05:35 +08:00
commit ac09b26253
2048 changed files with 189478 additions and 0 deletions

View File

@@ -0,0 +1,42 @@
# Fast-SCNN
> [Fast-SCNN for Semantic Segmentation](https://arxiv.org/abs/1902.04502)
## Introduction
<!-- [ALGORITHM] -->
<a href="">Official Repo</a>
<a href="https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/fast_scnn.py#L272">Code Snippet</a>
## Abstract
<!-- [ABSTRACT] -->
The encoder-decoder framework is state-of-the-art for offline semantic image segmentation. Since the rise in autonomous systems, real-time computation is increasingly desirable. In this paper, we introduce fast segmentation convolutional neural network (Fast-SCNN), an above real-time semantic segmentation model on high resolution image data (1024x2048px) suited to efficient computation on embedded devices with low memory. Building on existing two-branch methods for fast segmentation, we introduce our \`learning to downsample' module which computes low-level features for multiple resolution branches simultaneously. Our network combines spatial detail at high resolution with deep features extracted at lower resolution, yielding an accuracy of 68.0% mean intersection over union at 123.5 frames per second on Cityscapes. We also show that large scale pre-training is unnecessary. We thoroughly validate our metric in experiments with ImageNet pre-training and the coarse labeled data of Cityscapes. Finally, we show even faster computation with competitive results on subsampled inputs, without any network modifications.
<!-- [IMAGE] -->
<div align=center>
<img src="https://user-images.githubusercontent.com/24582831/142901444-705b4ff4-6d1e-409b-899a-37bf3a6b69ce.png" width="80%"/>
</div>
## Results and models
### Cityscapes
| Method | Backbone | Crop Size | Lr schd | Mem (GB) | Inf time (fps) | Device | mIoU | mIoU(ms+flip) | config | download |
| -------- | -------- | --------- | ------: | -------- | -------------- | ------ | ----: | ------------- | ---------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| FastSCNN | FastSCNN | 512x1024 | 160000 | 3.3 | 56.45 | V100 | 70.96 | 72.65 | [config](https://github.com/open-mmlab/mmsegmentation/blob/main/configs/fastscnn/fast_scnn_8xb4-160k_cityscapes-512x1024.py) | [model](https://download.openmmlab.com/mmsegmentation/v0.5/fast_scnn/fast_scnn_lr0.12_8x4_160k_cityscapes/fast_scnn_lr0.12_8x4_160k_cityscapes_20210630_164853-0cec9937.pth) \| [log](https://download.openmmlab.com/mmsegmentation/v0.5/fast_scnn/fast_scnn_lr0.12_8x4_160k_cityscapes/fast_scnn_lr0.12_8x4_160k_cityscapes_20210630_164853.log.json) |
## Citation
```bibtex
@article{poudel2019fast,
title={Fast-scnn: Fast semantic segmentation network},
author={Poudel, Rudra PK and Liwicki, Stephan and Cipolla, Roberto},
journal={arXiv preprint arXiv:1902.04502},
year={2019}
}
```

View File

@@ -0,0 +1,15 @@
_base_ = [
'../_base_/models/fast_scnn.py', '../_base_/datasets/cityscapes.py',
'../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py'
]
crop_size = (512, 1024)
data_preprocessor = dict(size=crop_size)
model = dict(data_preprocessor=data_preprocessor)
# Re-config the data sampler.
train_dataloader = dict(batch_size=4, num_workers=4)
val_dataloader = dict(batch_size=1, num_workers=4)
test_dataloader = val_dataloader
# Re-config the optimizer.
optimizer = dict(type='SGD', lr=0.12, momentum=0.9, weight_decay=4e-5)
optim_wrapper = dict(type='OptimWrapper', optimizer=optimizer)

View File

@@ -0,0 +1,37 @@
Collections:
- Name: FastSCNN
License: Apache License 2.0
Metadata:
Training Data:
- Cityscapes
Paper:
Title: Fast-SCNN for Semantic Segmentation
URL: https://arxiv.org/abs/1902.04502
README: configs/fastscnn/README.md
Frameworks:
- PyTorch
Models:
- Name: fast_scnn_8xb4-160k_cityscapes-512x1024
In Collection: FastSCNN
Results:
Task: Semantic Segmentation
Dataset: Cityscapes
Metrics:
mIoU: 70.96
mIoU(ms+flip): 72.65
Config: configs/fastscnn/fast_scnn_8xb4-160k_cityscapes-512x1024.py
Metadata:
Training Data: Cityscapes
Batch Size: 32
Architecture:
- FastSCNN
- FastSCNN
Training Resources: 8x V100 GPUS
Memory (GB): 3.3
Weights: https://download.openmmlab.com/mmsegmentation/v0.5/fast_scnn/fast_scnn_lr0.12_8x4_160k_cityscapes/fast_scnn_lr0.12_8x4_160k_cityscapes_20210630_164853-0cec9937.pth
Training log: https://download.openmmlab.com/mmsegmentation/v0.5/fast_scnn/fast_scnn_lr0.12_8x4_160k_cityscapes/fast_scnn_lr0.12_8x4_160k_cityscapes_20210630_164853.log.json
Paper:
Title: Fast-SCNN for Semantic Segmentation
URL: https://arxiv.org/abs/1902.04502
Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/fast_scnn.py#L272
Framework: PyTorch

View File

@@ -0,0 +1,122 @@
_base_ = [
'../_base_/models/fast_scnn.py',
'../_base_/datasets/publicdataset_autolaparo.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
123.62464353460942,
85.34836259209033,
82.31539425671558,
],
std=[
47.172211618459315,
47.08256715323592,
48.135121265163605,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
123.62464353460942,
85.34836259209033,
82.31539425671558,
],
std=[
47.172211618459315,
47.08256715323592,
48.135121265163605,
],
bgr_to_rgb=False,
),
decode_head=dict(
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=128,
channels=32,
num_convs=1,
num_classes=10,
in_index=-2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
dict(
type='FCNHead',
in_channels=64,
channels=32,
num_convs=1,
num_classes=10,
in_index=-3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,122 @@
_base_ = [
'../_base_/models/fast_scnn.py',
'../_base_/datasets/publicdataset_cholecseg8k.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
85.65740418979115,
53.99282220050495,
46.074045888534535,
],
std=[
72.24589167201978,
56.76979155397199,
49.056637115061775,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
85.65740418979115,
53.99282220050495,
46.074045888534535,
],
std=[
72.24589167201978,
56.76979155397199,
49.056637115061775,
],
bgr_to_rgb=False,
),
decode_head=dict(
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=128,
channels=32,
num_convs=1,
num_classes=13,
in_index=-2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
dict(
type='FCNHead',
in_channels=64,
channels=32,
num_convs=1,
num_classes=13,
in_index=-3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,122 @@
_base_ = [
'../_base_/models/fast_scnn.py',
'../_base_/datasets/publicdataset_dresden.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
103.172638338208,
61.44762740851152,
51.407770213021976,
],
std=[
75.77031253622098,
54.63616729031377,
49.45572239497569,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
103.172638338208,
61.44762740851152,
51.407770213021976,
],
std=[
75.77031253622098,
54.63616729031377,
49.45572239497569,
],
bgr_to_rgb=False,
),
decode_head=dict(
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=128,
channels=32,
num_convs=1,
num_classes=11,
in_index=-2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
dict(
type='FCNHead',
in_channels=64,
channels=32,
num_convs=1,
num_classes=11,
in_index=-3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,122 @@
_base_ = [
'../_base_/models/fast_scnn.py',
'../_base_/datasets/publicdataset_endovis_2017.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
),
decode_head=dict(
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=128,
channels=32,
num_convs=1,
num_classes=8,
in_index=-2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
dict(
type='FCNHead',
in_channels=64,
channels=32,
num_convs=1,
num_classes=8,
in_index=-3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,122 @@
_base_ = [
'../_base_/models/fast_scnn.py',
'../_base_/datasets/publicdataset_endovis_2018.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_300e_val1_check10.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 512)
data_preprocessor = dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 512),
mean=[
122.21429912990676,
77.0821859677977,
87.03836664626716,
],
std=[
50.53335800365262,
42.895340354037465,
47.739426483390446,
],
bgr_to_rgb=False,
),
decode_head=dict(
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=128,
channels=32,
num_convs=1,
num_classes=8,
in_index=-2,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
dict(
type='FCNHead',
in_channels=64,
channels=32,
num_convs=1,
num_classes=8,
in_index=-3,
norm_cfg=dict(
type='BN',
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=True,
begin=0,
end=10,
),
dict(
type='PolyLR',
power=0.9,
begin=10,
end=300,
eta_min=1e-05,
by_epoch=True,
),
]

View File

@@ -0,0 +1,134 @@
_base_ = [
'../_base_/models/fast_scnn.py',
'../_base_/datasets/my_dataset_model.py',
'../_base_/default_runtime.py',
'../_base_/schedules/schedule_40k_check_4000.py',
]
norm_cfg = dict(
type='BN',
)
crop_size = (512, 1024)
data_preprocessor = dict(
size=(512, 1024),
mean=[
94.94709810464303,
61.72942233949928,
75.93763705236906,
],
std=[
44.005506081132594,
42.69595666984776,
44.99354156225523,
],
bgr_to_rgb=False,
)
model = dict(
data_preprocessor=dict(
size=(512, 1024),
mean=[
94.94709810464303,
61.72942233949928,
75.93763705236906,
],
std=[
44.005506081132594,
42.69595666984776,
44.99354156225523,
],
bgr_to_rgb=False,
),
decode_head=dict(
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=1.0,
),
),
auxiliary_head=[
dict(
type='FCNHead',
in_channels=128,
channels=32,
num_convs=1,
num_classes=19,
in_index=-2,
norm_cfg=dict(
type='SyncBN',
requires_grad=True,
momentum=0.01,
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
# num_classes=36,
norm_cfg=dict(
type='BN',
),
),
),
dict(
type='FCNHead',
in_channels=64,
channels=32,
num_convs=1,
num_classes=19,
in_index=-3,
norm_cfg=dict(
type='SyncBN',
requires_grad=True,
momentum=0.01,
),
concat_input=False,
align_corners=False,
loss_decode=dict(
type='DiceLoss',
use_sigmoid=False,
loss_weight=0.4,
# num_classes=36,
norm_cfg=dict(
type='BN',
),
),
),
],
)
optim_wrapper = dict(
type='OptimWrapper',
_delete_=True,
optimizer=dict(
type='AdamW',
lr=0.0001,
weight_decay=0.0005,
),
clip_grad=dict(
max_norm=1,
norm_type=2,
),
)
param_scheduler = [
dict(
type='LinearLR',
start_factor=1e-06,
by_epoch=False,
begin=0,
end=1500,
),
dict(
type='PolyLR',
power=0.9,
begin=1500,
end=40000,
eta_min=1e-05,
by_epoch=False,
),
]