Skip to content

Conversation

@gpleiss
Copy link
Contributor

@gpleiss gpleiss commented Mar 21, 2017

See issue #97.

It's not super-memory optimized (i.e. there's a concatenation at every layer). This is consistent with the original Torch implementation, and prevents some gross autograd hacks.

Pretrained models for the model zoo are available here. They're converted over from the original Torch implementation (ported from LuaTorch).

@soumith
Copy link
Contributor

soumith commented Mar 21, 2017

awesome. do the DenseNet pre-trained models expect the same normalization as the rest of the models?

I'm going to upload your models to the pytorch s3 bucket so that you can make them available via the pretrained=True

@soumith
Copy link
Contributor

soumith commented Mar 21, 2017

You will have to change the naming of the pretrained models as described here:
http://pytorch.org/docs/model_zoo.html i.e. filename-<sha256>.ext


# First convolution
self.features = nn.Sequential()
self.features.add_module('conv0', nn.Conv2d(3, num_init_features, kernel_size=7, stride=2, padding=3, bias=False))

This comment was marked as off-topic.

drop_rate (float) - dropout rate after each dense layer
num_classes (int) - number of classification classes
"""
def __init__(self, growth_rate=32, block_config=(6,12,24,16), num_init_features=64, bn_size=4, drop_rate=0, num_classes=1000):

This comment was marked as off-topic.

@gpleiss
Copy link
Contributor Author

gpleiss commented Mar 21, 2017

@soumith - yeah, same normalization as the other ImageNet models. I'll rename the files

@gpleiss
Copy link
Contributor Author

gpleiss commented Mar 21, 2017

@fmasa no lint errors anymore. Sorry for not reading the contributing guidelines earlier!
@soumith just renamed the files: https://drive.google.com/drive/folders/0B0Y2k_mEJpY9R3dSSGQ0YXhfa2c?usp=sharing

@soumith
Copy link
Contributor

soumith commented Mar 21, 2017

They are now uploaded to the bucket and available via URLs:

https://download.pytorch.org/models/densenet121-1e136e00.pth
https://download.pytorch.org/models/densenet*

@fmasa
Copy link

fmasa commented Mar 22, 2017

@gpleiss I think you meant @fmassa 😁

@gpleiss
Copy link
Contributor Author

gpleiss commented Mar 22, 2017

@soumith Sorry, the files I linked to earlier were serialized version of the models, not the model states.

Here's a link to the correct files: https://drive.google.com/drive/folders/0B0Y2k_mEJpY9NXFBa1ktRUo3YlU?usp=sharing

@soumith
Copy link
Contributor

soumith commented Mar 23, 2017

@gpleiss the new files have been uploaded to the same place i.e. https://download.pytorch.org/models/

@gpleiss
Copy link
Contributor Author

gpleiss commented Mar 23, 2017

pretrained=True is now ready

@soumith soumith merged commit 831ba8c into pytorch:master Mar 23, 2017
@soumith
Copy link
Contributor

soumith commented Mar 23, 2017

Thank you! this is good stuff.

@trypag
Copy link

trypag commented Mar 26, 2017

It's weird, I have never seen DenseNet using this configuration for the first convolution. Checking other implementations from the authors, they also used conv 3x3, stride=1, padding=1
Is it your choice to make this design changes or am I missing something ?
To be clear, I am referring to this line https://github.com/pytorch/vision/blob/master/torchvision/models/densenet.py#L128

@gpleiss
Copy link
Contributor Author

gpleiss commented Mar 26, 2017

@trypag the cifar models use 3x3 convolution for the first layer, but the ImageNet models use 7x7 convolution. The author's implementation is only for the CIFAR models. However, if you download their pretrained imagenet models the first layer is a 7x7 convolution.

@trypag
Copy link

trypag commented Mar 27, 2017

Alright thanks @gpleiss !

@farleylai
Copy link

farleylai commented Dec 6, 2017

The authors address the memory efficiency in the followup paper and updated repo with underlying shared memory and re-computation on bp. The current pytorch code just uses torch.cat() directly. Any plan to formalize the technique? It is likely densenet based CNNs are going to prosper for other applications.

rajveerb pushed a commit to rajveerb/vision that referenced this pull request Nov 30, 2023
* initial commit of ssd code

* some readme fixes

* some bugfixes, adding model download

* requirements.txt for dockerfile

* switching the backbone to R34

* updating ssd300.py file

* removing imports that are no longer needed

* bug fixes for resnet backbone update (pytorch#112)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants