Python: HTTP Post a large file with streaming

Question

I'm uploading potentially large files to a web server. Currently I'm doing this:

import urllib2

f = open('somelargefile.zip','rb')
request = urllib2.Request(url,f.read())
request.add_header("Content-Type", "application/zip")
response = urllib2.urlopen(request)

However, this reads the entire file's contents into memory before posting it. How can I have it stream the file to the server?

Related: WSGI file streaming with a generator

Piotr Dobrogost
– Piotr Dobrogost

2012-10-10 22:20:43 +00:00
Commented Oct 10, 2012 at 22:20 — Piotr Dobrogost
– Piotr Dobrogost, Commented Oct 10, 2012 at 22:20
Related: stackoverflow.com/questions/2502596/…

Christophe Roussy
– Christophe Roussy

2016-03-17 14:11:22 +00:00
Commented Mar 17, 2016 at 14:11 — Christophe Roussy
– Christophe Roussy, Commented Mar 17, 2016 at 14:11

Daniel Von Fange · Accepted Answer · 2010-03-23 22:40:46Z

31

Reading through the mailing list thread linked to by systempuntoout, I found a clue towards the solution.

The mmap module allows you to open file that acts like a string. Parts of the file are loaded into memory on demand.

Here's the code I'm using now:

import urllib2
import mmap

# Open the file as a memory mapped string. Looks like a string, but 
# actually accesses the file behind the scenes. 
f = open('somelargefile.zip','rb')
mmapped_file_as_string = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)

# Do the request
request = urllib2.Request(url, mmapped_file_as_string)
request.add_header("Content-Type", "application/zip")
response = urllib2.urlopen(request)

#close everything
mmapped_file_as_string.close()
f.close()

answered Mar 23, 2010 at 22:40

Daniel Von Fange

6,0613 gold badges28 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Ayyappan Anbalagan Over a year ago

could you please confirm the below line is correct: request = urllib2.Request(url, mmapped_file_as_string)

Brian Beach · Accepted Answer · 2015-06-12 19:16:16Z

5

The documentation doesn't say you can do this, but the code in urllib2 (and httplib) accepts any object with a read() method as data. So using an open file seems to do the trick.

You'll need to set the Content-Length header yourself. If it's not set, urllib2 will call len() on the data, which file objects don't support.

import os.path
import urllib2

data = open(filename, 'r')
headers = { 'Content-Length' : os.path.getsize(filename) }
response = urllib2.urlopen(url, data, headers)

This is the relevant code that handles the data you supply. It's from the HTTPConnection class in httplib.py in Python 2.7:

def send(self, data):
    """Send `data' to the server."""
    if self.sock is None:
        if self.auto_open:
            self.connect()
        else:
            raise NotConnected()

    if self.debuglevel > 0:
        print "send:", repr(data)
    blocksize = 8192
    if hasattr(data,'read') and not isinstance(data, array):
        if self.debuglevel > 0: print "sendIng a read()able"
        datablock = data.read(blocksize)
        while datablock:
            self.sock.sendall(datablock)
            datablock = data.read(blocksize)
    else:
        self.sock.sendall(data)

answered Jun 12, 2015 at 19:16

Brian Beach

1612 silver badges3 bronze badges

2 Comments

Sergey Nudnov Over a year ago

urllib2.urlopen(url, data, headers) doesn't take headers as parameter, so the line response = urllib2.urlopen(url, data, headers) won't work. I have provided working code in answer below

Simplecode Over a year ago

Is this possible with the requests module ? I have to send files in chunks (10 MB) however do not want to read all 10MB in memory but want to read some bytes (8192) and send to requests..till I complete 10MB

systempuntoout · Accepted Answer · 2010-03-24 09:09:53Z

2

Have you tried with Mechanize?

from mechanize import Browser
br = Browser()
br.open(url)
br.form.add_file(open('largefile.zip'), 'application/zip', 'largefile.zip')
br.submit()

or, if you don't want to use multipart/form-data, check this old post.

It suggests two options:

  1. Use mmap, Memory Mapped file object
  2. Patch httplib.HTTPConnection.send

edited Mar 24, 2010 at 9:09

answered Mar 23, 2010 at 18:40

systempuntoout

74.6k47 gold badges176 silver badges247 bronze badges

2 Comments

Daniel Von Fange Over a year ago

I'm not wanting to send the files encoded "multipart/form-data". This would seem to do that. I'm just looking for a raw post.

MistahX Over a year ago

On python 2.7 option #2 has been added patched already, the block size is 8192, I wonder why.. hmmm. what's the norm/standard on this?

Mr_Pink · Accepted Answer · 2010-03-23 19:33:20Z

1

Try pycurl. I don't have anything setup will accept a large file that isn't in a multipart/form-data POST, but here's a simple example that reads the file as needed.

import os
import pycurl

class FileReader:
    def __init__(self, fp):
        self.fp = fp
    def read_callback(self, size):
        return self.fp.read(size)

c = pycurl.Curl()
c.setopt(pycurl.URL, url)
c.setopt(pycurl.UPLOAD, 1)
c.setopt(pycurl.READFUNCTION, FileReader(open(filename, 'rb')).read_callback)
filesize = os.path.getsize(filename)
c.setopt(pycurl.INFILESIZE, filesize)
c.perform()
c.close()

answered Mar 23, 2010 at 19:33

Mr_Pink

110k17 gold badges287 silver badges275 bronze badges

1 Comment

Daniel Von Fange Over a year ago

Thanks JimB. I'd have used this, except I have a few people Windows using this, and I don't want them to have to install anything else.

EthanP · Accepted Answer · 2016-06-29 23:01:36Z

1

Using the requests library you can do

with open('massive-body', 'rb') as f:
    requests.post('http://some.url/streamed', data=f)

as mentioned here in their docs

answered Jun 29, 2016 at 23:01

EthanP

1,7233 gold badges22 silver badges28 bronze badges

1 Comment

paul_h Over a year ago

The 8K block size still applies, as httplib.py, send() L#869 is called.

Sergey Nudnov · Accepted Answer · 2019-12-28 14:17:32Z

0

Below is the working example for both Python 2 / Python 3:

try:
    from urllib2 import urlopen, Request
except:
    from urllib.request import urlopen, Request

headers = { 'Content-length': str(os.path.getsize(filepath)) }
with open(filepath, 'rb') as f:
    req = Request(url, data=f, headers=headers)
    result = urlopen(req).read().decode()

The requests module is great, but sometimes you cannot install any extra modules...

edited Dec 28, 2019 at 14:17

answered Aug 20, 2018 at 16:36

Sergey Nudnov

1,44113 silver badges23 bronze badges

Collectives™ on Stack Overflow

Python: HTTP Post a large file with streaming

6 Answers 6

1 Comment

2 Comments

2 Comments

1 Comment

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

1 Comment

2 Comments

2 Comments

1 Comment

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related