Pickling is the process of converting a Python object (such as a list, dictionary, or class object) into a byte stream so that it can be saved to a file or transmitted over a network.
In the below example, a Python dictionary is saved (pickled) to a file and then loaded (unpickled) back, printing the original dictionary.
import pickle
data = {'name': 'Jenny', 'age': 25}
pickle.dump(data, open('data.pkl', 'wb'))
loaded = pickle.load(open('data.pkl', 'rb'))
print(loaded)
Output
{'name': 'Jenny', 'age': 25}
Explanation:
- import pickle: Imports the pickle module to handle serialization.
- data = {...}: Creates a Python object to be saved.
- pickle.dump(data, open('data.pkl', 'wb')): Serializes (pickles) the object and writes it to data.pkl.
- 'wb' means write in binary mode.
- loaded = pickle.load(open('data.pkl', 'rb')): Reads the file and deserializes (unpickles) it back into a Python object.
- 'rb' means read in binary mode.
Object Serialization and Deserialization
Object Serialization converts a Python object into a byte stream for storage or transmission, and deserialization restores it back. In Python, both are done using the pickle module.
In this image, a Python object is converted into a byte stream during serialization, which can be stored in a file, database, or memory. Later, the byte stream is converted back into the original object during deserialization.
.png)
Example 1: Pickling without a File
In this example, we will serialize the dictionary data and store it in a byte stream. Then this data is deserialized using pickle.loads() function back into the original Python object.
import pickle
Leo = {'key' : 'Leo', 'name' : 'Leo Johnson',
'age' : 21, 'pay' : 40000}
Harry = {'key' : 'Harry', 'name' : 'Harry Jenner',
'age' : 50, 'pay' : 50000}
db = {}
db['Leo'] = Leo
db['Harry'] = Harry
b = pickle.dumps(db)
myEntry = pickle.loads(b)
print(myEntry)
Output
{'Leo': {'key': 'Leo', 'name': 'Leo Johnson', 'age': 21, 'pay': 40000}, 'Harry': {'key': 'Harry', 'name': 'Harry Jenner', 'age': 50, 'pay': 50000}}
Explanation:
- pickle.dumps(obj): Serializes obj to a bytes object instead of a file.
- pickle.loads(bytes_obj): Deserializes the bytes back into the original Python object.
Example 2: Pickling with a File
In this example, we will use a pickle file to first write the data in it using the pickle.dump() function. Then using the pickle.load() function, we will load the pickle file in Python script and print its data in the form of a Python dictionary.
import pickle
Leo = {'key': 'Leo', 'name': 'Leo Johnson', 'age': 21, 'pay': 40000}
Harry = {'key': 'Harry', 'name': 'Harry Jenner', 'age': 50, 'pay': 50000}
db = {}
db['Leo'] = Leo
db['Harry'] = Harry
with open('examplePickle', 'ab') as dbfile:
pickle.dump(db, dbfile)
with open('examplePickle', 'rb') as dbfile:
try:
while True:
db = pickle.load(dbfile)
for key in db:
print(key, '=>', db[key])
except EOFError:
pass
Output
Leo => {'key': 'Leo', 'name': 'Leo Johnson', 'age': 21, 'pay': 40000}
Harry => {'key': 'Harry', 'name': 'Harry Jenner', 'age': 50, 'pay': 50000}
Explanation:
- 'ab' mode: Append binary (store multiple objects).
- 'rb' mode: Read binary.
- EOFError: Indicates end of file when reading multiple pickled objects.
Advantages
- Supports complex objects including user-defined classes
- Handles recursive objects (objects referencing themselves)
- Preserves shared references between objects
- Faster serialization compared to text-based formats like JSON
Disadvatages
- Python-version dependent: Pickle files created in one Python version may not work in another.
- Not human-readable: Pickle data is binary and cannot be easily edited.
- Security risk: Unpickling data from untrusted sources can execute malicious code.
- Not ideal for large datasets: Can be inefficient for very large data compared to other formats.