python Reading of LMDB Files

Keywords: Database Python Linux pip

This paper introduces python's method of reading and writing LMDB files in detail.
The full name of LMDB is Lightning Memory-Mapped Database (lightning-fast memory mapping database). Its file structure is simple, including a data file and a lock file:

LMDB files can be opened by multiple processes at the same time, with high data access speed, simple access, and no need to run a separate database management process, as long as the access code refers to the LMDB library, access to the file path can be.
It is very expensive for the system to access a large number of small files, and LMDB uses memory mapping to access files, which makes the cost of addressing in files very small and can be achieved by pointer operation. A single database file can also reduce the overhead of data replication/transfer process.

Using lmdb: linux in python, you can install LMDB packages using the instruction'pip install lmdb'.

  1. Generate an empty lmdb database file
# -*- coding: utf-8 -*- 
import lmdb 
  
# If there is no data.mbd or lock.mdb file under the train folder, an empty file will be generated, and if so, it will not be overwritten. 
# map_size defines the maximum storage capacity in kb, and the following defines 1TB capacity 
env = lmdb.open("./train",map_size=1099511627776) 
env.close()
  1. Addition, modification and deletion of LMDB data
# -*- coding: utf-8 -*- 
import lmdb 
  
# map_size defines the maximum storage capacity in kb, and the following defines 1TB capacity 
env = lmdb.open("./train", map_size=1099511627776) 
  
txn = env.begin(write=True) 
  
# Add data and key values 
txn.put(key = '1', value = 'aaa') 
txn.put(key = '2', value = 'bbb') 
txn.put(key = '3', value = 'ccc') 
  
# Delete data by key value 
txn.delete(key = '1') 
  
# Modifying data 
txn.put(key = '3', value = 'ddd') 
  
# Commit changes through the commit() function 
txn.commit() 
env.close()
  1. Query lmdb database content
# -*- coding: utf-8 -*- 
import lmdb 
  
env = lmdb.open("./train") 
  
# The parameter write is set to True before it can be written 
txn = env.begin(write=True) 
############################################Add, modify, delete data 
  
# Add data and key values 
txn.put(key = '1', value = 'aaa') 
txn.put(key = '2', value = 'bbb') 
txn.put(key = '3', value = 'ccc') 
  
# Delete data by key value 
txn.delete(key = '1') 
  
# Modifying data 
txn.put(key = '3', value = 'ddd') 
  
# Commit changes through the commit() function 
txn.commit() 
############################################Query lmdb data 
txn = env.begin() 
  
# get function queries data by key value 
print txn.get(str(2)) 
  
# Traverse all data and key values through cursor() 
for key, value in txn.cursor(): 
  print (key, value) 
    
############################################ 
  
env.close()
  1. Read the contents of existing. mdb files
# -*- coding: utf-8 -*- 
import lmdb 
  
env_db = lmdb.Environment('trainC') 
# env_db = lmdb.open("./trainC") 
  
txn = env_db.begin() 
  
# The get function queries data by key value, and outputs None if the key value to be queried does not have corresponding data. 
print txn.get(str(200)) 
  
for key, value in txn.cursor(): #ergodic 
  print (key, value) 
  
env_db.close()

Posted by ecco on Mon, 14 Oct 2019 07:23:57 -0700