This paper introduces python's method of reading and writing LMDB files in detail.
The full name of LMDB is Lightning Memory-Mapped Database (lightning-fast memory mapping database). Its file structure is simple, including a data file and a lock file:
LMDB files can be opened by multiple processes at the same time, with high data access speed, simple access, and no need to run a separate database management process, as long as the access code refers to the LMDB library, access to the file path can be.
It is very expensive for the system to access a large number of small files, and LMDB uses memory mapping to access files, which makes the cost of addressing in files very small and can be achieved by pointer operation. A single database file can also reduce the overhead of data replication/transfer process.
Using lmdb: linux in python, you can install LMDB packages using the instruction'pip install lmdb'.
- Generate an empty lmdb database file
# -*- coding: utf-8 -*- import lmdb # If there is no data.mbd or lock.mdb file under the train folder, an empty file will be generated, and if so, it will not be overwritten. # map_size defines the maximum storage capacity in kb, and the following defines 1TB capacity env = lmdb.open("./train",map_size=1099511627776) env.close()
- Addition, modification and deletion of LMDB data
# -*- coding: utf-8 -*- import lmdb # map_size defines the maximum storage capacity in kb, and the following defines 1TB capacity env = lmdb.open("./train", map_size=1099511627776) txn = env.begin(write=True) # Add data and key values txn.put(key = '1', value = 'aaa') txn.put(key = '2', value = 'bbb') txn.put(key = '3', value = 'ccc') # Delete data by key value txn.delete(key = '1') # Modifying data txn.put(key = '3', value = 'ddd') # Commit changes through the commit() function txn.commit() env.close()
- Query lmdb database content
# -*- coding: utf-8 -*- import lmdb env = lmdb.open("./train") # The parameter write is set to True before it can be written txn = env.begin(write=True) ############################################Add, modify, delete data # Add data and key values txn.put(key = '1', value = 'aaa') txn.put(key = '2', value = 'bbb') txn.put(key = '3', value = 'ccc') # Delete data by key value txn.delete(key = '1') # Modifying data txn.put(key = '3', value = 'ddd') # Commit changes through the commit() function txn.commit() ############################################Query lmdb data txn = env.begin() # get function queries data by key value print txn.get(str(2)) # Traverse all data and key values through cursor() for key, value in txn.cursor(): print (key, value) ############################################ env.close()
- Read the contents of existing. mdb files
# -*- coding: utf-8 -*- import lmdb env_db = lmdb.Environment('trainC') # env_db = lmdb.open("./trainC") txn = env_db.begin() # The get function queries data by key value, and outputs None if the key value to be queried does not have corresponding data. print txn.get(str(200)) for key, value in txn.cursor(): #ergodic print (key, value) env_db.close()