https://blog.csdn.net/littlely_ll/article/details/82706874
1. in storage
import datetime import pandas as pd import sqlalchemy from sqlalchemy import create_engine from sqlalchemy.sql import text, func from sqlalchemy.orm import sessionmaker,relationship from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Column, String, Integer, Float, TIMESTAMP
2. Create database engine
#create_engine('Database type + database driver name: / / user name: password @ address: port number / database name) engine = create_engine(""oracle or mysql", echo=False) conn = engine.connect()
3. Use native sql statements
#Extract data using the connected engine df_list = conn.execute("select * from dual") #Get a piece of data df_list.fetchone() #Get all data, a tuple list df_list.fetchall() #Convert to dataframe df = pd.DataFrame(df_list) #Column name is sequence number by default, add column name df.columns = ['your col_name',...] #Using pandas to get data directly df1 = pd.read_sql("select * from dual", con=engine)
4. Use object relational mapping (ORM)
4.1 declaration mapping
Base = declarative_base() DBSession = sessionmaker(bind=engine) session = DBSession() #The Student class is equivalent to a table created in sql class Student(Base): __tablename__ = "student" id = Column(Integer, primary_key=True) name = Column(String(20)) sex = Column(String(2)) clas = Column(String(20)) grade = Column(Float(32)) def __repr__(self): return "<Student(id='%s', name='%s', sex='%s',clas='%s',grade='%s')>" % (self.id, self.name, self.sex, self.clas, self.grade)
In creating a Student class using declarative, you must have a tablename attribute and at least one field with a primary key, otherwise an error will occur.
4.2 create table
#Base.metadata.create'all will find all subclasses of BaseModel, and create these tables in the database, which is equivalent to 'create table' Base.metadata.create_all(engine) #Similarly, Base.metadata.drop_all(engine)
4.3 inserting data
#Insert single data new_student = Student(name='lily',sex='F',clas=2, grade=97) #Only add, but not submit. If there is an error, you can also recall (rollback) session.add(new_student) #Commit to database session.commit() #Rollback of inserted data new_student2 = Student(name='lily',sex='F',clas=2, grade=97) session.add(new_student2) session.rollback() #Insert batch data, take data frame of panda as an example df = pd.DataFrame({"name":['Mike','John','Mary'],"sex":['M','M','F'],'clas':[4,7,1],'grade':[78,96,85]}) #The first insert method (pandas to SQL) #Pay attention to the if_exists parameter when using to_sql. If it is replace, it will drop the table first, then create the table, and finally insert the data df.to_sql('student',con=engine,if_exists='append',index=False)
In fact, pandas's to_sql is quite fast, but it may be very slow to insert when there is a primary key (it takes several hours to test 1 million data with oracle)
#The second method (native method): a = [] for i in df.iterrows(): a.append(dict(i[1])) engine.execute(Student.__table__.insert(),a) #There are other ways
4.4 create association table
class Book(Base): __tablename__ = 'book' bid = Column(Integer, primary_key=True) bname = Column(String(20)) price = Column(Float(32)) #Create foreign key student? ID student_id = Column(Integer, ForeignKey(Student.id)) #Tell ORM to associate book class with Student class student = relationship(Student) insert_time = Column(TIMESTAMP(timezone=False), nullable=False, server_default=text("NOW()")) def __repr__(self): return "<Book(bid='%s', bname='%s', price='%s',student_id='%s')>" % ( self.bid, self.bname, self.price, self.student_id) Base.metadata.create_all(engine) books = pd.DataFrame({"bname":['gone with wind','good by','game of throne','king of ring'],"price":[128,22,67,190],'student_id':[1,1,3,2]}) books.to_sql('book',engine,if_exists='append',index=False)
When setting the insert time field in this block, you need to pay attention to:
If you set insert time = column (timestamp (timezone = false), default = func. Now(), it doesn't work. According to mysql experiment, the default value of setting insert time is still NULL.
If you set insert time = column (timestamp (timezone = false), nullable = false), MySQL will automatically add on update current ﹣ timestamp in extra, and the time after each data update will also be updated, which can be used as the data update time.
If insert_time = Column(TIMESTAMP(timezone=False),nullable=False, server_default=text("NOW()"), or
Insert? Time = column (timestamp (timezone = false), nullable = false, server? Default = text ("current? Timestamp")) or
Insert ﹐ time = column (timestamp (timezone = false), server ﹐ default = func. Now()), this field will not automatically update the time, so it can be used as the insertion time of data.
If you want to insert both the update time field and the insert time field, you must update the time field before inserting the time field
Update? Time = column (timestamp (timezone = false), nullable = false), then
insert_time = Column(TIMESTAMP(timezone=False), nullable=False, server_default=text("NOW()")),
In turn, there will be errors. Compare the following two figures:
Update before:
Update after:
When inserting a TIMESTAMP, pay attention to the time zone. Although the default time zone is False, you need to explicitly set it to False if you do not use the time zone. In oracle test, if you do not add the parameter timezone=False, it will still add the time zone setting.
5 setting of default value
If you want to set the default value in mysql, it is not feasible to use the default parameter when sqlalchemy writes the class's fields, for example:
class Book(Base): __tablename__ = 'book' bid = Column(Integer, primary_key=True) bname = Column(String(20)) price = Column(Float(32)) #Create foreign key student? ID student_id = Column(Integer, ForeignKey(Student.id)) #Tell ORM to associate book class with Student class student = relationship(Student) insert_time = Column(TIMESTAMP(timezone=False), nullable=False, server_default=text("NOW()")) flag = Column(String(1),default=text('0'))
Compared with the previous Book class, this class adds an additional flag field, but the default value 0 in mysql cannot be inserted by using default, as shown in the following figure:
However, if you change the parameter default of the flag field to server default, the default value will be set:
class Book(Base): __tablename__ = 'book' bid = Column(Integer, primary_key=True) bname = Column(String(20)) price = Column(Float(32)) #Create foreign key student? ID student_id = Column(Integer, ForeignKey(Student.id)) #Tell ORM to associate book class with Student class student = relationship(Student) insert_time = Column(TIMESTAMP(timezone=False), nullable=False, server_default=text("NOW()")) flag = Column(String(1),server_default=text('0'))
The book data structure of mysql is shown as follows:
Therefore, the server default parameter should be used to set the default value.
Reference resources
SQLAlchemy and MySQL TIMESTAMP
SQLAlchemy default DateTime
SQLAlchemy/Pandas: Can not insert default time in MySQL