How to import CSV file data into PostgreSQL table?

Keywords: Excel SQL Stored Procedure Database

How to write a stored procedure that imports data from a CSV file and populates tables?

#1 building

You can also use pgAdmin, which provides a GUI to import. Here it is. Show in SO thread . The advantage of using pgAdmin is that it is also applicable to remote databases.

However, much like the previous solution, you will need to have tables in the database. Everyone has their own solutions, but what I usually do is to open CSV in Excel, copy the title, paste special content with transposition characters on different worksheets, put the corresponding data type in the next column, and then copy and paste it into the text editor with the appropriate SQL table to create the query, as follows:

CREATE TABLE my_table (
    /*paste data from Excel here for example ... */
    col_1 bigint,
    col_2 bigint,
    /* ... */
    col_n bigint 
)

#2 building

A quick way is to use the python panda Library (version 0.15 or later works best). This will create columns for you - although obviously the choice of data type may not be what you want. If your requirements are not fully met, you can always use the create table code generated as a template.

This is a simple example:

import pandas as pd
df = pd.read_csv('mypath.csv')
df.columns = [c.lower() for c in df.columns] #postgres doesn't like capitals or spaces

from sqlalchemy import create_engine
engine = create_engine('postgresql://username:password@localhost:5432/dbname')

df.to_sql("my_table_name", engine)

Here are some codes that show you how to set various options:

# Set it so the raw sql output is logged
import logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

df.to_sql("my_table_name2", 
          engine, 
          if_exists="append",  #options are 'fail', 'replace', 'append', default 'fail'
          index=False, #Do not output the index of the dataframe
          dtype={'col1': sqlalchemy.types.NUMERIC,
                 'col2': sqlalchemy.types.String}) #Datatypes should be [sqlalchemy types][1]

#3 building

Take a look at this A short article .

The solution is explained as follows:

Create table:

CREATE TABLE zip_codes 
(ZIP char(5), LATITUDE double precision, LONGITUDE double precision, 
CITY varchar, STATE char(2), COUNTY varchar, ZIP_CLASS varchar);

To copy data from a CSV file to a table:

COPY zip_codes FROM '/path/to/csv/ZIP_CODES.txt' WITH (FORMAT csv);

#4 building

Most of the other solutions here require you to create tables in advance / manually. In some cases (for example, if there are many columns in the target table), this may not be practical. Therefore, the following methods may be useful.

Provides the path and number of columns of the csv file. You can use the following functions to load the table into a temporary table named target table:

The first row is assumed to have a column name.

create or replace function data.load_csv_file
(
    target_table text,
    csv_path text,
    col_count integer
)

returns void as $$

declare

iter integer; -- dummy integer to iterate columns with
col text; -- variable to keep the column name at each iteration
col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet

begin
    create table temp_table ();

    -- add just enough number of columns
    for iter in 1..col_count
    loop
        execute format('alter table temp_table add column col_%s text;', iter);
    end loop;

    -- copy the data from csv file
    execute format('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_path);

    iter := 1;
    col_first := (select col_1 from temp_table limit 1);

    -- update the column names based on the first row which has the column names
    for col in execute format('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
    loop
        execute format('alter table temp_table rename column col_%s to %s', iter, col);
        iter := iter + 1;
    end loop;

    -- delete the columns row
    execute format('delete from temp_table where %s = %L', col_first, col_first);

    -- change the temp table name to the name given as parameter, if not blank
    if length(target_table) > 0 then
        execute format('alter table temp_table rename to %I', target_table);
    end if;

end;

$$ language plpgsql;

#5 building

If you do not have permission to use copy (working on the db server), you can use \ \ copy (working on the db client) instead. Use the same example as Bozhidar Batsov:

Create table:

CREATE TABLE zip_codes 
(ZIP char(5), LATITUDE double precision, LONGITUDE double precision, 
CITY varchar, STATE char(2), COUNTY varchar, ZIP_CLASS varchar);

To copy data from a CSV file to a table:

\copy zip_codes FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV

You can also specify which columns to read:

\copy zip_codes(ZIP,CITY,STATE) FROM '/path/to/csv/ZIP_CODES.txt' DELIMITER ',' CSV

Posted by RoBoTTo on Wed, 08 Jan 2020 03:11:35 -0800