SoFunction
Updated on 2025-04-08

Postgres operation of batch inserting data using stored procedures

refer toOfficial Documentation

create or replace function creatData2() returns 
boolean AS
$BODY$
declare ii integer;
 begin
 II:=1;
 FOR ii IN 1..10000000 LOOP
 INSERT INTO ipm_model_history_data (res_model, res_id) VALUES (116, ii);
 end loop;
 return true;
 end;
$BODY$
LANGUAGE plpgsql;
select * from creatData2() as tab;

It takes 610 seconds to insert 10 million pieces of data, of course, when there are not many fields.

Supplement: Postgresql stored procedure-update or insert data

To record information about the machine CPU, memory, and hard disk during a certain period of time, the displayed time granularity is minutes, but for the sake of accuracy, the time granularity of the input data source is 6s. This statistical process can be done at the application layer, inserted once a minute, or a stored procedure can be written in the database layer to determine whether to update the old data of the database or insert new data based on the time of incoming data.

At the same time, these data only need to be retained for one week, and older data needs to be deleted. The deletion action can be performed regularly every day, or it can be written to check each time in the stored procedure.

Considering that there is no constrained performance at this time, and the interface method of the stored procedures is more beautiful, the application layer does not need to care about what the data is organized into, so the following is implemented:

Postgresql V8.3
CREATE OR REPLACE FUNCTION insert_host_status(_log_date timestamp without time zone, _host inet, _cpu integer, _mem integer, _disk integer)
 RETURNS void AS
$BODY$
DECLARE
  new_start timestamp without time zone;
  current_start timestamp without time zone;
  c_id integer;
  c_log_date timestamp without time zone;
  c_cpu integer;
  c_mem integer;
  c_disk integer;
  c_count integer;
  date_span interval;
BEGIN
  -- insert or update
  SELECT id, log_date, cpu, mem, disk, count INTO c_id, c_log_date, c_cpu, c_mem, c_disk, c_count FROM host_status_byminute WHERE host=_host ORDER BY id DESC limit 1;
  SELECT timestamp_mi(_log_date, c_log_date) INTO date_span;
  IF date_span >= '00:00:60' OR c_id IS NULL THEN
    INSERT INTO host_status_byminute (log_date, host, cpu, mem, disk, count) values (_log_date, _host, _cpu, _mem, _disk, 1);
  ELSIF date_span >= '-00:00:60' THEN
    c_mem := ((c_mem * c_count) + _mem)/(c_count + 1);
    c_cpu := ((c_cpu * c_count) + _cpu)/(c_count + 1);
    c_disk := ((c_disk * c_count) + _disk)/(c_count + 1);
    c_count := c_count + 1;
    UPDATE host_status_byminute SET mem=c_mem, cpu=c_cpu, disk=c_disk, count=c_count WHERE id=c_id;
  END IF;
  -- delete old data
  SELECT date_mii(date(now()), 6) INTO new_start;
  SELECT date(log_date) from host_status_byminute limit 1 INTO current_start; -- omit a bug happened when date is disordered.
  IF new_start > current_start THEN
    DELETE FROM host_status_byminute where log_date < new_start;
  END IF;
END;
$BODY$
 LANGUAGE 'plpgsql' VOLATILE
 COST 100;
ALTER FUNCTION insert_host_status(timestamp without time zone, inet, integer, integer, integer) OWNER TO dbuser_test;

The above is personal experience. I hope you can give you a reference and I hope you can support me more. If there are any mistakes or no complete considerations, I would like to give you advice.