SoFunction
Updated on 2025-03-03

Postgresql: greenplum string deduplication method

postgresql greenplum string deduplication

pg and gp implement collect_set effect:

array_agg(DISTINCT $columnName))
array(SELECT DISTINCT   $columnName  from $tableName )

String stitching:

array_to_string(array_agg(DISTINCT ), '|') 

Postgresql string processing method

About finding strings

select position('hello' in 'test_sql') 
output:0
select position('test' in 'test_sql') 
output:1

Therefore, position(str_1 in str_2) != 0 can be used to determine whether str_1 is a substring of str_2.

Conversion of strings and numbers

String -> Number

to_number(block_id,'999999') and cast(block_id as numeric), but the previous sentence will limit the numeric size to 10 to the power of 5, that is, an excessively large number will have problems, while the next sentence will not.

Numbers -> Strings

cast(block_id as varchar) converts numerical type to string type.

Summarize

The above is personal experience. I hope you can give you a reference and I hope you can support me more.