Go back PostgreSQL

PostgreSQL begins within the list

Fast way how to find addresses matching  loooong list of postal codes,  and the actual address postal code may be in short version of the long version 

eg values in DB may be:
  •  addresses.postal_code == "12345" 
  •  addresses.postal_code == "12345-8767"
  • addresses.postal_code == "N17 9GD"  

TL;DR: The prettiest, easiest  and fastest  to do this  is to use SIMILAR TO regexp


SELECT * FROM addresses WHERE postal_code SIMILAR TO '(12758|60937|N17 9GD)%") // notice there is no % at start, just at the end
                                                           ^^^^^^^^^^^^ imagine this list to be ten thousand postal codes 
or if you want case insensitive:

SELECT * FROM addresses WHERE lower(postal_code) SIMILAR TO '(12758|60937|N17 9GD)%");


Don't use LIKE or ILIKE


ILIKE & LIKE will be slower (any of the following examoles will work but will be slower than proposed solution)


SELECT * FROM addresses WHERE postal_code ILIKE '80013%' OR  addresses.postal_code ILIKE '80014%' OR  addresses.postal_code ILIKE '80015%'
// slow, long and ugly once you place here 10 000 postcodes

there is also  this format of ILIKE:

SELECT * FROM addresses WHERE postal_code  LIKE  any (array['80013%', '80014%', '80015%']);
SELECT * FROM addresses WHERE postal_code ILIKE  any (array['80013%', '80014%', '80015%']);

... it will still be slower than proposed solution

Also be aware this will suck too:  

SELECT * FROM addresses WHERE postal_code lower(value) ~~ ANY('{80013%,80014%,80015%}');

Sources




Match reqexp with ~ 


Another option is  POSIX Regular Expressions  
 
  • text ~ text → matches regular expression, case sensitively. (  'thomas' ~ 't.*ma' → t  )
  • text ~* text →  matches regular expression, case insensitively ( 'thomas' ~* 'T.*ma' → t )
  • text !~ text →  does not match regular expression, case sensitively ( 'thomas' !~ 't.*max' → t )
  • text !~* text →  does not match regular expression, case insensitively ( 'thomas' !~* 'T.*ma' → f )

`~` will use all regex operators, so we could use `^`  operator, which will not make a full match of the string.

note that ~ uses a seqscan even when it's substantially more expensive (artifically so due to enable_seqscan) because it has no alternative, while LIKE uses the index. However, a corrected ~ with a left anchor uses the index too
source

So  theoretically this should be the fastest way to do it:

SELECT * FROM addresses WHERE postal_code !~ '^(22222|33333)'
SELECT * FROM addresses WHERE postal_code  ~ '^(22222|33333)'

However I've done some tests  and when it comes to 

SELECT * FROM addresses WHERE postal_code SIMILAR TO '(12758|60937|02218)%")

...and

SELECT * FROM addresses WHERE postal_code ~ '^(12758|60937|02218)

...are same speed ( tiny tiny difference ). Same was for negative versions `postal_code NOT SIMILAR TO` and `postal_code !~ '^(80013|...` .

Now here is the important part: Yes ~   was same speed  thanks to ^ BUT if you ever decide you want to remove the ^  then "SIMILAR TO '(123|456)%'" is faster 🫨