Ceci est une version archivée de
PlPythonU à 2010-02-05 11:39:06.
Ok, you are dreaming about a "normalization" function in SQL, to remove accents, uppercase symbols, etc. ?
Here is (one of many ways) to do it.
1. Install support for Python in your postgresql engine :
jpetazzo@illithid:~$ sudo apt-get install postgresql-plpython-8.3
(Replace with your actual version ; if you are using a non-Debian distribution, adapt!)
2. Assume we have a database named
foobase ; install Python support inside this database :
jpetazzo@illithid:~$ createlang plpythonu foobase
This will basically tell your postgrsql engine "okay, the foobase can use Python support from now". In case you're wondering, all postgresql languages begin with PL ; hence the name "plpythonu". The U at the end means "untrusted" (but don't ask me what it really means deep inside).
NB : the command should be run as a privileged postgresql user (e.g.,
sudo -u postgres createlang ...).
3. Now, connect to the database :
jpetazzo@illithid:~$ psql foobase
4. Create a function :
foobase=# create or replace function
normalize(t text) returns text as $$
import unicodedata
without_accents = unicodedata.normalize('NFKD',t.decode('utf-8')).encode('ascii','ignore')
return without_accents.lower()
$$ language plpythonu;
Note : using "create or replace" allows you to modify the query and rerun it without having to bother with dropping the function beforehand.
5. Use the function :
foobase=# select normalize('YéhaÉhaö');
normalize
- - - - -
yehaehao
(1 row)
6. Rejoice and have a beer.