PostgreSQL saved my life (once more...)
May 2017
Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      
About
This site is an effort to share some of the base knowledge I have gathered through all this years working with Linux, FreeBSD, OpenBSD, Python or Zope, among others. So, take a look around and I hope you will find the contents useful.
Recent Entries
Recent Comments
Recent Trackbacks
Categories
OpenBSD (9 items)
BSD (0 items)
FreeBSD (19 items)
Linux (3 items)
Security (3 items)
Python (22 items)
Zope (13 items)
Daily (144 items)
e-shell (9 items)
Hacks (14 items)
PostgreSQL (3 items)
OSX (8 items)
Nintendo DS (0 items)
enlightenment (0 items)
Apache (3 items)
Nintendo Wii (1 items)
Django (24 items)
Music (12 items)
Plone (7 items)
Varnish (0 items)
Lugo (2 items)
Sendmail (0 items)
europython (7 items)
Cherokee (1 items)
self (1 items)
Nature (1 items)
Hiking (0 items)
uwsgi (0 items)
nginx (0 items)
cycling (9 items)
Networking (1 items)
DNS (0 items)
Archives

Syndicate this site (XML)

RSS/RDF 0.91

31 enero
2008

PostgreSQL saved my life (once more...)

or how things could be so easy...

Yes, once more, PostgreSQL saved my life. Let me show you how.

Imagine you have a database, there you have a table and inside that table you have a date field. You have populated the database and now you have more than 1000 records inside that table, something like:

FMJJ_test=# select id, apodo, nombre, fecha_nacimiento from beyle_autor where fecha_nacimiento ilike '%01/01%' order by fecha_nacimiento;
 id  | apodo |                         nombre                         | fecha_nacimiento
-----+-------+--------------------------------------------------------+------------------
 259 |       | XXXXXXXXXXXXXXXXXX                                     | 01/01/1602
 296 |       | XXXXXXXXXXXXXXXXXX                                     | 01/01/1807
 375 |       | XXXXXXXXXXXXXXXXXX                                     | 01/01/1838
 355 |       | XXXXXXXXXXXXXXXXXX                                     | 01/01/1849
 409 |       | XXXXXXXXXXXXXXXXXX                                     | 01/01/1849

Now imagine that, for some reasons, that fecha_nacimiento field has to change into a varchar field and that all those date entries should appear with only their year value (without month and day).

Probably the first thing you thought when reading the text above was:

"hey, that's pretty easy, just write a simple python/perl/php/whatever script that connects to the database, then get the records from the table, then parse them, the put them back into the table"

Well, you were right, and it would take a small ammount of time to write that script (taking care that you will need the language, the posgtresql connector for that given language and the knowledge to write it so quickly), but there is an even quicker way to do that, using PostgreSQL functions directly!

PostgreSQL have some nice already-included functions that will allow us to do some formatting as well as play with date and datetime objects.

So, let's do it, first use psql to connect to the database:

snowball:~ Wu$ psql test
Welcome to psql 8.1.5, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit

test=#

NOTE: I'm using PostgreSQL 8.1.x in this example, but it will work with any other 8.x version too.

Now, we can use EXTRACT combined with to_date to do something like:

test=# update beyle_autor set fecha_nacimiento = extract(YEAR FROM to_date(fecha_nacimiento, 'DD MM YYYY'))  where fecha_nacimiento ilike '%01/01%';

There, we are telling PostgreSQL that it has to update the beyle_autor table, setting fecha_nacimiento as the YEAR part from a datetime object. We need to use to_date because the value we get from the database is already a varchar (aka character varying) and we need to convert it to a date/datetime value before passing it to EXTRACT.

Using that single query, all the records inside the table that match the given query (in that example I needed to keep complete dates that do not begin with 01/01, first of january), will be truncated and saved as only the year value. Easy, isn't it?

Let's check that it have work:

FMJJ_test=# select id, apodo, nombre, fecha_nacimiento from beyle_autor order by fecha_nacimiento;
 id  | apodo |                         nombre                         | fecha_nacimiento
-----+-------+--------------------------------------------------------+------------------
 259 |       | XXXXXXXXXXXXXXXXXX                                      | 1602
 296 |       | XXXXXXXXXXXXXXXXXX                                      | 1807
 375 |       | XXXXXXXXXXXXXXXXXX                                      | 1838
 355 |       | XXXXXXXXXXXXXXXXXX                                      | 1849
 409 |       | XXXXXXXXXXXXXXXXXX                                      | 1849

You could play a little bit with the functions separately, for example, you can get a date value from a varchar field:

test=# select to_date(fecha_nacimiento, 'DD MM YYYY') from beyle_autor where fecha_nacimiento ilike '%01/01%';
  to_date
------------
 1951-01-01
 1956-01-01
 1907-01-01
 1958-01-01
 1927-01-01
 1965-01-01

or you can get the YEAR value in a single query:

test=# select extract(YEAR FROM to_date(fecha_nacimiento, 'DD MM YYYY')) from beyle_autor where fecha_nacimiento ilike '%01/01%';
 date_part
-----------
      1951
      1956
      1907
      1958
      1927
      1965
      1923

Of course, this is only an example of the wonderful things you can get using internal PostgreSQL functions. Sometimes a little dig could avoid you endless time of no-so-useful scripting...

Posted by wu at 18:23 | Comments (0) | Trackbacks (0)
<< Alt.End | Main | Back from the mountains >>
Comments
Re: PostgreSQL saved my life (once more...)

Hey! All database servers nowaday have internal functions to manage dates...

Postgre saved your life, but I've seen it about to kill a project because of its *problems*.

Let's say pgpool hell or the fact it's performance it's very bad when you manage (let's say) some million of rows :(

Believe me man, I thought Postgres was a nice relational database server until I had to do some serious things with it :( When you have a 10.000EUR cluster and the bottleneck it's a select, it's kinda frustrating.

Posted by: Juanjo at enero 31,2008 21:50
Re: PostgreSQL saved my life (once more...)

> Hey! All database servers nowaday have internal > functions to manage dates...

Sure they have, but i'm like a little child with a new toy when I find those kinds of things... :D

> Postgre saved your life, but I've seen it about > to kill a project because of its *problems*.

Sure too, as any other database system in the world (imho)

> Let's say pgpool hell or the fact it's
> performance it's very bad when you manage
> (let's say) some million of rows :(

well, perhaps pgpool is not what you need... I mean, I didn't work with such setup (multiple postgresql servers sharing same content), but there are some other tools out there, like slony (http://slony.info), for example. (anyway, it is difficult to speak about something you haven't see by yourself, like that environment of yours).

> Believe me man, I thought Postgres was a nice
> relational database server until I had to do
> some serious things with it :( When you have a
> 10.000EUR cluster and the bottleneck it's a
> select, it's kinda frustrating.

mmm, I know it can be frustrating. I've had similar problems in a given setup, with some tables with some thousand of records in them, using some complex queries. That damm thing was really slow... until I began to play with some tools to solve that problem.

In my experience with postgresql, it is very important to have the databases clean (VACUUM), the indexes properly built (VACUUM FULL ANALYZE) and, of course, it is very important to have optimized queries (you can use ANALYZE and some other tools to get some information about bottlenecks on complex queries, and get better ones)

You can play a little bit with some things in postgresql.conf too, like geqo (the generic query optimizer) and more...

Of course, I don't know nor your environment neither your set up, but I'll bet there is a workaround for that bottleneck problem (IMHO).

Posted by: Wu at febrero 01,2008 10:57
Re: PostgreSQL saved my life (once more...)

Oyes! A ver lo que me haces con la bdd de la FMJJ que te veo!

Posted by: clayton at febrero 09,2008 16:13
Re: PostgreSQL saved my life (once more...)

Jajajajaja. tranqui tranqui, que esto es en la maquina en la que se está desarrollando Beyle, y no he puesto datos "sensibles"... ;D

Por cierto, mola que te animes a comentar de vez en cuando...

Posted by: Wu at febrero 09,2008 16:21
Trackbacks
Please send trackback to:http://blog.e-shell.org/38/tbping
There are no trackbacks.