Horizontal scaling PostgreSQL using PL/Proxy.

Very difficult to start writing an article. Ie very hard to come up with opening remarks. I want to tell all at once :) But no. Let's be consistent.
Start with the fact that just recently Highload++ 2008 which I was able to visit.
I have to say — the event was held by the upper class, the reports were many and all were very interesting.
One of the most memorable presentations was a lecture by ASKO oja infrastructure for database servers, Skype. The lecture largely concerned the various means by which to achieve such performance servers.
According to ASCO, the database Skype will survive even if everyone on the Earth wants to connect to Skype at one point.

Having arrived home really wanted it to try it live. What I now tell you. Just specify the database structure for the test, taken from the example on the website of developers themselves and of course has nothing to do with the actual download.
The article will be described that the load distribution need to do after I got so hot and the base of the falls, but it is not so. With the help of this article I just want to prepare the novice and the experienced developers and at the same time to get them to think about what to provide for the possibility of load distribution among servers is necessary even when designing the system. And it will not be considered the same "premature optimization" about which so much is written and which so frightened.

UPD: As correctly noticed abrowser descentspb the article gives you annoying error. In consequence of their negligence I think PgBouncer should be set between the proxy and the client. But as it turned out, the same problem which I solved by using PgBouncer is not solved if you set it that way. Correctly it is necessary to install bouncer between the node and the proxy. Moreover, exactly what is recommended in the official manual on the website of PL/Proxy.
In any case, the use PgBouncer as indicated on my chart will also give a performance boost. (Unload Proxy).




1.Who is to blame?


So, if you are a developer and create something large and fairly high-load you will sooner or later face the fact that the database does not withstand the load. Query comes up a lot, and iron is simply not able to cope with them.
Methods of solving this problem has been discussed many times, I just give a list of what I think is most effective.

— Optimize code.
— Scalable server capacity.
— Caching (looking at the tags articles on memcache).
— Distribute the load between the servers.

Let's dwell on the last paragraph.

2.What to do?


So the code is optimized, servers nowhere steeper the entire base lies in the cache and nevertheless falls from the same request. It's time to do horizontal scaling.
Oh yeah, I still haven't mentioned what the article is about PostgreSQL. And that you still use MySQL? Then we go to you :)
In my humble opinion if the project really seriously, then the base should be a little more serious than MySQL. All the more so for a Postgres'there are such wonderful tools for scaling. (Maybe MySQL is? Waiting for response article :) ).

3. And what do you eat?


PL/Proxy is a language for remote invocation of functions on the PostgreSQL database server and to protezirovanija data.
The operation scheme is shown in the picture. About PgBouncer I will discuss below.
image

Usually, your app simply makes a request to the database. In our case, the application also makes the usual request to the database. Just cause it's not a pure SQL code, and pre-written function.
Further, the database determines which of the nodes located in the required details.
And forwards the request to the appropriate server.

Everything seems to be fine but with a large number of queries, PL/Proxy creates a large
the number of connections to nodes, and this creates a new Postgres process (fork) that is not very good for performance. To solve this problem and need PgBouncer.
PgBouncer is... um... how to say this so don't screw up... Multiplexer connections. It looks like a normal Postgres process, but inside it manages a queue of queries that allows you to significantly speed up the server. Of the thousands of queries received by PgBouncer to the database will take only a few dozen.
To estimate the bonus from using this wonderful product is sufficient to look at the graph of the server load database for the two sites before and after switch PgBouncer. The picture is taken from the presentation of Nikolay Samokhvalov "Postgres Performance".
image

4.Give me these soft French rolls


4.1.Install PgBouncer

The installation process does not original:
Download the package (at the time of this writing the latest version was 1.2.3)
pgfoundry.org/frs/?group_id=1000258
Extract:
#tar-xzvf pgbouncer-1.2.3.tgz
Campiglio and set:
#cd pgbouncer-1.2.3
#./configure
#make
#make install


Create a config file:
/etc/pgbouncer/pgbouncer.ini

[databases]
testdb = host=localhost port=5432 dbname=testdb

[pgbouncer]
listen_port = 6543
listen_addr = 127.0.0.1
auth_type = md5
auth_file = users.txt
logfile = /var/log/pgbouncer.log
pidfile = /var/run/pgbouncer/pgbouncer.pid
pool_mode = statement #If you do not plan to use PL/Proxy the line should not be specified
admin_users = root

Create a file with authentication.
/etc/pgbouncer/users.txt
"testdb_user" "testdb_user_password"

Run:
/usr/local/bin/pgbouncer -d /etc/pgbouncer/pgbouncer.ini -u postgres
The-d switch indicates that you must run in daemon mode and the-u switch indicates on whose behalf it is necessary to start the process pgbouncer.

For gentoo users a nice surprise at the start of the script:
/etc/init.d/pgbouncer
#!/sbin/runscript

depend() {
need postgresql
use pgsql
}
start() {
ebegin "Starting Pgbouncer"
start-stop-daemon --start --background --exec /usr/local/bin/pgbouncer --chdir /etc/pgbouncer/ -- -d pgbouncer.ini -u postgres
eend $? "Failed to start Pgbouncer"
}
stop() {
ebegin "Stopping Pgbouncer"
start-stop-daemon --pidfile /var/run/pgbouncer/pgbouncer.pid --stop
eend $? "Failed to stop Pgbouncer"
}


Now as a DSN in your application you need to change only the port on which to connect to the database on 5432 6543 and start to compare the utilization of servers before and after.

4.2 installing the Pl/Proxy

To conduct this experiment we need 3 servers with Postgres.
One of them, let's call it proxy, to proxy requests to the other two.
Let's call them node1 and node2.
For the correct operation of pl/proxy is recommended to use many nodes equal to powers of two.
I assume that Postgres itself is already installed.
Install PL/Proxy on the server proxy.
Download the latest version of pl/proxy: pgfoundry.org/frs/?group_id=1000207
As usual:
#./configure
#make
#make install
Here it is necessary to restart the Postgres itself.
And now the fun begins.

To test create on each node a new database proxytest:
CREATE DATABASE proxytest
WITH with OWNER = postgres
ENCODING = 'UTF8';


And inside that database create a Schema called plproxy. In the official instructions this item was not but somehow I got all of the called function attempted to be called this way: plproxy.functioname().
CREATE SCHEMA plproxy
AUTHORIZATION postgres;
GRANT ALL ON SCHEMA plproxy TO postgres;
GRANT ALL ON SCHEMA plproxy TO public;



And add one sign:
CREATE TABLE plproxy.users
user_id bigint NOT NULL DEFAULT nextval('plproxy.user_id_seq'::regclass),
username character varying(255)
email character varying(255)
CONSTRAINT users_pkey PRIMARY KEY (user_id)
)
WITH (OIDS=FALSE);
ALTER TABLE plproxy.users the OWNER of TO postgres;



Now, create a function to add data to these tables:
CREATE OR REPLACE FUNCTION plproxy.insert_user(i_username text, i_emailaddress text)
RETURNS integer AS
$BODY$
INSERT INTO plproxy.users (username, email) VALUES ($1,$2);
SELECT 1;
$BODY$
LANGUAGE 'sql' VOLATILE;
ALTER FUNCTION plproxy.insert_user(text, text) OWNER TO postgres;



With the node gone. Proceed to configuring the server.
As all the nodes on the main server (proxy) must be present the database:
CREATE DATABASE proxytest
WITH with OWNER = postgres
ENCODING = 'UTF8';



And the corresponding schema:
CREATE SCHEMA plproxy
AUTHORIZATION postgres;
GRANT ALL ON SCHEMA plproxy TO postgres;
GRANT ALL ON SCHEMA plproxy TO public;



Now we have to specify an server the database is managed помощьюpl/proxy:
CREATE OR REPLACE FUNCTION plproxy.plproxy_call_handler()
RETURNS language_handler AS
'$libdir/plproxy' 'plproxy_call_handler'
LANGUAGE 'c' VOLATILE
COST 1;
ALTER FUNCTION plproxy.plproxy_call_handler() OWNER TO postgres;
-- language
CREATE LANGUAGE plproxy HANDLER plproxy_call_handler;



Also, what would the server know where and what nodes it is necessary to create a 3 service functions are pl/proxy will use in its work:
CREATE OR REPLACE FUNCTION plproxy.get_cluster_config(IN cluster_name text, OUT "key" text OUT val text)
RETURNS SETOF record AS
$BODY$
BEGIN
-- lets use same config for all clusters
key := 'connection_lifetime';
val := 30*60; -- 30m
RETURN NEXT;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION plproxy.get_cluster_config(text) OWNER TO postgres;



An important function of the code which need to be corrected. It will need to specify a DSN node:
REATE OR REPLACE FUNCTION plproxy.get_cluster_partitions(cluster_name text)
RETURNS SETOF text AS
$BODY$
BEGIN
IF cluster_name = 'clustertest' THEN
RETURN NEXT 'dbname=proxytest host=node1 user=postgres';
RETURN NEXT 'dbname=proxytest host=node2 user=postgres';
RETURN;
END IF;
RAISE EXCEPTION 'Unknown cluster';
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100
ROWS 1000;


And the last:
CREATE OR REPLACE FUNCTION plproxy.get_cluster_version(cluster_name text)
RETURNS integer AS
$BODY$
BEGIN
IF cluster_name = 'clustertest' THEN
RETURN 1;
END IF;
RAISE EXCEPTION 'Unknown cluster';
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100;
ALTER FUNCTION plproxy.get_cluster_version(text) OWNER TO postgres;



Well, actually the main function which will be called directly in the application:
CREATE OR REPLACE FUNCTION plproxy.insert_user(i_username text, i_emailaddress text)
RETURNS integer AS
$BODY$
CLUSTER 'clustertest';
RUN ON hashtext(i_username);
$BODY$
LANGUAGE 'plproxy' VOLATILE
COST 100;
ALTER FUNCTION plproxy.insert_user(text, text) OWNER TO postgres;



Questions on code functions are accepted in the comments, but note that I'm not a Postgres guru, but only an apprentice.

Now Astrium! :)

Connect to the proxy server on port 6543 (we'll just work through PgBouncer).
And write the data into the database:
SELECT insert_user('Sven''sven@somewhere.com');
SELECT insert_user('Marko' 'marko@somewhere.com');
SELECT insert_user('Steve''steve@somewhere.com');



Now you can connect each of the node and if you did everything correctly and without errors the first two records are on the node node1 and the third entry on node node2.

Try to extract the data.
For this we will write a new backend function:
CREATE OR REPLACE FUNCTION plproxy.get_user_email(i_username text)
RETURNS SETOF text AS
$BODY$
CLUSTER 'clustertest';
RUN ON hashtext(i_username) ;
SELECT email FROM plproxy.users WHERE username = i_username;
$BODY$
LANGUAGE 'plproxy' VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION plproxy.get_user_email(text) OWNER TO postgres;



And try to call:
select plproxy.get_user_email('Steve');


OK, it worked for me.

5.Why are you such a poor just so smart?


As you can see in the example there is nothing difficult in working with pl/proxy no. But, I think everyone who managed to read to this line have realized that in real life things are not so simple.
Imagine that you have 16 nodes. This is necessary to synchronize the code functions. What if the problem creeps — how to fix?
This question was asked at the conference what ASKO said that the appropriate means have been implemented within the Skype but still not quite ready to be able to give them to the court of the opensource community.
The second problem that God forbid will affect you in the development of such systems is the problem of data redistribution at a time when we want to add another node to the cluster.
To plan this large-scale operation will have to very carefully preparing all servers in advance, filling the data and then at one point replacing the function code get_cluster_partitions.

6.Dopolnitelnye materials


Projects PlProxy and PgBouncer on the developers website Skype.

Presentation ASCO for Highload++
Postgres Performance Nikolay Samokhvalov (Postgresmain)

7. Bonus for attentive


After I published the article I found one mistake and one bad.
Describe here so as to edit the already written article is hard.
1) In the table uses a sequence name user_id_seq. But the SQL code for it is nowhere given. Accordingly, if someone will just copy-paste the code — it won't work. Corrected:
CREATE SEQUENCE plproxy.user_id_seq
INCREMENT 1
MINVALUE 0
MAXVALUE 9223372036854775807
START 1
CACHE 1;
ALTER TABLE plproxy.user_id_seq the OWNER of TO postgres;


2) During insert data into database generated sequence for a field user_id. However, these two sequences. And each runs on its own node. Which will inevitably lead to the fact that two different users will have the same user_id.
Accordingly, the function insert_user must be corrected so that if a new user_id is taken from sequence posted on the proxy server and not on the nodes. Thus it is possible to avoid duplication in the field user_id.

PS: All the SQL code highlighted in Source Code Highliter
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Car navigation in detail

Google has launched an online training course advanced search

PostgreSQL: Analytics for DBA