Social Friend Relationship System Practice in PostgreSQL: Accelerating Queries of Positive and Negative Relationships
See a tutorial on accelerating queries of positive and negative relationships.
Join the DZone community and get the full member experience.
Join For FreeBackground
A relationship system is very common in various design applications. For example, query who your friends are and to whom you are a friend.
Another example is who I liked and who liked me.
How can we speed up the query of positive and negative relationships?
The efficiency difference between a multi-row query and a single-row query in databases is very significant because a multi-row query needs to run more code. Of course, a multi-row query has its advantages, for example, it performs better than a single-row query when returning a small number of records.
To speed up returning all relations of an ID, we can compress multiple rows into one single row.
1. Generate Positive and Negative Relationships in a Real-Time and Streaming Manner
Apply rules to automatically convert detail data into multiple rows of data.
The advantage of this method is that the data is real-time.
The disadvantage is that the large row of the merged table needs to be updated every time, producing large amounts of trash in a multi-version implementation. In addition, it is very likely to experience performance bottlenecks on popular users. (For example, users who are followed by many people have a very large relation network.)
Implementation example:
(1) Create an original relationship table.
create table t(uid1 int, uid2 int, primary key(uid1,uid2));
(2) Create positive and negative relationship tables and use arrays to represent one-to-many relationships.
create table t1(uid int primary key, fid int[]);
create table t2(uid int primary key, fid int[]);
(3) Create an intarray plug-in to remove duplicates from arrays.
create extension intarray;
(4) Create rules that automatically merge relationships into the positive and negative relationship tables when data is written into the original relationship table.
create rule r1 as on insert to t do also insert into t1 values (NEW.uid1, array[NEW.uid2]) on conflict (uid) do update set fid=uniq(sort(t1.fid||NEW.uid2));
create rule r2 as on insert to t do also insert into t2 values (NEW.uid2, array[NEW.uid1]) on conflict (uid) do update set fid=uniq(sort(t2.fid||NEW.uid1));
(5) Create a function for stress testing, that is, to write relationships among users.
create or replace function ins_rel( int, int ) returns void as
$$
declare
begin
insert into t values ($1,$2);
return;
exception when others then
return;
end;
$$
language plpgsql strict;
(6) Perform stress testing
vi test.sql
\set uid1 random(1,100000)
\set uid2 random(1,2000)
select ins_rel(:uid1,:uid2);
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 120
(7) View examples of positive and negative relationships.
postgres=# select * from t1 limit 3;
uid | fid
-------+--------------------------------------------------------------------------------------------------------------------------------
27627 | {60,77,258,301,412,574,649,651,709,780,805,811,998,1031,1066,1151,1173,1251,1284,1392,1586,1602,1616,1659,1708,1879,1932,1982}
78532 | {106,288,416,474,566,765,853,1003,1154,1166,1352,1657,1689,1707,1722}
81317 | {14,109,258,275,293,499,586,593,633,716,720,732,766,807,927,1052,1508,1518,1633,1662,1791,1848,1885,1946,1951}
(3 rows)
postgres=# select * from t2 limit 1;
uid | 19
fid | {25,119,224,291,317,421,470,574,637,683,857,905,1131,1451,1474,1573,1606,1950,2039,2077,2297,2464,2693,2716,2722,2832,2871,3062,3158,3167,3217,3265,3288,3428,3602,3644,3660,3799,3850,3899,3906,3966,4134,4177,4268,4308,4411,4477,4509,4517,4522,4709,4736,4781,4892,5142,5214,5235,5363,5429,5473,5485,5491,5535,5752,5778,5828,5851,5950,5965,5988,6042,6071,6190,6238,6254,6343,6483,6555,6743,6820,6917,6980,7006,7032,7075,7205,7289,7410,7434,7618,7936,7987,7995,8102,8281,8370,8439,8463,8509,8622,8731,8786,8866,8899,9020,9033,9082,9142,9308,9406,9413,9829,9876,9976,10116,10360,10396,10428,10680,10710,10950,11170,11187,11197,11200,11243,11370,11395,11500,11506,11594,11695,11878,11963,12055,12077,12196,12211,12261,12332,12333,12341,12486,12627,12843,12864,12976,13037,13206,13267,13353,13354,13359,13383,13411,13633,13637,13778,13837,13860,13928,13968,13976,14036,14058,14150,14196,14284,14435,14724,14743,14774,14831,15086,15138,15367,15376,15407,15692,15755,15820,15870,15925,16057,16129,16288,16292,16388,16397,16426,16482,16812,17006,17013,17025,17159,17165,17402,17489,17494,17539,17883,17977,17991,18173,18180,18249,18253,18306,18487,18557,18730,18777,18881,18914,19102,19125,19245,19273,19448,19477,19534,19578,19751,19951,19973,20037,20043,20182,20244,20248,20388,20423,20588,20625,20727,20851,20884,20911,21300,21345,21401,21416,21442,21456,21465,21564,21747,21826,21841,21844,22051,22154,22168,22208,22215,22315,22322,22374,22400,22476,22617,22621,22626,22666,22786,22853,23014,23142,23539,23565,23580,23586,23620,23658,23976,24013,24014,24046,24059,24377,24546,24688,24723,24905,24951,25129,25192,25255,25445,25592,25723,25779,25814,25863,25880,25886,25963,26006,26020,26117,26273,26298,26318,26344,26347,26378,26399,26494,26561,26600,26607,26700,26721,26754,26755,26841,26849,26863,26884,26954,27071,27073,27140,27176,27237,27246,27256,27270,27315,27323,27434,27469,27510,27547,27565,27799,27918,27921,28045,28055,28168,28181,28310,28425,28531,28628,28769,28796,28924,28993,29009,29098,29145,29180,29238,29301,29542,29581,29625,29879,29953,29958,29994,30005,30029,30203,30276,30358,30423,30445,30472,30518,30581,30587,30593,30800,30855,30860,30885,30927,30946,31033,31256,31329,31416,31445,31554,31663,31762,31768,31807,31831,31845,31890,31891,32257,32329,32384,32864,32866,32942,33161,33299,33395,33437,33492,33828,33870,33876,33936,33989,34114,34236,34386,34752,34835,34847,35125,35402,35479,35492,35503,35543,35575,35756,35924,35987,36044,36191,36210,36212,36235,36313,36315,36341,36508,36568,36640,36676,36705,36800,36911,37002,37074,37075,37080,37084,37127,37149,37237,37248,37424,37445,37486,37547,37573,37625,37699,38083,38150,38254,38255,38297,38339,38436,38486,38500,38507,38563,38616,38750,38768,38790,38931,38946,38948,39079,39117,39141,39271,39400,39788,39802,39813,39853,40096,40122,40162,40225,40245,40256,40298,40329,40430,40452,40496,40556,40604,40631,40770,40794,40913,41017,41051,41435,41490,41523,41596,41699,41841,41844,41870,41969,42020,42056,42152,42190,42214,42279,42356,42376,42432,42623,42731,42810,43055,43107,43167,43221,43276,43389,43498,43505,43542,43549,43651,43802,43979,44033,44057,44167,44326,44378,44536,44611,44646,44816,44895,44901,44996,45076,45114,45148,45208,45227,45470,45482,45521,45565,45609,45728,45811,45894,45955,45989,46106,46424,46468,46500,46564,46577,46745,46792,46840,46945,46963,46965,47057,47172,47236,47503,47510,47541,47726,47752,47792,47856,47864,47966,48003,48169,48267,48282,48409,48436,48455,48505,48507,48964,49111,49221,49235,49364,49577,49701,50183,50270,50591,50727,50949,51014,51174,51284,51287,51350,51358,51426,51435,51533,51534,51606,51647,51757,51823,51830,51939,51949,51970,52262,52668,52684,52766,52926,52968,52982,53009,53054,53262,53370,53420,53526,53530,53573,53612,53687,53757,53854,53993,54047,54079,54093,54234,54236,54255,54259,54379,54397,54434,54545,54583,54595,54612,54636,54643,54678,54887,55034,55345,55378,55381,55428,55442,55510,55547,55750,55937,55947,56030,56067,56098,56172,56174,56330,56337,56438,56503,56610,56622,56825,56844,56881,56883,56976,57098,57292,57353,57551,57587,57588,57636,57662,57672,57685,57693,57787,57794,58064,58069,58122,58314,58318,58588,58663,58912,58990,59090,59095,59162,59311,59342,59431,59534,59536,59579,59760,59819,59865,59868,60099,60229,60242,60338,60342,60354,60373,60374,60415,60512,60513,60518,60531,60589,60757,60803,61104,61197,61201,61215,61253,61439,61446,61513,61553,61686,61696,61874,62062,62274,62348,62449,62471,62490,62584,62654,62671,62763,62766,62808,63018,63044,63142,63282,63295,63363,63375,63419,63427,63431,63577,63590,63636,63642,63646,63678,63706,63816,63862,63915,63978,64039,64104,64131,64143,64323,64365,64394,64397,64449,64502,64535,64671,64849,64855,64971,65004,65039,65165,65228,65239,65330,65407,65446,65567,65691,65709,65741,65768,65874,65876,65882,65955,66068,66178,66410,66423,66425,66488,66493,66540,66770,67017,67284,67360,67509,67537,67579,67663,67691,67750,67821,67855,67932,67964,68023,68031,68076,68195,68301,68525,68618,68684,68779,68789,68815,68833,68876,68955,69075,69145,69160,69187,69244,69500,69541,69585,69599,69691,69698,69747,69766,69782,69864,70039,70105,70151,70225,70238,70248,70256,70325,70378,70392,70495,70540,70699,70913,70919,70957,71047,71093,71149,71349,71489,71583,71639,71702,71841,71893,71922,71970,71971,71983,72071,72074,72184,72229,72343,72404,72429,72446,72536,72660,72662,72675,72714,72718,72858,72903,72986,73042,73069,73097,73102,73160,73174,73313,73417,73426,73464,73601,73778,73867,73966,73978,74057,74189,74253,74257,74270,74357,74441,74450,74484,74574,74590,74863,74896,74903,74970,75024,75048,75076,75161,75251,75285,75287,75427,75488,75638,75704,75999,76077,76117,76169,76252,76456,76745,76810,77058,77109,77110,77194,77401,77411,77427,77530,77565,77610,77634,77663,77765,77776,77852,77861,77929,77954,78007,78320,78326,78348,78365,78419,78441,78588,78663,78733,78745,78838,78887,78896,78964,79140,79142,79569,79772,79834,79846,79855,80089,80169,80297,80521,80535,80655,80753,80793,80890,81039,81065,81114,81225,81397,81408,81461,81463,81560,81575,81607,81644,81673,81686,81702,81756,81762,81772,81866,82090,82200,82247,82273,82320,82432,82589,82615,82659,82680,83078,83100,83127,83249,83578,83607,83609,83714,83792,83817,83894,83952,84016,84086,84088,84182,84299,84469,84487,84529,84544,84574,84607,84701,84721,84764,84775,84832,85044,85098,85334,85547,85574,85740,85809,85824,85875,86070,86078,86277,86342,86445,86536,86628,86651,86775,86801,86961,86967,87021,87242,87316,87318,87480,87498,87543,87683,87700,87759,87780,87839,87883,88053,88187,88198,88322,88533,88535,88752,88765,89034,89053,89115,89168,89191,89401,89624,89692,89771,89851,89896,90051,90056,90072,90077,90079,90180,90187,90499,90500,90616,90627,90717,91002,91075,91279,91295,91314,91343,91365,91498,91502,91550,91561,91587,91671,91857,91912,92085,92106,92342,92372,92501,92717,92967,93063,93084,93092,93113,93244,93320,93433,93541,93680,93684,93719,93809,94049,94077,94139,94221,94277,94309,94530,94555,94686,94690,94920,94936,94996,95219,95258,95333,95465,95614,95640,95753,95755,95779,95906,96158,96183,96225,96339,96347,96577,96648,96870,96894,97030,97121,97152,97162,97174,97237,97542,97599,97607,97686,97705,97708,97718,97728,97817,97990,97997,98172,98246,98313,98452,98521,98581,98636,98831,98870,98926,98962,98976,99055,99112,99260,99419,99462,99470,99546,99573,99720,99777,99906}
(8) Check if positive and negative relationships are consistent with the relationships in the original relationship table. To do this, simply compare the hash values. The relationships are completely consistent in this example.
postgres=# select sum(hashtext(t.*::text)) from t;
sum
---------------
-137085631634
(1 row)
postgres=# select sum(hashtext) from (select hashtext((t1.uid, unnest(t1.fid))::text) from t1) t;
sum
---------------
-137085631634
(1 row)
postgres=# select sum(hashtext) from (select hashtext((unnest(t2.fid), t2.uid)::text) from t2) t;
sum
---------------
-137085631634
(1 row)
2. Batch Generate Positive and Negative Relationships in an Asynchronous and Streaming Manner
The defects of the first method may easily cause increased I/O and increased RT of writing relationships in the case of very popular users.
To avoid these problems, we can use the "batch merge" method, which is similar to the method used in the cellphone user profile system of a cellphone manufacturer: Paste, delete, and modify tags, generate data as a log, perform async batch, consume with atomicity on feed data, and then merge data into the tag table.
Query Performance Comparison Between Positive and Negative Relationships
Compare query performance among the original table, the negative relationship table, and the negative relationship table.
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 56 -j 56 -T 120
(1) Add a single-field index to the original table
postgres=# create index idx_t_1 on t(uid1);
CREATE INDEX
postgres=# create index idx_t_2 on t(uid2);
CREATE INDEX
(2) The query performance of the positive relationships in the original table
\set uid1 random(1,100000)
\set uid2 random(1,2000)
select * from t where uid1=:uid1;
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 56
number of threads: 56
duration: 120 s
number of transactions actually processed: 73916179
latency average = 0.091 ms
latency stddev = 0.014 ms
tps = 615905.000776 (including connections establishing)
tps = 615979.071532 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set uid1 random(1,100000)
0.001 \set uid2 random(1,2000)
0.089 select * from t where uid1=:uid1;
(3) The query performance of the negative relationships in the original table
\set uid1 random(1,100000)
\set uid2 random(1,2000)
select * from t where uid2=:uid2;
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 56
number of threads: 56
duration: 120 s
number of transactions actually processed: 2784841
latency average = 2.413 ms
latency stddev = 0.116 ms
tps = 23204.361967 (including connections establishing)
tps = 23207.284520 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set uid1 random(1,100000)
0.001 \set uid2 random(1,2000)
2.411 select * from t where uid2=:uid2;
(4) The query performance of the positive relationship table
\set uid1 random(1,100000)
\set uid2 random(1,2000)
select * from t1 where uid=:uid1;
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 56
number of threads: 56
duration: 120 s
number of transactions actually processed: 85533010
latency average = 0.079 ms
latency stddev = 0.012 ms
tps = 712676.961185 (including connections establishing)
tps = 712758.158291 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set uid1 random(1,100000)
0.001 \set uid2 random(1,2000)
0.077 select * from t1 where uid=:uid1;
(5) The query performance of the negative relationship table
\set uid1 random(1,100000)
\set uid2 random(1,2000)
select * from t2 where uid=:uid2;
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 56
number of threads: 56
duration: 120 s
number of transactions actually processed: 25649777
latency average = 0.262 ms
latency stddev = 0.037 ms
tps = 213695.286456 (including connections establishing)
tps = 213719.115995 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set uid1 random(1,100000)
0.001 \set uid2 random(1,2000)
0.260 select * from t2 where uid=:uid2;
Comparison test | TPS for positive relationships | TPS for negative relationships |
Original method | 616,000 | 23,000 |
Array query in the positive and negative tables | 713,000 | 214,000 |
The more relationship records there are, the more significantly the performance is improved by using arrays to store positive and negative relationships.
Summary
A relationship system is very common in various design applications. For example, query who your friends are and to whom you are a friend.
Directly querying original relationships requires the batch extraction of all related relationships. This may cause relatively high RT and low TPS.
Real-time stream computing and batch traffic computing allow relationships to be divided into positive and negative relationships. Each user is a record and arrays are used to store one-to-many relationships. This method brings significant performance improvement.
Published at DZone with permission of Di goal. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments