Apesar do fato de que agora existem muitos dados em quase todos os lugares, os bancos de dados analĂticos ainda sĂŁo bastante exĂłticos. Eles sĂŁo pouco conhecidos e, pior ainda, sabem como usĂĄ-los com eficĂĄcia. Muitos continuam a "comer um cacto" com MySQL ou PostgreSQL, que sĂŁo projetados para diferentes cenĂĄrios, lutam com NoSQL ou pagam a mais por soluçÔes comerciais. ClickHouse Ă© uma virada de jogo e reduz significativamente o limite para entrar no mundo do DBMS analĂtico.
RelatĂłrio da BackEnd Conf 2018 e publicado com a permissĂŁo do palestrante.
Quem sou eu e por que estou falando sobre a ClickHouse? Sou o Diretor de Desenvolvimento da LifeStreet, que usa a ClickHouse. Além disso, sou o fundador da Altinity. Este é um parceiro Yandex que promove a ClickHouse e ajuda a Yandex a tornar a ClickHouse mais bem-sucedida. Também estou pronto para compartilhar meu conhecimento sobre a ClickHouse.
. . , .
« », ClickHouse:
- ,
- ,
- .
, .
, , ClickHouse, .
, ClickHouse , ClickHouse , .
, ClickHouse . , .
: « ClickHouse?». , , .
- â . ClickHouse . ClickHouse . , - .
- â . . , Vertica â . , . , . . ClickHouse .
- â . - . RedShift â . RedShift . , , Amazon, . Google BigQuery . - , , .
ClickHouse .
ClickHouse ? ClickHouse .
- -, . . use case, .
- AdTech ClickHouse.
- , .
- ClickHouse . ClickHouse, , .
- , . . ClickHouse.
- CloudFlare. - ClickHouse , . community. ClickHouse-. , Kafka Engine ClickHouse.
- . ClickHouse proof on concept, production.
- ClickHouse . , , 2 000 . â .
- -. , Bloxy.info. ethereum-. ClickHouse.
. , . . .
, :
- : 500+ , 25 .
- LifeStreet: 60 , 75 . , , .
- CloudFlare: 36 , 200 . .
- Bloomberg: 102 , . .
â . heatmap, ClickHouse . , , . . 4 .
, . , Altinity, . , , , . . , . . . , . , ClickHouse â , , .
? , ClickHouse . , . , , , .
ClickHouse .
- â : Vertica ClickHouse. , Vertica .
- â ClickHouse. . , ClickHouse , . , . , .
- â ClickHouse. , ClickHouse Hadoop . , ClickHouse - map reduce , . , .
- LifeStreet â Ad Tech , , .
- , programmatic bidding.
- : 10 . .
- , , â , programmatic bidding.
. HighLoad. LifeStreet MySQL ( Oracle) Vertica. .
, , Vertica â . . . proof of concept performance testing , 13- 16- . HighLoad.
â Vertica , . . , . , , - , , . , , .
? , , , . , Vertica, . , .
, open source , , . . . , . .
, , , open source, â .
, , , ClickHouse. , : «?», .
2016- , ClickHouse. , Vertica. . , . . (join), ClickHouse Vertica .
. : ClickHouse Vertica, .
(join), . ClickHouse Vertica . - , . . .
, , LifeStreet ClickHouse.
16- , . , , . , . .
, , .
â :
- .
- . 10 , , LifeStreet 75 3 . , . SQL- , .
- , ClickHouse , Vertica, , SAS-. ClickHouse SATA. ? Vertica insert . , , , , . . . ClickHouse insert . , , , ClickHouse , . . SATA, RAID , .
- , . . 3 60 (20 â ) 6 . Vertica .
.
- â . .
- â SQL.
OLAP- â select. group by, . where, . group by . .
star-, , .
, , , . , . , . . - .
ClickHouse . :
- â ClickHouse (join), . . (join) , . .
- â , . , star-, - . , , . .
ClickHouse . :
- â . External Dictionaries â , 99 % star-, .
- â . (join) .
- (join).
- . 2018- ( ) , . . , . â .
- , (join) , , , , , , .
- (join).
- 1 .
- , . - .
. , .
, . :
- . - .
- key-value . - .
- , - .
. - , (join).
ClickHouse , string key-value.
â , , . , .
. has
, , . , , .
subid - . , , , . .
, , , , -, . , -, , .
. , ID. . arrayMap
. -. -. ID .
. , .
.
, , .
- ClickHouse . .
- . ?
- (join), . , , .
- â . , , . (join), .
- .
. , -5 . 2,5 , -. , - . , , (int). . . 2,5 1,5 . .
. . 5 . , , - , , .
. , , , , , . .
- .
- , .
- - (join), , , , (join) .
- .
, , . .
. . ?
:
- - .
- .
?
, , 20 , . - , , . , , - .
: « , ?» « , ?». . . , , , ?
, , - , . . - . DWH. , .
â , , - , , , , . , , , - , . , . .
.
â :
- Last Interaction, interaction â , .
- First Interaction, . . , .
- â .
- .
- .
? Runtime Cassandra. Cassandra transaction storage, . . . - Runtime, , - - , Cassandra â . , . .
, transaction id, . . . .
, . , , 10 , 300 , . Cassandra , , Runtime , 10-15 .
, . ? , 30 . , , 30 . , - . 500 , , , Runtime , - . , .
ClickHouse. ClickHouse? , .
- , , . . mutable, ClickHouse mutable-.
- , , visit id. point query, ClickHouse . ClickHouse âŠ, . .
- , json, , json , , - . .
. . .
, .
? ClickHouse, , . attributed , ClickHouse . visit id , , . . , . , , . . ClickHouse, . . . , DWH, .
. ClickHouse , visit id, 1 000-2 000 visit id 1 000-2 000 . .
ClickHouse, 3 , .
, , .
. materialized view , attributed , . . . materialized view . . . materialized view , .
SQL. .
â ClickHouse json , . . . ClickHouse json. - .
visitParamExtractInt json , . . . transaction id visit id. .
â materialized . ? , , . . , . ClickHouse . json , .
materialized view â . . ? -, , . . visit id, .
â index_granularity. MergeTree, 8 192 index_granularity. ? . ClickHouse , . 8 192. , , , , overhead. index granularity, overhead. , . .
ClickHouse.
-, AggregatingMergeTree. AggregatingMergeTree argMax, . . , timestamp. . . ClickHouse. argMax .
- «» Runtime.
- 3 . , Cassandra, . . .
- 25 ClickHouse. 5 . , Cassandra , click based , impression based. . . , 30 , .
â Y, .
:
- 5 000 .
- 100 .
- 10 . , , .
- 100 .
.
. , , , . . , . , , , , . . , .
.
. ? 100 , : , . 100 runningDifference . RunningDifference â ClickHouse, .
, . 5 000 12,5 . , . . 12,5 .
- , Íx Íy â . . . . , . - 12,5 , . . 60 . .
-, - , , ClickHouse.
Hadoop , Spark, Greenplum. . . . - , .
ClickHouse .
, , . , , .
? . . . logReturn, .
, . , . . . . . , . . .
. «r» , . . . â .
12,5 , , . . . 2 500 5 000 . ClickHouse-. , .
, . : , , . logReturn, . . , logReturn. , . . groupArray . . ClickHouse, . .
proof of concept â , . . . .
: Log_return .
- 50 . 50 â , . . , 70 .
, , . . . . .
- â . â ClickHouse.
- Summing/AggregatingMergeTrees â , state . .
- Materialized Views . , , , , attribute , . . , , . , . Materialized Views , , ClickHouse.
- .
- , . , , .
, , ClickHouse , open source , . . . . , , , ClickHouse, InfiniDB. , , , SQL. !
- ! ! - Apache Phoenix?
-, , - . ClickHouse . - ClickHouse, . .
( ) Apache Phoenix â SQL- Hbase. Hbase key-value. . Hbase, Cassandra. . , , ClickHouse.
-
! , . ClickHouse, , ClickHouse , mutable. - , ClickHouse, , ? , . , ?
. . , , mutable, , . . . ClickHouse Oracle, - . ClickHouse , , ClickHouse . . . , , . , - ClickHouse , .
! . . , -. , ? , , , - , ? . . - , ?
, ClickHouse . Hadoop. Hadoop â . Hadoop . â , ClickHouse , , . . , - , .
. , 50 . , ?
-.
, .
3- .
! ! . , ClickHouse . . . -, ? ClickHouse? , ? , , ClickHouse , - .
, . ClickHouse . , . ? , . - 2012-2013- ClickHouse. . . - , , . . ClickHouse â . . . .
! , . ? -. , , , - .
, , . . , Materialized View, , . . . ClickHouse, . . extract - , primary key, , Materialized View. , . Materialized View, . .
.