I was comparing the following SQL engines:
- MySQL, embedded
- Hive against data in different format: text files, Parquet and OCR
- Big SQL on Hive tables, Parquet and OCR
- Spark SQL
- Phoenix, SQL engine for HBase.
It is not any kind of benchmarking, the purpose is not to prove the superiority of one SQL engine over another. I also haven't done any kind of tunning or reconfiguration to speed up. Just to conduct a simple check after installation and have several numbers at hand.
The test description and several results are here.
Although I do not claim any ultimate authority here, I can provide several conclusions.
- Big SQL is a winner. Particularly comparing to Hive. Very important: Big SQL is running on the same physical data, the only difference is a different computational model. It even beats MySQL. But, of course, MySQL will get the upper hand for OLTP requests.
- Hive behaves much better paired with TEZ. On the other hand, the execution time is very fluid, can change from one execution to another drastically.
- Spark SQL is outside competition but it is hard to outmatch in-memory execution.
- Phoenix SQL is at the end of the race, but the execution time is very stable.
Brak komentarzy:
Prześlij komentarz