tags: Open source community publicity Apache IoTDB Open source
Time series database - Give wings to the Internet of Things - HBase/Spark/BigData with attitude
Apache IoTDB started at the School of Software at Tsinghua University and is a time series database. The main usage scenario isInternet of ThingsRelated industries, such as: Internet of Vehicles, Wind Power, Subway, Aircraft Monitoring, etc., you can view specific application cases and company details:IoTDB usage information collection in actual companies. It adopts columnar storage, data encoding, pre-computing and indexing technology, has a SQL-like interface that can support writing millions of data points per node per second, and can obtain query results of more than trillions of data points in seconds. It can also be easily integrated with Apache Hadoop, MapReduce, and Apache Spark for analysis.
The Internet of Things is characterized by the existence of one or more devices, which are organized together in various forms to observe or record data generated by the same environment at the same time.
In 2012, Sany Heavy Industry's actual business had 200,000 equipment stored data for 3 years, and TB-level data made Oracle unable to bear it. The key problem is not just the large stock data, but the new data is still growing at a very fast rate. Later the company contacted IoTDB The first batch of developers, but the plan at that time was still based on Cassandra. At that time, a cluster of 5 machines was planned, and the performance was just met, but as time went by, the total number of equipment was increasing, and the number of query requests for the business system was increasing.

In 2015, we began to develop a distributed timing database based on Cassandra, but because it was not completely self-developed, it was a bit restricted. Moreover, after a lot of effort, Cassandra finally found that if it was modified, it might require large-scale reconstruction of the code of Cassandra data. Finally, we decided to redesign a storage method to solve the efficient writing of timing data, low-latency reading, and high compression ratio persistence in the Internet of Things scenario. The IoTDB project began to be born and embarked on the road of independent research and development. Later, the IoTDB project was donated to the Apache Foundation project for incubation. Later, after graduation, it developed into the current Apache IoTDB.
The IoTDB suite consists of several components, jointly forming a series of functions such as "data collection-data writing-data storage-data query-data visualization-data analysis". The following figure shows the overall application architecture formed after using all components of the IoTDB suite. It is hereby said that all components form the IoTDB suite, and IoTDB specifically refers to the time series database components in it.
In the above figure, users can import time series data such as system status data collected from sensors on the device, server load and CPU memory, time series data in message queues, application time series data or other databases into local or remote IoTDB through JDBC. Users can also write the above data directly to the TsFile file locally (or on HDFS).
TsFile files can be written to HDFS, thereby realizing data processing tasks such as exception detection and machine learning on Hadoop or Spark's data processing platforms.
For TsFile files written to HDFS or local, you can use the TsFile-Hadoop or TsFile-Spark connector to allow Hadoop or Spark for data processing.
For the analysis results, they can be written back into a TsFile file.
IoTDB and TsFile also provide corresponding client tools to meet the needs of users to view and write data, such as SQL forms, script forms and graphical forms.
The test tool was developed by the Big Data Laboratory of Tsinghua University
1. Comparison of write performance
| Dataset 2 | Client | Storage Group | equipment | variable | batchsize | LOOP | Data volume | Write speed (point/s) | Hard disk data size |
| IoTDB | 10 | 10 | 10 | 10 | 1000 | 1000000 | 1.00E+11 | 24750321.93 | 38306092 |
| InfluxDB | 10 | 10 | 10 | 10 | 1000 | 1000000 | 1.00E+11 | 304682932 | |
| TimescaleDB | 10 | 10 | 10 | 10 | 1000 | 1000000 | 1.00E+11 | 737689.22 | 1610219064 |
| Dataset 1 | Client | Storage Group | equipment | variable | batchsize | LOOP | Data volume | Write speed (point/s) | Hard disk data size |
| IoTDB | 10 | 10 | 10 | 10 | 1000 | 100000 | 10000000000 | 20706345.15 | 3599732 |
| InfluxDB | 10 | 10 | 10 | 10 | 1000 | 100000 | 10000000000 | 1729907.81 | 30546560 |
| TimescaleDB | 10 | 10 | 10 | 10 | 1000 | 100000 | 10000000000 | 715857 | 161026468 |
| KairosDB | 10 | 10 | 10 | 10 | 10000 | 10000 | 10000000000 | 24924.97 | 76263380 |
The above set of data shows that the write performance is more than 10 times higher than the same database, and the writing speed of a single machine reaches 20 million per second. Moreover, the hard disk occupies the smallest, which may be 1 to 2 hard disks per month in online businesses with relatively large data.
Raw data query
| Client | Storage Group | equipment | Sequence - Data volume | variable | Query the number of points | LOOP | Speed (point/s) | AVG | MIN | |
| IoTDB | 10 | 10 | 10 | 1.00E+09 | 1 | 1000000 | 100 | 12942984.85 | 740.27 | 457.04 |
| InfluxDB | 10 | 10 | 10 | 1.00E+09 | 1 | 1000000 | 100 | 1779606.4 | 5591 | 4666.39 |
| TimescaleDB | 10 | 10 | 10 | 1.00E+09 | 1 | 1000000 | 100 | 3781467.52 | 2345.69 | 1193.78 |
Aggregated data query
| Client | Storage Group | equipment | Sequence - Data volume | variable | LOOP | scope | Speed (point/s) | AVG | MIN | |
| IoTDB-1 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.0001 | 49.75 | 27.87 | 18.03 |
| IoTDB-2 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.001 | 49.75 | 49.14 | 19.87 |
| IoTDB-3 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.01 | 49.76 | 48.69 | 22.32 |
| IoTDB-4 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.1 | 48.68 | 99.14 | 25.56 |
| IoTDB-5 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 1 | 14 | 595.61 | 45.54 |
| InfluxDB-1 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.0001 | 234.32 | 40.28 | 21.63 |
| InfluxDB-2 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.001 | 28.88 | 341.9 | 238.1 |
| InfluxDB-3 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.01 | 3.07 | 3226.87 | 2664.86 |
| TimescaleDB-1 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.0001 | 42.39 | 220.57 | 120.5 |
| TimescaleDB-2 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.001 | 5.8 | 1502.9 | 754.15 |
| TimescaleDB-3 | 10 | 10 | 10 | 1.00E+09 | 1 | 100 | 0.01 | 1.02 | 9711.55 | 7148.69 |

Overall IoTDB Whether in writing, raw data query or aggregation query, it has almost 10 times the performance of competitor databases, and the hard disk occupies 10 times less than the same database.
Apache IoTDB is an open source Internet of Things native database designed to meet the stringent requirements of large-scale Internet of Things and Industrial Internet of Things (IoT and IIoT) applica...
exited without any exception, because the startup command is incorrect. The startup command in question is:./sbin/start-server.sh & The correct way to start...
The Year of the Ox is about to enter, and I wish you all a Happy New Year and all the best! This article summarizes this year's work~ The main text is 2430 words, and the expected reading time is 7 mi...
Click aboveblue font,select"Set as a star standard” Reply to "Resources" to get more resources Apache Software Foundation announced in Beijing time to announce an Apache top progr...
Download and installation The official is given three installation methods, where only binary files are installed. Preparation Install the JDK operating environment above 1.8 and configure the environ...
????️It takes 5 minutes to read this article On the first weekend of the 1920s, we co-organized the first offline meetup at Tsinghua University with friends from the IoTDB community, and shared variou...
IoTDBIs a timing database, there are competing products relatedKairosdb,InfluxDB,TimescaleDBEtc., mainly used in the sceneThingsRelated industries, such as: car networking, wind power, subway, aircraf...
Happy weekend everyone! With the recent project landing, 0.10.0 is about to be released. I am going to write a series of tutorials. The first one today will introduce the data model and modeling metho...
After talking about grammar and deployment operation and maintenance for a long time, the actual use still has to fall into the code. Today, I will introduce the client interface. The text is 3516 wor...
The text is 1141 words, and the expected reading time is 3 minutes. Datang Xianyi Technology is a company focusing on system solutions for the power and energy industries. Therefore, a simple and easy...