English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
Preface
The compressed list (ziplist) is composed of a series of specially encoded memory blocks and plays a very important role in the data storage optimization of Redis. This article summarizes the compressed list ziplist, which is frequently used in Redis. It is not an exaggeration to say that this data structure is everywhere in Redis, except for the list, many other data structures also use it as a transition, such as the SortedSet mentioned in the previous article. There is no need to say more, let's take a detailed look at the introduction.
1. Brief introduction to the data structure of compressed list ziplist
First, let's take a look at the structure of ziplist as a whole, as shown in the following figure:
Structure diagram of compressed list ziplist
It can be seen that there are many fields and different byte sizes, but this is exactly the essence of the compressed list. Let's summarize them one by one.
field | meaning |
---|---|
zlbytes | This field is the first field of the compressed list, which is an unsigned integer, occupying4bytes. It is used to indicate the number of bytes occupied by the entire compressed list (including itself). |
zltail | An unsigned integer, occupying4bytes. It is used to store the offset from the head of the compressed list to the last entry (not the tail element zlend), which is used in scenarios where fast jumping to the end of the list is required. |
zllen | An unsigned integer, occupying2bytes. It is used to store the total number of entries contained in the compressed list. |
zlend | Special entry used to represent the end of a compressed list. It occupies one byte and its value is always255。 |
Summarized as the head and tail of ziplist, let's summarize the entry field, which is of great importance.
Generally, an entry consists of prevlen, encoding, and entry-The data field consists of three fields, but when the entry is a very small integer, it will omit the entry according to the encoding.-The data field. The following is a summary in order:
The first is the field prevlen:represents the length of the previous entry, which has two encoding methods.
Then comes the field encoding:it will use different encoding methods according to the content of the current element, as follows:
1、if the element content is a string, the values of encoding are:
00xx xxxx :00开头表示 the length of the string uses6bits represent.
01xx xxxx | xxxx xxxx :01The starting bit indicates that the length of the string is determined by14bit represents, this14bits are used for big-endian storage.
1000 0000 | xxxx xxxx | xxxx xxxx | xxxx xxxx | xxxx xxxx :10The starting bit indicates that the next four bytes are the length of the string, this32bits are used for big-endian storage.
2、if the element content is a number, the values of encoding are:
1100 0000:represents the number of bytes occupied after2bytes.
1101 0000:represents the number of bytes occupied after4bytes.
1110 0000:represents the number of bytes occupied after8bytes.
1111 0000 :represents the number of bytes occupied after3bytes.
1111 1110 :represents the number of bytes occupied after1bytes.
1111 1111 :represents the last element in the compressed linked list (special encoding).
1111 xxxx :represents using the last4bit represents 0~12integer, since 0000,1110and1111Three have been occupied, that is to say, the four digits of xxxx here can only represent 0001~1101,converted to decimal is the number1~13,but Redis stipulates that it is used to represent 0~12,therefore when encountering this encoding, we need to take out the last four digits and subtract1to get the correct value.
The last is the field entry-data:If the value of the element is a string, then save the value of the element itself. If the value of the element is a very small number (by the encoding rules above, that is 0~12),then there is no such field.
The encoding of the compressed linked list is very complex, but this is also the essence of this data structure. Let's look at an example together:
Note:This example is mentioned in the Redis source code
//consists of elements2,5composing the compressed linked list [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff] | | | | | | zlbytes zltail entries "2" "5end //The encoded content of the string "Hello World" [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]
The above is a segment represented in hexadecimal2,5a compressed linked list composed of two elements.
Next, let's insert a string element 'Hello World' at the end of this compressed linked list. First, let's see how to encode the string. According to the agreed encoding rules, the length of the previous element must be represented by a byte first. Here, the previous element is5It takes up two bytes, so the first byte is used to represent the length of the previous element 02The next is the encoding of the string. The length of the string to be added is11(spaces are also counted), converted to binary is1011According to the encoding rules of the string, using 0000 1011That is, converted to hexadecimal is 0b, and then added to the ASCII code of our string itself, combining to get the encoding of the string: [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]。
At this point, the entire compressed linked list becomes:
[0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [02 0b 48 65 6c 6c 6f 20 57 6f 72 6c 64] [ff] | | | | | | | zlbytes zltail entries "2" "5" "Hello World" end
Part Two: Analysis of ziplist Command Source Code
After understanding the encoding rules above, let's take a look at the source code of some operations on the compressed linked list ziplist. This article summarizes the basic principles of the compressed linked list through the creation, insertion, deletion, and search of elements.
Firstly, let's create:
//Define the header size of the compressed linked list composed of zlbytes, zltail, and zllen #define ZIPLIST_HEADER_SIZE (sizeof(uint32_t)*2+sizeof(uint16_t)) //Create a compressed linked list and return a pointer to the list unsigned char *ziplistNew(void) { //Here is why+1This is because the tail element occupies one byte, which is also the minimum size of a compressed list unsigned int bytes = ZIPLIST_HEADER_SIZE+1; //Allocate memory unsigned char *zl = zmalloc(bytes); //Set the size of the list ZIPLIST_BYTES(zl) = intrev32if be(bytes); //Set the offset of the last element ZIPLIST_TAIL_OFFSET(zl) = intrev32if be(ZIPLIST_HEADER_SIZE); //Set the number of elements ZIPLIST_LENGTH(zl) = 0; //Set the tail element (the above is just to apply for space) zl[bytes-1] = ZIP_END; return zl; }
The logic for creating a compressed list is very simple, which is to apply for fixed space containing the header and tail nodes, and then initialize the list context.
Compared to creation, the source code for adding elements is very long. To facilitate understanding, let's sort out the logic of adding elements before looking at the source code.
The above three steps are the core steps, and the rest include operations such as updating the tail node offset, modifying the number of elements in the list, etc. Of course, since the compressed list is implemented based on an array, memory copying is indispensable when inserting or deleting elements.
After summarizing the above steps, we start to analyze the source code step by step, which is quite long, let's take a look slowly:
//The four parameters in order are: compressed list, insertion position (the new element is inserted after the p element), element value, and element length unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) { //Here we save the length of the current list size_t curlen = intrev32if be(ZIPLIST_BYTES(zl)), reqlen; unsigned int prevlensize, prevlen = 0; size_t offset; int nextdiff = 0; unsigned char encoding = 0; long long value = 123456789; zlentry tail; //1. The purpose of this logic is to obtain the length of the preceding element if (p[0] != ZIP_END) { //If the element at the insertion position is not the last element, then get the length of the element //This is split for convenience later, prevlensize saves the length of the encoding field, and prevlen saves the length of the element itself ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); } else { //If the element at the insertion position is the end element, then the new element needs to be inserted at the end of the linked list //Get the last element of the linked list (note: the last element is not equal to the end element) unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl); if (ptail[0] != ZIP_END) { //If the last element is not the end element, then this element is the preceding element of the new element, and the length of this element is obtained prevlen = zipRawEntryLength(ptail); } //Otherwise, it means that the linked list has no elements at all, that is, the length of the preceding element of the new element is 0 } //2. Encode the new element to get the total size of the new element if (zipTryEncoding(s,slen,&value,&encoding)) { //If it is a number, then encode it as a number reqlen = zipIntSize(encoding); } else { //The element length is the length of the string reqlen = slen; } //The total length of the new element is the length of the value plus the length of the preceding element and the encoding element reqlen += zipStorePrevEntryLength(NULL,prevlen); reqlen += zipStoreEntryEncoding(NULL,encoding,slen); //If the insertion position is not at the end of the linked list, then the subsequent elements of the new element's prevlen field need to be judged //According to the encoding rules above, this field may need to be expanded int forcelarge = 0; nextdiff = (p[0] != ZIP_END) &63; zipPrevLenByteDiff(p,reqlen) : 0; if (nextdiff == -4 && reqlen < 4) {}} nextdiff = 0; forcelarge = 1; } //Expand the array to the new calculated size, since the address of the new array may change, therefore, the old offset is recorded here offset = p-zl; zl = ziplistResize(zl,curlen+reqlen+nextdiff); //Calculate the insertion position on the new array p = zl+offset; //If the newly inserted element is not at the end of the linked list if (p[0] != ZIP_END) { //copy the new element and its subsequent elements to the new array,-1for the last element memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff); //The prevlen field of the subsequent element of the new element if (forcelarge) zipStorePrevEntryLengthLarge(p+reqlen,reqlen); else zipStorePrevEntryLength(p+reqlen,reqlen); //Update the offset of the last element ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen); //When the subsequent element of the new element is not only one, the new tail element offset needs to be added with nextdiff zipEntry(p+reqlen, &tail); if (p[reqlen+tail.headersize+tail.len] != ZIP_END) { ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff); } } else { //The new element is inserted at the end of the list, and the offset of the tail is updated ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl); } //nextdiff != 0 indicates that the length of the subsequent element has changed, so we need to cascade update the subsequent elements of the subsequent elements if (nextdiff != 0) { offset = p-zl; zl = __ziplistCascadeUpdate(zl,p+reqlen); p = zl+offset; } //Write the new element into the list p += zipStorePrevEntryLength(p,prevlen); p += zipStoreEntryEncoding(p,encoding,slen); if (ZIP_IS_STR(encoding)) { memcpy(p,s,slen); } else { zipSaveInteger(p,value,encoding); } //The number of elements stored in the compressed ziplist+1 ZIPLIST_INCR_LENGTH(zl,1; return zl; }
After analyzing the logic of inserting elements, take a deep breath, it's really long, and there are many details.
Next, let's take a look at the process of deleting elements. Compared to adding, deletion is relatively simple. After clearing the current element, subsequent elements need to be copied one by one (this is the difference between array and linked list data structures), and then pay attention to whether cascading updates are needed. See the code above:
//The parameters are in order: compressed ziplist, the starting position of the element to be deleted, and the number of elements to be deleted unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) { unsigned int i, totlen, deleted = 0; size_t offset; int nextdiff = 0; zlentry first, tail; //Read the element pointed to by p and store it in first zipEntry(p, &first); for (i = 0; p[0] != ZIP_END && i < num; i++) {}} //Count the total length deleted p += zipRawEntryLength(p); //Count the number of elements actually deleted deleted++; } //The number of bytes to be deleted totlen = p-first.p; if (totlen > 0) { if (p[0] != ZIP_END) { //Determine if the size of the element has changed nextdiff = zipPrevLenByteDiff(p, first.prevrawlen); //Modify the prevlen field of the element after the deleted element p -= nextdiff; zipStorePrevEntryLength(p, first.prevrawlen); //Update the offset of the end element ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen); //When the successor element of the deleted element is not just one, the offset of the new end element needs to be added by nextdiff zipEntry(p, &tail); if (p[tail.headersize+tail.len] != ZIP_END) { ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff); } //Move the remaining elements to the front memmove(first.p, p, intrev32ifbe(ZIPLIST_BYTES(zl))-(p-zl)-1; } else { //Directly delete to the end of the list, so no memory copy is required, just modify the offset of the last element ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe((first.p-zl)-first.prevrawlen); } //Resize the array size offset = first.p-zl; zl = ziplistResize(zl, intrev32ifbe(ZIPLIST_BYTES(zl))-totlen+nextdiff); //Modify the number of elements in the list ZIPLIST_INCR_LENGTH(zl,-deleted); p = zl+offset; //nextdiff != 0 indicates that the size of the element has changed, and a cascading update is required if (nextdiff != 0) zl = __ziplistCascadeUpdate(zl, p); } return zl; }
Let's take a look at the element search operation:
//The parameters are in order: compressed list, value of the element to be found, length of the element to be found, number of elements skipped between comparisons unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip) { int skipcnt = 0; unsigned char vencoding = 0; long long vll = 0; //Keep looping as long as it's not the end element while (p[0] != ZIP_END) { unsigned int prevlensize, encoding, lensize, len; unsigned char *q; //Query the prevlen field of the current element in the linked list ZIP_DECODE_PREVLENSIZE(p, prevlensize); //Query the encoding field of the current element in the linked list ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len); q = p + prevlensize + lensize; //Reached the position of the element to be compared if (skipcnt == 0) { //If the current element in the linked list is a string if (ZIP_IS_STR(encoding)) { //Compare with the string to be searched if (len == vlen && memcmp(q, vstr, vlen) == 0) { //Matching success, then the pointer of the element to be searched return p; } } else { //If the current element is a number and vencoding is 0 if (vencoding == 0) { //Attempt to encode the value to be searched as a number if (!zipTryEncoding(vstr, vlen, &vll, &vencoding)) { //If the encoding fails, it means the element to be searched is not a number at all //Then set vencoding to the maximum value, serving as a marker //That is, there is no need to try to encode the value to be searched as a number later vencoding = UCHAR_MAX; } assert(vencoding); } //If vencoding is not UCHAR_MAX, it means the element to be searched is successfully encoded as a number if (vencoding != UCHAR_MAX) { //Extract the current element from the linked list by number long long ll = zipLoadInteger(q, encoding); if (ll == vll) { //If two numbers are equal, return the element pointer return p; } } } //Reset the number of elements to be skipped skipcnt = skip; } else { //Skip elements skipcnt--; } //Traverse the next element p = q + len; } //Traversed the entire list, but did not find the element return NULL; }
Here we have summarized the principles of the four basic operations of creating, adding, deleting, and searching compressed lists.
Summary of Compressed List ziplist Data Structure
The ziplist compression in redis is very widely used, and it is considered to be one of the most characteristic data structures in redis. The essence of this data structure actually lies in the coding rules summarized in the first part of the article. After understanding this part, you can simply read the source code to deepen your understanding.
It has to be said that the source code part is indeed a bit long, and it really requires patience. I was also very confused in the process of reading it. If you are interested in the source code, I suggest that you first sort out what needs to be done for a certain operation (such as the insertion element mentioned above), and then look at the code, which may be easier to understand.
Finally, leave a small question. Since the ziplist in redis has such a high memory utilization, why does it still provide the ordinary linked list structure for users to use?
There is no standard answer to this question. It depends on different people's opinions.
Summary
That's all for this article. I hope the content of this article has certain reference value for everyone's learning or work. If you have any questions, you can leave a message for communication. Thank you for your support of the呐喊 tutorial.
Declaration: The content of this article is from the Internet, and the copyright belongs to the original author. The content is contributed and uploaded by Internet users spontaneously. This website does not own the copyright, does not edit the content manually, and does not assume any relevant legal liability. If you find any content suspected of copyright infringement, please send an email to: notice#oldtoolbag.com (Please replace # with @ when sending an email for reporting. Provide relevant evidence, and once verified, this site will immediately delete the infringing content.)