cbor C. Amsüss Internet-Draft 29 July 2024 Intended status: Standards Track Expires: 30 January 2025 Packed CBOR: Table permutations draft-amsuess-cbor-packed-shuffle-00 Abstract Packed CBOR is a compression mechanism for Concise Binary Object Representation (CBOR) that can be used without a decompression step. This document introduces a means for altering configured tables to optimize the use of efficient encoding points by shuffling around entries in the packing tables. About This Document This note is to be removed before publishing as an RFC. The latest revision of this draft can be found at https://chrysn.codeberg.page/packed-shuffle/draft-amsuess-cbor- packed-by-reference.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-amsuess-cbor- packed-shuffle/. Discussion of this document takes place on the CBOR Working Group mailing list (mailto:cbor@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/cbor/. Subscribe at https://www.ietf.org/mailman/listinfo/cbor/. Source for this draft and an issue tracker can be found at https://codeberg.org/chrysn/packed-shuffle. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Amsüss Expires 30 January 2025 [Page 1] Internet-Draft Packed CBOR: Table permutations July 2024 Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 30 January 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Setting up the tables by reference . . . . . . . . . . . . . 3 2.1. Permutation semantics . . . . . . . . . . . . . . . . . . 3 2.2. Examples . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Security Considerations . . . . . . . . . . . . . . . . . . . 5 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5 4.1. CBOR Tags Registry . . . . . . . . . . . . . . . . . . . 5 5. References . . . . . . . . . . . . . . . . . . . . . . . . . 6 5.1. Normative References . . . . . . . . . . . . . . . . . . 6 5.2. Informative References . . . . . . . . . . . . . . . . . 6 Appendix A. Example implementation . . . . . . . . . . . . . . . 6 Appendix B. Open issues . . . . . . . . . . . . . . . . . . . . 7 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 8 1. Introduction [ See abstract. ] This document uses the "CPA" convention of [I-D.bormann-cbor-draft-numbers]. If it reaches the RFC editing stage, as IANA processes the IANA Considerations, this note is to be removed, and all occurrences of "CPA" will have been replaced with allocated numbers. Amsüss Expires 30 January 2025 [Page 2] Internet-Draft Packed CBOR: Table permutations July 2024 2. Setting up the tables by reference CBOR tag CPA115 describes a permutation of the CBOR packing tables around a rump. Unlike tags CPA113 and most other table setup tags, it does not add items; instead, it modifies the table such that items with very short encoded lengths (most efficient are the first 16 items of the shared table, and the first item of the argument items in straight use, and the first 8 items of the inverted items) can be used even when they were not originally assigned to those positions. Without excluding others, there are two groups of use cases envisioned: * Rearranging preconfigured tables. If a media type comes with a pre-configured table, and/or when one ore more external tables are referenced (e.g. by using [I-D.amsuess-cbor-packed-by-reference]), a single tag CPA115 can be used around the main rump of the document to pull the most frequently used entries into the prominent positions. * Local optimization. If some table entries are frequent in a sub-tree of the document, using tag CPA115 around that subtree can be beneficiall to overall compactness even when the table was set up locally (using tag CPA113) with an ordering suitable for the whole document. Packed-Shuffle = #6.([ shuffle-shared, ?shuffle-argument, rump]) rump = any shuffle-shared = shuffle shuffle-argument = shuffle shuffle = [+(offset, ?length)] offset = uint length = nint cpa115 = 115 ; preliminary value, see IANA considerations 2.1. Permutation semantics Inside the rump, each table has a permutation applied compared to the outer table, described in the suffle-shared value for the shared table, and in the shuffle-argument value for the argument table (which defaults to the identical permutation). Amsüss Expires 30 January 2025 [Page 3] Internet-Draft Packed CBOR: Table permutations July 2024 The applied permutations are described in inverse (so that a table index used in the rump gets mapped to a table index outside this tag): The item shuffle consecutively lists the lowest items of the inner table by indicating their postions in the outer table. A negative number indicates that a group of in total (1 - length) items are included consecutively starting with the item before the negative number. This creates a kind of run-length encoding. At the end of the shuffle array, all items that were not referenced are appended in their original sequence from the outer table. An inner index can be calculated in a way suitable to constrained systems by applying the algorighm in Appendix A. 2.2. Examples Outer table: A B C D E F G H Permutation CBOR item: [1 / pick the "B" /, 4 / pick the "E" /, -2 / "3 items" /] Inner table: B E F G A C D H Figure 1: Single permutation illustrated Amsüss Expires 30 January 2025 [Page 4] Internet-Draft Packed CBOR: Table permutations July 2024 1113([ ["tick.", "tock.", "tickety."], ["The clock goes ", "And it goes "], [ [ 6(simple(0)), / "The clock goes tick." / 6(simple(1)), / "The clock goes tock." / 224(simple(2)), / "And it goes tickety." / 6(simple(1)), / "The clock goes tock." / ] 115([[], [1], [ / The argument table is now: ["And it goes ", "The clock goes "] / 6(simple(0)), / "And it goes tick." / 6(simple(1)), / "And it goes tock." / 6(simple(2)), / "And it goes tickety." / 6(simple(1)), / "And it goes tock." / ]]), [ 6(simple(1)), / "The clock goes tock." / ] ] ]) Figure 2: Using permutations to make the 6() straight argument available. Note that in the tick/tock example, using 6() over 224() saves just 1 byte, whereas the setup around tag CPA115 costs 6 bytes. This particular use would break even at 6 uses of tag 6. 3. Security Considerations [ I don't think there's anything to add? ] The security considerations of [I-D.ietf-cbor-packed] apply. 4. IANA Considerations 4.1. CBOR Tags Registry In the registry "CBOR Tags", IANA is requested to allocate one tag: * Tag: CPA115 * Data item: Array [shuffle-shared, ?shuffle-argument, rump] Amsüss Expires 30 January 2025 [Page 5] Internet-Draft Packed CBOR: Table permutations July 2024 * Semantics: "Packed CBOR: table permutation" * Reference: This document 5. References 5.1. Normative References [I-D.bormann-cbor-draft-numbers] Bormann, C., "Managing CBOR numbers in Internet-Drafts", Work in Progress, Internet-Draft, draft-bormann-cbor- draft-numbers-03, 29 February 2024, . [I-D.ietf-cbor-packed] Bormann, C. and M. Gütschow, "Packed CBOR", Work in Progress, Internet-Draft, draft-ietf-cbor-packed-12, 2 March 2024, . 5.2. Informative References [I-D.amsuess-cbor-packed-by-reference] Amsüss, C., "Packed CBOR: Table set up by reference", Work in Progress, Internet-Draft, draft-amsuess-cbor-packed-by- reference-02, 4 March 2024, . Appendix A. Example implementation The following algorithm illustrates how table lookup can be implemented without the need for variable size storage per compression table. An implementation in fixed memory is expected to accommodate up to a fixed size of nested table setup tags. When parsing the shuffle item, if may calculate early_item_count and store it for later use, along with a reference to the position of the shuffle table, through which it iterates again for every item to be unpacked. Amsüss Expires 30 January 2025 [Page 6] Internet-Draft Packed CBOR: Table permutations July 2024 def early_item_count(shuffle): count = 0 for (offset, length) in shuffle: length = 1 if length is None else 1 - length count += length return count def lookup(index, shuffle, outertable): shuffle_early_items = early_item_count(shuffle) if index < shuffle_early_items: for (offset, length) in shuffle: length = 1 if length is None else 1 - length if index < length: return outertable[offset + index] index -= length else: outer_index = index - shuffle_early_items for (offset, length) in shuffle: length = 1 if length is None else 1 - length if offset <= outer_index: outer_index += length return outertable[outer_index] Appendix B. Open issues * For this to progress it will need some real use cases, not just synthetic examples. * An unoptimized version of this can be achieved with tag CPA113 from [I-D.ietf-cbor-packed], without the numeric encoding of items and the special encoding for consecutive items. Using CPA113 this way is relatively easy for shared items; for argument items, it requires referencing them from the shared table or performing a no-op concatenation with an empty item. * Is skipping of used values worth it? If we accepted that this tag produced not a permutation but an extended copy of the outer table, we could do without the double iteration / pre-computation of the number of items. This would be a simplification, and more similar to using CPA113. The advantage of the permutation construction is that the non-optimized entries are not pushed back more than necessary, saving a bit (in some cases, literally) in the encoded size of those points. Amsüss Expires 30 January 2025 [Page 7] Internet-Draft Packed CBOR: Table permutations July 2024 Is there maybe a different expression of the permutation that has the same nice properties added complexity? (Stating the number of items in front of the shuffle item saves the double processing, but adds redundant information and does not help with the branching in the implementation). Whether this document should progress on from the initial version should depend on whether good use cases are identified and whether it outperforms alternative solutions by a sufficient margin. Author's Address Christian Amsüss Austria Email: christian@amsuess.com Amsüss Expires 30 January 2025 [Page 8]