Distributed Aggregation Protocol for Privacy Preserving Measurement

Internet-Draft	DAP	November 2024
Geoghegan, et al.	Expires 7 May 2025	[Page]

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶

This Internet-Draft will expire on 7 May 2025.¶

Copyright Notice

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶

1. Introduction

This document describes the Distributed Aggregation Protocol (DAP) for privacy preserving measurement. The protocol is executed by a large set of clients and two aggregation servers. The aggregators' goal is to compute some aggregate statistic over measurements generated by clients without learning the measurements themselves. This is made possible by distributing the computation among the aggregators in such a way that, as long as at least one of them executes the protocol honestly, no measurement is ever seen in the clear by any aggregator.¶

1.1. Change Log

(RFC EDITOR: Remove this section.)¶

(*) Indicates a change that breaks wire compatibility with the previous draft.¶

13:¶

Bump draft-irtf-cfrg-vdaf-12 to 13 [VDAF] and adopt the streaming aggregation interface. Accordingly, clarify that DAP is only compatible with VDAFs for which aggregation is order insensitive.¶
Add public extensions to report metadata. (*)¶
Improve extension points for batch modes. (*)¶
During the upload interaction, allow the Leader to indicate to the Client which set of report extensions it doesn't support.¶
Add a start time to task parameters and require rejection of reports outside of the time validity window. Incidentally, replace the task end time with a task duration parameter.¶
Clarify underspecified behavior around aggregation skew recovery.¶
Improve IANA considerations and add guidelines for extending DAP.¶
Rename "upload extension" to "report extension", and "prepare error" to "report error", to better align the names of these types with their functionality.¶
Bump version tag from "dap-12" to "dap-13". (*)¶

12:¶

Bump draft-irtf-cfrg-vdaf-08 to 12 [VDAF], and specify the newly-defined application context string to be a concatenation of the DAP version in use with the task ID. (*)¶
Add support for "asynchronous" aggregation, based on the Leader polling the Helper for the result of each step of aggregation. (*)¶
Update collection semantics to match the new aggregation semantics introduced in support of asynchronous aggregation. (*)¶
Clarify the requirements around report replay protection, defining when and how report IDs must be checked and stored in order to correctly detect replays.¶
Remove support for per-task HPKE configurations. (*)¶
Rename "query type" to "batch mode", to align the name of this configuration value with its functionality.¶
Rename the "fixed-size" batch mode to "leader-selected", to align the name with the behavior of this query type.¶
Remove the max_batch_size parameter of the "fixed-size" batch mode.¶
Restore the part_batch_selector field of the Collection structure, which was removed in draft 11, as it is required to decrypt collection results in some cases. (*)¶
Update PrepareError allocations in order to remove an unused value and to reserve the zero value for testing. (*)¶
Document distributed-system and synchronization concerns in the operational considerations section.¶
Document additional storage and runtime requirements in the operational considerations section.¶
Document deviations from the presentation language of Section 3 of [RFC8446] for structures described in this specification.¶
Clarify that differential privacy mitigations can help with privacy, rather than robustness, in the operational considerations section.¶
Bump version tag from "dap-11" to "dap-12". (*)¶

11:¶

Remove support for multi-collection of batches, as well as the fixed-size query type's by_batch_id query. (*)¶
Clarify purpose of report ID uniqueness.¶
Bump version tag from "dap-10" to "dap-11". (*)¶

10:¶

Editorial changes from httpdir early review.¶
Poll collection jobs with HTTP GET instead of POST. (*)¶
Upload reports with HTTP POST instead of PUT. (*)¶
Clarify requirements for problem documents.¶
Provide guidance on batch sizes when running VDAFs with non-trivial aggregation parameters.¶
Bump version tag from "dap-09" to "dap-10". (*)¶

09:¶

Fixed-size queries: make the maximum batch size optional.¶
Fixed-size queries: require current-batch queries to return distinct batches.¶
Clarify requirements for compatible VDAFs.¶
Clarify rules around creating and abandoning aggregation jobs.¶
Recommend that all task parameters are visible to all parties.¶
Revise security considerations section.¶
Bump draft-irtf-cfrg-vdaf-07 to 08 [VDAF]. (*)¶
Bump version tag from "dap-07" to "dap-09". (*)¶

08:¶

Clarify requirements for initializing aggregation jobs.¶
Add more considerations for Sybil attacks.¶
Expand guidance around choosing the VDAF verification key.¶
Add an error type registry for the aggregation sub-protocol.¶

07:¶

Bump version tag from "dap-06" to "dap-07". This is a bug-fix revision: the editors overlooked some changes we intended to pick up in the previous version. (*)¶

06:¶

Bump draft-irtf-cfrg-vdaf-06 to 07 [VDAF]. (*)¶
Overhaul security considerations (#488).¶
Adopt revised ping-pong interface in draft-irtf-cfrg-vdaf-07 (#494).¶
Add aggregation parameter to AggregateShareAad (#498). (*)¶
Bump version tag from "dap-05" to "dap-06". (*)¶

05:¶

Bump draft-irtf-cfrg-vdaf-05 to 06 [VDAF]. (*)¶
Specialize the protocol for two-party VDAFs (i.e., one Leader and One Helper). Accordingly, update the aggregation sub-protocol to use the new "ping-pong" interface for two-party VDAFs introduced in draft-irtf-cfrg-vdaf-06. (*)¶
Allow the following actions to be safely retried: aggregation job creation, collection job creation, and requesting the Helper's aggregate share.¶
Merge error types that are related.¶
Drop recommendation to generate IDs using a cryptographically secure pseudorandom number generator wherever pseudorandomness is not required.¶
Require HPKE config identifiers to be unique.¶
Bump version tag from "dap-04" to "dap-05". (*)¶

04:¶

Introduce resource oriented HTTP API. (#278, #398, #400) (*)¶
Clarify security requirements for choosing VDAF verify key. (#407, #411)¶
Require Clients to provide nonce and random input when sharding inputs. (#394, #425) (*)¶
Add interval of time spanned by constituent reports to Collection message. (#397, #403) (*)¶
Update share validation requirements based on latest security analysis. (#408, #410)¶
Bump draft-irtf-cfrg-vdaf-03 to 05 [VDAF]. (#429) (*)¶
Bump version tag from "dap-03" to "dap-04". (#424) (*)¶

03:¶

Enrich the "fixed_size" query type to allow the Collector to request a recently aggregated batch without knowing the batch ID in advance. ID discovery was previously done out-of-band. (*)¶
Allow Aggregators to advertise multiple HPKE configurations. (*)¶
Clarify requirements for enforcing anti-replay. Namely, while it is sufficient to detect repeated report IDs, it is also enough to detect repeated IDs and timestamps.¶
Remove the extensions from the Report and add extensions to the plaintext payload of each ReportShare. (*)¶
Clarify that extensions are mandatory to implement: If an Aggregator does not recognize a ReportShare's extension, it must reject it.¶
Clarify that Aggregators must reject any ReportShare with repeated extension types.¶
Specify explicitly how to serialize the Additional Authenticated Data (AAD) string for HPKE encryption. This clarifies an ambiguity in the previous version. (*)¶
Change the length tag for the aggregation parameter to 32 bits. (*)¶
Use the same prefix ("application") for all media types. (*)¶
Make input share validation more explicit, including adding a new ReportShareError variant, "report_too_early", for handling reports too far in the future. (*)¶
Improve alignment of problem details usage with [RFC7807]. Replace "reportTooLate" problem document type with "repjortRejected" and clarify handling of rejected reports in the upload sub-protocol. (*)¶
Bump version tag from "dap-02" to "dap-03". (*)¶

02:¶

Define a new task configuration parameter, called the "query type", that allows tasks to partition reports into batches in different ways. In the current draft, the Collector specifies a "query", which the Aggregators use to guide selection of the batch. Two query types are defined: the "time_interval" type captures the semantics of draft 01; and the "fixed_size" type allows the Leader to partition the reports arbitrarily, subject to the constraint that each batch is roughly the same size. (*)¶
Define a new task configuration parameter, called the task "expiration", that defines the lifetime of a given task.¶
Specify requirements for HTTP request authentication rather than a concrete scheme. (Draft 01 required the use of the DAP-Auth-Token header; this is now optional.)¶
Make "task_id" an optional parameter of the "/hpke_config" endpoint.¶
Add report count to CollectResp message. (*)¶
Increase message payload sizes to accommodate VDAFs with input and aggregate shares larger than 2^16-1 bytes. (*)¶
Bump draft-irtf-cfrg-vdaf-01 to 03 [VDAF]. (*)¶
Bump version tag from "dap-01" to "dap-02". (*)¶
Rename the report nonce to the "report ID" and move it to the top of the structure. (*)¶
Clarify when it is safe for an Aggregator to evict various data artifacts from long-term storage.¶

1.2. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶

1.2.1. Glossary of Terms

Aggregate result:: The output of the aggregation function. As defined in [VDAF].¶
Aggregate share:: A secret share of the aggregate result computed by each Aggregator and transmitted to the Collector. As defined in [VDAF].¶
Aggregation function:: The function computed over the measurements generated by the Clients and the aggregation parameter selected by the Collector. As defined in [VDAF].¶
Aggregation parameter:: Parameter selected by the Collector used to prepare a batch of measurements for aggregation. As defined in [VDAF].¶
Aggregator:: A server that receives report shares from Clients and validates and aggregates them with the help of the other Aggregator. As defined in [VDAF].¶
Batch:: A set of reports (i.e., measurements) that are aggregated into an aggregate result. As defined in [VDAF].¶
Batch bucket:: State associated with a given batch allowing the aggregators to perform incremental aggregation. Depending on the batch mode, there may be many batch buckets tracking the state of a single batch.¶
Batch duration:: The time difference between the oldest and newest report in a batch.¶
Batch interval:: A parameter of a query issued by the Collector that specifies the time range of the reports in the batch.¶
Client:: The DAP protocol role identifying a party that generates a measurement and uploads a report. Note the distinction between a DAP Client (distinguished in this document by the capital "C") and an HTTP client (distinguished in this document by the phrase "HTTP client"), as the DAP Client is not the only role that sometimes acts as an HTTP client.¶
Collector:: The party that selects the aggregation parameter and computes the aggregate result.¶
Helper:: The Aggregator that executes the aggregation and collection interactions initiated by the Leader.¶
Input share:: An Aggregator's share of a measurement. The input shares are output by the VDAF sharding algorithm. As defined in [VDAF].¶
Output share:: An Aggregator's share of the refined measurement resulting from successful execution of VDAF preparation. Many output shares are combined into an aggregate share during VDAF aggregation. As defined in [VDAF].¶
Leader:: The Aggregator that coordinates aggregation and collection with the Helper.¶
Measurement:: A plaintext input emitted by a Client (e.g., a count, summand, or string), before any encryption or secret sharing is applied. Depending on the VDAF in use, multiple values may be grouped into a single measurement. As defined in [VDAF].¶
Minimum batch size:: The minimum number of reports that must be aggregated before a batch can be collected.¶
Public share:: The output of the VDAF sharding algorithm transmitted to each of the Aggregators. As defined in [VDAF].¶
Report:: A cryptographically protected measurement uploaded to the Leader by a Client. Includes a pair of report shares, one for each Aggregator.¶
Report share:: An input share encrypted under the HPKE public key of an Aggregator [HPKE]. The report share also includes some associated data used for processing the report.¶

1.3. Representation Language

With some exceptions, we use the presentation language defined in [RFC8446], Section 3 to define messages in the DAP protocol. Encoding and decoding of these messages as byte strings also follows [RFC8446]. We enumerate the exceptions below.¶

Section 3.7 of [RFC8446] defines a syntax for structure fields whose values are constants. In this document, we do not use that notation, but use something similar to describe specific variants of structures containing enumerated types, described in [RFC8446], Section 3.8.¶

For example, suppose we have an enumeration and a structure defined as follows:¶

enum {
  number(0),
  string(1),
  (255)
} ExampleEnum;

struct {
  uint32 always_present;
  ExampleEnum type;
  select (ExampleStruct.type) {
    case number: uint32 a_number;
    case string: opaque a_string<0..10>;
  };
} ExampleStruct;

Then we describe the specific variant of ExampleStruct where type == number with a variant block as follows:¶

variant {
  /* Field exists regardless of variant */
  uint32 always_present;
  ExampleEnum type = number;
  /* Only fields included in the `type == number`
     variant is described */
  uint32 a_number;
} ExampleStruct;

The protocol text accompanying this would explain how implementations should handle the always_present and a_number fields, but not the type field. This does not mean that the type field of ExampleStruct can only ever have value number; it means only that it has this type in this instance.¶

This notation can also be used in structures where the enum field does not affect what fields are or are not present in the structure. For example:¶

enum {
  something(0),
  something_else(1),
  (255)
} FailureReason;

struct {
  FailureReason failure_reason;
  opaque another_field<0..256>;
} FailedOperation;

The protocol text might include a description like:¶

variant {
  FailureReason failure_reason = something;
  opaque another_field<0..256>;
} FailedOperation;

Finally, by convention we do not specify the lower length limit of variable-length vectors. Rather, the lower limit is always set to 0.¶

2. Overview

The protocol is executed by a large set of Clients and a pair of servers referred to as "Aggregators". Each Client's input to the protocol is its measurement (or set of measurements, e.g., counts of some user behavior). Given the input set of measurements meas_1, ..., meas_N held by the Clients, and an "aggregation parameter" agg_param shared by the Aggregators, the goal of DAP is to compute agg_result = F(agg_param, meas_1, ..., meas_N) for some function F while revealing nothing else about the measurements. We call F the "aggregation function" and agg_result the "aggregate result".¶

DAP is extensible in that it allows for the addition of new cryptographic schemes that compute different aggregation functions. In particular, the aggregation function is determined by the Verifiable Distributed Aggregation Function, or VDAF [VDAF], used to securely compute it.¶

VDAFs rely on secret sharing to protect the privacy of the measurements. Rather than sending its measurement in the clear, each Client shards its measurement into a pair of "input shares" and sends an input share to each of the Aggregators. This scheme has two important properties:¶

Given only one of the input shares, it is impossible to deduce the plaintext measurement from which it was generated.¶
Aggregators can compute secret shares of the aggregate result by aggregating their shares locally into "aggregate shares", which may later be combined into the aggregate result.¶

Note that some VDAFs allow measurements to be aggregated multiple times, each time with a different aggregation parameter; however, DAP only allows each measurement to be aggregated once. Similarly, some VDAFs produce aggregate values which depend on the order in which the measurementas are aggregated; however, DAP only supports VDAFs whose aggregation results are independent of the order in which measurements are aggregated (see Section 4.4.1 of [VDAF]).¶

2.1. System Architecture

The overall system architecture is shown in Figure 1.¶

+--------+
|        |
| Client +----+
|        |    |
+--------+    |
              |
+--------+    |     +------------+         +-----------+
|        |    +----->            |         |           |
| Client +---------->   Leader   <---------| Collector |
|        |    +----->            |         |           |
+--------+    |     +------------+         +-----------+
              |           |
+--------+    |           |
|        |    |           |
| Client +----+     +-----V------+
|        |          |            |
+--------+          |   Helper   |
                    |            |
                    +------------+

Figure 1: DAP architecture (who sends requests to whom).

The main participants in the protocol are as follows:¶

Collector:: The party which wants to obtain the aggregate result for the measurements generated by the Clients. Any given measurement task will have a single Collector.¶
Clients:: The parties which directly take the measurement(s) and report them to the DAP protocol. In order to provide reasonable levels of privacy, there must be a large number of Clients.¶
Leader:: The Aggregator responsible for coordinating the protocol. It receives the reports from Clients and aggregates them with the assistance of the Helper; and it orchestrates the process of computing the aggregate result as requested by the Collector.¶
Helper:: The Aggregator assisting the Leader with the computation. The protocol is designed so that the Helper is relatively lightweight, with most of the operational burden borne by the Leader.¶

The basic unit of DAP is the "task" which represents a single measurement process (though potentially aggregating multiple, non-overlapping batches of measurements). The definition of a task includes the following parameters:¶

The VDAF which determines the type of measurements and the aggregation function (e.g., mean, standard deviation, etc.).¶
The identities of the Leader and Helper. These are URLs that implement the APIs for uploading, aggregation, and collection specified in Section 4.¶
Cryptographic assets involved in the computation (i.e., key material).¶
Various parameters used to determine how reports are partitioned into batches and the size of each batch.¶

These parameters are distributed to the Clients, Aggregators, and Collector before the task begins. (Note that not all parameters are distributed to all parties.) This document does not specify a distribution mechanism, but it is important that all protocol participants agree on the task's configuration. Each task is identified by a unique 32-byte ID which is used to refer to it in protocol messages.¶

During the lifetime of a task, each Client records its own measurement value(s), packages them up into a report, and sends them to the Leader. Each share is separately encrypted for each Aggregator so that even though they pass through the Leader, the Leader is unable to see or modify them. Depending on the task, the Client may only send one report or may send many reports over time.¶

The Leader distributes the shares to the Helper and orchestrates the process of verifying them (see Section 2.2) and assembling them into a final aggregate result for the Collector. Depending on the VDAF, it may be possible to incrementally process each report as it is uploaded, or it may be necessary to wait until the Collector transmits a collection request before processing can begin.¶

2.2. Validating Measurements

An essential task of any data collection pipeline is ensuring that the data being aggregated is "valid". For example, if each measurement is expected to be an number between 0 and 10, then the aggregator should reject measurements larger than 10 or smaller than 0. In DAP, input validation is complicated by the fact that none of the entities other than the Client ever sees that Client's plaintext measurement. To an Aggregator, a secret share of a valid measurement is indistinguishable from a secret share of an invalid measurement.¶

In DAP, input validation is accomplished by an interactive computation between the Leader and Helper. At the beginning of this computation, each Aggregator is in possession of an input share uploaded by the Client. At the end of the computation, each Aggregator is in possession of either an "output share" that is ready to be aggregated or an indication that the underlying data is invalid, in which case the report is rejected.¶

This process is known as "preparation" and is specified by the VDAF itself (Section 5.2 of [VDAF]). To facilitate preparation, the report generated by the Client include information used by the Aggregators. For example, Prio3 (Section 7 of [VDAF]) includes a zero-knowledge proof of the measurement's validity (Section 7.1 of [VDAF]); the process of verifying this proof reveals nothing about the underlying measurement beyond its validity.¶

The specific properties attested to by the proof vary depending on the measurement being taken. For instance, to measure the time the user took performing a given task the proof might demonstrate that the value reported was within a certain range (e.g., between 0 and 60 seconds). By contrast, to report which of a set of N options the user select, the report might contain N integers and the proof would demonstrate that N-1 were 0 and the other was 1.¶

It is important to recognize that "validity" is distinct from "correctness". For instance, the user might have spent 30 seconds on a task but the Client might report 60 seconds. This is a problem with any measurement system and DAP does not attempt to address it; it merely ensures that the data is within acceptable limits, so the Client could not report 10^6 or -20 seconds.¶

2.3. Replay Protection

Another goal of DAP is to mitigate replay attacks in which a report is aggregated multiple times within a batch or across multiple batches. This would allow the attacker to learn more information about the underlying measurement than it would otherwise.¶

When a Client generates a report, it also generates a random nonce, called the "report ID". Each Aggregator is responsible for storing the IDs of reports it has aggregated for a given task. To check whether a report has been replayed, it checks whether the report's ID is in the set of stored IDs.¶

Note that IDs do not need to be stored indefinitely. The protocol allows Aggregators to enforce replay only for a sliding time window (e.g., within the last two weeks of the current time) and reject reports that fall outside of the replay window. This allows implementation to save resources by forgetting old report IDs.¶

3. Message Transport

Communications between DAP participants are carried over HTTP [RFC9110]. Use of HTTPS is REQUIRED to provide server authentication and confidentiality.¶

3.1. HTTPS Request Authentication

DAP is made up of several interactions in which different subsets of the protocol's participants interact with each other.¶

In those cases where a channel between two participants is tunneled through another protocol participant, DAP mandates the use of public-key encryption using [HPKE] to ensure that only the intended recipient can see a message in the clear.¶

In other cases, DAP requires HTTP client authentication as well as server authentication. Any authentication scheme that is composable with HTTP is allowed. For example:¶

[OAuth2] credentials are presented in an Authorization HTTP header, which can be added to any DAP protocol message.¶
TLS client certificates can be used to authenticate the underlying transport.¶
The DAP-Auth-Token HTTP header described in [I-D.draft-dcook-ppm-dap-interop-test-design-04].¶

This flexibility allows organizations deploying DAP to use existing well-known HTTP authentication mechanisms that they already support. Discovering what authentication mechanisms are supported by a DAP participant is outside of this document's scope.¶

3.2. Errors

Errors can be reported in DAP both as HTTP status codes and as problem detail objects [RFC9457] in the response body. For example, if the HTTP client sends a request using a method not allowed in this document, then the server MAY return HTTP status 405 Method Not Allowed.¶

When the server responds with an error status code, it SHOULD provide additional information using a problem detail object. If the response body does consist of a problem detail object, the HTTP status code MUST indicate a client or server error (the 4xx or 5xx classes, respectively, from Section 15 of [RFC9110]).¶

To facilitate automatic response to errors, this document defines the following standard tokens for use in the "type" field:¶

Table 1
Type	Description
invalidMessage	A message received by a protocol participant could not be parsed or otherwise was invalid.
unrecognizedTask	A server received a message with an unknown task ID.
unrecognizedAggregationJob	A server received a message with an unknown aggregation job ID.
outdatedConfig	The message was generated using an outdated configuration.
reportRejected	Report could not be processed for an unspecified reason.
reportTooEarly	Report could not be processed because its timestamp is too far in the future.
batchInvalid	The batch boundary check for Collector's query failed.
invalidBatchSize	There are an invalid number of reports in the batch.
batchQueriedMultipleTimes	A batch was queried with multiple distinct aggregation parameters.
batchMismatch	Aggregators disagree on the report shares that were aggregated in a batch.
unauthorizedRequest	Authentication of an HTTP request failed (see Section 3.1).
stepMismatch	The Aggregators disagree on the current step of the DAP aggregation protocol.
batchOverlap	A request's query includes reports that were previously collected in a different batch.
unsupportedExtension	An upload request's extensions list includes an unknown extension.

This list is not exhaustive. The server MAY return errors set to a URI other than those defined above. Servers MUST NOT use the DAP URN namespace for errors not listed in the appropriate IANA registry (see Section 9.3). The "detail" member of the Problem Details document includes additional diagnostic information.¶

When the task ID is known (see Section 4.3), the problem document SHOULD include an additional "taskid" member containing the ID encoded in Base 64 using the URL and filename safe alphabet with no padding defined in Sections 5 and 3.2 of [RFC4648].¶

In the remainder of this document, the tokens in the table above are used to refer to error types, rather than the full URNs. For example, an "error of type 'invalidMessage'" refers to an error document with "type" value "urn:ietf:params:ppm:dap:error:invalidMessage".¶

This document uses the verbs "abort" and "alert with [some error message]" to describe how protocol participants react to various error conditions. This implies HTTP status code 400 Bad Request unless explicitly specified otherwise.¶

4. Protocol Definition

DAP has three major interactions which need to be defined:¶

Uploading reports from the Client to the Aggregators, specified in Section 4.5 ¶
Computing the results for a given measurement task, specified in Section 4.6 ¶
Collecting aggregated results, specified in Section 4.7 ¶

Each of these interactions is defined in terms of "resources". In this section we define these resources and the messages used to act on them.¶

4.1. Basic Type Definitions

The following are some basic type definitions used in other messages:¶

/* ASCII encoded URL. e.g., "https://example.com" */
opaque Url<0..2^16-1>;

uint64 Duration; /* Number of seconds elapsed between two instants */

uint64 Time; /* seconds elapsed since start of UNIX epoch */

/* An interval of time of length duration, where start is included
  and (start + duration) is excluded. */
struct {
  Time start;
  Duration duration;
} Interval;

/* An ID used to uniquely identify a report in the context of a
   DAP task. */
opaque ReportID[16];

/* The various roles in the DAP protocol. */
enum {
  collector(0),
  client(1),
  leader(2),
  helper(3),
  (255)
} Role;

/* Identifier for a server's HPKE configuration */
uint8 HpkeConfigId;

/* An HPKE ciphertext. */
struct {
  HpkeConfigId config_id;    /* config ID */
  opaque enc<0..2^16-1>;     /* encapsulated HPKE key */
  opaque payload<0..2^32-1>; /* ciphertext */
} HpkeCiphertext;

/* Represent a zero-length byte string. */
struct {} Empty;

DAP uses the 16-byte ReportID as the nonce parameter for the VDAF shard and prep_init methods (see [VDAF], Section 5). Additionally, DAP includes messages defined in the VDAF specification encoded as opaque byte strings within various DAP messages. Thus, for a VDAF to be compatible with DAP, it MUST specify a NONCE_SIZE of 16 bytes, and MUST specify encodings for the following VDAF types:¶

PublicShare¶
InputShare¶
AggParam¶
AggShare¶
PrepShare¶
PrepMessage¶

4.2. Batch Modes, Batches, and Queries

An aggregate result is computed from a set of reports, called a "batch". The Collector requests the aggregate result by making a "query"; the Aggregators use this query to select a batch for aggregation.¶

Each measurement task has a preconfigured "batch mode". The batch mode defines both how reports may be partitioned into batches, as well as how these batches are addressed and the semantics of the query used for collection.¶

This document defines two batch modes:¶

time-interval (Section 5.1) in which the query specifies a time interval, and the batch consists of all reports with a timestamp in that interval.¶
leader-selected (Section 5.2) in which the Leader assigns reports to batches itself, and the Collector's query simply requests the aggregate result for the "next" batch of reports.¶

Future documents may define additional batch modes for DAP; see Section 10. Implementations are free to implement only a subset of the available batch modes.¶

Batch modes are identified with a single byte in serialized messages, as follows:¶

enum {
  reserved(0),
  (255)
} BatchMode;

The query is issued to the Leader by the Collector during the collection interaction (Section 4.7). In addition, information used to guide batch selection is conveyed from the Leader to the Helper when initializing aggregation (Section 4.6) and finalizing the aggregate shares.¶

4.3. Task Configuration

Prior to the start of execution of the protocol, each participant must agree on the configuration for each task. A task is uniquely identified by its task ID:¶

opaque TaskID[32];

The task ID value MUST be a globally unique sequence of bytes. Each task has the following parameters associated with it:¶

leader_aggregator_url: A URL relative to which the Leader's API resources can be found.¶
helper_aggregator_url: A URL relative to which the Helper's API resources can be found.¶
The batch mode for this task (see Section 4.2). This determines how reports are grouped into batches and the properties that all batches for this task must have. The party MUST NOT configure the task if it does not recognize the batch mode.¶
task_start: The time from which the Clients will start uploading reports to a task. Aggregators MUST reject reports with timestamps earlier than task_start.¶
task_duration: The duration of a task. The task is considered completed after the end time task_start + task_duration. Aggregators MUST reject reports that have timestamps later than the end time, and MAY choose to opt out of the task if task_duration is too long.¶
time_precision: Clients use this value to truncate their report timestamps; see Section 4.5. Additional semantics may apply, depending on the batch mode. (See Section 4.7.5 for details.)¶
A unique identifier for the VDAF in use for the task, e.g., one of the VDAFs defined in Section 10 of [VDAF].¶
min_batch_size: The smallest number of reports the batch is allowed to include. In a sense, this parameter controls the degree of privacy that will be obtained: the larger the minimum batch size, the higher degree of privacy. However, this ultimately depends on the application and the nature of the measurements and aggregation function.¶

Note that the leader_aggregator_url and helper_aggregator_url values may include arbitrary path components.¶

In addition, in order to facilitate the aggregation and collection interactions, each of the Aggregators is configured with following parameters:¶

collector_hpke_config: The [HPKE] configuration of the Collector (described in Section 4.5.1); see Section 7 for information about the HPKE configuration algorithms.¶
vdaf_verify_key: The VDAF verification key shared by the Aggregators. This key is used in the aggregation interaction (Section 4.6). The security requirements are described in Section 8.6.1.¶

Finally, the Collector is configured with the HPKE secret key corresponding to collector_hpke_config.¶

A task's parameters are immutable for the lifetime of that task. The only way to change parameters or to rotate secret values like collector HPKE configuration or the VDAF verification key is to configure a new task.¶

4.4. Resource URIs

DAP is defined in terms of "resources", such as reports (Section 4.5), aggregation jobs (Section 4.6), and collection jobs (Section 4.7). Each resource has an associated URI. Resource URIs are specified by a sequence of string literals and variables. Variables are expanded into strings according to the following rules:¶

Variables {leader} and {helper} are replaced with the base URL of the Leader and Helper respectively (the base URL is defined in Section 4.3).¶
Variables {task-id}, {aggregation-job-id}, and {collection-job-id} are replaced with the task ID (Section 4.3), aggregation job ID (Section 4.6.1), and collection job ID (Section 4.7.1) respectively. The value MUST be encoded in its URL-safe, unpadded Base 64 representation as specified in Sections 5 and 3.2 of [RFC4648].¶

For example, given a helper URL "https://example.com/api/dap", task ID "f0 16 34 47 36 4c cf 1b c0 e3 af fc ca 68 73 c9 c3 81 f6 4a cd f9 02 06 62 f8 3f 46 c0 72 19 e7" and an aggregation job ID "95 ce da 51 e1 a9 75 23 68 b0 d9 61 f9 46 61 28" (32 and 16 byte octet strings, represented in hexadecimal), resource URI {helper}/tasks/{task-id}/aggregation_jobs/{aggregation-job-id} would be expanded into https://example.com/api/dap/tasks/8BY0RzZMzxvA46_8ymhzycOB9krN-QIGYvg_RsByGec/aggregation_jobs/lc7aUeGpdSNosNlh-UZhKA.¶

4.5. Uploading Reports

Clients periodically upload reports to the Leader. Each report contains two "report shares", one for the Leader and another for the Helper. The Helper's report share is transmitted by the Leader during the aggregation interaction (see Section 4.6).¶

4.5.1. HPKE Configuration Request

Before the Client can upload its report to the Leader, it must know the HPKE configuration of each Aggregator. See Section 7 for information on HPKE algorithm choices.¶

Clients retrieve the HPKE configuration from each Aggregator by sending an HTTP GET request to {aggregator}/hpke_config.¶

An Aggregator responds to well-formed requests with HTTP status code 200 OK and an HpkeConfigList value, with media type "application/dap-hpke-config-list". The HpkeConfigList structure contains one or more HpkeConfig structures in decreasing order of preference. This allows an Aggregator to support multiple HPKE configurations simultaneously.¶

HpkeConfig HpkeConfigList<0..2^16-1>;

struct {
  HpkeConfigId id;
  HpkeKemId kem_id;
  HpkeKdfId kdf_id;
  HpkeAeadId aead_id;
  HpkePublicKey public_key;
} HpkeConfig;

opaque HpkePublicKey<0..2^16-1>;
uint16 HpkeAeadId; /* Defined in [HPKE] */
uint16 HpkeKemId;  /* Defined in [HPKE] */
uint16 HpkeKdfId;  /* Defined in [HPKE] */

Aggregators MUST allocate distinct id values for each HpkeConfig in an HpkeConfigList.¶

The Client MUST abort if any of the following happen for any HPKE config request:¶

the GET request did not return a valid HPKE config list;¶
the HPKE config list is empty; or¶
no HPKE config advertised by the Aggregator specifies a supported a KEM, KDF, or AEAD algorithm triple.¶

Aggregators SHOULD use HTTP caching to permit client-side caching of this resource [RFC5861]. Aggregators SHOULD favor long cache lifetimes to avoid frequent cache revalidation, e.g., on the order of days. Aggregators can control this cached lifetime with the Cache-Control header, as in this example:¶

  Cache-Control: max-age=86400

Servers should choose a max-age value appropriate to the lifetime of their keys. Clients SHOULD follow the usual HTTP caching [RFC9111] semantics for HPKE configurations.¶

Note: Long cache lifetimes may result in Clients using stale HPKE configurations; Aggregators SHOULD continue to accept reports with old keys for at least twice the cache lifetime in order to avoid rejecting reports.¶

4.5.2. Upload Request

Clients upload reports by using an HTTP POST to {leader}/tasks/{task-id}/reports. The payload is a Report, with media type "application/dap-report", structured as follows:¶

struct {
  ReportID report_id;
  Time time;
  Extension public_extensions<0..2^16-1>;
} ReportMetadata;

struct {
  ReportMetadata report_metadata;
  opaque public_share<0..2^32-1>;
  HpkeCiphertext leader_encrypted_input_share;
  HpkeCiphertext helper_encrypted_input_share;
} Report;

report_metadata is public metadata describing the report.¶
- report_id is used by the Aggregators to ensure the report is not replayed (Figure 2). The Client MUST generate this by generating 16 random bytes using a cryptographically secure random number generator.¶
- time is the time at which the report was generated. The Client SHOULD round this value down to the nearest multiple of the task's time_precision in order to ensure that that the timestamp cannot be used to link a report back to the Client that generated it.¶
- public_extensions is the list of public report extensions; see Section 4.5.3.¶
public_share is the public share output by the VDAF sharding algorithm. Note that the public share might be empty, depending on the VDAF.¶
leader_encrypted_input_share is the Leader's encrypted input share.¶
helper_encrypted_input_share is the Helper's encrypted input share.¶

Aggregators MAY require Clients to authenticate when uploading reports (see Section 8.3). If it is used, HTTP client authentication MUST use a scheme that meets the requirements in Section 3.1.¶

The handling of the upload request by the Leader MUST be idempotent as discussed in Section 9.2.2 of [RFC9110].¶

To generate a report, the Client begins by sharding its measurement into input shares and the public share using the VDAF's sharding algorithm (Section 5.1 of [VDAF]), using the report ID as the nonce:¶

(public_share, input_shares) = Vdaf.shard(
    "dap-13" || task_id,
    measurement, /* plaintext measurement */
    report_id,   /* nonce */
    rand,        /* randomness for sharding algorithm */
)

where task_id is the task ID. The last input comprises the randomness consumed by the sharding algorithm. The sharding randomness is a random byte string of length specified by the VDAF. The Client MUST generate this using a cryptographically secure random number generator.¶

The sharding algorithm will return two input shares. The first input share returned from the sharding algorithm is considered to be the Leader's input share, and the second input share is considered to be the Helper's input share.¶

The Client then wraps each input share in the following structure:¶

struct {
  Extension private_extensions<0..2^16-1>;
  opaque payload<0..2^32-1>;
} PlaintextInputShare;

Field private_extensions is set to the list of private report extensions intended to be consumed by the given Aggregator. (See Section 4.5.3.) Field payload is set to the Aggregator's input share output by the VDAF sharding algorithm.¶

Next, the Client encrypts each PlaintextInputShare plaintext_input_share as follows:¶

enc, payload = SealBase(pk,
  "dap-13 input share" || 0x01 || server_role,
  input_share_aad, plaintext_input_share)

where pk is the Aggregator's public key; 0x01 represents the Role of the sender (always the Client); server_role is the Role of the intended recipient (0x02 for the Leader and 0x03 for the Helper), plaintext_input_share is the Aggregator's PlaintextInputShare, and input_share_aad is an encoded message of type InputShareAad defined below, constructed from the same values as the corresponding fields in the report. The SealBase() function is as specified in [HPKE], Section 6.1 for the ciphersuite indicated by the HPKE configuration.¶

struct {
  TaskID task_id;
  ReportMetadata report_metadata;
  opaque public_share<0..2^32-1>;
} InputShareAad;

The Leader responds to well-formed requests with HTTP status code 201 Created. Malformed requests are handled as described in Section 3.2. Clients SHOULD NOT upload the same measurement value in more than one report if the Leader responds with HTTP status code 201 Created.¶

If the Leader does not recognize the task ID, then it MUST abort with error unrecognizedTask.¶

The Leader responds to requests whose Leader encrypted input share uses an out-of-date or unknown HpkeConfig.id value, indicated by HpkeCiphertext.config_id, with error of type 'outdatedConfig'. When the Client receives an 'outdatedConfig' error, it SHOULD invalidate any cached HpkeConfigList and retry with a freshly generated Report. If this retried upload does not succeed, the Client SHOULD abort and discontinue retrying.¶

If a report's ID matches that of a previously uploaded report, the Leader MUST ignore it. In addition, it MAY alert the Client with error reportRejected.¶

The Leader MUST ignore any report pertaining to a batch that has already been collected (see Section 4.6.1.4 for details). Otherwise, comparing the aggregate result to the previous aggregate result may result in a privacy violation. Note that this is also enforced by the Helper during the aggregation interaction. The Leader MAY also abort the upload interaction and alert the Client with error reportRejected.¶

The Leader MUST ignore any report whose timestamp is before the task's task_start, or is past the end time task_start + task_duration. When it does so, it SHOULD also abort the upload interaction and alert the Client with error reportRejected. Clients MUST NOT upload a report if its timestamp would be earlier than task_start or later than task_start + task_duration.¶

The Leader may need to buffer reports while waiting to aggregate them (e.g., while waiting for an aggregation parameter from the Collector; see Section 4.7). The Leader SHOULD NOT accept reports whose timestamps are too far in the future. Implementors MAY provide for some small leeway, usually no more than a few minutes, to account for clock skew. If the Leader rejects a report for this reason, it SHOULD abort the upload interaction and alert the Client with error reportTooEarly. In this situation, the Client MAY re-upload the report later on.¶

If the report contains an unrecognized public report extension, or if the Leader's input share contains an unrecognized private report extension, then the Leader MAY abort the upload request with error "unsupportedExtension." If the Leader does abort for this reason, it MAY indicate the unsupported extensions in the resulting problem document via an extension member (Section 3.2 of [RFC9457]) "unsupported_extensions" on the problem document; this member MUST contain an array of numeric values indicating the extension code points which were not recognized. For example, if the report upload contained two unsupported extensions with code points 23 and 42, the "unsupported_extensions" member would contain the value [23, 42].¶

If the same extension type appears more than once among the public extensions and the private extensions in the Leader's input share, then the Leader MAY abort the upload request with error "invalidMessage".¶

(Note that validation of extensions is not mandatory here because it requires the Leader to decrypt its input share. The Leader also cannot validate the Helper's extensions because it cannot decrypt the Helper's input share. Mandatory validation of extensions will occur during aggregation.)¶

4.5.3. Report Extensions

Each ReportMetadata contains a list of extensions public to both aggregators, and each PlaintextInputShare contains a list of extensions private to the relevant Aggregator. Clients use these extensions to convey additional information to the Aggregators. Some extensions might be intended for both Aggregators; others may only be intended for a specific Aggregator. (For example, a DAP deployment might use some out-of-band mechanism for an Aggregator to verify that reports come from authenticated Clients. It will likely be useful to bind the extension to the input share via HPKE encryption.)¶

Each extension is a tag-length encoded value of the following form:¶

struct {
  ExtensionType extension_type;
  opaque extension_data<0..2^16-1>;
} Extension;

enum {
  reserved(0),
  (65535)
} ExtensionType;

Field "extension_type" indicates the type of extension, and "extension_data" contains information specific to the extension.¶

Extensions are mandatory-to-implement: If an Aggregator receives a report containing an extension it does not recognize, then it MUST reject the report. (See Section 4.6.1.4 for details.)¶

4.6. Verifying and Aggregating Reports

Once a set of Clients have uploaded their reports to the Leader, the Leader can begin the process of validating and aggregating them with the Helper. To enable the system to handle large batches of reports, this process can be parallelized across many "aggregation jobs" in which small subsets of the reports are processed independently. Each aggregation job is associated with exactly one DAP task, but a task can have many aggregation jobs.¶

The primary objective of an aggregation job is to run the VDAF preparation process described in [VDAF], Section 5.2 for each report in the job. Preparation has two purposes:¶

To "refine" the input shares into "output shares" that have the desired aggregatable form. For some VDAFs, like Prio3, the mapping from input to output shares is a fixed operation depending only on the input share, but in general the mapping involves an aggregation parameter chosen by the Collector.¶
To verify that the output shares, when combined, correspond to a valid, refined measurement, where validity is determined by the VDAF itself. For example, the Prio3Sum variant of Prio3 (Section 7.4.2 of [VDAF]) requires that the output shares sum up to an integer in a specific range, while the Prio3Histogram variant (Section 7.4.4 of [VDAF]) proves that output shares sum up to a one-hot vector representing a contribution to a single bucket of the histogram.¶

In general, refinement and verification are not distinct computations, since for some VDAFs, verification may only be achieved implicitly as a result of the refinement process. We instead think of these as properties of the output shares themselves: if preparation succeeds, then the resulting output shares are guaranteed to combine into a valid, refined measurement.¶

VDAF preparation is mapped onto an aggregation job as illustrated in Figure 2. The protocol is comprised of a sequence of HTTP requests from the Leader to the Helper, the first of which includes the aggregation parameter, the Helper's report share for each report in the job, and for each report the initialization step for preparation. The Helper's response, along with each subsequent request and response, carries the remaining messages exchanged during preparation.¶

  report, agg_param
   |
   v
+--------+                                         +--------+
| Leader |                                         | Helper |
+--------+                                         +--------+
   | AggregationJobInitReq:                              |
   |   agg_param, prep_init                              |
   |---------------------------------------------------->|
   |                                 AggregationJobResp: |
   |                               prep_resp(continue)   |
   |<----------------------------------------------------|
   | AggregationJobContinueReq:                          |
   |   prep_continue                                     |
   |---------------------------------------------------->|
   |                                 AggregationJobResp: |
   |                               prep_resp(continue)   |
   |<----------------------------------------------------|
   |                                                     |
  ...                                                   ...
   |                                                     |
   | AggregationJobContinueReq:                          |
   |   prep_continue                                     |
   |---------------------------------------------------->|
   |                                 AggregationJobResp: |
   |                      prep_resp(continue|finished)   |
   |<----------------------------------------------------|
   |                                                     |
   v                                                     v
  leader_out_share                         helper_out_share

Figure 2: Overview of the DAP aggregation interaction.

The number of steps, and the type of the responses, depends on the VDAF. The message structures and processing rules are specified in the following subsections.¶

Normally, the Helper processes each step synchronously, meaning it computes each step of the computation before producing a response to the Leader's HTTP request. The Helper can optionally instead process each step asynchronously, meaning the Helper responds to requests immediately, while deferring processing to a background worker. To continue, the Leader polls the Helper until it responds with the next step. This choice allows a Helper implementation flexibility in choosing a request model that best supports its architecture and use case. For instance, resource-intensive use cases, such as replay checks across vast numbers of reports and preparation of large histograms, may be better suited for the asynchronous model. For use cases where datastore performance is a concern, the synchronous model may be better suited.¶

In general, aggregation cannot begin until the Collector specifies a query and an aggregation parameter. However, depending on the VDAF and batch mode in use, it is often possible to begin aggregation as soon as reports arrive. For example, Prio3 has just one valid aggregation parameter (the empty string), and thus allows for eager aggregation; and both the time-interval (Section 5.1) and leader-selected (Section 5.2) batch modes defined in this document allow for eager aggregation.¶

An aggregation job can be thought of as having three phases, which are described in the remaining subsections:¶

Initialization: Begin the aggregation flow by disseminating report shares and initializing the VDAF prep state for each report.¶
Continuation: Continue the aggregation flow by exchanging prep shares and messages until preparation completes or an error occurs.¶
Completion: Finish the aggregation flow, yielding an output share corresponding to each report share in the aggregation job, and update the appropriate aggregate shares with these output shares.¶

After an aggregation job is completed, each Aggregator updates running-total aggregate shares and other values for each "batch bucket" associated with a recovered output share, as described in Section 4.6.2.3. These values are stored until the batch is collected as described in Section 4.7.¶

Apart from VDAF preparation, another important task of the aggregation interaction is to provide replay protection (Section 2.3). Along with the output shares, each Aggregator records the IDs of all reports it is has aggregated for a given task: before committing to an output share, it checks whether the corresponding report ID is in the set of stored IDs.¶

4.6.1. Aggregate Initialization

The Leader begins an aggregation job by choosing a set of candidate reports that pertain to the same DAP task and a job ID which MUST be unique within the scope of the task. The job ID is a 16-byte value, structured as follows:¶

opaque AggregationJobID[16];

The Leader can run this process for many sets of candidate reports in parallel as needed. After choosing a set of candidates, the Leader begins aggregation by splitting each report into report shares, one for each Aggregator. The Leader and Helper then run the aggregate initialization flow to accomplish two tasks:¶

Recover and determine which input report shares are valid.¶
For each valid report share, initialize the VDAF preparation process (see Section 5.2 of [VDAF]).¶

The Leader and Helper initialization behavior is detailed below.¶

Implementation note: the Leader will generally want to associate each report with a single aggregation job, as otherwise the duplicated reports will eventually be discarded as a replay. However, it is likely not appropriate to directly use the used-ID storage used for replay protection to determine which reports can be added to an aggregation job: certain errors (e.g. report_too_early) allow the report to be added to another aggregation job in the future; but storage into the used-ID storage is permanent.¶

4.6.1.1. Leader Initialization

The Leader begins the aggregate initialization by sampling a fresh AggregationJobID.¶

Next, for each report in the candidate set, it checks if the report ID corresponds to a report ID it has previously stored for this task. If so, it marks the report as invalid and removes it from the candidate set.¶

Next, the Leader decrypts each of its report shares as described in Section 4.6.1.3, then checks input share validity as described in Section 4.6.1.4. If either step invalidates the report, the Leader rejects the report and removes it from the set of candidate reports.¶

Next, for each report the Leader executes the following procedure:¶

(state, outbound) = Vdaf.ping_pong_leader_init(
    vdaf_verify_key,
    "dap-13" || task_id,
    agg_param,
    report_id,
    public_share,
    plaintext_input_share.payload)

where:¶

vdaf_verify_key is the VDAF verification key for the task¶
task_id is the task ID¶
agg_param is the VDAF aggregation parameter provided by the Collector (see Section 4.7)¶
report_id is the report ID, used as the nonce for VDAF sharding¶
public_share is the report's public share¶
plaintext_input_share is the Leader's PlaintextInputShare¶

The methods are defined in Section 5.8 of [VDAF]. This process determines the initial per-report state, as well as the initial outbound message to send to the Helper. If state is of type Rejected, then the report is rejected and removed from the set of candidate reports, and no message is sent to the Helper.¶

If state is of type Continued, then the Leader constructs a PrepareInit message structured as follows:¶

struct {
  ReportMetadata report_metadata;
  opaque public_share<0..2^32-1>;
  HpkeCiphertext encrypted_input_share;
} ReportShare;

struct {
  ReportShare report_share;
  opaque payload<0..2^32-1>;
} PrepareInit;

Each of these messages is constructed as follows:¶

report_share.report_metadata is the report's metadata.¶
report_share.public_share is the report's public share.¶
report_share.encrypted_input_share is the Helper's encrypted input share.¶
payload is set to the outbound message computed by the previous step.¶

It is not possible for state to be of type Finished during Leader initialization.¶

Once all the report shares have been initialized, the Leader creates an AggregationJobInitReq message structured as follows:¶

opaque BatchID[32];

struct {
  BatchMode batch_mode;
  opaque config<0..2^16-1>;
} PartialBatchSelector;

struct {
  opaque agg_param<0..2^32-1>;
  PartialBatchSelector part_batch_selector;
  PrepareInit prepare_inits<0..2^32-1>;
} AggregationJobInitReq;

This message consists of:¶

agg_param: The VDAF aggregation parameter.¶
part_batch_selector: The "partial batch selector" used by the Aggregators to determine how to aggregate each report. Its contents depends on the indicated batch mode: for time-interval mode, the config field is empty (Section 5.1); for leader-selected tasks, the Leader specifies a "batch ID" that determines the batch to which each report for this aggregation job belongs (Section 5.2).¶

Documents that define batch modes MUST specify the content this field; see Section 10 for details.¶

The indicated batch mode MUST match the task's batch mode. Otherwise, the Helper MUST abort with error invalidMessage.¶

This field is called the "partial" batch selector because, depending on the batch mode, it may not determine a batch. In particular, if the batch mode is time_interval, the batch is not determined until the Collector's query is issued (see Section 4.2).¶
prepare_inits: the sequence of PrepareInit messages constructed in the previous step.¶

Finally, the Leader sends an HTTP PUT request to {helper}/tasks/{task-id}/aggregation_jobs/{aggregation-job-id} with a media type of "application/dap-aggregation-job-init-req" and a body containing the AggregationJobInitReq.¶

The Leader MUST authenticate its requests to the Helper using a scheme that meets the requirements in Section 3.1.¶

The Helper responds with HTTP status 201 Created with a body containing an AggregationJobResp (see Section 4.6.1.2). If the status field is ready, the Leader proceeds onward. Otherwise, if the status field is processing, the Leader polls the aggregation job by sending GET requests to the URI indicated in the Location header field, until the status is ready. The Helper's response when processing SHOULD include a Retry-After header to suggest a polling interval to the Leader.¶

The AggregationJobResp.prepare_resps field must include exactly the same report IDs in the same order as the Leader's AggregationJobInitReq. Otherwise, the Leader MUST abort the aggregation job.¶

Otherwise, the Leader proceeds as follows with each report:¶

If the inbound prep response has type "continue", then the Leader computes¶
```
(state, outbound) = Vdaf.ping_pong_leader_continued(
    "dap-13" || task_id,
    agg_param,
    prev_state,
    inbound,
)
```
¶
where:¶
- task_id is the task ID¶
- agg_param is the VDAF aggregation parameter provided by the Collector (see Section 4.7)¶
- prev_state is the state computed earlier by calling Vdaf.ping_pong_leader_init or Vdaf.ping_pong_leader_continued¶
- inbound is the message payload in the PrepareResp¶
If outbound != None, then the Leader stores state and outbound and proceeds to Section 4.6.2.1. If outbound == None, then the preparation process is complete: either state == Rejected(), in which case the Leader rejects the report and removes it from the candidate set; or state == Finished(out_share), in which case preparation is complete and the Leader updates the relevant batch bucket with out_share as described in Section 4.6.2.3.¶
Else if the type is "reject", then the Leader rejects the report and removes it from the candidate set. The Leader MUST NOT include the report in a subsequent aggregation job, unless the error is report_too_early, in which case the Leader MAY include the report in a subsequent aggregation job.¶
Else the type is invalid, in which case the Leader MUST abort the aggregation job.¶

When the Leader updates a batch bucket based on out_share, it MUST also store the report ID for replay protection.¶

(Note: Since VDAF preparation completes in a constant number of rounds, it will never be the case that some reports are completed and others are not.)¶

If the Leader fails to process the response from the Helper, for example because of a transient failure such as a network connection failure or process crash, the Leader SHOULD re-send the original request unmodified in order to attempt recovery (see Section 4.6.2.4).¶

4.6.1.2. Helper Initialization

The Helper begins an aggregation job when it receives an AggregationJobInitReq message from the Leader. For each PrepareInit conveyed by this message, the Helper attempts to initialize VDAF preparation (see Section 5.1 of [VDAF]) just as the Leader does. If successful, it includes the result in its response that the Leader will use to continue preparing the report.¶

Upon receipt of an AggregationJobInitReq, the Helper checks if it recognizes the task ID. If not, then it MUST abort with error unrecognizedTask.¶

Next, the Helper checks that the report IDs in AggregationJobInitReq.prepare_inits are all distinct. If two preparation initialization messages have the same report ID, then the Helper MUST abort with error invalidMessage.¶

To process the aggregation job, the Helper computes an outbound prepare step for each report share. This includes the following structures:¶

enum {
  continue(0),
  finished(1)
  reject(2),
  (255)
} PrepareRespState;

enum {
  reserved(0),
  batch_collected(1),
  report_replayed(2),
  report_dropped(3),
  hpke_unknown_config_id(4),
  hpke_decrypt_error(5),
  vdaf_prep_error(6),
  task_expired(7),
  invalid_message(8),
  report_too_early(9),
  task_not_started(10),
  (255)
} ReportError;

struct {
  ReportID report_id;
  PrepareRespState prepare_resp_state;
  select (PrepareResp.prepare_resp_state) {
    case continue: opaque payload<0..2^32-1>;
    case finished: Empty;
    case reject:   ReportError report_error;
  };
} PrepareResp;

First, for each report in the request, the Helper MAY check if the report ID corresponds to a report ID it has previously stored for this task. If so, it rejects the report by setting the outbound preparation response to¶

variant {
  ReportID report_id;
  PrepareRespState prepare_resp_state = reject;
  ReportError report_error = report_replayed;
} PrepareResp;

where report_id is the report ID. Note that the Helper must perform this check before completing the aggregation job.¶

Next the Helper decrypts each of its remaining report shares as described in Section 4.6.1.3, then checks input share validity as described in Section 4.6.1.4. For any report that was rejected, the Helper sets the outbound preparation response to¶

variant {
  ReportID report_id;
  PrepareRespState prepare_resp_state = reject;
  ReportError report_error;
} PrepareResp;

where report_id is the report ID and report_error is the indicated error. For all other reports it initializes the VDAF prep state as follows (let inbound denote the payload of the prep step sent by the Leader):¶

(state, outbound) = Vdaf.ping_pong_helper_init(
    vdaf_verify_key,
    "dap-13" || task_id,
    agg_param,
    report_id,
    public_share,
    plaintext_input_share.payload)

where:¶

vdaf_verify_key is the VDAF verification key for the task¶
task_id is the task ID¶
agg_param is the VDAF aggregation parameter sent in the AggregationJobInitReq¶
report_id is the report ID¶
public_share is the report's public share¶
plaintext_input_share is the Helper's PlaintextInputShare¶

This procedure determines the initial per-report state, as well as the initial outbound message to send in response to the Leader. If state is of type Rejected, then the Helper responds with¶

variant {
  ReportID report_id;
  PrepareRespState prepare_resp_state = reject;
  ReportError report_error = vdaf_prep_error;
} PrepareResp;

Otherwise the Helper responds with¶

variant {
  ReportID report_id;
  PrepareRespState prepare_resp_state = continue;
  opaque payload<0..2^32-1> = outbound;
} PrepareResp;

If state == Continued(prep_state), then the Helper stores state to prepare for the next continuation step (Section 4.6.2.2).¶

If state == Finished(out_share), the Helper MUST resolve replay of the report. It does so by checking if the report ID was previously stored for this task. If so, it responds with¶

variant {
  ReportID report_id;
  PrepareRespState prepare_resp_state = reject;
  Reporterror report_error = report_replayed;
} PrepareResp;

Otherwise it stores the report ID for replay protection and updates the relevant batch bucket with out_share as described in Section 4.6.2.3.¶

Finally, the Helper creates an AggregationJobResp to send to the Leader. This message is structured as follows:¶

enum {
  processing(0),
  ready(1),
} AggregationJobStatus;

struct {
  AggregationJobStatus status;
  select (AggregationJobResp.status) {
    case processing: Empty;
    case ready:      PrepareResp prepare_resps<0..2^32-1>;
  };
} AggregationJobResp;

where prepare_resps are the outbound prep steps computed in the previous step. The order MUST match AggregationJobInitReq.prepare_inits.¶

The Helper responds to the Leader with HTTP status 201 Created, a body consisting of the AggregationJobResp, and the media type "application/dap-aggregation-job-resp".¶

Depending on the task parameters, processing an aggregation job may take some time, so the Helper MAY defer computation to a background process. It does so by responding with the field status set to processing and a Location header field set to the relative reference /tasks/{task-id}/aggregation_jobs/{aggregation-job-id}?step=0. The Leader then polls the Helper by making HTTP GET requests to the aforementioned Location. The Helper responds to GET requests with HTTP status 200 and the status field reflecting the current state of the job. When the aggregation job is processing, the response SHOULD include a Retry-After header field to suggest a polling interval to the Leader.¶

Changing an aggregation job's parameters is illegal, so further HTTP PUT requests to /tasks/{task-id}/aggregation_jobs/{aggregation-job-id} for the same aggregation-job-id but with a different AggregationJobInitReq in the body MUST fail with an HTTP client error status code. For further requests with the same AggregationJobInitReq in the body, the Helper SHOULD respond as it did for the original AggregationJobInitReq, or otherwise fail with an HTTP client error status code.¶

Additionally, it is not possible to rewind or reset the state of an aggregation job. Once an aggregation job has been continued at least once (see Section 4.6.2), further requests to initialize that aggregation job MUST fail with an HTTP client error status code.¶

Validating an input share will either succeed or fail. In the case of failure, the input share is marked as invalid with a corresponding report error.¶

Before beginning the preparation step, Aggregators are required to perform the following checks:¶

Check that the input share can be decoded as specified by the VDAF. If not, the input share MUST be marked as invalid with the error invalid_message.¶
Check if the report is too far into the future. Implementors can provide for some small leeway, usually no more than a few minutes, to account for clock skew. If a report is rejected for this reason, the Aggregator SHOULD mark the input share as invalid with the error report_too_early.¶
Check if the report's timestamp is before the task's task_start time. If so, the Aggregator MUST mark the input share as invalid with the error task_not_started.¶
Check if the report's timestamp is past the task's end time, given by task_start + task_duration. If so, the Aggregator MUST mark the input share as invalid with the error task_expired.¶
Check if the public or private report extensions contain any unrecognized report extension types. If so, the Aggregator MUST mark the input share as invalid with error invalid_message.¶
Check if any two public or private report extensions have the same extension type. (For the purposes of this check, a private report extension can conflict with a public report extension.) If so, the Aggregator MUST mark the input share as invalid with error invalid_message.¶
If the report pertains to a batch that was previously collected, then the input share MUST be marked as invalid with error batch_collected.¶
- Implementation note: The Leader considers a batch to be collected once it has completed a collection job for a CollectionJobReq message from the Collector; the Helper considers a batch to be collected once it has responded to an AggregateShareReq message from the Leader. A batch is determined by query (Section 4.2) conveyed in these messages. Queries must satisfy the criteria covered in Section 4.7.5. These criteria are meant to restrict queries in a way that makes it easy to determine whether a report pertains to a batch that was collected. See Section 6.4.2 for more information.¶
Finally, if an Aggregator cannot determine if an input share is valid, it MUST mark the input share as invalid with error report_dropped. For example, if the Aggregator has evicted the state required to perform the check from long-term storage. (See Section 6.4.1 for details.)¶

If all of the above checks succeed, the input share is not marked as invalid.¶

4.6.2. Aggregate Continuation

In the continuation phase, the Leader drives the VDAF preparation of each report in the candidate report set until the underlying VDAF moves into a terminal state, yielding an output share for both Aggregators or a rejection.¶

Whether this phase is reached depends on the VDAF: for 1-round VDAFs, like Prio3, processing has already completed. Continuation is required for VDAFs that require more than one round.¶

4.6.2.1. Leader Continuation

The Leader begins each step of aggregation continuation with a prep state object state and an outbound message outbound for each report in the candidate set.¶

The Leader advances its aggregation job to the next step (step 1 if this is the first continuation after initialization). Then it instructs the Helper to advance the aggregation job to the step the Leader has just reached. For each report the Leader constructs a preparation continuation message:¶

struct {
  ReportID report_id;
  opaque payload<0..2^32-1>;
} PrepareContinue;

where report_id is the report ID associated with state and outbound, and payload is set to the outbound message.¶

Next, the Leader sends a POST request to {helper}/tasks/{task-id}/aggregation_jobs/{aggregation-job-id} with media type "application/dap-aggregation-job-continue-req" and body structured as:¶

struct {
  uint16 step;
  PrepareContinue prepare_continues<0..2^32-1>;
} AggregationJobContinueReq;

The step field is the step of DAP aggregation that the Leader just reached and wants the Helper to advance to. The prepare_continues field is the sequence of preparation continuation messages constructed in the previous step. The PrepareContinues MUST be in the same order as the previous aggregate request.¶

The Leader MUST authenticate its requests to the Helper using a scheme that meets the requirements in Section 3.1.¶

The Helper responds with HTTP status 202 Accepted with a body containing an AggregationJobResp (see Section 4.6.1.2). If the status field is ready, the Leader proceeds onward. Otherwise, if the status field is processing, the Leader polls the aggregation job by sending GET requests to the URI indicated in the Location header field, until the status is ready. The Helper's response when processing SHOULD include a Retry-After header to suggest a polling interval to the Leader.¶

The response's prepare_resps MUST include exactly the same report IDs in the same order as the Leader's AggregationJobContinueReq. Otherwise, the Leader MUST abort the aggregation job.¶

Otherwise, the Leader proceeds as follows with each report:¶

If the inbound prep response type is "continue" and the state is Continued(prep_state), then the Leader computes¶
```
(state, outbound) = Vdaf.ping_pong_leader_continued(
    "dap-13" || task_id,
    agg_param,
    state,
    inbound,
)
```
¶
where task_id is the task ID and inbound is the message payload. If outbound != None, then the Leader stores state and outbound and proceeds to another continuation step. If outbound == None, then the preparation process is complete: either state == Rejected(), in which case the Leader rejects the report and removes it from the candidate set; or state == Finished(out_share), in which case preparation is complete and the Leader updates the relevant batch bucket with out_share as described in Section 4.6.2.3.¶
Else if the type is "finished" and state == Finished(out_share), then preparation is complete and the Leader stores the output share for use in the collection interaction (Section 4.7).¶
Else if the type is "reject", then the Leader rejects the report and removes it from the candidate set.¶
Else the type is invalid, in which case the Leader MUST abort the aggregation job.¶

When the Leader stores the out_share, it MUST also store the report ID for replay protection.¶

4.6.2.2. Helper Continuation

The Helper begins each step of continuation with a sequence of state objects, which will be Continued(prep_state), one for each report in the candidate set.¶

The Helper awaits an HTTP POST request to {helper}/tasks/{task-id}/aggregation_jobs/{aggregation-job-id} from the Leader, the body of which is an AggregationJobContinueReq as specified in Section 4.6.2.1.¶

Next, it checks that it recognizes the task ID. If not, then it MUST abort with error unrecognizedTask.¶

Next, it checks if it recognizes the indicated aggregation job ID. If not, it MUST abort with error unrecognizedAggregationJob.¶

Next, the Helper checks that:¶

the report IDs are all distinct¶
each report ID corresponds to one of the state objects¶
AggregationJobContinueReq.step is not equal to 0¶

If any of these checks fail, then the Helper MUST abort with error invalidMessage. Additionally, if any prep step appears out of order relative to the previous request, then the Helper MAY abort with error invalidMessage. (Note that a report may be missing, in which case the Helper should assume the Leader rejected it.)¶

Next, the Helper checks if the continuation step indicated by the request is correct. (For the first AggregationJobContinueReq the value should be 1; for the second the value should be 2; and so on.) If the Leader is one step behind (e.g., the Leader has resent the previous HTTP request), then the Helper MAY attempt to recover by sending the same response as it did for the previous AggregationJobContinueReq, without performing any additional work on the aggregation job. In this case it SHOULD verify that the contents of the AggregationJobContinueReq are identical to the previous message (see Section 4.6.2.4). Otherwise, if the step is incorrect or if the Helper does not wish to attempt recovery, the Helper MUST abort with error stepMismatch.¶

Let inbound denote the payload of the prep step. For each report, the Helper computes the following:¶

(state, outbound) = Vdaf.ping_pong_helper_continued(
    "dap-13" || task_id,
    agg_param,
    state,
    inbound,
)

where task_id is the task ID. If state == Rejected(), then the Helper's response is¶

variant {
  ReportID report_id;
  PrepareRespState prepare_resp_state = reject;
  ReportError report_error = vdaf_prep_error;
} PrepareResp;

If state == Continued(prep_state), then the Helper stores state to prepare for the next continuation step (Section 4.6.2.2).¶

If state == Finished(out_share), the Helper MUST resolve replay of the report. It does so by checking if the report ID was previously stored for this task. If so, it responds with¶

variant {
  ReportID report_id;
  PrepareRespState prepare_resp_state = reject;
  ReportError report_error = report_replayed;
} PrepareResp;

Otherwise it stores the report ID for replay protection and updates the relevant batch bucket with out_share as described in Section 4.6.2.3.¶

The Helper's response depends on the value of outbound. If outbound != None, then the Helper's response is¶

variant {
  ReportID report_id;
  PrepareRespState prepare_resp_state = continue;
  opaque payload<0..2^32-1> = outbound;
} PrepareResp;

Otherwise, if outbound == None, then the Helper's response is¶

variant {
  ReportID report_id;
  PrepareRespState prepare_resp_state = finished;
} PrepareResp;

The Helper constructs an AggregationJobResp message (see Section 4.6.1.2) with each prep step. The order of the prep steps MUST match the Leader's AggregationJobContinueReq.¶

The Helper responds to the Leader with HTTP status 200 OK, a body consisting of the AggregationJobResp, and the media type "application/dap-aggregation-job-resp".¶

Depending on the task parameters, processing an aggregation job may take some time, so the Helper MAY defer computation to a background process by responding with the field status set to processing and Location header field set to the relative reference /tasks/{task-id}/aggregation_jobs/{aggregation-job-id}?step={step}, where step is the step indicated in the AggregationJobContinueReq. If so, the Leader polls the Helper by making HTTP GET requests to the aforementioned Location. The Helper responds to GET requests with HTTP status 200 and the status field reflecting the current state of the job. When the aggregation job is processing, the response SHOULD include a Retry-After header field to suggest a polling interval to the Leader.¶

If for whatever reason the Leader must abandon the aggregation job, it SHOULD send an HTTP DELETE request to {helper}/tasks/{task-id}/aggregation_jobs/{aggregation-job-id} so that the Helper knows it can clean up its state.¶

4.6.2.3. Batch Buckets

When aggregation recovers an output share, it must be stored into an appropriate "batch bucket", which is defined in this section. The data stored in a batch bucket is kept for eventual use in the Section 4.7.¶

Batch buckets are indexed by a "batch bucket identifier" as as specified by the task's batch mode:¶

For the time-interval batch mode (Section 5.1, the batch bucket identifier is an interval of time and is determined by the report's timestamp.¶
For the leader-selected batch mode (Section 5.2), the batch bucket identifier is the batch ID and indicated in the aggregation job.¶

A few different pieces of information are associated with each batch bucket:¶

A VDAF-specific aggregate share, as defined in [VDAF].¶
A count of a number of reports included in the batch bucket.¶
A 32-byte checksum value, as defined below.¶

When processing an output share out_share, the following procedure is used to update the batch buckets:¶

Look up the existing batch bucket for the batch bucket identifier associated with the aggregation job and output share.¶
- If there is no existing batch bucket, initialize a new one. The initial aggregate share value is computed as Vdaf.agg_init(agg_param), where agg_param is the aggregation parameter associated with the aggregation job (see [VDAF], Section 4.4); the initial count is 0; and the initial checksum is 32 zero bytes.¶
Update the aggregate share agg_share to Vdaf.agg_update(agg_param, agg_share, out_share).¶
Increment the count by 1.¶
Update the checksum value to the bitwise-XOR of the checksum value with the SHA256 [SHS] hash of the report ID associated with the output share.¶

Implementation note: this section describes a single set of values associated with each batch bucket. However, implementations are free to shard the aggregate share/count/checksum values associated with the batch bucket, combining them back into a single set of values when reading the batch bucket. This may be useful to avoid write contention. The aggregate share values are combined using Vdaf.merge(agg_param, agg_shares) (see [VDAF], Section 4.4), the count values are combined by summing, and the checksum values are combined by bitwise XOR.¶

4.6.2.4. Recovering from Aggregation Step Skew

AggregationJobContinueReq messages contain a step field, allowing Aggregators to ensure that their peer is on an expected step of DAP aggregation. In particular, the intent is to allow recovery from a scenario where the Helper successfully advances from step n to n+1, but its AggregationJobResp response to the Leader gets dropped due to something like a transient network failure. The Leader could then resend the request to have the Helper advance to step n+1 and the Helper should be able to retransmit the AggregationJobResp that was previously dropped. To make that kind of recovery possible, Aggregator implementations SHOULD checkpoint the most recent step's prep state and messages to durable storage such that the Leader can re-construct continuation requests and the Helper can re-construct continuation responses as needed.¶

When implementing an aggregation step skew recovery strategy, the Helper SHOULD ensure that the Leader's AggregationJobContinueReq message did not change when it was re-sent (i.e., the two messages must be identical). This prevents the Leader from re-winding an aggregation job and re-running an aggregation step with different parameters.¶

One way the Helper could address this would be to store a digest of the Leader's request, indexed by aggregation job ID and step, and refuse to service a request for a given aggregation step unless it matches the previously seen request (if any).¶

4.7. Collecting Results

In this phase, the Collector requests aggregate shares from each Aggregator and then locally combines them to yield a single aggregate result. In particular, the Collector issues a query to the Leader (Section 4.2), which the Aggregators use to select a batch of reports to aggregate. Each Aggregator emits an aggregate share encrypted to the Collector so that it can decrypt and combine them to yield the aggregate result. This entire process is composed of two interactions:¶

Collect request and response between the Collector and Leader, specified in Section 4.7.1 ¶
Aggregate share request and response between the Leader and the Helper, specified in Section 4.7.2 ¶

Once complete, the Collector computes the final aggregate result as specified in Section 4.7.3.¶

This overall process is referred to as a "collection job".¶

4.7.1. Collection Job Initialization

First, the Collector chooses a collection job ID:¶

opaque CollectionJobID[16];

This ID value MUST be unique within the scope of the corresponding DAP task.¶

To initiate the collection job, the collector issues a PUT request to {leader}/tasks/{task-id}/collection_jobs/{collection-job-id}. The body of the request has media type "application/dap-collection-job-req", and it is structured as follows:¶

struct {
  BatchMode batch_mode;
  opaque config<0..2^16-1>;
} Query;

struct {
  Query query;
  opaque agg_param<0..2^32-1>; /* VDAF aggregation parameter */
} CollectionJobReq;

The named parameters are:¶

query, the Collector's query. The content of this field depends on the indicated batch mode: for time-interval mode, the config field contains the requested time interval (Section 5.1); for leader-selected mode, the config field is empty, and the Collector gets the aggregate result for the next batch selected by the Leader (Section 5.2).¶

Batch modes defined by future documents MUST specify the content of this field; see Section 10 for details.¶

The indicated batch mode MUST match the task's batch mode. Otherwise, the Leader MUST abort with error "invalidMessage".¶
agg_param, an aggregation parameter for the VDAF being executed. This is the same value as in AggregationJobInitReq (see Section 4.6.1.1).¶

Collectors MUST authenticate their requests to Leaders using a scheme that meets the requirements in Section 3.1.¶

Depending on the VDAF scheme and how the Leader is configured, the Leader and Helper may already have prepared a sufficient number of reports satisfying the query and be ready to return the aggregate shares right away. However, this is not always the case. In fact, for some VDAFs, it is not be possible to begin running aggregation jobs (Section 4.6) until the Collector initiates a collection job. This is because, in general, the aggregation parameter is not known until this point. In certain situations it is possible to predict the aggregation parameter in advance. For example, for Prio3 the only valid aggregation parameter is the empty string. For these reasons, the collection job is handled asynchronously. If aggregation is performed eagerly, the Leader MUST validate that the aggregation parameter received in the CollectionJobReq matches the aggregation parameter used in aggregations.¶

Upon receipt of a CollectionJobReq, the Leader begins by checking that it recognizes the task ID in the request path. If not, it MUST abort with error unrecognizedTask.¶

The Leader MAY further validate the request according to the requirements in Section 4.7.5 and abort with the indicated error, though some conditions such as the number of valid reports may not be verifiable while handling the CollectionJobReq message, and the batch will have to be re-validated later on regardless.¶

Changing a collection job's parameters is illegal, so further requests to PUT /tasks/{task-id}/collection_jobs/{collection-job-id} for the same collection-job-id but with a different CollectionJobReq in the body MUST fail with an HTTP client error status code.¶

The Leader responds to CollectionJobReqs with a CollectionJobResp, which is structured as follows:¶

enum {
  processing(0),
  ready(1),
} CollectionJobStatus;

struct {
  PartialBatchSelector part_batch_selector;
  uint64 report_count;
  Interval interval;
  HpkeCiphertext leader_encrypted_agg_share;
  HpkeCiphertext helper_encrypted_agg_share;
} Collection;

struct {
  CollectionJobStatus status;
  select (CollectionJob.status) {
    case processing: Empty;
    case ready:      Collection;
  }
} CollectionJobResp;

The body's media type is "application/dap-collection-job-resp". The Collection structure includes the following:¶

part_batch_selector: Information used to bind the aggregate result to the query. For leader-selected tasks, this includes the batch ID assigned to the batch by the Leader. The indicated batch mode MUST match the task's batch mode.¶
report_count: The number of reports included in the batch.¶
interval: The smallest interval of time that contains the timestamps of all reports included in the batch, such that the interval's start and duration are both multiples of the task's time_precision parameter. Note that in the case of a time-interval query (Section 5.1), this interval can be smaller than the one in the corresponding CollectionJobReq.query.¶
leader_encrypted_agg_share: The Leader's aggregate share, encrypted to the Collector (see Section 4.7.4).¶
helper_encrypted_agg_share: The Helper's aggregate share, encrypted to the Collector (see Section 4.7.4).¶

If the Leader finds the CollectionJobReq to be valid, it immediately responds with HTTP status 201 Created with a body containing a CollectionJobResp with the status field set to processing. The Leader SHOULD include a Retry-After header field to suggest a polling interval to the Collector.¶

After receiving the response to its CollectionJobReq, the Collector periodically makes HTTP GET requests /tasks/{task-id}/collection_jobs/{collection-job-id} to check on the status of the collect job and eventually obtain the result. The Leader responds to GET requests with HTTP status 200 and the status field reflecting the current state of the job. When the collection job is processing, the response SHOULD include a Retry-After header field to suggest a polling interval to the Collector.¶

The Leader then begins working with the Helper to aggregate the reports satisfying the query (or continues this process, depending on the VDAF) as described in Section 4.6.¶

The Leader first checks whether it can construct a batch for the collection job by applying the requirements in Section 4.7.5. If so, then the Leader obtains the Helper's aggregate share following the aggregate-share request flow described in Section 4.7.2. If not, it either aborts the collection job or tries again later, depending on which requirement in Section 4.7.5 was not met. If the Leader has a pending aggregation job that overlaps with the batch and aggregation parameter for the collection job, the Leader MUST first complete the aggregation job before proceeding and requesting an aggregate share from the Helper. This avoids a race condition between aggregation and collection jobs that can yield trivial batch mismatch errors.¶

Once both aggregate shares are successfully obtained, the Leader responds to subsequent HTTP GET requests with the status field set to ready and the Collection field populated with the encrypted aggregate shares. The Collector stops polling once receiving this final request.¶

If obtaining aggregate shares fails, then the Leader responds to subsequent HTTP GET requests to the collection job with an HTTP error status and a problem document as described in Section 3.2.¶

The Leader MAY respond with HTTP status 204 No Content to requests to a collection job if the results have been deleted.¶

The Collector can send an HTTP DELETE request to the collection job, which indicates to the Leader that it can abandon the collection job and discard all state related to it.¶

4.7.2. Obtaining Aggregate Shares

The Leader must compute its own aggregate share, as well as obtaining the Helper's encrypted aggregate share, before it can complete a collection job.¶

First, the Leader retrieves all batch buckets (Section 4.6.2.3) associated with this collection job. The batch buckets to retrieve depend on the batch mode of this task:¶

For time-interval (Section 5.1), this is all batch buckets whose batch bucket identifiers are contained within the batch interval specified in the CollectionJobReq's query.¶
For leader-selected (Section 5.2), this is the batch bucket associated with the batch ID the Leader has chosen for this collection job.¶

The Leader then combines the values inside the batch bucket as follows:¶

Aggregate shares are combined via Vdaf.merge(agg_param, agg_shares) ((see [VDAF], Section 4.4)), where agg_param is the aggregation parameter provided in the CollectionJobReq, and agg_shares are the (partial) aggregate shares in the batch buckets. The result is the final aggregate share for this collection job.¶
Report counts are combined via summing.¶
Checksums are combined via bitwise XOR.¶

Then the Leader sends a POST request to {helper}/tasks/{task-id}/aggregate_shares with the following message:¶

struct {
  BatchMode batch_mode;
  opaque config<0..2^16-1>;
} BatchSelector;

struct {
  BatchSelector batch_selector;
  opaque agg_param<0..2^32-1>;
  uint64 report_count;
  opaque checksum[32];
} AggregateShareReq;

The media type of the request is "application/dap-aggregate-share-req". The message contains the following parameters:¶

batch_selector: The "batch selector", the contents of which depends on the indicated batch mode: for time-interval mode, the config field contains the time interval selected by the Collector (Section 5.1); in leader-selected mode, the field contains the batch ID selected by the Leader (Section 5.2).¶

Future documents that new batch modes MUST specify the contents of the config field; see Section 10 for details.¶

The indicated batch mode MUST match the task's batch mode. Otherwise, the Helper MUST abort with "invalidMessage".¶
agg_param: The opaque aggregation parameter for the VDAF being executed. This value MUST match the AggregationJobInitReq message for each aggregation job used to compute the aggregate shares (see Section 4.6.1.1) and the aggregation parameter indicated by the Collector in the CollectionJobReq message (see Section 4.7.1).¶
report_count: The number number of reports included in the batch, as computed above.¶
checksum: The batch checksum, as computed above.¶

Leaders MUST authenticate their requests to Helpers using a scheme that meets the requirements in Section 3.1.¶

To handle the Leader's request, the Helper first ensures that it recognizes the task ID in the request path. If not, it MUST abort with error unrecognizedTask. The Helper then verifies that the request meets the requirements for batch parameters following the procedure in Section 4.7.5. The Helper MUST validate that the aggregation parameter matches the aggregation parameter used during aggregation of this batch; otherwise, it aborts with "invalidMessage".¶

Next, the Helper retrieves and combines the batch buckets associated with the request using the same process used by the Leader (as described at the beginning of this section Section 4.7.2), arriving at its final aggregate share, report count, and checksum values. If the Helper's computed report count and checksum values do not match the values provided in the AggregateShareReq, it MUST abort with an error of type "batchMismatch".¶

The Helper then encrypts agg_share under the Collector's HPKE public key as described in Section 4.7.4, yielding encrypted_agg_share. Encryption prevents the Leader from learning the actual result, as it only has its own aggregate share and cannot compute the Helper's.¶

The Helper responds to the Leader with HTTP status code 200 OK and a body consisting of an AggregateShare, with media type "application/dap-aggregate-share":¶

struct {
  HpkeCiphertext encrypted_aggregate_share;
} AggregateShare;

encrypted_aggregate_share.config_id is set to the Collector's HPKE config ID. encrypted_aggregate_share.enc is set to the encapsulated HPKE context enc computed above and encrypted_aggregate_share.ciphertext is the ciphertext encrypted_agg_share computed above.¶

The Helper's handling of this request MUST be idempotent. That is, if multiple identical, valid AggregateShareReqs are received, they should all yield the same response.¶

After receiving the Helper's response, the Leader uses the HpkeCiphertext to finalize a collection job (see Section 4.7.3).¶

Once an AggregateShareReq has been issued for the batch determined by a given query, it is an error for the Leader to issue any more aggregation jobs for additional reports that satisfy the query. These reports will be rejected by the Helper as described in Section 4.6.1.4.¶

Before completing the collection job, the Leader encrypts its aggregate share under the Collector's HPKE public key as described in Section 4.7.4.¶

4.7.3. Collection Job Finalization

Once the Collector has received a collection job from the Leader, it can decrypt the aggregate shares and produce an aggregate result. The Collector decrypts each aggregate share as described in Section 4.7.4. Once the Collector successfully decrypts all aggregate shares, it unshards the aggregate shares into an aggregate result using the VDAF's unshard algorithm. In particular, let leader_agg_share denote the Leader's aggregate share, helper_agg_share denote the Helper's aggregate share, let report_count denote the report count sent by the Leader, and let agg_param be the opaque aggregation parameter. The final aggregate result is computed as follows:¶

agg_result = Vdaf.unshard(agg_param,
                          [leader_agg_share, helper_agg_share],
                          report_count)

4.7.5. Batch Validation

When the Leader receives a Query in the request from the Collector during collection initialization (Section 4.7.1), it must first check that the batch determined by the query can be collected. It does so as described here. The Helper performs the same check when it receives a BatchSelector in the request from the Leader for its aggregate share (Section 4.7.2).¶

First, the Aggregator checks if the request (the CollectionJobReq for the Leader and the AggregateShareReq for the Helper) identifies a valid set of batch buckets (Section 4.6.2.3). If not, it MUST abort the request with "batchInvalid".¶

Next, the Aggregator checks that batch contains a valid number of reports. The Aggregator checks that len(X) >= min_batch_size, where X is the set of reports successfully aggregated into the batch and min_batch_size is the minimum batch size for the task. If this check fails, then Helpers MUST abort with an error of type "invalidBatchSize". Leaders SHOULD wait for more reports to be validated and try the collection job again later.¶

Next, the Aggregator checks that the batch has not been queried with multiple distinct aggregation parameters. If the batch has been queried with more than one distinct aggregation parameter, the Aggregator MUST abort with error of type "batchQueriedMultipleTimes".¶

Next, the Aggregator checks if the set of batch buckets identified by the request overlaps with the batch buckets that have already been collected. If so, it MUST abort with "batchOverlap".¶

Finally, the batch mode may define additional batch validation rules.¶

5. Batch Modes

This section defines an initial set of batch modes for DAP. New batch modes may be defined by future documents following the guidelines in Section 10.¶

Each batch mode specifies the following:¶

The value of the config field of Query, PartialBatchSelector, and BatchSelector¶
Batch buckets (Section 4.6.2.3): how reports are assigned to batch buckets; how each bucket is identified; and how batch buckets are mapped to batches and how batch buckets¶

5.1. Time Interval

enum {
  time_interval(1),
  (255)
} BatchMode;

The time-interval batch mode is designed to support applications in which reports are collected into batches grouped by an interval of time. The Collector specifies a "batch interval" that determines the time range for reports included in the batch. For each report in the batch, the time at which that report was generated (see Section 4.5) MUST fall within the batch interval specified by the Collector.¶

Typically the Collector issues queries for which the batch intervals are continuous, monotonically increasing, and have the same duration. For example, the sequence of batch intervals (1659544000, 1000), (1659545000, 1000), (1659546000, 1000), (1659547000, 1000) satisfies these conditions. (The first element of the pair denotes the start of the batch interval and the second denotes the duration.) However, this is not a requirement--the Collector may decide to issue queries out-of-order. In addition, the Collector may need to vary the duration to adjust to changing report upload rates.¶

5.1.1. Query Configuration

The payload of Query.config is¶

struct {
  Interval batch_interval;
} TimeIntervalQueryConfig;

where batch_interval is the batch interval requested by the Collector.¶

5.1.2. Partial Batch Selector Configuration

The payload of PartialBatchSelector.config is empty.¶

5.1.3. Batch Selector Configuration

The payload of BatchSelector.config is¶

struct {
  Interval batch_interval;
} TimeIntervalBatchSelectorConfig;

where batch_interval is the batch interval requested by the Collector.¶

5.1.4. Batch Buckets

Each batch bucket is identified by an interval of time whose beginning is a multiple of the task's time_precision and whose duration is equal to the task's time_precision. The identifier associated with a given report is the unique such interval containing the timestamp of the report; for example, if the task's time_precision is 1000 seconds and the report's timestamp is 1729629081, the relevant batch bucket identifier is (1729629000, 1000). The first element of the pair denotes the start of the batch interval and the second denotes the duration.¶

The Query received by the Leader determines a valid set of batch bucket identifiers if the following conditions hold (likewise for the BatchSelector received by the Helper):¶

batch_interval.duration >= time_precision (this field determines, effectively, the minimum batch duration)¶
both batch_interval.start and batch_interval.duration are divisible by time_precision¶

A batch consists of a sequence of contiguous batch buckets. That is, the set of batch buckets identifiers for the batch interval is (batch_interval.start, batch_interval.start + time_precision), (batch_interval.start + time_precision, batch_interval.start + 2*time_precision), ..., (batch_interval.start + batch_interval.duration - time_precision) + batch_interval.start + batch_interval.duration).¶

5.2. Leader-selected Batch Mode

enum {
  leader_selected(2),
  (255)
} BatchMode;

The leader-selected batch mode is used to support applications in which it is acceptable for reports to be batched in an arbitrary fashion by the Leader. Each batch of reports is identified by an opaque "batch ID". Both the reports included in each batch and the ID for each batch are allocated in an arbitrary fashion by the Leader.¶

The Collector will not know the set of batch IDs available for collection. To get the aggregate of a batch, the Collector issues a query, which does not include any information specifying a particular batch (see Section 4.2). The Leader selects a recent batch to aggregate. The Leader MUST select a batch that has not yet been associated with a collection job.¶

The Aggregators can output batches of any size that is larger than or equal to min_batch_size. The target batch size, if any, is implementation-specific, and may be equal to or greater than the minimum batch size. Deciding how soon batches should be output is also implementation-specific. Exactly sizing batches may be challenging for Leader deployments in which multiple, independent nodes running the aggregate interaction (see Section 4.6) need to be coordinated.¶

5.2.1. Query Configuration

They payload of Query.config is empty; the request merely indicates the Collector would like the next batch selected by the Leader.¶

5.2.2. Partial Batch Selector Configuration

The payload of PartialBatchSelector.config is:¶

struct {
  BatchID batch_id;
} LeaderSelectedPartialBatchSelectorConfig;

where batch_id is the batch ID selected by the Leader.¶

5.2.3. Batch Selector Configuration

The payload of BatchSelector.config is:¶

struct {
  BatchID batch_id;
} LeaderSelectedBatchSelectorConfig;

where batch_id is the batch ID selected by the Leader.¶

5.2.4. Batch Buckets

Each batch consists of a single bucket and is identified by the batch ID selected by the Leader. A report is assigned to the batch indicated by the PartialBatchSelector during aggregation.¶

6. Operational Considerations

The DAP protocol has inherent constraints derived from the tradeoff between privacy guarantees and computational complexity. These tradeoffs influence how applications may choose to utilize services implementing the specification.¶

6.1. Protocol Participant Capabilities

The design in this document has different assumptions and requirements for different protocol participants, including Clients, Aggregators, and Collectors. This section describes these capabilities in more detail.¶

6.1.1. Client Capabilities

Clients have limited capabilities and requirements. Their only inputs to the protocol are (1) the parameters configured out of band and (2) a measurement. Clients are not expected to store any state across any upload flows, nor are they required to implement any sort of report upload retry mechanism. By design, the protocol in this document is robust against individual Client upload failures since the protocol output is an aggregate over all inputs.¶

6.1.2. Aggregator Capabilities

Leaders and Helpers have different operational requirements. The design in this document assumes an operationally competent Leader, i.e., one that has no storage or computation limitations or constraints, but only a modestly provisioned Helper, i.e., one that has computation, bandwidth, and storage constraints. By design, Leaders must be at least as capable as Helpers, where Helpers are generally required to:¶

Support the aggregate interaction, which includes validating and aggregating reports; and¶
Publish and manage an HPKE configuration that can be used for the upload interaction.¶

In addition, for each DAP task, the Helper is required to:¶

Implement some form of batch-to-report index, as well as inter- and intra-batch replay mitigation storage, which includes some way of tracking batch report size. Some of this state may be used for replay attack mitigation. The replay mitigation strategy is described in Section 4.6.1.4.¶

Beyond the minimal capabilities required of Helpers, Leaders are generally required to:¶

Support the upload interaction and store reports; and¶
Track batch report size during each collect flow and request encrypted output shares from Helpers.¶

In addition, for each DAP task, the Leader is required to:¶

Implement and store state for the form of inter- and intra-batch replay mitigation in Figure 2. This requires storing the report IDs of all reports processed for a given task. Implementations may find it helpful to track additional information, like the timestamp, so that the storage used for anti-replay can be sharded efficiently.¶

6.1.3. Collector Capabilities

Collectors statefully interact with Aggregators to produce an aggregate output. Their input to the protocol is the task parameters, configured out of band, which include the corresponding batch window and size. For each collect invocation, Collectors are required to keep state from the start of the protocol to the end as needed to produce the final aggregate output.¶

Collectors must also maintain state for the lifetime of each task, which includes key material associated with the HPKE key configuration.¶

6.2. VDAFs and Compute Requirements

The choice of VDAF can impact the computation and storage required for a DAP task:¶

The runtime of VDAF sharding and preparation is related to the "size" of the underlying measurements. For example, the Prio3SumVec VDAF defined in Section 7 of [VDAF] requires each measurement to be a vector of the same length, which all parties need to agree on prior to VDAF execution. The computation required for such tasks increases linearly as a function of the chosen length, as each vector element must be processed in turn.¶
The runtime of VDAF preparation is related to the size of the aggregation parameter. For example for Poplar1 defined in Section 8 of [VDAF], preparation takes as input a sequence of so-called "candidate prefixes", and the amount of computation is linear in the number of prefixes.¶
The storage requirements for aggregate shares vary depending on the size of the measurements and/or the aggregation parameter.¶

To account for these factors, care must be taken that a DAP deployment can handle VDAF execution of all possible configurations for any tasks which the deployment may be configured for. Otherwise, an attacker may deny service by uploading many expensive reports to a suitably-configured VDAF.¶

The varying cost of VDAF computation means that Aggregators should negotiate reasonable limits for each VDAF configuration, out of band with the protocol. For example, Aggregators may agree on a maximum size for an aggregation job or on a maximum rate of incoming reports.¶

Applications which require computationally-expensive VDAFs can mitigate the computation cost of aggregation in a few ways, such as producing aggregates over a sample of the data or choosing a representation of the data permitting a simpler aggregation scheme.¶

6.3. Aggregation Utility and Soft Batch Deadlines

A soft real-time system should produce a response within a deadline to be useful. This constraint may be relevant when the value of an aggregate decreases over time. A missed deadline can reduce an aggregate's utility but not necessarily cause failure in the system.¶

An example of a soft real-time constraint is the expectation that input data can be verified and aggregated in a period equal to data collection, given some computational budget. Meeting these deadlines will require efficient implementations of the VDAF. Applications might batch requests or utilize more efficient serialization to improve throughput.¶

Some applications may be constrained by the time that it takes to reach a privacy threshold defined by a minimum number of reports. One possible solution is to increase the reporting period so more samples can be collected, balanced against the urgency of responding to a soft deadline.¶

6.4. Protocol-specific Optimizations

Not all DAP tasks have the same operational requirements, so the protocol is designed to allow implementations to reduce operational costs in certain cases.¶

6.4.1. Reducing Storage Requirements

In general, the Aggregators are required to keep state for tasks and all valid reports for as long as collection requests can be made for them. However, it is not necessary to store the complete reports. Each Aggregator only needs to store an aggregate share for each possible batch interval (for time-interval) or batch ID (for leader-selected), along with a flag indicating whether the aggregate share has been collected. This is due to the requirement that in the time-interval case, the batch interval respect the boundaries defined by the DAP parameters; and that in leader-selected case, a batch is determined entirely by a batch ID. (See Section 4.7.5.)¶

However, Aggregators are also required to implement several per-report checks that require retaining a number of data artifacts. For example, to detect replay attacks, it is necessary for each Aggregator to retain the set of report IDs of reports that have been aggregated for the task so far. Depending on the task lifetime and report upload rate, this can result in high storage costs. To alleviate this burden, DAP allows Aggregators to drop this state as needed, so long as reports are dropped properly as described in Section 4.6.1.4. Aggregators SHOULD take steps to mitigate the risk of dropping reports (e.g., by evicting the oldest data first).¶

Furthermore, the Aggregators must store data related to a task as long as the current time has not passed this task's end time task_start + task_duration. Aggregator MAY delete the task and all data pertaining to this task after the end time. Implementors SHOULD provide for some leeway so the Collector can collect the batch after some delay.¶

6.4.2. Distributed-systems and Synchronization Concerns

Various parts of a DAP implementation will need to synchronize in order to ensure correctness during concurrent operation. This section describes the relevant concerns and makes suggestions as to potential implementation tradeoffs.¶

The upload interaction requires the Leader to ignore uploaded reports with a duplicated ID, including concurrently-uploaded reports. This might be implemented by synchronization or via an eventually-consistent process. If the Leader wishes to alert the Client with a reportRejected error, synchronization will be necessary to ensure all but one concurrent request receive the error.¶
The Leader is responsible for generating aggregation jobs, and will generally want to place each report in exactly one aggregation job. (The only event in which a Leader will want to place a report in multiple aggregation jobs is if the Helper rejects the report with report_too_early, in which case the Leader can place the report into a later aggregation job.) This may require synchronization between different components of the system which are generating aggregation jobs. Note that placing a report into more than one aggregation job will result in a loss of throughput, rather than a loss of correctness, privacy, or robustness, so it is acceptable for implementations to use an eventually-consistent scheme which may rarely place a report into multiple aggregation jobs.¶
Aggregation is implemented as a sequence of aggregation steps by both the Leader and the Helper. The Leader must ensure that each aggregation job is only processed once concurrently, which may require synchronization between the components responsible for performing aggregation. The Helper must ensure that concurrent requests against the same aggregation job are handled appropriately, which requires synchronization between the components handling aggregation requests.¶
Aggregation requires checking and updating used-report storage as part of implementing replay protection. This must be done while processing the aggregation job, though which steps the checks are performed at is up to the implementation. The checks and storage require synchronization, so that if two aggregation jobs contianing the same report are processed, at most one instance of the report will be aggregated. However, the interaction with the used-report storage does not necessarily have to be synchronized with the processing and storage for the remainder of the aggregation process. For example, used-report storage could be implemented in a separate datastore than is used for the remainder of data storage, without any transactionality between updates to the two datastores.¶
The aggregation and collection interactions require synchronization to avoid modifying the aggregate of a batch after it has already been collected. Any reports being aggregated which pertain to a batch which has already been collected must fail with a batch_collected error; correctly determining this requires synchronizing aggregation with the completion of collection jobs (for the Leader) or aggregate share requests (for the Helper). Also, the Leader must complete all outstanding aggregation jobs for a batch before requesting aggregate shares from the Helper, again requiring synchronization between the Leader's collection and aggregation interactions. Further, the Helper must determine the aggregated report count and checksum of aggregated report IDs before responding to an aggregate share request, requiring synchronization between the Helper's collection and aggregation interactions.¶

7. Compliance Requirements

In the absence of an application or deployment-specific profile specifying otherwise, a compliant DAP application MUST implement the following HPKE cipher suite:¶

KEM: DHKEM(X25519, HKDF-SHA256) (see [HPKE], Section 7.1)¶
KDF: HKDF-SHA256 (see [HPKE], Section 7.2)¶
AEAD: AES-128-GCM (see [HPKE], Section 7.3)¶

8. Security Considerations

DAP aims to achieve the privacy and robustness security goals defined in Section 9 of [VDAF], even in the presence of an active attacker. It is assumed that the attacker may control the network and have the ability to control a subset of of Clients, one of the Aggregators, and the Collector for a given task.¶

In the presence of this adversary, there are some threats DAP does not defend against and which are considered outside of DAP's threat model. These are enumerated below, along with potential mitigations.¶

Attacks on robustness:¶

Aggregators can defeat robustness by emitting incorrect aggregate shares, by omitting reports from the aggregation process, or by manipulating the VDAF preparation process for a single report. DAP follows VDAF in providing robustness only if both Aggregators honestly follow the protocol.¶
Clients may affect the quality of aggregate results by reporting false measurements. A VDAF can only verify that a submitted measurement is valid, not that it is true.¶
An attacker can impersonate multiple Clients, or a single malicious Client can upload an unexpectedly-large number of reports, in order to skew aggregate results or to reduce the number of measurements from honest Clients in a batch below the minimum batch size. See Section 8.1 for discussion and potential mitigations.¶

Attacks on privacy:¶

Clients can intentionally leak their own measurements and compromise their own privacy.¶
Both Aggregators together can, purposefully or accidentally, share unencrypted input shares in order to defeat the privacy of individual reports. DAP follows VDAF in providing privacy only if at least one Aggregator honestly follows the protocol.¶

Attacks on other properties of the system:¶

Both Aggregators together can, purposefully or accidentally, share unencrypted aggregate shares in order to reveal the aggregation result for a given batch.¶
Aggregators, or a passive network attacker between the Clients and the Leader, can examine metadata such as HTTP client IP in order to infer which Clients are submitting reports. Depending on the particulars of the deployment, this may be used to infer sensitive information about the Client. This can be mitigated for the Aggregator by deploying an anonymizing proxy (see Section 8.4), or in general by requiring Clients to submit reports at regular intervals independently of the measurement value such that the existence of a report does not imply the occurrence of a sensitive event.¶
Aggregators can deny service by refusing to respond to collection requests or aggregate share requests.¶
Some VDAFs could leak information to either Aggregator or the Collector beyond what the protocol intended to learn. It may be possible to mitigate such leakages using differential privacy (Section 8.5).¶

8.1. Sybil Attacks

Several attacks on privacy or robustness involve malicious Clients uploading reports that are valid under the chosen VDAF but incorrect.¶

For example, a DAP deployment might be measuring the heights of a human population and configure a variant of Prio3 to prove that measurements are values in the range of 80-250 cm. A malicious Client would not be able to claim a height of 400 cm, but they could submit multiple bogus reports inside the acceptable range, which would yield incorrect averages. More generally, DAP deployments are susceptible to Sybil attacks [Dou02], especially when carried out by the Leader.¶

In this type of attack, the adversary adds to a batch a number of reports that skew the aggregate result in its favor. For example, sending known measurements to the Aggregators can allow a Collector to shrink the effective anonymity set by subtracting the known measurements from the aggregate result. The result may reveal additional information about the honest measurements, leading to a privacy violation; or the result may have some property that is desirable to the adversary ("stats poisoning").¶

Depending on the deployment and the specific threat being mitigated, there are different ways to address Sybil attacks, such as:¶

Implementing Client authentication, as described in Section 8.3, likely paired with rate-limiting uploads from individual Clients.¶
Removing Client-specific metadata on individual reports, such as through the use of anonymizing proxies in the upload flow, as described in Section 8.4.¶
Some mechanisms for differential privacy (Section 8.5) can help mitigate Sybil attacks against privacy to some extent.¶

8.2. Batch-selection Attacks

Depending on the batch mode, the privacy of an individual Client may be infringed upon by selection of the batch. For example, in the leader-selected batch mode, the Leader is free to select the reports that compose a given batch almost arbitrarily; a malicious Leader might choose a batch composed of reports arriving from a single client. The aggregate derived from this batch might then reveal information about that Client.¶

The mitigations for this attack are similar to those used for Sybil attacks (Section 8.1):¶

Implementing Client authentication, as described in Section 8.3, and having each aggregator verify that each batch contains reports from a suitable number of distinct clients.¶
Disassociating each report from the Client which generated it, via the use of anonymizing proxies (Section 8.4) or similar techniques.¶
Differential privacy (Section 8.5) can help mitigate the impact of this attack.¶
Deployment-specific mitigations may also be possible: for example, if every Client is sending reports at a given rate, it may be possible for aggregators to bound the accepted age of reports such that the number of aggregatable reports from a given Client is small enough to effectively mitigate this attack.¶

8.3. Client Authentication

In settings where it is practical for each Client to have an identity provisioned (e.g., a user logged into a backend service or a hardware device programmed with an identity), Client authentication can help Aggregators (or an authenticating proxy deployed between Clients and the Aggregators; see Section 8.4) ensure that all reports come from authentic Clients. Note that because the Helper never handles messages directly from the Clients, reports would need to include an extension (Section 4.5.3) to convey authentication information to the Helper. For example, a deployment might include a Privacy Pass token ([I-D.draft-ietf-privacypass-architecture-16]) in a report extension to allow both Aggregators to independently verify the Client's identity.¶

However, in some deployments, it will not be practical to require Clients to authenticate, so Client authentication is not mandatory in DAP. For example, a widely distributed application that does not require its users to log in to any service has no obvious way to authenticate its report uploads.¶

8.4. Anonymizing Proxies

Client reports can contain auxiliary information such as source IP, HTTP user agent, or Client authentication information (in deployments which use it, see Section 8.3). This metadata can be used by Aggregators to identify participating Clients or permit some attacks on robustness. This auxiliary information can be removed by having Clients submit reports to an anonymizing proxy server which would then use Oblivious HTTP [RFC9458] to forward reports to the DAP Leader. In this scenario, Client authentication would be performed by the proxy rather than any of the participants in the DAP protocol.¶

8.5. Differential Privacy

DAP deployments can choose to ensure their aggregate results achieve differential privacy ([Vad16]). A simple approach would require the Aggregators to add two-sided noise (e.g. sampled from a two-sided geometric distribution) to aggregate shares. Since each Aggregator is adding noise independently, privacy can be guaranteed even if all but one of the Aggregators is malicious. Differential privacy is a strong privacy definition, and protects users in extreme circumstances: even if an adversary has prior knowledge of every measurement in a batch except for one, that one measurement is still formally protected.¶

8.6. Task Parameters

Distribution of DAP task parameters is out of band from DAP itself and thus not discussed in this document. This section examines the security tradeoffs involved in the selection of the DAP task parameters. Generally, attacks involving crafted DAP task parameters can be mitigated by having the Aggregators refuse shared parameters that are trivially insecure (e.g., a minimum batch size of 1 report).¶

8.6.1. VDAF Verification Key Requirements

Knowledge of the verification key would allow a Client to forge a report with invalid values that will nevertheless pass verification. Therefore, the verification key must be kept secret from Clients.¶

Furthermore, for a given report, it may be possible to craft a verification key which leaks information about that report's measurement during VDAF preparation. Therefore, the verification key for a task SHOULD be chosen before any reports are generated. Moreover, it SHOULD be fixed for the lifetime of the task and not be rotated. One way to ensure that the verification key is generated independently from any given report is to derive the key based on the task ID and some previously agreed upon secret (verify_key_seed) between Aggregators, as follows:¶

verify_key = HKDF-Expand(
    HKDF-Extract(
        "verify_key",    # salt
        verify_key_seed, # IKM
    ),
    task_id,             # info
    VERIFY_KEY_SIZE,     # L
)

Here, VERIFY_KEY_SIZE is the length of the verification key, and HKDF-Extract and HKDF-Expand are as defined in [RFC5869].¶

This requirement comes from current security analysis for existing VDAFs. In particular, the security proofs for Prio3 require that the verification key is chosen independently of the generated reports.¶

8.6.2. Batch Parameters

An important parameter of a DAP deployment is the minimum batch size. If a batch includes too few reports, then the aggregate result can reveal information about individual measurements. Aggregators enforce the agreed-upon minimum batch size during collection, but implementations SHOULD also opt out of participating in a DAP task if the minimum batch size is too small. This document does not specify how to choose an appropriate minimum batch size, but an appropriate value may be determined from the differential privacy (Section 8.5) parameters in use, if any.¶

8.6.3. Task Configuration Agreement and Consistency

In order to execute a DAP task, it is necessary for all parties to ensure they agree on the configuration of the task. However, it is possible for a party to participate in the execution of DAP without knowing all of the task's parameters. For example, a Client can upload a report (Section 4.5) without knowing the minimum batch size that is enforced by the Aggregators during collection (Section 4.7).¶

Depending on the deployment model, agreement can require that task parameters are visible to all parties such that each party can choose whether to participate based on the value of any parameter. This includes the parameters enumerated in Section 4.3 and any additional parameters implied by report extensions Section 4.5.3 used by the task. Since meaningful privacy requires that multiple Clients contribute to a task, they should also share a consistent view of the task configuration.¶

8.7. Infrastructure Diversity

DAP deployments should ensure that Aggregators do not have common dependencies that would enable a single vendor to reassemble measurements. For example, if all participating Aggregators stored unencrypted input shares on the same cloud object storage service, then that cloud vendor would be able to reassemble all the input shares and defeat privacy.¶

9. IANA Considerations

This document requests registry of new media types (Section 9.1), creation of new codepoint registries (Section 9.2), and registration of an IETF URN sub-namespace (Section 9.3).¶

(RFC EDITOR: In the remainder of this section, replace "RFC XXXX" with the RFC number assigned to this document.)¶

9.1. Protocol Message Media Types

This specification defines the following protocol messages, along with their corresponding media types types:¶

HpkeConfigList Section 4.5.1: "application/dap-hpke-config-list"¶
Report Section 4.5.2: "application/dap-report"¶
AggregationJobInitReq Section 4.6.1.1: "application/dap-aggregation-job-init-req"¶
AggregationJobResp Section 4.6.1.2: "application/dap-aggregation-job-resp"¶
AggregationJobContinueReq Section 4.6.2.1: "application/dap-aggregation-job-continue-req"¶
AggregateShareReq Section 4.7.2: "application/dap-aggregate-share-req"¶
AggregateShare Section 4.7.2: "application/dap-aggregate-share"¶
CollectionJobReq Section 4.7.1: "application/dap-collection-job-req"¶
CollectionJobResp Section 4.7.1: "application/dap-collection-job-resp"¶

Protocol message format evolution is supported through the definition of new formats that are identified by new media types. The messages above are specific to this specification. When a new major enhancement is proposed that results in newer IETF specification for DAP, a new set of media types will be defined. In other words, newer versions of DAP will not be backward compatible with this version of DAP.¶

(RFC EDITOR: Remove this paragraph.) HTTP requests with DAP media types MAY express an optional parameter 'version', following Section 8.3 of [RFC9110]. Value of this parameter indicates current draft version of the protocol the component is using. This MAY be used as a hint by the receiver of the request to do compatibility checks between client and server. For example, A report submission to leader from a client that supports draft-ietf-ppm-dap-09 could have the header Media-Type: application/dap-report;version=09.¶

The "Media Types" registry at https://www.iana.org/assignments/media-types will be (RFC EDITOR: replace "will be" with "has been") updated to include each of these media types. The information required for each media type is listed in the remaining subsections.¶

9.1.1. "application/dap-hpke-config-list" media type

Type name:

application¶

Subtype name:

dap-hpke-config-list¶

Required parameters:

N/A¶

Optional parameters:

None¶

Encoding considerations:

only "8bit" or "binary" is permitted¶

Security considerations:

see Section 4.5 of the published specification¶

Interoperability considerations:

N/A¶

Published specification:

RFC XXXX¶

Applications that use this media type:

N/A¶

Fragment identifier considerations:

N/A¶

Additional information:

Magic number(s):: N/A¶
Deprecated alias names for this type:: N/A¶
File extension(s):: N/A¶
Macintosh file type code(s):: N/A¶

Person and email address to contact for further information:

see Authors' Addresses section of the published specification¶

Intended usage:

COMMON¶

Restrictions on usage:

N/A¶

Author:

see Authors' Addresses section of the published specification¶

Change controller:

IESG¶

9.1.2. "application/dap-report" media type

Type name:

application¶

Subtype name:

dap-report¶

Required parameters:

N/A¶

Optional parameters:

None¶

Encoding considerations:

only "8bit" or "binary" is permitted¶

Security considerations:

see Section 4.5 of the published specification¶

Interoperability considerations:

N/A¶

Published specification:

RFC XXXX¶

Applications that use this media type:

N/A¶

Fragment identifier considerations:

N/A¶

Additional information:

Magic number(s):: N/A¶
Deprecated alias names for this type:: N/A¶
File extension(s):: N/A¶
Macintosh file type code(s):: N/A¶

Person and email address to contact for further information:

see Authors' Addresses section of the published specification¶

Intended usage:

COMMON¶

Restrictions on usage:

N/A¶

Author:

see Authors' Addresses section of the published specification¶

Change controller:

IESG¶

9.1.3. "application/dap-aggregation-job-init-req" media type

Type name:

application¶

Subtype name:

dap-aggregation-job-init-req¶

Required parameters:

N/A¶

Optional parameters:

None¶

Encoding considerations:

only "8bit" or "binary" is permitted¶

Security considerations:

see Section 4.6 of the published specification¶

Interoperability considerations:

N/A¶

Published specification:

RFC XXXX¶

Applications that use this media type:

N/A¶

Fragment identifier considerations:

N/A¶

Additional information:

Magic number(s):: N/A¶
Deprecated alias names for this type:: N/A¶
File extension(s):: N/A¶
Macintosh file type code(s):: N/A¶

Person and email address to contact for further information:

see Authors' Addresses section of the published specification¶

Intended usage:

COMMON¶

Restrictions on usage:

N/A¶

Author:

see Authors' Addresses section of the published specification¶

Change controller:

IESG¶

9.1.4. "application/dap-aggregation-job-resp" media type

Type name:

application¶

Subtype name:

dap-aggregation-job-resp¶

Required parameters:

N/A¶

Optional parameters:

None¶

Encoding considerations:

only "8bit" or "binary" is permitted¶

Security considerations:

see Section 4.6 of the published specification¶

Interoperability considerations:

N/A¶

Published specification:

RFC XXXX¶

Applications that use this media type:

N/A¶

Fragment identifier considerations:

N/A¶

Additional information:

Magic number(s):: N/A¶
Deprecated alias names for this type:: N/A¶
File extension(s):: N/A¶
Macintosh file type code(s):: N/A¶

Person and email address to contact for further information:

see Authors' Addresses section of the published specification¶

Intended usage:

COMMON¶

Restrictions on usage:

N/A¶

Author:

see Authors' Addresses section of the published specification¶

Change controller:

IESG¶

9.1.5. "application/dap-aggregation-job-continue-req" media type

Type name:

application¶

Subtype name:

dap-aggregation-job-continue-req¶

Required parameters:

N/A¶

Optional parameters:

None¶

Encoding considerations:

only "8bit" or "binary" is permitted¶

Security considerations:

see Section 4.6 of the published specification¶

Interoperability considerations:

N/A¶

Published specification:

RFC XXXX¶

Applications that use this media type:

N/A¶

Fragment identifier considerations:

N/A¶

Additional information:

Magic number(s):: N/A¶
Deprecated alias names for this type:: N/A¶
File extension(s):: N/A¶
Macintosh file type code(s):: N/A¶

Person and email address to contact for further information:

see Authors' Addresses section of the published specification¶

Intended usage:

COMMON¶

Restrictions on usage:

N/A¶

Author:

see Authors' Addresses section of the published specification¶

Change controller:

IESG¶

9.1.8. "application/dap-collection-job-req" media type

Type name:

application¶

Subtype name:

dap-collection-job-req¶

Required parameters:

N/A¶

Optional parameters:

None¶

Encoding considerations:

only "8bit" or "binary" is permitted¶

Security considerations:

see Section 4.7 of the published specification¶

Interoperability considerations:

N/A¶

Published specification:

RFC XXXX¶

Applications that use this media type:

N/A¶

Fragment identifier considerations:

N/A¶

Additional information:

Magic number(s):: N/A¶
Deprecated alias names for this type:: N/A¶
File extension(s):: N/A¶
Macintosh file type code(s):: N/A¶

Person and email address to contact for further information:

see Authors' Addresses section of the published specification¶

Intended usage:

COMMON¶

Restrictions on usage:

N/A¶

Author:

see Authors' Addresses section of the published specification¶

Change controller:

IESG¶

9.1.9. "application/dap-collection-job-resp" media type

Type name:

application¶

Subtype name:

dap-collection-job-resp¶

Required parameters:

N/A¶

Optional parameters:

None¶

Encoding considerations:

only "8bit" or "binary" is permitted¶

Security considerations:

see Section 4.7 of the published specification¶

Interoperability considerations:

N/A¶

Published specification:

RFC XXXX¶

Applications that use this media type:

N/A¶

Fragment identifier considerations:

N/A¶

Additional information:

Magic number(s):: N/A¶
Deprecated alias names for this type:: N/A¶
File extension(s):: N/A¶
Macintosh file type code(s):: N/A¶

Person and email address to contact for further information:

see Authors' Addresses section of the published specification¶

Intended usage:

COMMON¶

Restrictions on usage:

N/A¶

Author:

see Authors' Addresses section of the published specification¶

Change controller:

IESG¶

9.2. DAP Type Registries

This document also requests creation of a new "Distributed Aggregation Protocol (DAP)" page. This page will contain several new registries, described in the following sections. All registries are administered under the Specification Required policy [RFC8126].¶

9.2.1. Batch Modes Registry

A new registry will be (RFC EDITOR: change "will be" to "has been") created called "Batch Mode Identifiers" for DAP batch modes (Section 4.2). This registry should contain the following columns:¶

Value:: The one-byte identifier for the batch mode¶
Name:: The name of the batch mode¶
Reference:: Where the batch mode is defined¶

The initial contents of this registry listed in Table 2.¶

Table 2: Initial contents of the Batch Mode Identifiers registry.
Value	Name	Reference
`0x00`	`reserved`	Section 4.2 of RFC XXXX
`0x01`	`time_interval`	Section 5.1 of RFC XXXX
`0x02`	`leader_selected`	Section 5.2 of RFC XXXX

9.2.2. Report Extension Registry

A new registry will be (RFC EDITOR: change "will be" to "has been") created called "Report Extension Identifiers" for extensions to the upload interaction (Section 4.5). This registry should contain the following columns:¶

Value:: The two-byte identifier for the upload extension¶
Name:: The name of the upload extension¶
Reference:: Where the upload extension is defined¶

The initial contents of this registry are listed in Table 3.¶

Table 3: Initial contents of the Report Extension Identifiers registry.
Value	Name	Reference
`0x0000`	`reserved`	RFC XXXX

9.2.3. Report Error Registry

A new registry will be (RFC EDITOR: change "will be" to "has been") created called "Report Error Identifiers" for reasons for rejecting reports during the aggregation interaction (Section 4.6.1.2).¶

Value:: The one-byte identifier of the report error¶
Name:: The name of the report error¶
Reference:: Where the report error is defined¶

The initial contents of this registry are listed below in Table 4.¶

Table 4: Initial contents of the Report Error Identifiers registry.
Value	Name	Reference
`0x00`	`reserved`	Section 4.6.1.2 of RFX XXXX
`0x01`	`batch_collected`	Section 4.6.1.2 of RFX XXXX
`0x02`	`report_replayed`	Section 4.6.1.2 of RFX XXXX
`0x03`	`report_dropped`	Section 4.6.1.2 of RFX XXXX
`0x04`	`hpke_unknown_config_id`	Section 4.6.1.2 of RFX XXXX
`0x05`	`hpke_decrypt_error`	Section 4.6.1.2 of RFX XXXX
`0x06`	`vdaf_prep_error`	Section 4.6.1.2 of RFX XXXX
`0x07`	`task_expired`	Section 4.6.1.2 of RFX XXXX
`0x08`	`invalid_message`	Section 4.6.1.2 of RFX XXXX
`0x09`	`report_too_early`	Section 4.6.1.2 of RFX XXXX
`0x10`	`task_not_started`	Section 4.6.1.2 of RFX XXXX

9.3. URN Sub-namespace for DAP (urn:ietf:params:ppm:dap)

The following value will be (RFC EDITOR: change "will be" to "has been") registered in the "IETF URN Sub-namespace for Registered Protocol Parameter Identifiers" registry, following the template in [RFC3553]:¶

Registry name:  dap

Specification:  RFC XXXX

Repository:  http://www.iana.org/assignments/dap

Index value:  No transformation needed.

The initial contents of this namespace are the types and descriptions in Table 1, with the Reference field set to RFC XXXX.¶

10. Extending this Document

The behavior of DAP may be extended or modified by future documents defining one or more of the following:¶

a new batch mode (Section 4.2)¶
a new report extension (Section 4.5.3)¶
a new report error (Section 4.6.1.2)¶
a new entry in the URN sub-namespace for DAP (Table 1)¶

Each of these requires registration of a codepoint or other value; see Section 9. No other considerations are required except in the following cases:¶

When a document defines a new batch mode, it MUST include a section titled "DAP Batch Mode Considerations" specifying the following:¶
- The value of the config field of Query, PartialBatchSelector, and BatchSelector¶
- Batch buckets (Section 4.6.2.3): how reports are assigned to batch buckets; how each bucket is identified; and how batch buckets are mapped to batches.¶
See Section 5 for examples.¶
When a document defines a new report extension, it SHOULD include in its "Security Considerations" section some discussion of how the extension impacts the security of DAP with respect to the threat model in Section 8.¶

11. References

11.1. Normative References

[HPKE]: Barnes, R., Bhargavan, K., Lipp, B., and C. Wood, "Hybrid Public Key Encryption", RFC 9180, DOI 10.17487/RFC9180, February 2022, <https://www.rfc-editor.org/rfc/rfc9180>.
[OAuth2]: Hardt, D., Ed., "The OAuth 2.0 Authorization Framework", RFC 6749, DOI 10.17487/RFC6749, October 2012, <https://www.rfc-editor.org/rfc/rfc6749>.
[RFC2119]: Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC3553]: Mealling, M., Masinter, L., Hardie, T., and G. Klyne, "An IETF URN Sub-namespace for Registered Protocol Parameters", BCP 73, RFC 3553, DOI 10.17487/RFC3553, June 2003, <https://www.rfc-editor.org/rfc/rfc3553>.
[RFC4648]: Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, <https://www.rfc-editor.org/rfc/rfc4648>.
[RFC5861]: Nottingham, M., "HTTP Cache-Control Extensions for Stale Content", RFC 5861, DOI 10.17487/RFC5861, May 2010, <https://www.rfc-editor.org/rfc/rfc5861>.
[RFC7807]: Nottingham, M. and E. Wilde, "Problem Details for HTTP APIs", RFC 7807, DOI 10.17487/RFC7807, March 2016, <https://www.rfc-editor.org/rfc/rfc7807>.
[RFC8126]: Cotton, M., Leiba, B., and T. Narten, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 8126, DOI 10.17487/RFC8126, June 2017, <https://www.rfc-editor.org/rfc/rfc8126>.
[RFC8174]: Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
[RFC8446]: Rescorla, E., "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, <https://www.rfc-editor.org/rfc/rfc8446>.
[RFC9110]: Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, June 2022, <https://www.rfc-editor.org/rfc/rfc9110>.
[RFC9111]: Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Caching", STD 98, RFC 9111, DOI 10.17487/RFC9111, June 2022, <https://www.rfc-editor.org/rfc/rfc9111>.
[RFC9457]: Nottingham, M., Wilde, E., and S. Dalal, "Problem Details for HTTP APIs", RFC 9457, DOI 10.17487/RFC9457, July 2023, <https://www.rfc-editor.org/rfc/rfc9457>.
[RFC9458]: Thomson, M. and C. A. Wood, "Oblivious HTTP", RFC 9458, DOI 10.17487/RFC9458, January 2024, <https://www.rfc-editor.org/rfc/rfc9458>.
[SHS]: "Secure hash standard", National Institute of Standards and Technology (U.S.), DOI 10.6028/nist.fips.180-4, 2015, <https://doi.org/10.6028/nist.fips.180-4>.
[VDAF]: Barnes, R., Cook, D., Patton, C., and P. Schoppmann, "Verifiable Distributed Aggregation Functions", Work in Progress, Internet-Draft, draft-irtf-cfrg-vdaf-13, 2 November 2024, <https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-vdaf-13>.

11.2. Informative References

[Dou02]: Douceur, J. R., "The Sybil Attack", International Workshop on Peer-to-Peer Systems (IPTPS), 2002, <https://doi.org/10.1007/3-540-45748-8_24>.
[I-D.draft-dcook-ppm-dap-interop-test-design-04]: Cook, D., "DAP Interoperation Test Design", Work in Progress, Internet-Draft, draft-dcook-ppm-dap-interop-test-design-04, 14 June 2023, <https://datatracker.ietf.org/doc/html/draft-dcook-ppm-dap-interop-test-design-04>.
[I-D.draft-ietf-privacypass-architecture-16]: Davidson, A., Iyengar, J., and C. A. Wood, "The Privacy Pass Architecture", Work in Progress, Internet-Draft, draft-ietf-privacypass-architecture-16, 25 September 2023, <https://datatracker.ietf.org/doc/html/draft-ietf-privacypass-architecture-16>.
[RFC5869]: Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand Key Derivation Function (HKDF)", RFC 5869, DOI 10.17487/RFC5869, May 2010, <https://www.rfc-editor.org/rfc/rfc5869>.
[Vad16]: Vadhan, S., "The Complexity of Differential Privacy", 2016, <https://privacytools.seas.harvard.edu/files/privacytools/files/complexityprivacy_1.pdf>.