The following limitations apply to Atlas Stream Processing:
The combined
state.stateSizeof all stream processing instances can't exceed 80% of the RAM available for a worker in the same SPI tier. For example, the maximum size of a stream processor in theSP30tier which has 8GB of RAM per worker, is 6.4GB. If thestate.stateSizeof any of your stream processors is approaching 80% of the RAM available for a worker in the same SPI tier, move up to the next SPI tier.When the 80% RAM threshold has been crossed, all stream processors fail with a
stream processing instance out of memoryerror. You can view thestate.stateSizevalue of each stream processor with thesp.processor.stats()command. See View Statistics of a Stream Processor to learn more.A stream processing instance can use only clusters in the same project as sources or sinks.
An Atlas Stream Processing pipeline definition cannot exceed 16 MB.
Only users with the
Project OwnerorAtlas adminroles can use Atlas Stream Processing.Atlas Stream Processing currently supports only the following connection types:
Connection TypeUsageSource or Sink
Atlas Database
Source or Sink
Sample Connection
Source Only
For Atlas Stream Processing using Apache Kafka as a $source, if the Apache Kafka topic acting as $source to the running processor adds a partition, Atlas Stream Processing continues running without reading the partition. The processor fails when it detects the new partition after you restore it from a checkpoint after a failure, or you restart it after stopping it. You must recreate the processors that read from topics with the newly added partitions.
Atlas Stream Processing currently supports only JSON-formatted data. It does not currently support alternative serializations such as Avro or Protocol Buffers.
For Apache Kafka connections, Atlas Stream Processing currently supports only the following security protocols:
SASL_PLAINTEXTSASL_SSLSSL
For
SASL, Atlas Stream Processing supports the following mechanisms:PLAINSCRAM-SHA-256SCRAM-SHA-512OAUTHBEARER
For
SSL, you must provide the following assets for your Apache Kafka system mutual TLS authentication with Atlas Stream Processing:a Certificate Authority (if you are using one other than the default Apache Kafka CA)
a client TLS certificate
a TLS keyfile, used to sign your TLS certificate
Atlas Stream Processing doesn't support $function JavaScript UDFs.
Atlas Stream Processing supports a subset of the Aggregation Pipeline Stages available in Atlas, allowing you to perform many of the same operations on streaming data that you can perform on data-at-rest. For a full list of supported Aggregation Pipeline Stages, see the Stream Aggregation documentation.
Atlas Stream Processing doesn't support the aggregation variables
$$NOW,$$CLUSTER_TIME,$$USER_ROLES, and$SEARCH_META.