Analytics (old portal)

This portal has been closed.  You will be redirected to the new portal at

Allow users to pub/sub to existing streams using a pub/sub operator

Within Streams Publish/Subscribe is simplification/enhancement of Import/Export that is designed to work across all programming models.
Thus (for example) an SPL application can publish a stream and a Python or Java application can subscribe to that stream to process the tuples.
A published stream has a topic, e.g. streamsx/transportation/nextbus/sfmuni/locations and a type which is one of:
    - Structured (SPL) schema, e.g. tuple<rstring busid, float64 latitude, float64 longitude, float64 speed>
    - String - each tuple is a string
    - JSON - each tuple is a JSON object
    - Binary- each tuple is binary data
    - XML - each tuple is an XML document
    - Java object (Java/Scala apps only)
    - Python object (Python apps only)
An application subscribes to published streams using a topic filter and a type, so it could be:
    streamsx/transportation/nextbus/sfmuni/locations  to select the above topic, or:
    streamsx/transportation/nextbus/+/locations  to select any published stream from different bus agencies, e.g. matching:
(topics are based upon the MQTT topic scheme).
Adding Publish/Subscribe to Streams Designer would provide seamless integration with existing SPL applications and allow Streams Designer to fit in the role of ad-hoc analytics integrating with "mission-critical" applications, the USG pattern described by Brian Williams in Austin.
From an implementation point of view:
    - Publish/Subscribe is simply invoking SPL operators
    - Available published streams can be determined from the Streams REST api - we have Python code that does this (for use in DSX), so a user can see the available streams.
Thus I imagine that the editor for subscribe could show a list of available topics (similar to MessageHub), maybe presenting them as "live feeds" ? The type of the schema is predefined from the published stream. It could also suggest wildcard topics, based upon available streams.
For publish the editor would allow the topic to be entered and the published type, e.g. the current schema, or convert to JSON.
  • Jacob Shelley
  • Dec 14 2018
  • Delivered
  • Attach files
  • Jacob Shelley commented
    14 Dec, 2018 05:59pm

    Linked with STREAMPIP-94. (

  • Daniel Debrunner commented
    14 Dec, 2018 05:59pm

    and one more:

    • No preconfiguration needed for topics (MH requires the topic be defined).
  • Dakshi Agrawal commented
    14 Dec, 2018 05:59pm

    On the downside - when Streams applications are connected using pub/sub operators, there are no guarantees of exactly once or at least once processing - it is back to at most processing which is unacceptable in most cases.   

  • Daniel Debrunner commented
    14 Dec, 2018 05:59pm

    Some differences to message hub:

    • It's builtin to streams - no need for a separate service
    • Direct TCP connection between applications with automatic reconnect
    • The schema of the stream is maintained and known, rather than MH being "just bytes"
    • Wildcards are supported for subscribing to multiple topics
    • It is "lossy" in that a connection failure will lead to dropped tuples. There is no synchronization between publishers and subscribers, so that a subscriber gets tuples from the moment it connects, any tuples previously published are not seen.
  • Jacob Shelley commented
    14 Dec, 2018 05:59pm

    Looking into this. I'd like to figure out how this compares to our ability to connect streams via MessageHub. I know this offers a lot in terms of efficiency. Reaching out to some folks now to see how this would compare to MH if they were trying to set up a connected pipeline with SD.

  • +1