Qbix Streams as a Graph Database

Qbix Streams: A Searchable, Distributed Graph Database Hiding in Plain Sight

When most people think of graph databases, they imagine Neo4j with its Cypher queries, or Firebase/Firestore for document search. These tools give you nodes, edges, and the ability to traverse relationships.

But what if you could get all of that — plus history, access control, federation, and SQL-speed search — without leaving the comfort of relational databases?

That’s exactly what the Qbix Streams architecture provides. And it turns out, it’s not just “another storage layer” — it’s a searchable, distributed graph database, deeply integrated with how communities, apps, and services actually evolve.


1. Streams as Nodes

Every unit of information in the system is a stream, identified globally by (publisherId, streamName). Examples:

  • Users/user/123
  • Assets/product/456
  • Places/user/location/95.90.209.105

Unlike a raw document or row in SQL, each stream:

  • Has type, attributes, messages, and history
  • Supports forking and merging, for moderation or customization
  • Is governed by access control lists (ACLs)
  • Can be hosted anywhere, but remains globally addressable

In graph terms: every stream is a node.


2. Relations as Edges

Streams don’t live in isolation. Their attributes automatically generate relations, which are directed edges of the graph, stored in normalized SQL tables:

  • streams_related_from (outgoing edges)
  • streams_related_to (incoming edges, pre-indexed for search, with weights)

Whenever a stream changes — say a user updates their location or a product gets tagged — the system runs syncRelations(), which recomputes edges. That means:

  • Edges are always consistent with node state
  • You never worry about stale indexes
  • Every relation is automatically queryable

Example:

  • Attribute: Places/geohash = u33dc
  • Relation: attribute/Places/geohash=u33dc
  • Target stream: Streams/search/all

That edge says: “this stream belongs to the search bucket, under this geohash label.”

In graph theory terms: relations are labeled directed edges.


3. SQL as the Query Engine

Most graph databases invent a new query language (Cypher, Gremlin). Streams doesn’t have to. Relations are stored in plain SQL tables with strong indexes. That means you can query them directly with SELECT, GROUP BY, and HAVING.

For example, here’s a search for all streams that are both in a specific geohash and in a credits category:

SELECT fromPublisherId, fromStreamName
FROM streams_related_to
WHERE toPublisherId = 'Streams'
  AND toStreamName = 'Streams/search/all'
  AND type IN ('attribute/Places/geohash=u33dc',
               'attribute/Assets/category=credits')
GROUP BY fromPublisherId, fromStreamName
HAVING COUNT(*) = 2;

That’s it. You just found all streams that match both conditions.


Faceted Search (Amazon-Style)

The same pattern powers e-commerce product filters. Every attribute (brand, color, price range, etc.) is stored as a relation. To get facet counts for all attributes at once:

SELECT type, COUNT(*) AS matches
FROM streams_related_to
WHERE toPublisherId = 'Streams'
  AND toStreamName = 'Streams/search/all'
  AND type LIKE 'attribute/Assets/%'
GROUP BY type;

Results might look like:

  • attribute/Assets/brand=Sony → 152 products
  • attribute/Assets/color=Black → 879 products
  • attribute/Assets/priceRange=200-500 → 65 products

When a user selects a filter (“Sony”), you simply add it to the IN (…) clause:

... AND type IN ('attribute/Assets/brand=Sony')
HAVING COUNT(*) = 1;

Selecting multiple filters (“Sony + Black”):

... AND type IN ('attribute/Assets/brand=Sony',
                 'attribute/Assets/color=Black')
HAVING COUNT(*) = 2;

The HAVING COUNT(*) = 2 ensures that only products matching both criteria are returned.


Sorting by Relevance

Sometimes you don’t just want matches — you want the best matches ranked by how many filters they satisfy. That’s one SQL tweak away:

SELECT fromPublisherId, fromStreamName, COUNT(*) as relevance
FROM streams_related_to
WHERE toPublisherId = 'Streams'
  AND toStreamName = 'Streams/search/all'
  AND type IN ('attribute/brand=Sony', 'attribute/color=Black', 'attribute/price<500')
GROUP BY fromPublisherId, fromStreamName
ORDER BY relevance DESC;

Multi-Valued Attributes

Some attributes are arrays — for example, a product can have multiple colors (["Black","Red"]) or a person can speak multiple languages (["English","Spanish"]). Each value is stored as a separate relation row:

  • attribute/Assets/color=Black
  • attribute/Assets/color=Red

This means queries work the same way: if you filter by color=Black, all products with Black in their array of colors will be included.


People Search (Dating-App Style)

In a dating app, user profiles publish attributes like:

  • attribute/Profile/height=5'10"
  • attribute/Profile/ageRange=25-30
  • attribute/Profile/religion=Jewish
  • attribute/Profile/languages=Spanish

A search for “Spanish-speaking, age 25-30” is the same as before:

... AND type IN ('attribute/Profile/languages=Spanish',
                 'attribute/Profile/ageRange=25-30')
HAVING COUNT(*) = 2;

Every filter is just another relation type. Adding or removing criteria is as simple as adjusting the IN (…) list, while HAVING COUNT(*) = n guarantees the match covers all selected filters.


Because relations are flattened into rows and indexed, you get:

  • Fast lookups (comparable to document stores)
  • Aggregations with counts (COUNT, SUM, HAVING) — something Firebase/Firestore can’t do natively
  • Support for scalar and array attributes without extra schema work
  • No custom runtime — just MySQL or MariaDB

In short: graph search at SQL speed, whether you’re finding people by height and age, or products by brand and color.


4. Distribution by Design

Unlike Neo4j or Firebase, which assume a single global DB, Streams was built for federation.

  • Every stream belongs to a publisherId (community, app, or user).
  • Publishers can host their streams anywhere.
  • Relations between publishers work seamlessly because the schema itself is global.

That makes Streams a multi-tenant graph database, where each community can govern its own data — but search still works across the network.


5. Beyond Graph: Streams Bring History & ACLs

Here’s where Streams leap beyond graph databases:

  • History: every change to a stream is logged as a message, so you can replay or audit evolution.
  • Forks: communities can fork streams (e.g., moderate user content) while preserving lineage.
  • ACLs: streams enforce read/write/admin levels out of the box.

Try doing that in Neo4j or Firebase — you’d have to bolt it on manually.


6. Example Queries That Just Work

  • Find people near me:
    Attribute relations: attribute/Places/geohash=u33d* → search by prefix

  • Find products and services in the same area:
    Both Assets/product/* and Assets/service/* streams relate to Streams/search/all by geohash

  • Find users with >100 credits who wrote an article:
    Two relations: attribute/Assets/credits>100, attribute/Assets/articleAuthor=true

  • Find dogs and people weighing 100–200 lbs:
    attribute/Animals/weight and attribute/Users/weight are just different edge labels in the same search bucket.

No joins across 20 different tables — just SQL on one relation index.


7. Comparison to Other Systems

Feature Streams Neo4j Firebase/Firestore
Nodes/edges Streams/relations Yes No (documents only)
Query language SQL (portable) Cypher (custom) Proprietary API
Federation Built-in (publisherId) No (single DB) No
History & forks Yes No No
ACLs Yes No Limited
Aggregations Native (COUNT, SUM, HAVING) Complex Weak
Hosting model Federated, self-hostable Centralized Cloud-only

Streams is not just another graph DB — it’s a graph + event log + ACL system, built on relational DBs, and distributed by design.


Conclusion

What started as a way to organize user data has evolved into something bigger: a graph database you didn’t know you had.

By treating streams as nodes and relations as edges, with automatic updates whenever streams change, the system creates a searchable, distributed graph on top of SQL. And because every stream also has history, forking, and ACLs, this isn’t just graph storage — it’s the backbone for decentralized, trustworthy apps.

If Neo4j is a graph DB, and Firebase is a document store, then Streams is a graph database for communities — one that was hiding in plain sight all along.