Qbix Streams: A Searchable, Distributed Graph Database Hiding in Plain Sight
When most people think of graph databases, they imagine Neo4j with its Cypher queries, or Firebase/Firestore for document search. These tools give you nodes, edges, and the ability to traverse relationships.
But what if you could get all of that — plus history, access control, federation, and SQL-speed search — without leaving the comfort of relational databases?
That’s exactly what the Qbix Streams architecture provides. And it turns out, it’s not just “another storage layer” — it’s a searchable, distributed graph database, deeply integrated with how communities, apps, and services actually evolve.
1. Streams as Nodes
Every unit of information in the system is a stream
, identified globally by (publisherId, streamName)
. Examples:
Users/user/123
Assets/product/456
Places/user/location/95.90.209.105
Unlike a raw document or row in SQL, each stream:
- Has type, attributes, messages, and history
- Supports forking and merging, for moderation or customization
- Is governed by access control lists (ACLs)
- Can be hosted anywhere, but remains globally addressable
In graph terms: every stream is a node.
2. Relations as Edges
Streams don’t live in isolation. Their attributes automatically generate relations
, which are directed edges of the graph, stored in normalized SQL tables:
-
streams_related_from
(outgoing edges) -
streams_related_to
(incoming edges, pre-indexed for search, with weights)
Whenever a stream changes — say a user updates their location or a product gets tagged — the system runs syncRelations()
, which recomputes edges. That means:
- Edges are always consistent with node state
- You never worry about stale indexes
- Every relation is automatically queryable
Example:
- Attribute:
Places/geohash = u33dc
- Relation:
attribute/Places/geohash=u33dc
- Target stream:
Streams/search/all
That edge says: “this stream belongs to the search bucket, under this geohash label.”
In graph theory terms: relations are labeled directed edges.
3. SQL as the Query Engine
Most graph databases invent a new query language (Cypher, Gremlin). Streams doesn’t have to. Relations are stored in plain SQL tables with strong indexes. That means you can query them directly with SELECT
, GROUP BY
, and HAVING
.
For example, here’s a search for all streams that are both in a specific geohash and in a credits category:
SELECT fromPublisherId, fromStreamName
FROM streams_related_to
WHERE toPublisherId = 'Streams'
AND toStreamName = 'Streams/search/all'
AND type IN ('attribute/Places/geohash=u33dc',
'attribute/Assets/category=credits')
GROUP BY fromPublisherId, fromStreamName
HAVING COUNT(*) = 2;
That’s it. You just found all streams that match both conditions.
Faceted Search (Amazon-Style)
The same pattern powers e-commerce product filters. Every attribute (brand, color, price range, etc.) is stored as a relation. To get facet counts for all attributes at once:
SELECT type, COUNT(*) AS matches
FROM streams_related_to
WHERE toPublisherId = 'Streams'
AND toStreamName = 'Streams/search/all'
AND type LIKE 'attribute/Assets/%'
GROUP BY type;
Results might look like:
-
attribute/Assets/brand=Sony
→ 152 products -
attribute/Assets/color=Black
→ 879 products -
attribute/Assets/priceRange=200-500
→ 65 products
When a user selects a filter (“Sony”), you simply add it to the IN (…)
clause:
... AND type IN ('attribute/Assets/brand=Sony')
HAVING COUNT(*) = 1;
Selecting multiple filters (“Sony + Black”):
... AND type IN ('attribute/Assets/brand=Sony',
'attribute/Assets/color=Black')
HAVING COUNT(*) = 2;
The HAVING COUNT(*) = 2
ensures that only products matching both criteria are returned.
Sorting by Relevance
Sometimes you don’t just want matches — you want the best matches ranked by how many filters they satisfy. That’s one SQL tweak away:
SELECT fromPublisherId, fromStreamName, COUNT(*) as relevance
FROM streams_related_to
WHERE toPublisherId = 'Streams'
AND toStreamName = 'Streams/search/all'
AND type IN ('attribute/brand=Sony', 'attribute/color=Black', 'attribute/price<500')
GROUP BY fromPublisherId, fromStreamName
ORDER BY relevance DESC;
Multi-Valued Attributes
Some attributes are arrays — for example, a product can have multiple colors (["Black","Red"]
) or a person can speak multiple languages (["English","Spanish"]
). Each value is stored as a separate relation row:
attribute/Assets/color=Black
attribute/Assets/color=Red
This means queries work the same way: if you filter by color=Black
, all products with Black in their array of colors will be included.
People Search (Dating-App Style)
In a dating app, user profiles publish attributes like:
attribute/Profile/height=5'10"
attribute/Profile/ageRange=25-30
attribute/Profile/religion=Jewish
attribute/Profile/languages=Spanish
A search for “Spanish-speaking, age 25-30” is the same as before:
... AND type IN ('attribute/Profile/languages=Spanish',
'attribute/Profile/ageRange=25-30')
HAVING COUNT(*) = 2;
Every filter is just another relation type. Adding or removing criteria is as simple as adjusting the IN (…)
list, while HAVING COUNT(*) = n
guarantees the match covers all selected filters.
Because relations are flattened into rows and indexed, you get:
- Fast lookups (comparable to document stores)
-
Aggregations with counts (
COUNT
,SUM
,HAVING
) — something Firebase/Firestore can’t do natively - Support for scalar and array attributes without extra schema work
- No custom runtime — just MySQL or MariaDB
In short: graph search at SQL speed, whether you’re finding people by height and age, or products by brand and color.
4. Distribution by Design
Unlike Neo4j or Firebase, which assume a single global DB, Streams was built for federation.
- Every stream belongs to a publisherId (community, app, or user).
- Publishers can host their streams anywhere.
- Relations between publishers work seamlessly because the schema itself is global.
That makes Streams a multi-tenant graph database, where each community can govern its own data — but search still works across the network.
5. Beyond Graph: Streams Bring History & ACLs
Here’s where Streams leap beyond graph databases:
- History: every change to a stream is logged as a message, so you can replay or audit evolution.
- Forks: communities can fork streams (e.g., moderate user content) while preserving lineage.
- ACLs: streams enforce read/write/admin levels out of the box.
Try doing that in Neo4j or Firebase — you’d have to bolt it on manually.
6. Example Queries That Just Work
-
Find people near me:
Attribute relations:attribute/Places/geohash=u33d*
→ search by prefix -
Find products and services in the same area:
BothAssets/product/*
andAssets/service/*
streams relate toStreams/search/all
by geohash -
Find users with >100 credits who wrote an article:
Two relations:attribute/Assets/credits>100
,attribute/Assets/articleAuthor=true
-
Find dogs and people weighing 100–200 lbs:
attribute/Animals/weight
andattribute/Users/weight
are just different edge labels in the same search bucket.
No joins across 20 different tables — just SQL on one relation index.
7. Comparison to Other Systems
Feature | Streams | Neo4j | Firebase/Firestore |
---|---|---|---|
Nodes/edges | Streams/relations | Yes | No (documents only) |
Query language | SQL (portable) | Cypher (custom) | Proprietary API |
Federation | Built-in (publisherId) | No (single DB) | No |
History & forks | Yes | No | No |
ACLs | Yes | No | Limited |
Aggregations | Native (COUNT, SUM, HAVING) | Complex | Weak |
Hosting model | Federated, self-hostable | Centralized | Cloud-only |
Streams is not just another graph DB — it’s a graph + event log + ACL system, built on relational DBs, and distributed by design.
Conclusion
What started as a way to organize user data has evolved into something bigger: a graph database you didn’t know you had.
By treating streams as nodes and relations as edges, with automatic updates whenever streams change, the system creates a searchable, distributed graph on top of SQL. And because every stream also has history, forking, and ACLs, this isn’t just graph storage — it’s the backbone for decentralized, trustworthy apps.
If Neo4j is a graph DB, and Firebase is a document store, then Streams is a graph database for communities — one that was hiding in plain sight all along.