Complex Event Processing with Triceps CEP v2.0

Developer's Guide

Sergey A. Babkin

All rights reserved.

This manual is a part of the Triceps project. It is covered by the same Triceps version of the LGPL v3 license as Triceps itself.

The author can be contacted by e-mail at <babkin@users.sf.net> or <sab123@hotmail.com>.

Many of the designations used by the manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this manual, and the author was aware of a trademark claim, the designations have been printed in caps or initial caps.

While every precaution has been taken in the preparation of this manual, the author assumes no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein.


Table of Contents

Preface
1. About the manual
2. Some concepts
1. The field of CEP
1.1. What is the CEP?
1.2. The uses of CEP
1.3. Surveying the CEP langscape
1.4. We're not in 1950s any more, or are we?
2. Enter Triceps
2.1. What led to it
2.2. Hello, world!
3. Building Triceps
3.1. Downloading Triceps
3.2. The reference environment
3.3. The basic build
3.4. Building the documentation
3.5. Running the examples and simple programs
3.6. Locale dependency
3.7. Installation of the Perl library
3.8. Installation of the C++ library
3.9. Disambiguation of the C++ library
3.10. Build configuration settings
4. API Fundamentals
4.1. Languages and layers
4.2. Errors, deaths and confessions
4.3. Memory management fundamentals
4.4. Code references and snippets
4.5. Triceps constants
4.6. Printing the object contents
4.7. The Hungarian notation
4.8. The Perl libraries and examples
5. Rows
5.1. Simple types
5.2. Row types
5.3. Row types equivalence
5.4. Rows
6. Labels and Row Operations
6.1. Labels basics
6.2. Label construction
6.3. Other label methods
6.4. Row operations
6.5. Opcodes
7. Scheduling
7.1. Introduction to the scheduling
7.2. Comparative scheduling in the various CEP systems
7.3. Execution unit basics
7.4. Trays
7.5. Error handling during the execution
7.6. No bundling
7.7. Topological loops
7.8. The main loop
7.9. Main loop with a socket
7.10. Tracing the execution
7.11. The gritty details of Triceps scheduling
7.12. The gritty details of Triceps loop scheduling
7.13. Recursion control
8. Memory Management
8.1. Reference cycles
8.2. Clearing of the labels
8.3. The clearing labels
9. Tables
9.1. Hello, tables!
9.2. Tables and labels
9.3. Basic iteration through the table
9.4. Deleting a row
9.5. A closer look at the RowHandles
9.6. A window is a FIFO
9.7. Secondary indexes
9.8. Sorted index
9.9. Ordered index
9.10. The index tree
9.11. Table and index type introspection
9.12. The copy tray
9.13. Table wrap-up
10. Templates
10.1. Comparative modularity
10.2. Template variety
10.3. Simple wrapper templates
10.4. Templates of interconnected components
10.5. Template options
10.6. Code generation in the templates
10.7. Result projection in the templates
11. Aggregation
11.1. The ubiquitous VWAP
11.2. Manual aggregation
11.3. Introducing the proper aggregation
11.4. Tricks with aggregation on a sliding window
11.5. Optimized DELETEs
11.6. Additive aggregation
11.7. Computation function arguments
11.8. Using multiple indexes
11.9. SimpleAggregator
11.10. The guts of SimpleAggregator
12. Joins
12.1. Joins variety
12.2. Hello, joins!
12.3. The lookup join, done manually
12.4. The LookupJoin template
12.5. Manual iteration with LookupJoin
12.6. The key fields of LookupJoin
12.7. A peek inside LookupJoin
12.8. JoinTwo joins two tables
12.9. The key field duplication in JoinTwo
12.10. The override options in JoinTwo
12.11. JoinTwo input event filtering
12.12. Self-join done with JoinTwo
12.13. Self-join done manually
12.14. Self-join done with a LookupJoin
12.15. A glimpse inside JoinTwo and the hidden options of LookupJoin
13. Time processing
13.1. Time-limited propagation
13.2. Periodic updates
13.3. The general issues of time processing
14. The other templates and solutions
14.1. The dreaded diamond
14.2. Collapsed updates
14.3. Large deletes in small chunks
15. Streaming functions
15.1. Introduction to streaming functions
15.2. Streaming functions by example, another version of Collapse
15.3. Collapse with grouping by key with streaming functions
15.4. Table-based translation with streaming functions
15.5. Streaming functions and loops
15.6. Streaming functions and pipelines
15.7. Streaming functions and tables
15.8. Streaming functions and template results
15.9. Streaming functions and recursion
15.10. Streaming functions and more recursion
15.11. Streaming functions and unit boundaries
15.12. The ways to call a streaming function
15.13. The gritty details of streaming functions scheduling
16. Multithreading
16.1. Triceps multithreading concepts
16.2. The Triead lifecycle
16.3. Multithreaded pipeline
16.4. Object passing between threads
16.5. Threads and file descriptors
16.6. Dynamic threads and fragments in a socket server
16.7. ThreadedServer implementation, and the details of thread harvesting
16.8. ThreadedClient, a Triceps Expect
16.9. Thread main loop and timeouts in the guts of ThreadedClient
16.10. The threaded dreaded diamond and data reordering
17. TQL, Triceps Trivial Query Language
17.1. Introduction to TQL
17.2. TQL syntax
17.3. TQL commands
17.4. TQL in a single-threaded server
17.5. TQL in a multi-threaded server
17.6. Internals of a TQL join
18. Performance
19. Triceps Perl API Reference
19.1. Unit and FrameMark reference
19.2. TableType reference
19.3. IndexType reference
19.4. AggregatorType reference
19.5. SimpleAggregator reference
19.6. Table reference
19.7. RowHandle reference
19.8. AggregatorContext reference
19.9. Opt reference
19.10. Fields reference
19.11. LookupJoin reference
19.12. JoinTwo reference
19.13. Collapse reference
19.14. Braced reference
19.15. FnReturn reference
19.16. FnBinding reference
19.17. AutoFnBind reference
19.18. App reference
19.18.1. App instance management
19.18.2. App resolution
19.18.3. App introspection
19.18.4. App harvester control
19.18.5. App state management
19.18.6. App drain control
19.18.7. App start timeout
19.18.8. File descriptor transfer through an App
19.18.9. App build
19.19. Triead reference
19.20. TrieadOwner reference
19.20.1. TrieadOwner construction
19.20.2. TrieadOwner general methods
19.20.3. TrieadOwner drains
19.20.4. TrieadOwner file interrruption
19.20.5. TrackedFile
19.21. Nexus reference
19.22. Facet reference
19.23. AutoDrain reference
20. Triceps C++ API Reference
20.1. C++ API Introduction
20.2. The const-ness in C++
20.3. Memory management in the C++ API and the Autoref reference
20.4. The many ways to do a copy
20.5. String utilities
20.6. Perl wrapping for the C++ objects
20.7. Error reporting and Errors reference
20.8. Exception reference
20.9. Initialization templates
20.10. Types reference
20.11. Simple types reference
20.12. RowType reference
20.13. Row and Rowref reference
20.14. TableType reference
20.15. NameSet reference
20.16. IndexType reference
20.17. Index reference
20.18. FifoIndexType reference
20.19. HashedIndexType reference
20.20. SortedIndexType reference
20.21. Gadget reference
20.22. Table reference
20.22.1. Data dump
20.22.2. Sticky errors
20.23. RowHandle and Rhref reference
20.24. Aggregator classes reference
20.24.1. AggregatorType reference
20.24.2. AggregatorGadget reference
20.24.3. Aggregator reference
20.24.4. BasicAggregatorType reference
20.24.5. Aggegator example
20.25. Unit reference
20.26. Unit Tracer reference
20.27. Label reference
20.28. Rowop reference
20.29. Tray reference
20.30. FrameMark reference
20.31. RowSetType reference
20.32. FnReturn reference
20.33. FnBinding reference
20.34. ScopeFnBind and AutoFnBind reference
20.35. App reference
20.36. Triead reference
20.37. TrieadOwner reference
20.38. Nexus reference
20.39. Facet reference
20.40. AutoDrain reference
20.41. Sigusr2 reference
20.42. TrieadJoin reference
20.43. FileInterrupt reference
20.44. BasicPthread reference
21. Release Notes
21.1. Release 2.0.0
21.2. Release 1.0.1
21.3. Release 1.0.0
21.4. Release 0.99
Bibliography
Index

List of Figures

6.1. Stateful elements with chained labels.
7.1. Labels forming a topological loop.
7.2. Proper calls in a loop.
9.1. Drawings legend.
9.2. One index type.
9.3. Straight nesting.
9.4. begin(), beginIdx($itA) and beginIdx($itB) work the same for this table.
9.5. findIdx($itA, $rh) goes through A and then switches to the beginIdx() logic.
9.6. firstOfGroupIdx($itB, $rh).
9.7. nextGroupIdx($itB, $rh).
9.8. Two top-level index types.
9.9. A primary and secondary index type.
9.10. Two index types nested under one.
14.1. The diamond topology.
15.1. The difference between the function and macro calls.
15.2. The query patterns and streaming functions.
16.1. Triceps multithreaded application.
16.2. Chat server internal structure.
17.1. Multithreaded TQL application structure.
19.1. The use of immediate import.

Preface

1. About the manual

Before starting on the subject of the Triceps CEP itself, I want to tell some things about the organization of this manual.

It had grown quite large, and if it were printed on paper, I would have divided it into at least three volumes. But in the electronic form it's more convenient as a single document, this way the cross-references between any parts of it work seamlessly.

The manual keeps living and growing together with Triceps itself. As things change in Triceps, they change in the manual, but sometimes it's difficult to track down and update all the mentions of the changed subject. I've been spending a huge effort on tracking all such instances down but sometimes things slip through. Keep this in mind and don't be too scared when some paragraph says something contradictory.

A known issue with this manual is that it tends to describe the subjects in the bottom-up fashion, starting from the low-level details and then building up to the high-level concepts. This is partially because the manual has been growing together with Triceps, which is being built from the ground up. And partially it's because I like the details. When I read about a product, I want to understand, how exactly it works. When I write, I want to convey this information. I rewrote some of the chapters to put the high-level descriptions up front. But it's a huge work that will take some time to complete for the whole manual. In the meantime, I'd rather not delay the releases for it, they've been already slowed a lot by the documentation work. So it will get better with time, and in the meantime, if you feel that some details are too much for you, feel free to skip over them.

There are great many other improvements that can be done to the manual, and they will eventually be done. But my take on it is that it's better to have an imperfect manual now than a perfect one in some distant future. It had already been too long in the works, writing the manual for the version 2.0 had taken a whole year.

2. Some concepts

When talking about the CEP programs, I often use the term model. What is a model? It's basically a CEP program. And more about the models and about what is the CEP itself is described in Chapter 1 .

Many of the examples are built around the world of stock trading. In the modern times almost everyone is probably familiar with the basics of this area. But if case if you're not, let me tell the most fundamental thing needed for understanding the examples: what is a symbol.

When the stock shares of some company are traded on an exchange, this company gets assigned a short identifier. This identifier is known as the stock symbol for this company. This word is also often used to mean not just the identifier but also the shares denoted by it. If a company has multiple classes of shares, each class would have its own symbol. And if a company is traded on multiple exchanges, each exchange may have its own identifier for its shares. The options and other derivative financial products also have their own symbols.

Chapter 1. The field of CEP

1.1. What is the CEP?

CEP stands for the Complex Event Processing. If you look at Wikipedia, it has separate articles for the Event Stream Processing and the Complex Event Processing. In reality it's all the same thing, with the naming driven by the marketing. I would not be surprised if someone invents yet another name, and everyone will start jumping on that bandwagon too.

In general a CEP system can be thought of as a black box, where the input events come in, propagate in some way through that black box, and come out as the processed output events. There is also an idea that the processing should happen fast, though the definitions of fast vary widely.

If we open the lid on the box, there are at least three ways to think of its contents:

  • a spreadsheet on steroids
  • a data flow machine
  • a database driven by triggers

Hopefully you've seen a spreadsheet before. The cells in it are tied together by formulas. You change one cell, and the machine goes and recalculates everything that depends on it. So does a CEP system. If we look closer, we can discern the CEP engine (which is like the spreadsheet software), the CEP model (like the formulas in the spreadheet) and the state (like the current values in the spreadsheet). An incoming event is like a change in an input cell, and the outgoing events are the updates of the values in the spreadsheet.

Only a typical CEP system is bigger: it can handle some very complicated formulas and many millions of records. There actually are products that connect the Excel spreadsheets with the behind-the-curtain computations in a CEP system, with the results coming back to the spreadsheet cells. Pretty much every commercial CEP provider has a product that does that through the Excel RT interface. The way these models are written are not exactly pretty, but the results are, combining the nice presentation of spreadsheets and the speed and power of CEP.

A data flow machine, where the processing elements are exchanging messages, is your typical academical look at CEP. The events represented as data rows are the messages, and the CEP model describes the connections between the processing elements and their internal logic. This approach naturally maps to the multiprocessing, with each processing element becoming a separate thread. The hiccup is that the research in the dataflow machines tends to prefer the non-looped topologies. The loops in the connections complicate the things.

And many real-world relational databases already work very similarly to the CEP systems. They have the constraints and triggers propagating these constraints. A trigger propagates an update on one table to an update on another table. It's like a formula in a spreasheet or a logical connection in a dataflow graph. Yet the databases usually miss two things: the propagation of the output events and the notion of being fast.

The lack of propagation of the output events is totally baffling to me: the RDBMS engines already write the output event stream as the redo log. Why not send them also in some generalized format, XML or something? Then people realize that yes, they do want to get the output events and start writing some strange add-ons and aftermarket solutions like the log scrubbers. This has been a mystery to me for some 15 years. I mean, how more obvious can it be? But nobody budges. Well, with the CEP systems gaining popularity and the need to connect them to the databases, I think it will eventually grow on the database vendors that a decent event feed is a competitive advantage, and I think it will happen somewhere soon.

The feeling of fast or lack thereof has to do with the databases being stored on disks. The growth of CEP has coincided with the growth in RAM sizes, and the data is usually kept completely in memory. People who deploy CEP tend to want the performance not of hundreds or thousands but hundreds of thousands events per second. The second part of fast is connected with the transactions. In a traditional RDBMS a single event with all its downstream effects is one transaction. Which is safe but may cause lots of conflicts. The CEP systems usually allow to break up the logic into multiple loosely-dependent layers, thus cutting on the overhead.

1.2. The uses of CEP

Despite what Wikipedia says (and honestly, the Wikipedia articles on CEP and ESP are not exactly connected with reality), the pattern detection is not your typical usage, by a wide, wide margin. The typical usage is for the data aggregation: lots and lots of individual events come in, and you want to aggregate them to keep a concise and consistent picture for the decision-making. The actual decision making can be done by humans or again by the CEP systems. It may involve some pattern recognition but usually even when it does, it doesn't look like patterns, it looks like conditions and joins on the historical chains of events.

The usage in the cases I know of includes the ad-click aggregation, the decisions to make a market trade, the watching whether the bank's end-of-day balance falls within the regulations, the choosing the APR for lending.

A related use would be for the general alert consoles. The data aggregation is what they do too. The last time I worked with it up close (around 2006), the processing in the BMC Patrol and Nagios was just plain inadequate for anything useful, and I had to hand-code the data collection and console logic. I've been touching this issue recently again at Google, and apparently nothing has changed much since then. All the real monitoring is done with the systems developed in-house.

But the CEP would have been just the ticket. I think, the only reason why it has not been widespread yet is that the commercial CEP licenses had cost a lot. But with the all-you-can-eat pricing of Sybase, and with the Open Source systems, this is gradually changing.

Well, and there is also the pattern matching. It has been lagging behind the aggregation but growing too.

1.3. Surveying the CEP langscape

What do we have in the CEP area now? The scene is pretty much dominated by Sybase (combining the former competitors Aleri and Coral8) and StreamBase.

There seem to be two major approaches to the execution model. One was used by Aleri, another by Coral8 and StreamBase. I'm not hugely familiar with StreamBase, but that's how it seems to me. Since I'm much more familiar with Coral8, I'll be calling the second model the Coral8 model. If you find StreamBase substantially different, let me know.

The Aleri idea is to collect and keep all the data. The relational operators get applied on the data, producing the derived data ("materialized views") and eventually the results. So, even though the Aleri models were usually expressed in XML (though an SQL compiler was also available), fundamentally it's a very relational and SQLy approach.

This creates a few nice properties. All the steps of execution can be pipelined and executed in parallel. For persistence, it's fundamentally enough to keep only the input data (what has been called BaseStreams and then SourceStreams), and all the derived computations can be easily reprocessed on restart (it's funny but it turns out that often it's faster to read a small state from the disk and recalculate the rest from scratch in memory than to load a large state from the disk).

It also has issues. It doesn't allow loops, and the procedural calculations aren't always easy to express. And keeping all the state requires more memory. The issues of loops and procedural computations have been addressed in Aleri by FlexStreams: modules that would perform the procedural computations instead of relational operations, written in SPLASH — a vaguely C-ish or Java-ish language. However this tends to break the relational properties: once you add a FlexStream, usually you do it for the reasons that prevent the derived calculations from being re-done, creating issues with saving and restoring the state. Mind you, you can write a FlexStream that doesn't break any of them, but then it would probably be doing something that can be expressed without it in the first place.

Coral8 has grown from the opposite direction: the idea has been to process the incoming data while keeping a minimal state in the variables and short-term windows (limited sliding recordings of the incoming data). The language (CCL) is very SQL-like. It relies on the state of variables and windows being pretty much global (module-wide), and allows the statements to be connected in loops. Which means that the execution order matters a lot. Which means that there are some quite extensive rules, determining this order. The logic ends up being very much procedural, but written in the peculiar way of SQL statements and connecting streams.

The good thing is that all this allows to control the execution order very closely and write things that are very difficult to express in the pure un-ordered relational operators. Which allows to aggregate the data early and creatively, keeping less data in memory.

The bad news is that it limits the execution to a single thread. If you want a separate thread, you must explicitly make a separate module, and program the communications between the modules, which is not exactly easy to get right. There are lots of people who do it the easy way and then wonder, why do they get the occasional data corruption. Also, the ordering rules for execution inside a module are quite tricky. Even for some fairly simple logic, it requires writing a lot of code, some of which is just bulky (try enumerating 90 fields in each statement), and some of which is tricky to get right.

The summary is that everything is not what it seems: the Aleri models aren't usually written in SQL but are very declarative in their meaning, while the Coral8/StreamBase models are written in an SQL-like language but in reality are totally procedural.

Sybase is also striking for a middle ground, combining the features inherited from Aleri and Coral8 in its CEP R5 and later: use the CCL language but relax the execution order rules to the Aleri level, except for the explicit single-threaded sections where the order is important. Include the SPLASH fragments for where the outright procedural logic is easy to use. Even though it sounds hodgy-podgy, it actually came together pretty nicely. Forgive me for saying so myself since I've done a fair amount of design and the execution logic implementation for it before I've left Sybase.

Still, not everything is perfect in this merged world. The SQLy syntax still requires you to drag around all your 90 fields into nearly every statement. The single-threaded order of execution is still non-obvious. It's possible to write the procedural code directly in SPLASH but the boundary where the data passes between the SQLy and C-ish code still has a whole lot of its own kinks (less than in Aleri but still a lot). And worst of all, there is still no modular programming. Yeah, there are modules but they are not really reusable. They are tied too tightly to the schema of the data. What is needed, is more like C++ templates. Only preferrably something more flexible and less difficult to debug than the C++ templates.

Let me elaborate a little on the point of dragging around all your fields. Here is a typical example: you have a stream of data and you want to pass through only the rows that find a match in some reference table. Which is reasonable to do with something like:

insert into filtered_data
select
  incoming_data.*
from
  incoming_data as d left join reference_table as r
  on d.key_field = r.key_field;

Only you can't write incoming_data.* in their syntax, you have to list every single field of it explicitly. If the data has 90 fields, that becomes quite a drag.

StreamBase does have modules with parametrizable arguments ( capture fields), somewhat like the C++ templates. The limitation is that you can say and carry any additional fields through unchanged but can't really specify subsets of fields for a particular usage (and use these fields as a key). Or at least that's my understanding. I haven't used it in practice and don't understand StreamBase too well.

1.4. We're not in 1950s any more, or are we?

Part of the complexity with CCL programming is that the CCL programs tend to feel very broken-up, with the flow of the logic jumping all over the place.

Consider a simple example: some incoming financial information may identify the securities by either RIC (Reuters identifier) or SEDOL or ISIN, and before processing it further we want to convert them all to ISIN (since the fundamentally same security may be identified in multiple ways when it's traded in multiple countries, ISIN is the common denominator).

This can be expressed in CCL approximately like this (no guarantees about the correctness of this code, since I don't have a compiler to try it out):

// the incoming data
create schema s_incoming (
  id_type string, // identifier type: RIC, SEDOL or ISIN
  id_value string, // the value of the identifier
  // add another 90 fields of payload...
);

// the normalized data
create schema s_normalized (
  isin string, // the identity is normalized to ISIN
  // add another 90 fields of payload...
);

// schema for the identifier translation tables
create schema s_translation (
  from string, // external id value (RIC or SEDOL)
  isin string, // the translation to ISIN
);

// the windows defining the translations from RIC and SEDOL to ISIN
create window w_trans_ric schema s_translation
  keep last per from;
create window w_trans_sedol schema s_translation
  keep last per from;

create input stream i_incoming schema s_incoming;
create stream incoming_ric  schema s_incoming;
create stream incoming_sedol  schema s_incoming;
create stream incoming_isin  schema s_incoming;
create output stream o_normalized schema s_normalized;

insert
  when id_type = 'RIC' then incoming_ric
  when id_type = 'SEDOL' then incoming_sedol
  when id_type = 'ISIN' then incoming_isin
select *
from i_incoming;

insert into o_normalized
select
  w.isin,
  i. ... // the other 90 fields
from
  incoming_ric as i join w_tranc_ric as w
    on i.id_value =  w.from;

insert into o_normalized
select
  w.isin,
  i. ... // the other 90 fields
from
  incoming_sedol as i join w_tranc_sedol as w
    on i.id_value =  w.from;

insert into o_normalized
select
  i.id_value,
  i. ... // the other 90 fields
from
  incoming_isin;

Not exactly easy, is it, even with the copying of payload data skipped? You may notice that what it does could also be expressed as procedural pseudo-code:

// the incoming data
struct s_incoming (
  string id_type, // identifier type: RIC, SEDOL or ISIN
  string id_value, // the value of the identifier
  // add another 90 fields of payload...
);

// schema for the identifier translation tables
struct s_translation (
  string from, // external id value (RIC or SEDOL)
  string isin, // the translation to ISIN
);

// the windows defining the translations from RIC and SEDOL to ISIN
table s_translation w_trans_ric
  key from;
table s_translation w_trans_sedol
  key from;

s_incoming i_incoming;
string isin;

if (i_incoming.id_type == 'RIC') {
  isin = lookup(w_trans_ric,
    w_trans_ric.from == i_incoming.id_value
  ).isin;
} elsif (i_incoming.id_type == 'SEDOL') {
  isin = lookup(w_trans_sedol,
    w_trans_sedol.from == i_incoming.id_value
  ).isin;
} elsif (i_incoming.id_type == 'ISIN') {
  isin = i_incoming.id_value;
}

if (isin != NULL) {
  output o_ normalized(isin,
    i_incoming.(* except (id_type, id_value))
  );
}

Basically, writing in CCL feels like programming in Fortran in the 50s: lots of labels, lots of GOTOs. Each stream is essentially a label, when looking from the procedural standpoint. It's actually worse than Fortran, since all the labels have to be pre-defined (with types!). And there isn't even the normal sequential flow, each statement must be followed by a GOTO, like on those machines with magnetic-drum main memory.

This is very much like the example in my book [Babkin10], in section 6.4. Queues as the sole synchronization mechanism. You can alook at the draft text online at http://web.newsguy.com/sab123/tpopp/06odata.txt. This similarity is not accidental: the CCL streams are queues, and they are the only communication mechanism in CCL.

The SQL statement structure also adds to the confusion: each statement has the destination followed by the source of the data, so each statement reads like it flows backwards.

Chapter 2. Enter Triceps

2.1. What led to it

It had happened that I've worked for a while on and with the Complex Event Processing (CEP) systems. I've worked for a few years on the internals of the Aleri CEP engine, then after Aleri acquired Coral8, some on the Coral8 engine, then after Sybase gobbled up them both, I've designed and did the early implementation of a fair bit of the Sybase CEP R5. After that I've moved on to Deutsche Bank and got the experience from the other side: using the CEP systems, primarily the former Coral8, now known as Sybase CEP R4.

This made me feel that writing the CEP models is unnecessarily difficult. Even the essentially simple things take too much effort. I've had this feeling before as well, but one thing is to have it in abstract, and another is to grind against it every day.

Which in turn led me to thinking about making my own Open Source CEP system, where I could try out the ideas I get, and make the streaming models easier to write. I aim to do better than the 1950's style, to bring the advances of the structured programming into the CEP world.

Thus the Triceps project was born. For a while it was called Biceps, until I've learned of the existence of a recearch project called BiCEP. It's spelled differently, and is in a substantially differnt area of CEP work, but it's easier to avoid confusion, so I went one better and renamed mine Triceps.

Since then I've moved on from DB, and I'm currently not using any CEP at work (though you never know what would happen), but Triceps has already gained momentum by itself.

The Triceps development has been largely shaped by two considerations:

  • It has to be different from the Sybase products on which I worked. This is helpful from both legal standpoint and from marketing standpoint: Sybase and StreamBase already have similar products that compete head to head. There is no use getting into the same fray without some major resources.
  • It has to be small. I can't spend the same amount of effort on Triceps as a large company, or even as a small one. Not only this saves time but also allows the modifications to be easy and fast. The point of Triceps is to experiment with the CEP language to make it easy to use: try out the ideas, make sure that they work well, or replace them with other ideas. The companies with a large established product can't really afford the radical changes: they have invested much effort into the product, and are stuck with supporting it and providing compatibility into the future.

Both of these considerations point into the same direction: an embeddable CEP system. Adapting an integrated system for an embedded usage is not easy, so it's a good open niche. Yeah, this niche is not empty either. There already is Esper. But from a cursory look, it seems to have the same issues as Coral8/StreamBase. It's also Java-centric, and Triceps is aimed for embeddability into different languages.

And an embeddable system saves on a lot of components.

For starters, no IDE. Anyway, I find the IDEs pretty useless for development in general, and especially for the CEP development. Though it comes handy once in a while for the analysis of the code and debugging.

No new language, no need to develop compilers, virtual machines, function libraries, external callout APIs. Well, the major goal of Triceps actually is the development of a new and better language. But it's one of these paradoxes: Aleri does the relational logic looking like procedural, Coral8 and StreamBase do the procedural logic looking like relational, and Triceps is a design of a language without a language. Eventually there probably will be a language, to be mixed with the parent one. But for now a lot can be done by simply using the Triceps library in an existing scripting language. The existing scripting languages are already powerful, fast, and also support the dynamic compilation.

No separate server executable, no need to control it, and no custom network protocols: the users can put the code directly into their executables and devise any protocols they please. Well, it's not a real good answer for the protocols, since it means that everyone who wants to communicate the streaming data for Triceps over the network has to implement these protocols from scratch. So eventually Triceps will provide a default implementation. But it doesn't have to be done right away.

No data persistence for now either. It's a nice feature, and I have some ideas about it too, but it requires a large amount of work, and doesn't really affect the API.

The language used to implement Triceps is C++, and the scripting language is Perl. Nothing really prevents embedding Triceps into other languages but it's not going to happen anywhere soon. The reason being that extra code adds weight and makes the changes more difficult.

The multithreading support has been a major consideration from the start. All the C++ code has been written with the multithreading in mind. However for the first release the multithreading did not propagate into the Perl API yet.

Even though Triceps is a system aimed for quick experimentation, that does not imply that it's of a toy quality. The code is written in production quality to start with, with a full array of unit tests. In fact, the only way you can do the quick experimentation is by setting up the proper testing from the scratch. The idea of move fast and break things is complete rubbish.

2.2. Hello, world!

Let's finally get to business: write a simple Hello, world! program with Triceps. Since Triceps is an embeddable library, naturally, the smallest Hello, world! program would be in the host language without Triceps, but it would not be interesting. So here is the a bit contrived but more interesting Perl program that passes some data through the Triceps machinery:

use Triceps;

$hwunit = Triceps::Unit->new("hwunit");
$hw_rt = Triceps::RowType->new(
  greeting => "string",
  address => "string",
);

my $print_greeting = $hwunit->makeLabel($hw_rt, "print_greeting", undef, sub {
  my ($label, $rowop) = @_;
  printf("%s!\n", join(', ', $rowop->getRow()->toArray()));
} );

$hwunit->call($print_greeting->makeRowop(&Triceps::OP_INSERT,
  $hw_rt->makeRowHash(
    greeting => "Hello",
    address => "world",
  )
));

What happens there? First, we import the Triceps module. Then we create a Triceps execution unit. An execution unit keeps the Triceps context and controls the execution for one logical thread.

The argument of the constructor is the name of the unit, that can be used in printing messages about it. It doesn't have to be the same as the name of the variable that keeps the reference to the unit, but it's a convenient convention to make the debugging easier. This is a common idiom of Triceps: when you create something, you give it a name. If any errors occur later with this object, the name will be present int the error message, and you'll be able to find easily, which object has the issue and where it was created.

If something goes wrong, the Triceps methods will confess. To be precise, call Carp::confess, which is like Perl's die but also prints the stack trace. Triceps also includes its own high-level call stack into this trace.

The next statement creates the type for rows. For the simplest example, one row type is enough. It contains two string fields. A row type does not belong to an execution unit. It may be used in parallel by multiple threads. Once a row type is created, it's immutable, and that's the story for pretty much all the Triceps objects that can be shared between multiple threads: they are created, they become immutable, and then they can be shared. (Of course, the containers that facilitate the passing of data between the threads would have to be an exception to this rule).

Then we create a label. The label is the Triceps term for the same kind of stream processing elements as in the other CEP systems. The Coral8 term for the same concept is stream. The SQLy vs procedural example in Section 1.4: “We're not in 1950s any more, or are we?” shows why these elements are analogs of labels in the procedural programming, and Triceps generally follows the procedural terminology.

Of course, now, in the days of the structured programming, we don't create labels for GOTOs all over the place. But we still use labels. The function names are essentially labels, the loops in Perl may have labels. So a Triceps label can often be seen kind of like a function definition, but only kind of. It takes a data row as a parameter and does something with it. But unlike a proper function it has no way to return the processed data back to the caller. It has to either pass the processed data to other labels or collect it in some hardcoded data structure, from which the caller can later extract it back. Thus a Triceps label is still much more like a GOTO label.

Triceps has the streaming functions too, where the caller does provide the way to return the result. These are more than the ordinary labels.

A basic label takes a row type for the rows it accepts, a name (again, purely for the ease of debugging) and a reference to a Perl function that will be handling the data. Extra arguments for the function can be specified as well, but there is no use for them in this example.

Here it's a simple unnamed Perl function. Though of course a reference to a named function can be used instead, and the same function may be reused for multiple labels. Whenever the label gets a row operation to process, its function gets called with the reference to the label object, the row operation object, and whatever extra arguments were specified at the label creation (none in this example). The example just prints a message combined from the data in the row.

Note that the label's handler function doesn't just get a row as an argument. It gets a row operation (rowop as it's called throughout the code). It's an important distinction. A row just stores some data. As the row gets passed around, it gets referenced and unreferenced, but it just stays the same until the last reference to it disappears, and then it gets destroyed. It doesn't know what happens with the data, it just stores them. A row may be shared between multiple threads. On the other hand, a row operation says take these data and do such and such a thing with them. A row operation is a combination of a row of data, an operation code, and a label that has to carry out the operation. Since the row operation object is also immutable, a reference to a row operation may be kept and reused again and again.

Triceps has the explicit operation codes, very much like Aleri/Sybase R5 (only Aleri doesn't differentiate between a row and row operation, every row there has an opcode in it). It might be just my background, but let me tell you: the CEP systems without the explicit opcodes are a pain. The visible opcodes make life a lot easier. However unlike Aleri, there is no UPDATE opcode. The available opcodes are INSERT, DELETE and NOP (no-operation). If you want to update something, you send two operations: first DELETE for the old value, then INSERT for the new value. All this will be described in more detail later.

For this simple example, the opcode doesn't really matter, so the label handler function quietly ignores it. It gets the row from the row operation and extracts the data from it into the Perl representation, then prints them. The Triceps row data may be represented in Perl in two ways: an array and a hash. In the array format, the array contains the values of the fields in the order they are defined in the row type. The hash format consists of name-value pairs, which may be stored either in an actual hash or in an array. The conversion from a row to a hash actually returns an array of values which becomes a real hash if it gets stored into a hash variable.

As a side note, this also suggests, how the systems without explicit opcodes came to be: they've been initially built on the simple stateless examples. And when the more complex examples have turned up, they've been aready stuck on this path, and could not afford too deep a retrofit.

The final part of the example is the creation of a row operation for our label, with an INSERT opcode and a row created from hash-formatted Perl data, and calling it through the execution unit. The row type provides a method to construct the rows, and the label provides a method to construct the row operations for it. The call() method of the execution unit does exactly what its name implies: it evaluates the label function right now, and returns after all its processing its done.

This is a very simple example, so it does only one call. The real Triceps programs get a stream of incoming data, and do the calls to handle each row of it.

Chapter 3. Building Triceps

3.1. Downloading Triceps

The official Triceps site is located at SourceForge.

http://triceps.sf.net is the high-level page.

http://sf.net/projects/triceps is the SourceForge project page.

The official releases of Triceps can be downloaded from SourceForge and CPAN. The CPAN location is:

http://search.cpan.org/~babkin/triceps/

The Developer's Guide can also be found in the Kindle format on Amazon web site, for the Amazon's minimal price of $1.

The release policy of Triceps is aimed towards the ease of development. As the new features are added (or sometimes removed), they are checked into the SVN repository and documented in the blog form at http://babkin-cep.blogspot.com/. Periodically the documentation updates are collected from the blog into this manual, and the official releases are produced.

If you want to try out the most bleeding-edge features that have been described on the blog but not officially released yet, you can get the most recent code directly from the SVN repository. The SVN code can be checked out with

svn co http://svn.code.sf.net/p/triceps/code/trunk

You don't need any login for check-out. You can keep it current with latest changes by periodically running svn update. After you've checked out the trunk, you can build it as usual. If you do have a login and SSH key, you can use then as well:

svn co svn+ssh://your_username@svn.code.sf.net/p/triceps/code/trunk

3.2. The reference environment

The tested reference build environment is where I do the Triceps development, and currently it is Linux Fedora 11. The build should work automatically on the other Linux systems as well, and the testing reports from CPAN show that it usually works.

The build should work on the other Unix environments too but may require some manual configuration for the available libraries. The test reports from CPAN show that the BSD varieties (FreeBSD, OpenBSD, MidnightBSD) usually do well.

Currently you must use the GNU Linux toolchain: GNU make, GNU C++ compiler (version 4.4.1 has been tested), glibc, valgrind. You can build without valgrind by running only the non-valgrind tests.

The older GNU compiler 4.1 and the newer compiler versions have been reported to work as well. But if you build the trunk code checked out from SVN (or otherwise in the directory named trunk), there is a catch with the warning flags. This kind of build treats almost all warnings as errors, and this causes varying results with the different compiler versions. The version 4.1 doesn't have the option -Wno-sign-conversion and will fail on it. The newer compiler versions may have some extra warnings that will be treated as errors (and since my reference compiler doesn't check for them, the code may trigger them). The fix for this situation is to edit cpp/Makefile.inc and change the variable CFLAGS_WARNINGS, or just clear it altogether. In the release form this is not an issue, in the release directory the warnings are not treated as errors and no warning options are used.

GCC 4.1 is also known to have complaints about the construct sizeof(field). I've modified the reported occurrences but more might creep up in the future. If this stops your build, change them to sizeof(TypeOfField).

The tested Perl versions are 5.10.0 and 5.19.0, and should work on any recent version as well. With the earlier versions your luck may vary. The Makefile.PL has been configured to require at least 5.8.0. The older versions have a different threading module and definitely won't work.

The threads support in the Perl interpreter is needed to run the multithreaded API. If your Perl is built without threads, the single-threaded part is still usable but all the tests related to multithreading will fail. The last version of Triceps with no threads support at all is 1.0.1, and it's the last resort if you want to run without threads.

I am interested in hearing the reports about builds in various environments.

The normal build expectation is for the 64-bit machines. The 32-bit machines should work (and the code even includes the special cases for them) but have been untested at the moment. Some of the tests might fail on the 32-bit and/or big-endian machines due to the different computation of the hash values, and thus producing a different row order in the result.

3.3. The basic build

If everything works, the basic build is simple, go to the Triceps directory and run:

make all
make test

That would build and test both the C++ and Perl portions of Triceps. The C++ libraries will be created under cpp/build. The Perl libraries will be created under perl/Triceps/blib.

The tests are normally run with valgrind for the C++ part, without valgrind for the Perl part. The reason is that Perl produces lots of false positives, and the suppressions depend on particular Perl versions and are not exactly reliable.

If your system differs substantially, you may need to adjust the configurable settings manually, since there is no ./configure script in the Triceps build yet. More information about them is in the Section 3.10: “Build configuration settings” .

The other interesting make targets are:

clean
Remove all the built files.
clobber
Remove the object files, forcing the libraries to be rebuilt next time.
vtest
Run the unit tests with valgrind, checking for leaks and memory corruption.
qtest
Run the unit tests quickly, without valgrind.
release
Export from SVN a clean copy of the code and create a release package. The package name will be triceps-version.tgz, where the version is taken from the SVN directory name, from where the current directory is checked out. This includes the build of the documentation.

3.4. Building the documentation

If you have downloaded the release package of Triceps, the documentation is already included it in the built form. The PDF and HTML versions are available in doc/pdf and doc/html. It is also available online from http://triceps.sf.net.

The documentation is formatted in DocBook, that produces the PDF and HTML outputs. If you check out the source from SVN and want to build the documentation, you need to download the DocBook tools needed to build it. I hate the dependency situations, when to build something you need to locate, build and download dozens of other packages firsti, and then the versions turn out to be updated, and don't want to work together, and all kinds of hell break loose. To make things easier, I've collected the set of packages that I've used for the build and that are known to work. They've collected in http://downloads.sourceforge.net/project/triceps/docbook-for-1.0/. The DocBook packages come originally from http://docbook.sf.net, plus a few extra packages that by now I forgot where I've got from. An excellent book on the DocBook tools and their configuration is [Stayton07]. And if you're interested, the text formatting in Docbook is described in [Walsh99].

DocBook is great in the way it takes cary of great many things automatically but configuring it is plainly a bitch. Fortunately, it's all already taken care of. I've reused the infrastructure I've built for my book [Babkin10] for Triceps. Though some elements got dropped and some added.

Downloading and extraction of the DocBook tools gets taken care of by running

make -C doc/dbtools

These tools are written in Java, and the packages are already the compiled binaries, so they don't need to be built. As long as you have the Java runtime environment, they just run. However like many Java packages, they are sloppy and often don't return the correct return codes on errors. So the results of the build have to be checked visually afterwards.

The build also uses Ghostscript for converting the figues from the EPS format. The luck with Ghostscript versions also varies. The version 8.70 works for me. I've seen some versions crash on this conversion. Fortunately, it was crashing after the conversion actually succeeded, so a workaround was to ignore the exit code from Ghostscript.

After the tools have been extracted, the build is done by

make -C doc/src

The temporary files are cleaned with

make -C doc/src cleanwork

The results will be in doc/pdf and doc/html.

If like me you plan to use the DocBook tools repeatedly to build the docs for different versions of Triceps, you can download and extract them once in some other directory and then set the exported variable TRICEPS_TOOLS_BASE to point to it.

3.5. Running the examples and simple programs

Overall, the examples live together with unit tests. The primary target language for Triceps is Perl, so the examples from the manual are the Perl examples located in perl/Triceps/t. The files with names starting with x contain the examples as such, like xWindow.t. Usually there are multiple related examples in the same file.

The examples as shown in the manual usually read the inputs from stdin and print their results on stdout. The actual examples in perl/Triceps/t are not quite exactly the same because they are plugged into the unit test infrastructure. The difference is limited to the input/output functions: rather than reading and writing on the stdin and stdout, they take the inputs from variables, put the results into variables, and have the results checked for correctness. This way the examples stay working and do not experience the bit rot when something changes.

Speaking of the examples outputs, the common convention in this manual is to show the lines entered from stdin as bold and the lines printed on stdout as regular font. This way they can be easily told apart, and the effects can be connected to their causes. Like this:

OP_INSERT,1,AAA,10,10
Contents:
  id="1" symbol="AAA" price="10" size="10"
lbAverage OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
Contents:
  id="1" symbol="AAA" price="10" size="10"
  id="3" symbol="AAA" price="20" size="20"
lbAverage OP_INSERT symbol="AAA" id="3" price="15"

The other unit tests in the .t files are interesting too, since they contain absolutely all the possible usages of everything, and can be used as a reference. However they tend to be much more messy and hard to read, exactly because they contain in them lots of tiny snippets that do everything.

The easiest way to start trying out your own small programs is to place them into the same directory perl/Triceps/t and run them from there. Just name them with the suffix .pl, so that they would not be picked up by the Perl unit test infrastructre (or if you do want to run them as a part of unit tests, use the suffix .t).

To make your programs find the Triceps modules, start them with

use ExtUtils::testlib;
use Triceps;
use Carp;

The module ExtUtils::testlib takes care of setting the include paths to find Triceps. You can run them from the parent directory, like:

perl t/xWindow.t

The parent directory is the only choice, since ExtUtils::testlib can not set up the include paths properly from the other directories.

3.6. Locale dependency

Some of the Perl tests depend on the locale. They expect the English text in some of the error strings received from the OS and Perl, so if you try to run them in a non-English locale, these tests fail.

To work around this issue, I've added LANG=C in the top-level Makefile, and when the tests run from there, they use this English locale.

However if you run make test directly in the perl/Triceps subdirectory, it has no such override (because the Makefile there is built by Perl). If you run the test from there and use a non-English locale, you'd have to set the locale for the command expicitly:

LANG=C make test

Some of these expected messages might also change between different OSes and between different versions of Perl. They seem pretty stable overall but you'd never known when something might change somewhere, and that would lead to the spurious failures that can be ignored. I'd be interested to learn of them, to support all known forms of messages in the future.

3.7. Installation of the Perl library

If you have the root permissions on the machine and want to install Triceps in the central location, just run

make -C perl/Triceps install

If you don't, there are multiple options. One is to create your private Perl hierarchy in the home directory. If you decide to put it into $HOME/inst, the installation there becomes

mkdir -p $HOME/inst
cp -Rf perl/Triceps/blib/* $HOME/inst/

You can then set the environment variable

export PERL5LIB=$HOME/inst/lib:$HOME/inst/arch

to have your private hierarchy prepended to the Perl's standard library path. You can then insert use Triceps; and the Triceps module will be found. If you want to have the man pages from that directory working too, set

export MANPATH=$HOME/inst:$MANPATH

Not that Triceps has any usable man pages at the moment.

However if you're building a package that uses Triceps and will be shipped to the customer and/or deployed to a production machine, placing the libraries into the home directory is still not the best idea. Not only you don't want to pollute the random home directories, you also want to make sure that your libraries get picked up, and not the ones that might happen to be installed on the machine from some other sources (because they may be of different versions, or completely different libraries that accidentaly have the same name).

The best idea then is to copy Triceps and all the other libraries into your distribution package, and have the binaries (including the scripts) find them by a relative path.

Suppose you build the package prototype in the $PKGDIR, with the binaries and scripts located in the subdirectory bin, and the Triceps library located in the subdirectory blib. When you build your package, you install the Triceps library in that prototype by

cp -Rf perl/Triceps/blib $PKGDIR/

Then this package gets archived, sent to the destination machine and unarchived. Whatever the package type, tar, cpio or rpm, doesn't matter. The relative paths under it stay the same. For example, if it gets installed under /opt/my_package, the directory hierarchy would look like this:

/opt/my_package
     +- bin
     |  +- my_program.pl
     +- blib
        +- ... Triceps stuff ...

The script my_program.pl can then use the following code at the top to load the Triceps package:

#!/usr/bin/perl

use File::Basename;

# This is the magic sequence that adds the relative include paths.
BEGIN {
  my $mypath = dirname($0);
  unshift @INC, "${mypath}/../blib/lib", "${mypath}/../blib/arch";
}

use Triceps;

It finds its own path from $0, by taking its directory name. Then it adds the relative directories for the Perl modules and XS shared libraries to the include path. And finally loads Triceps using the modified include path. Of course, more paths for more packages can be added as well. The script can also use that own directory (if saved into a global instead of my variable) to run the other programs later, find the configuration files and so on.

3.8. Installation of the C++ library

There are no special install scripts for the C++ libraries and includes. To build your C++ code with Triceps, simply specify the location of Triceps sources and built libraries with options -I and -L. For example, if you have built Triceps in $HOME/srcs/triceps-1.0.0, you can add the following to your Makefile:

TRICEPSBASE=$(HOME)/srcs/triceps-1.0.0
CFLAGS += -I$(TRICEPSBASE)/cpp -DTRICEPS_NSPR4
LDFLAGS += -L$(TRICEPSBASE)/cpp/build -ltriceps -lnspr4 -pthread

The Triceps include files expect that the Triceps C++ subdirectory is directly in the include path as shown.

The exact set of -D flags and extra -l libraries may vary with the Triceps configuration. To get the exact ones used in the configuration, run the special configuration make targets:

make --quiet -f cpp/Makefile.inc getconf
make --quiet -f cpp/Makefile.inc getxlib

The additions to CFLAGS are returned by getconf. The additional external libraries for LDFLAGS are returned by getxlib. It's important to use the same settings in the build of Triceps itself and of the user programs. The differing settings may cause the program to crash.

If you build your code with the dynamic library, the best packaging practice is to copy the libtriceps.so to the same directory where your binary is located and specify its location with the build flags (for GCC, the flags of other compilers may vary):

LDFLAGS += "-Wl,-rpath='$$ORIGIN/.'"

Or any relative path would do. For example, if your binary package contains the binaries in the subdirectory bin and the libraries in the subdirectory lib, the setting for the path of the libraries relative to the binaries will be:

LDFLAGS += "-Wl,-rpath='$$ORIGIN/../lib'"

But locating the binaries and the shared libraries won't work if Triceps and your program get ever ported to Windows. Windows searches for the DLLs only in the same directory.

Or it might be easier to build your code with the static library: just instead of -ltriceps, link explicitly with $(TRICEPSBASE)/cpp/build/libtriceps.a and the libraries it requires:

LDFLAGS += $(TRICEPSBASE)/cpp/build/libtriceps.a -lpthread -lnspr4

3.9. Disambiguation of the C++ library

A problem with the shared libraries is that you never know, which exact library will end up linked at run time. The system library path takes priority over the one specified in -rpath. So if somene has installed a Triceps shared library system-wide, it would be found and used instead of your one. And it might be of a completely different version. Or some other package might have messed with LD_LIBRARY_PATH in the user's .profile, and inserted its path with its own version of Triceps.

Messing with LD_LIBRARY_PATH is bad. The good solution is to give your libraries some unique name, so that it would not get confused. Instead of libtriceps.so, name it something like libtriceps_my_corp_my_project_v_123.so.

Triceps can build the libraries with such names directly. To change the name, edit cpp/Makefile.inc and change

LIBRARY := triceps

to

LIBRARY := triceps_my_corp_my_project_v_123

and it will produce the custom-named library. The Perl part of the build detects this name change automatically and still works (though for the Perl build it doesn't change much, the static C++ Triceps library gets linked into the XS-produced shared library).

There is also a special make target to get back the base name of the Triceps library:

make --quiet -f cpp/Makefile.inc getlib

The other potential naming conflict could happen with both shared and dynamic libraries. It appears when you want to link two different versions of the library into the same binary. This is needed rarely, but still needed. If nothing special is done, the symbol names in two libraries clash and nothing works. Triceps provides a way around it by having an opportunity to rename the C++ namespaces, instead of the default namespace Triceps. It can be done again by editing cpp/Makefile.inc and modifying the setting TRICEPS_CONF:

TRICEPS_CONF += -DTRICEPS_NS=TricepsMyVersion

Suppose that you have two Triceps versions that you want both to use in the same binary. Suppose that you are building them in $(HOME)/srcs/triceps-1.0.0 and $(HOME)/srcs/triceps-2.0.0.

Then you edit $(HOME)/srcs/triceps-1.0.0/cpp/Makefile.inc and put in there

TRICEPS_CONF += -DTRICEPS_NS=Triceps1

And in $(HOME)/srcs/triceps-2.0.0/cpp/Makefile.inc put

TRICEPS_CONF += -DTRICEPS_NS=Triceps2

If you use the shared libraries, you need to disambiguate their names too, as described above, but for the static libraries you don't have to.

Almost there, but you need to have your code use the different namespaces for different versions too. The good practice is to include in your files

#include <common/Conf.h>

and then use everywhere the Triceps namespace TRICEPS_NS instead of Triceps. Then as long as one source file deals with only one version of Triceps, it can be easily manipulated to which version to use by providing that version in the include path. And you get your program to work with two versions of Triceps by linking the object files produces from these source files together into one binary. Then you just build some of your files with -I$(HOME)/srcs/triceps-1.0.0/cpp and some with -I$(HOME)/srcs/triceps-2.0.0/cpp and avoid any conflicts or code changes.

At the link time, you will need to link with the libraries from both versions.

3.10. Build configuration settings

Since Triceps has only a very limited autoconfiguration yet, it may need to be configured manually for the target operating system. The same method is used for the build options.

The configuration options are set in the file cpp/Makefile.inc. The extra defines are added in TRICEPS_CONF, the extra library dependencies in TRICEPS_XLIB.

So far the only such configurable library dependency is the NSPR4 library. It's used for its implementation of the atomic integers and pointers. Normally the build attempts to auto-detect the location and name of the library and includes, or otherwise builds without it. Without it the code still works but uses a less efficient implementation of an integer or pointer protected by a mutex. If your system has a version of NSPR4 that doesn't get auto-detected, you can still enable it by changing the settings manually. For example, for Fedora Linux the auto-detected version amounts to the following settings:

TRICEPS_CONF += -DTRICEPS_NSPR -I/usr/include/nspr4
TRICEPS_XLIB += -lnspr4

-DTRICEPS_NSPR tells the code to compile with NSPR support enabled, and the other settings give the location of the includes and of the library.

The other build options require only the -D settings.

TRICEPS_CONF += -DTRICEPS_NS=TricepsMyVersion

Changes the namespace of Triceps.

TRICEPS_CONF += -DTRICEPS_BACKTRACE=false

Disables the use of the glibc stack backtrace library (it's a standard part of glibc nowadays but if you use a non-GNU libc, you might have to disable it). This library is used to make the messages on fatal errors more readable, and let you find the location of the error easier.

Chapter 4. API Fundamentals

4.1. Languages and layers

As mentioned before, at the moment Triceps provides the APIs in C++ and Perl. They are similar but not quite the same, because the nature of the compiled and scripted languages is different. The C++ API is more direct and expects discipline from the programmer: if some incorrect arguments are passed, everything might crash. The Perl API should never crash. It should detect any incorrect use and report an orderly error. Besides, the idioms of the scripted languages are different from the compiled languages, and different usages become convenient.

So far only the Perl API is documented in this manual. Its is considered the primary one for the end users, and also richer and easier to use. The C++ API will be documented as well, just it didn't make the cut for the version 1.0. If you're interested in the C++ API, read the Perl documentation first, to understand the ideas of Triceps, and then look in the source code. The C++ classes have very extensive comments in the header files.

The Perl API is implemented in XS. Some people, may wonder, why not SWIG? SWIG would automatically export the API into many languages, not just Perl. The problem with SWIG is that it just maps the API one-to-one. And this doesn't work any good, it makes for some very ugly APIs with abilities to crash from the user code. Which then have to be wrapped into more scripting code before they become usable. So then why bother with SWIG, it's easier to just use the scripting language's native extension methods. Another benefit of the native XS support is the access to the correct memory management.

In general, I've tried to avoid the premature optimization. The idea is to get it working at all first, and then bother about working fast. Except for the cases when the need for optimization looked obvious, and the logic intertwined with the general design strongly ehough, that if done one way, would be difficult to change in the future. We'll see, if these obvious cases really turn out to be the obvious wins, or will they become a premature-optimization mess.

There is usually more than one way to do something in Triceps. It has been written in layers: There is the C++ API layer on the bottom, then the Perl layer that closely parallels it, then more of the niceties built in Perl. There is more than one way to organize the manual, structuring it by features or by layers. Eventually I went in the order of the major features, discussing each one of them at various layers.

I've also tried to show, how these layers are built on top of each other and connected. Which might be too much detail for the first reading. If you feel that something is going over your head, just skim over it. It could be marked more clearly but I don't like this kind of marking. I hate the side-panels in the magazines. I like the text to flow smoothly and sequentially. I don't like the simplifications that distort the real meaning and add all kinds of confusion. I like having all the details I can get, and then I can skip over the ones that look too complicated (and read them again when they start making sense).

Also, a major goal of Triceps is the extendability. And the best way to learn how to extend it, is by looking up close at how it has already been extended.

4.2. Errors, deaths and confessions

When the Perl API of Triceps detects an error, it makes the interpreter die with an error message. Unless of course you catch it with eval. The message includes the call stack as the method Carp::confess() would. confess() is a very useful method that helps a lot with finding the source of the problem, it's much better than the plain die(). Triceps uses internally the methods from Carp to build the stack trace in the message. But it also does one better: it includes the stack of the Triceps label calls into the trace.

You are welcome to use confess directly as well, it's typically done in the following pattern:

&someFunction() or confess "Error message";
&someFunction() or confess "Error message: $!";

This is what the Triceps methods implemented in Perl do. The variable $! contains the error messages from the methods that deal with the system errors. To require the package with confess, do:

use Carp;

The full description of Carp is available at http://perldoc.perl.org/Carp.html. It has more functions, however I find the full stack trace the most helpful thing in any case.

There also are modules to make all the cases of die work like confess, Devel::SimpleTrace and Carp::Always. They work by intercepting the pseudo-signals __WARN__ and __DIE__. The logic of Carp::Always is pretty simple, see http://cpansearch.perl.org/src/FERREIRA/Carp-Always-0.11/lib/Carp/Always.pm, so if you're not feeling like installing the module, you can easily do the same directly in your code.

If you want to intercept the error to add more information to the message, use eval:

eval { $self->{unit}->call($rowop) }
  or confess "Bad rowop argument:\n$@";

I have some better ideas about reporting the errors in the nested templated but they need to be implemented and tried out yet.

A known problem with confess in a threaded program is that it leaks the scalars, apparently by leaving garbage on the Perl stack, even when intercepted with eval. It's actually not a problem when the confession is not intercepted, then the program exits anyway. But if confessing frequently and catching these confessions, the leak can accumulate to something noticeable.

The problem seems to be in the line

package DB;

in the middle of one of its internal functions. Perhaps changing the package in the middle of a function is not such a great idea, leaving some garbage on the stack. The most interesting part is that this line can be removed altogether, with no adverse effects, and then the leak stops. So be warned and don't be surprised. Maybe it will get fixed.

Now let's look at how the C++ parts of Triceps interact with confessions. When the Perl code inside a label or tracer or aggregator or index sorting handler dies, the C++ infrastructure around it catches the error. It unrolls the stack trace through the C++ code and passes the die request to the Perl code that called it. If that Perl code was called through another Triceps C++ code, that C++ code will catch the error and continue unrolling the stack and reporting back to Perl. When one Perl label calls another Perl label that calls the third Perl label, the call sequence goes in layers of Perl—C++—Perl—C++—Perl—C++—Perl. If that last label has its Perl code die and there are no evals in between, the stack will be correctly unwound back through all these layers and reported in the error message. The C++ code will include the reports of all the chained label calls as well. If one of the intermediate Perl layers wraps the call in eval, it will receive the error message with the stack trace up to that point.

More of the error handling details will be discussed later in Section 7.5: “Error handling during the execution” .

4.3. Memory management fundamentals

The memory is managed in Triceps using the reference counters. Each Triceps object has a reference counter in it. In C++ this is done explicitly, in Perl it gets mostly hidden behind the Perl memory management that also uses the reference counters. Mostly.

In C++ the Autoref template is used to produce the reference objects. The memory management at the C++ level is described in more detail in Section 20.3: “Memory management in the C++ API and the Autoref reference” . As the references are copied around between these objects, the reference counts in the target objects are automatically adjusted. When the reference count drops to 0, the target object gets destroyed. While there are live references, the object can't get destroyed from under them. All nice and well and simple, however still possible to get wrong.

The major problem with the reference counters is the reference cycles. If object A has a reference to object B, and object B has a reference (possibly, indirect) to object A, then neither of them will ever be destroyed. Many of these cases can be resolved by keeping a reference in one direction and a plain pointer in the other. This of course introduces the problem of hanging pointers, so extra care has to be taken to not reference them. There also are the unpleasant situations when there is absolutely no way around the reference cycles. For example, the Triceps label's method may keep a reference to the next label, where to send its processed results. If the labels are connected into a loop (a perfectly normal occurrence), this would cause a reference cycle. Here the way around is to know when all the labels are no longer used (before the thread exit), and explicitly tell them to clear their references to the other labels. This breaks up the cycle, and then bits and pieces can be collected by the reference count logic.

The reference cycle problem can be seen all the way up into the Perl level. However Triceps provides the ready solutions for its typical occurences. To explain it, more about Triceps operation has to be explained first, so it's described in detail later in Chapter 8: “Memory Management.

The reference counting may be single-threaded or multi-threaded. If an object may only be used inside one thread, the references to it use the faster single-threaded counting. In C++ it's real important to not access and not reference the single-threaded objects from multiple threads. In Perl, when a new thread is created, only the multithreaded objects from the parent thread become accessible for it, the rest become undefined, so the issue gets handled automatically (as of version 1.0 even the potentially multithreaded objects are still exported to Perl as single-threaded, with no connection between threads yet).

The C++ objects are exported into Perl through wrappers. The wrappers perform the adaptation between Perl reference counting and Triceps reference counting, and sometimes more of the helper functions. Perl sees them as blessed objects, from which you can inherit and otherwise treat like normal objects.

When we say that a Perl variable $label contains a Triceps label object, it really means that it contains a referece to a label object. When it gets copied like $label2 = $label, this copies the reference and now both variables refer to the same label object (more exactly, even to the same wrapper object). Any changes to the object's state done through one reference will also be visible through the other reference.

When the Perl references are copied between the variables, this increases the Perl reference count to the same wrapper object. However if an object goes into the C++ land, and then is extracted back (such as, create a Rowop from a Row, and then extract the Row from that Rowop), a brand new wrapper gets created. It's the same underlying C++ object but with multiple wrappers. You can't tell that it's the same object by comparing the Perl references, because they may be pointing to the different wrappers. However Triceps provides the method same() that compares the data inside the wrappers. It can be used as

$row1->same($row2)

and if it returns true, then both $row1 and $row2 point to the same underlying row.

Note also that if you inherit from the Triceps objects and add some extra data to them, none of that data nor even your derived class'es identity will be preserved when a new wrapper is created from the underlying C++ object.

4.4. Code references and snippets

Many of the Triceps Perl API objects accept the Perl code arguments, to be executed as needed. This code can be specified as either a function reference or a string containing the source code snippet. The major reason to accept the arguments in the source code format is the ability to pass them through between the threads, which cannot be done with the compiled code. See more information on that in Section 16.4: “Object passing between threads” .

Only a few of the classes can be exported between the threads but for consistency all the classes support the code arguments in either format. This feature is built into the general way the Triceps XS methods handle the code references.

The following examples are equivalent, one using a function reference, another using a source code snippet. Of course, if you know that the created object will be exported to another thread, you must use the source code format. Otherwise you can take your pick.

$it= Triceps::IndexType->newPerlSorted("b_c", undef,
sub {
  my $res = ($_[0]->get("b") <=> $_[1]->get("b")
    || $_[0]->get("c") <=> $_[1]->get("c"));
  return $res;
}
);

$it= Triceps::IndexType->newPerlSorted("b_c", undef,
'
  my $res = ($_[0]->get("b") <=> $_[1]->get("b")
    || $_[0]->get("c") <=> $_[1]->get("c"));
  return $res;
'
);

As you can see, when specifying the handler as source code, you must specify only the function body, and the sub { ... } will be wrapped around it implicitly. Including the sub would be an error.

There are other differences between the code references and the source code format:

When you compile a function, it carries with it the lexical context. So you can make the closures that refer to the my variables in their lexical scope. With the source code snippets you can't do this. The source code gets compiled in the context of the main package, and that's all they can see. In some cases, it might not even be compiled immediately. If an object has an explicit initialization, the code snippets get compiled at the initialization time. And if the object is exported to another thread, the code snippets will be re-compiled when an object's copy is created and initialized in that another thread. Remember also that the global variables are not shared between the threads, so if you refer to a global variable in the code snippet and rely on a value in that variable, it won't be present in the other threads (unless the other threads are direct descendants and the value was set before their creation).

The code written in Perl can make use of the source code snippets as well. If it just passes these code arguments to the XS methods, it will get this support automatically. But if it wants to call these snippets directly from the Perl code, Triceps provides a convenience method that would accept the code in either format and compile it if needed:

$code = Triceps::Code::compile($code_ref_or_source);

It takes either a code reference or a source code string as an argument and returns the reference to the compiled code. If the argument was a code reference, it just passes through unchanged. If it was a source code snippet, it gets compiled (and the rules are the same, the text gets the sub { ... } wrapper added around it implicitly).

If the argument was an undef, it also passes through unchanged. This is convenient in case if the code is optional. But if it isn't then the caller should check for undef.

If the compilation fails, the method confesses, and includes the error and the source code into the message, in the same way as the XS methods do.

The optional second argument can be used to provide information about the meaning of the code for the error messages. If it's undefined then the default is "Code snippet":

$code = Triceps::Code::compile($code_ref_or_source, $description);

For example, if the code represents an error handler, the call can be done as follows:

$code = Triceps::Code::compile($code, "Error handler");

4.5. Triceps constants

Triceps has a number of symbolic constants that are grouped into essentially enums. The constants themselves will be introduced with the classes that use them, but here is the general description common to them all.

In Perl they all are placed into the same namespace. Each group of constants (that can be thought of as an enum) gets its name prefix. For example, the operation codes are all prefixed with OP_, the enqueueing modes with EM_, and so on.

The underlying constants are all integer. The way to give symbolic names to constants in Perl is to define a function without arguments that would return the value. Each constant has such a function defined for it. For example, the opcode for the insert operation is the result of function Triceps::OP_INSERT.

Most methods that take constants as arguments are also smart enough to recognise the constant names as strings, and automatically convert them to integers. For example, the following calls are equivalent:

$label->makeRowop(&Triceps::OP_INSERT, ...);
$label->makeRowop("OP_INSERT", ...);

For a while I've thought that the version with Triceps::OP_INSERT would be more efficient and might check for correctness of the name at compile time. But as it turns out, no, on both counts. The look-up of the function by name happens at run time, so there is no compile-time check. And that look-up happens to be a little slower than the one done by the Triceps C++ code, so there is no win there either. The string version is not only shorter but also more efficient. The only win with the function is if you call it once, remember the result in a variable and then reuse. Unless you're chasing the last few percent of performance in a tight loop, it's not worth the trouble. Perhaps in the future the functions will be replaced with the module-level variables: that would be both faster and allow the compile-time checking with use strict.

What if you need to print out a constant in a message? Triceps provides the conversion functions for each group of constants. They generally are named Triceps::somethingString. For example,

print &Triceps::opcodeString(&Triceps::OP_INSERT);

would print OP_INSERT. If the argument is out of range of the valid enums, it would confess. There is also a version of these functions ending with Safe:

print &Triceps::opcodeStringSafe(&Triceps::OP_INSERT);

The difference is that it returns undef if the input value is out of range, thus being safe from confessions.

There also are functions to convert from strings to constant values. They generally are named Triceps::stringSomething. For example,

&Triceps::stringOpcode("OP_INSERT")
&Triceps::stringOpcodeSafe("OP_INSERT")

would return the integer value of Triceps::OP_INSERT. If the string name is not valid for this kind of constants, it would also either confess without Safe in the name or return undef with it.

4.6. Printing the object contents

When debugging the programs, it's important to find from the error messages, what is going on, what kinds of objects are getting involved. Because of this, many of the Triceps objects provide a way to print out their contents into a string. This is done with the method print(). The simplest use is as follows:

$message = "Error in object " . $object->print();

Most of the objects tend to have a pretty complicated internal structure and are printed on multiple lines. They look better when the components are appropriately indented. The default call prints as if the basic message is un-indented, and indents every extra level by 2 spaces.

This can be changed with extra arguments. The general format of print() is:

$object->print([$indent, [$subindent] ])

where $indent is the initial indentation, and $subindent is the additional indentation for every level. The default print() is equivalent to print("", " ").

A special case is

$object->print(undef)

It prints the object in a single line, without line breaks.

Here is an example of how a row type object would get printed. The details of the row types will be described later, for now just assume that a row type is defined as:

$rt1 = Triceps::RowType->new(
  a => "uint8",
  b => "int32",
  c => "int64",
  d => "float64",
  e => "string",
);

Then $rt1->print() produces:

row {
  uint8 a,
  int32 b,
  int64 c,
  float64 d,
  string e,
}

With extra arguments $rt1->print("++", "--"):

row {
++--uint8 a,
++--int32 b,
++--int64 c,
++--float64 d,
++--string e,
++}

The first line doesn't have a ++ because the assumption is that the text gets appended to some other text already on this line, so any prefixes are used only for the following lines.

And finally with an undef argument $rt1->print(undef):

row { uint8 a, int32 b, int64 c, float64 d, string e, }

The Rows and Rowops do not have the print() method. That's largely because the C++ code does not deal with printing the actual data, this is left to the Perl code. So instead they have the method printP() that does a similar job. Only it's simpler and doesn't have any of the indenting niceties. It always prints the data in a single line. The P in printP stands for Perl. The name is also different because of this lack of indenting niceties. See more about it in the Section 5.4: “Rows” .

4.7. The Hungarian notation

The Hungarian notation is the idea that the name of each variable should be prefixed with some abbreviation of its type. It has probably become most widely known from the Microsoft operating systems.

Overall it's a complete abomination and brain damage. But I'm using it widely in the examples in this manual. Why? The problem is that there usually too many components for one logical purpose. For a table, there would be a row type, a table type, and the table itself. Rather than inventing separate names for them, it's easier to have a common name and an uniform prefix. Eventually something better would have to be done but for now I've fallen back on the Hungarian notation. One possibility is to just not give names to the intermediate entities. Say just have a named table, and then there would be the the type of the table and the row type of the table.

Among the CEP systems, Triceps is not unique in the Hungarian notation department. Coral8/Sybase CCL has this mess of lots of schemas, input streams, windows and output streams, with the same naming problems. The uniform naming prefixes or suffixes help making this mess more navigable. I haven't actually used StreamBase but from reading the documentation I get the feeling that the Hungarian notation is probably useful for its SQL as well.

4.8. The Perl libraries and examples

The official Triceps classes are collected in the Triceps package (and its subpackages).

However when writing tests and examples I've found that there are also some repeating elements. Initially I've been handling the situation by either combining all examples using such an element into a single file or by copying it around. Then I've collected all such fragments under the package Triceps::X. X can be thought of as a mark of eXperimental, eXample, eXtraneous code.

While the code in the official part of the library is extensively tested, the X-code is tested only in its most important functionality and not in the details. This code is not exactly of production quality but is good enough for the examples, and can be used as a starting point for development of the better code. Quite a few fragments of Triceps went this way: the joins have been done as an example first, and then solidified for the main code base, and so did the aggregation.

One of these modules is Triceps::X::TestFeed. It's a small infrastructure to run the examples, pretending that it gets the input from stdin and sends output to stdout, while actually doing it all in memory. All of the more complicated examples have been written to use it. When you look in the code of the actual running examples and compare it to the code snippets in the manual, you can see the differences. A &readLine shows instead of <STDIN>, and a &send instead of print (and for the manual, I have a script that does the reverse substitutions automatically when I insert the code examples into it).

Chapter 5. Rows

In Triceps the relational data is stored and passed around as rows (once in a while I call them records, which is the same thing here). Each row belongs to a certain type, that defines the types of the fields. Each field may belong to one of the simple types.

5.1. Simple types

The simple values in Triceps belong to one of the simple types:

  • uint8
  • int32
  • int64
  • float64
  • string

I like the explicit specification of the data size, so it's not some mysterious double but an explicit float64.

When the data is stored in the rows, it's stored in the strongly-typed binary format. When it's extracted from the rows for the Perl code to access, it gets converted into the Perl values. And the other way around, when stored into the rows, the conversion is done from the Perl values.

uint8 is the type intended to represent the raw bytes. So, for example, when they are compared, they should be compared as raw bytes, not according to the locale. Since Perl stores the raw bytes in strings, and its pack() and unpack() functions operate on strings, The Perl side of Triceps extracts the uint8 values from records into Perl strings, and the other way around.

The string type is intended to represent a text string in whatever current locale (at some point it may become always UTF-8, this question is open for now).

Perl on the 32-bit machines has an issue with int64: it has no type to represent it directly. Because of that, when the int64 values are passed to Perl on the 32-bit machines, they are converted into the floating-point numbers. This gives only 54 bits (including sign) of precision, but that's close enough. Anyway, the 32-bit machines are obsolete by now, and Triceps it targeted towards the 64-bit machines.

On the 64-bit machines both int32 and int64 translate to the Perl 64-bit integers.

Note that there is no special type for timestamps. As of version 1.0 there is no time-based processing inside Triceps, but that does not prevent you from passing around timestamps as data and use them in your logic. Just store the timestamps as integers (or, if you prefer, as floating point numbers). When the time-based processing will be added to Perl, the plan is to still use the int64 to store the number of microseconds since the Unix epoch. My experience with the time types in the other CEP systems is that they cause nothing but confusion. In the meantime, the time-based processing is still possible by driving the notion of time explicitly. It's described in the Chapter 13: “Time processing.

5.2. Row types

A row type is created from a sequence of (field-name, field-type) string pairs, for example:

$rt1 = Triceps::RowType->new(
  a => "uint8",
  b => "int32",
  c => "int64",
  d => "float64",
  e => "string",
);

Even though the pairs look like a hash, don't use an actual hash to create row types! The order of pairs in a hash is unpredictable, while the order of fields in a row type usually matters.

In an actual row the field may have a value or be NULL. The NULLs are represented in Perl as undef.

The real-world records tend to be pretty wide and contain repetitive data. Hundreds of fields are not unusual, and I know of a case when an Aleri customer wanted to have records of two thousand fields (and succeeded). This just begs for arrays. So the Triceps rows allow the array fields. They are specified by adding [] at the end of field type. The arrays may only be made up of fixed-width data, so no arrays of strings.

$rt2 = Triceps::RowType->new(
  a => "uint8[]",
  b => "int32[]",
  c => "int64[]",
  d => "float64[]",
  e => "string", # no arrays of strings!
);

The arrays are of variable length, whatever array data passed when a row is created determines its length. The individual elements in the array may not be NULL (and if undefs are passed in the array used to construct the row, they will be replaced with 0s). The whole array field may be NULL, and this situation is equivalent to an empty array.

The type uint8 is typically used in arrays, uint8[] is the Triceps way to define a blob field. In Perl the uint8[] is represented as a string value, same as a simple unit8.

The rest of array values are represented in Perl as references to Perl arrays, containing the actual values.

The row type objects provide a way for introspection:

$rt->getdef()

returns back the array of pairs used to create this type. It can be used among other things for the schema inheritance. For example, the multi-part messages with daily unique ids can be defined as:

$rtMsgKey = Triceps::RowType->new(
  date => "string",
  id => "int32",
);

$rtMsg = Triceps::RowType->new(
  $rtMsgKey->getdef(),
  from => "string",
  to => "string",
  subject => "string",
);

$rtMsgPart = Triceps::RowType->new(
  $rtMsgKey->getdef(),
  type => "string",
  payload => "string",
);

The meaning here is the same as in the CCL example:

create schema rtMsgKey (
  string date,
  integer id
);
create schema rtMsg inherits from rtMsgKey (
  string from,
  string to,
  string subject
);
create schema rtMsgPart inherits from rtMsgKey (
  string type,
  string payload
);

The grand plan is to provide some better ways of defining the commonality of fields between row types. It should include the ability to rename fields, to avoid conflicts, and to remember this equivalence to be reused in the further joins without the need to write it over and over again. But it has not come to the implementation stage yet.

The other methods are:

$rt->getFieldNames()

returns the array of field names only.

$rt->getFieldTypes()

returns the array of field types only.

$rt->getFieldMapping()

returns the array of pairs that map the field names to their indexes in the field definitions. It can be stored into a hash and used for name-to-index translation. It's used mostly in the templates, to generate code that accesses data in the rows by field index (which is more efficient than access by name). For example, for rtMsgKey defined above it would return (date => 0, id => 1).

5.3. Row types equivalence

The Triceps objects are usually strongly typed. A label handles rows of a certain type. A table stores rows of a certain type.

However there may be multiple ways to check whether a row fits for a certain type:

  • It may be a row of the exact same type, created with the same RowType object.
  • It may be a row of another type but one with the exact same definition.
  • It may be a row of another type that has the same number of fields and field types but different field names. The field names (and everything else in Triceps) are case-sensitive.

The row types may be compared for these conditions using the methods:

$rt1->same($rt2)
$rt1->equals($rt2)
$rt1->match($rt2)

The comparisons are hierarchical: if two type references are the same, they would also be equal and matching; two equal types are also matching.

Most of the objects would accept the rows of any matching type (this may change or become adjustable in the future). However if the rows are not of the same type, this check involves a performance penalty. If the types are the same, the comparison is limited to comparing the pointers. But if not, then the whole type definition has to be compared. So every time a row of a different type is passed, it would involve the overhead of type comparison.

For example:

my @schema = (
  a => "int32",
  b => "string"
);

my $rt1 = Triceps::RowType->new(@schema);
# $rt2 is equal to $rt1: same field names and field types
my $rt2 = Triceps::RowType->new(@schema);
# $rt3  matches $rt1 and $rt2: same field types but different names
my $rt3 = Triceps::RowType->new(
  A => "int32",
  B => "string"
);

my $lab = $unit->makeDummyLabel($rt1, "lab");
# same type, efficient
my $rop1 = $lab->makeRowop(&Triceps::OP_INSERT,
  $rt1->makeRowArray(1, "x"));
# different row type, involves a comparison overhead
my $rop2 = $lab->makeRowop(&Triceps::OP_INSERT,
  $rt2->makeRowArray(1, "x"));
# different row type, involves a comparison overhead
my $rop3 = $lab->makeRowop(&Triceps::OP_INSERT,
  $rt3->makeRowArray(1, "x"));

A dummy label used here is a label that does nothing (its usefulness will be explained later).

Once the Rowop is constructed, no further penalty is involved: the row in the Rowop is re-typed to the type of the label from now on. It's physically still the same row with another reference to it, but when you get it back from the Rowop, it will have the label's type. It's all a part of the interesting interaction between C++ and Perl. All the type checking is done in the Perl XS layer. The C++ code just expects that the data is always right and doesn't carry the types around. When the Perl code wants to get the row back from the Rowop, it wants to know the type of the row. The only way to get it is to look, what is the label of this Rowop, and get the row type from the label. This is also the reason why the types have to be checked when the Rowop is constructed: if a wrong row is placed into the Rowop, there will be no later opportunity to check it for correctness, and bad data may cause a crash.

5.4. Rows

The rows in Triceps always belong to some row type, and are always immutable. Once a row is created, it can not be changed. This allows it to be referenced from multiple places, instead of copying the whole row value. Naturally, a row may be passed and shared between multiple threads.

The row type provides the constructor methods for the rows:

$row = $rowType->makeRowArray(@fieldValues);
$row = $rowType->makeRowHash($fieldName => $fieldValue, ...);

Here $row is a reference to the resulting row. As usual, in case of error it will confess.

In the array form, the values for the fields go in the same order as they are specified in the row type (if there are too few values, the rest will be considered NULL, having too many values is an error).

The Perl value of undef is treated as NULL.

In the hash form, the fields are specified as name-value pairs. If the same field is specified multiple times, the last value will overwrite all the previous ones. The unspecified fields will be left as NULL. Again, the arguments of the function actually are an array, but if you pass a hash, its contents will be converted to an array on the call stack.

If the performance is important, the array form is more efficient, since the hash form has to translate internally the field names to indexes.

The row itself and its type don't have any concept of keys in general and of the primary key in particular. So any fields may be left as NULL. There is no NOT NULL constraint.

Some examples:

$row  = $rowType->makeRowArray(@fields);
$row  = $rowType->makeRowArray($a, $b, $c);
$row  = $rowType->makeRowHash(%fields);
$row  = $rowType->makeRowHash(a => $a, b => $b);

The usual Perl conversions are applied to the values. So for example, if you pass an integer 1 for a string field, it will be converted to the string 1. Or if you pass a string for an integer field, it will be converted to 0.

If a field is an array (as always, except for uint8[] which is represented as a Perl string), its value is a Perl array reference (or undef). For example:

$rt1 = Triceps::RowType->new(
  a => "uint8[]",
  b => "int32[]",
);
$row = $rt1->makeRowArray("abcd", [1, 2, 3]);

An empty array will become a NULL value. So the following two are equivalent:

$row = $rt1->makeRowArray("abcd", []);
$row = $rt1->makeRowArray("abcd", undef);

Remember that an array field may not contain NULL values. Any undefs in the array fields will be silently converted to zeroes (since arrays are supported only for the numeric types, a zero value would always be available for all of them). The following two are equivalent:

$row = $rt1->makeRowArray("abcd", [undef, undef]);
$row = $rt1->makeRowArray("abcd", [0, 0]);

The row also provides a way to copy itself, modifying the values of selected fields:

$row2 = $row1->copymod($fieldName => $fieldValue, ...);

The fields that are not explicitly specified will be left unchanged. Since the rows are immutable, this is the closest thing to the field assignment. copymod() is generally more efficient than extracting the row into an array or hash, replacing a few of them with new values and constructing a new row. It bypasses the binary-to-Perl-to-binary conversions for the unchanged fields.

The row knows its type, which can be obtained with

$row->getType()

Note that this will create a new Perl wrapper to the underlying type object. So if you do:

$rt1 = ...;
$row = $rt1->makeRow...;
$rt2 = $row->getType();

then $rt1 will not be equal to $rt2 by the direct Perl comparison ($rt1 != $rt2). However both $rt1 and $rt2 will refer to the same row type object, so $rt1-&gt;same($rt2) will be true.

The row references can also be compared for sameness:

$row1->same($row2)

The row contents can be extracted back into Perl representation as

@adata = $row->toArray();
%hdata = $row->toHash();

Again, the NULL fields will become undefs, and the array fields (unless they are NULL) will become Perl array references. Since the empty array fields are equivalent to NULL array fields, on extraction back they will be treated the same as NULL fields, and become undefs.

There is also a convenience function to get one field from a row at a time by name:

$value = $row->get("fieldName");

If you need to access only a few fields from a big row, get() is more efficient (and easier to write) that extracting the whole row with toHash() or even with toArray(). But don't forget that every time you call get(), it creates a new Perl value, which may be pretty involved if the value is an array. So the most efficient way then for the values that get reused many times is to call get(), remember the result in a Perl variable, and then reuse that variable.

There is also a way to conveniently print a rows contents, usually for the debugging purposes:

$result = $row->printP();

The name printP is an artifact of implementation: it shows that this method is implemented in Perl and uses the default Perl conversions of values to strings. The uint8[] arrays are printed directly as strings. The result is a sequence of name="value" or name=["value", "value", "value"] for all the non-NULL fields. The backslashes and double quotes inside the values are escaped by backslashes in Perl style. For example, reusing the row type above,

$row = $rt1->makeRowArray('ab\ "cd"', [0, 0]);
print $row->printP(), "\n";

will produce

a="ab\\ \"cd\"" b=["0", "0"]

It's possible to check quickly if all the fields of a row are NULL:

$result = $row->isEmpty();

It returns 1 if all the fields are NULL and 0 otherwise.

Finally, there is a deep debugging method:

$result = $row->hexdump()

That dumps the raw bytes of the row's binary format, and is useful only to debug the more weird issues.

Chapter 6. Labels and Row Operations

6.1. Labels basics

In each CEP engine there are two kinds of logic: One is to get some request, look up some state, maybe update some state, and return the result. The other has to do with the maintenance of the state: make sure that when one part of the state is changed, the change propagates consistently through the rest of it. If we take a common RDBMS for an analog, the first kind would be like the ad-hoc queries, the second kind will be like the triggers. The CEP engines are very much like database engines driven by triggers, so the second kind tends to account for a lot of code.

The first kind of logic is often very nicely accommodated by the procedural logic. The second kind often (but not always) can benefit from a more relational, SQLy definition. However the SQLy definitions don't stay SQLy for long. When every every SQL statement executes, it gets compiled first into the procedural form, and only then executes as the procedural code.

The Triceps approach is tilted toward the procedural execution. That is, the procedural definitions come out of the box, and then the high-level relational logic can be defined on top of them with the templates and code generators.

These bits of code, especially where the first and second kind connect, need some way to pass the data and operations between them. In Triceps these connection points are called Labels.

The streaming data rows enter the procedural logic through a label. Each row causes one call on the label. From the functional standpoint they are the same as Coral8 Streams, as has been shown in Section 1.4: “We're not in 1950s any more, or are we?” . Except that in Triceps the labels receive not just rows but operations on rows, as in Aleri: a combination of a row and an operation code.

They are named labels because Triceps has been built around the more procedural ideas, and when looked at from that side, the labels are targets of calls and GOTOs.

If the streaming model is defined as a data flow graph, each arrow in the graph is essentially a GOTO operation, and each node is a label.

A Triceps label is not quite a GOTO label, since the actual procedural control always returns back after executing the label's code. It can be thought of as a label of a function or procedure. But if the caller does nothing but immedially return after getting the control back, it works very much like a GOTO label.

Each label accepts operations on rows of a certain type.

Each label belongs to a certain execution unit, so a label can be used only strictly inside one thread and can not be shared between threads.

Each label may have some code to execute when it receives a row operation. The labels without code can be useful too.

A Triceps model contains the straightforward code and the mode complex stateful elements, such as tables, aggregators, joiners (which may be implemented in C++ or in Perl, or created as user templates). These stateful elements would have some input labels, where the actions may be sent to them (and the actions may also be done as direct method calls), and output labels, where they would produce the indications of the changed state and/or responses to the queries. This is shown in the diagram in Figure 6.1. The output labels are typically the ones without code (dummy labels). They do nothing by themselves, but can pass the data to the other labels. This passing of data is achieved by chaining the labels: when a label is called, it will first execute its own code (if it has any), and then call the same operation on whatever labels are chained from it. Which may have more labels chained from them in turn. So, to pass the data, chain the input label of the following element to the output label of the previous element.

Stateful elements with chained labels.

Figure 6.1. Stateful elements with chained labels.


The make things clear, a label doesn't have to be a part of a stateful element. The labels absolutely can exist by themselves. It's just that the stateful elements can use the labels as their endpoints.

6.2. Label construction

The execution unit provides methods to construct labels. A dummy label is constructed as:

$label = $unit->makeDummyLabel($rowType, "name");

It takes as arguments the type of rows that the label will accept and the symbolic name of the label. As usual, the name can be any but for the ease of debugging it's better to give the same name as the label variable.

The label with Perl code is constructed as follows:

$label = $unit->makeLabel($rowType, "name", $clearSub,
  $execSub, @args);

The row type and name arguments are the same as for the dummy label. The following two arguments provide the references to the Perl functions that perform the actions. They can be specified as a function reference or a source code string, see Section 4.4: “Code references and snippets” . $execSub is the function that executes to handle the incoming rows. It gets the arguments:

&$execSub($label, $rowop, @args)

Here $label is this label, $rowop is the row operation, and @args are the same as extra arguments specified at the label creation.

The row operation actually contains the label reference, so why pass it the second time? The reason lies in the chaining. The current label may be chained, possibly through multiple levels, to some original label, and the rowop will refer to that original label. The extra argument lets the code find the current label.

$clearSub is the function that clears the label. It will be explained in the Section 8.2: “Clearing of the labels” . Either of $execSub and $clearSub can be specified as undef. Though a label with an undefined $execSub makes the label useless for anything other than clearing. On an attempt to send data to it, it will complain that the label has been cleared. The undefined $clearSub causes the function Triceps::clearArgs() to be used as the default, which provides the correct reaction for most situations.

There is a special convenience constructor for the labels that are used only for clearing an object (their usefulness is discussed in Section 8.2: “Clearing of the labels” ).

$lb = $unit->makeClearingLabel("name", @args);

The arguments would be the references to the objects that need clearing, usually the object's $self. They will be cleared with Triceps::clearArgs() when the label clearing gets called.

6.3. Other label methods

The chaining of labels is done with the method:

$label1->chain($label2);

$label2 becomes chained to $label1. A label can not be chained to itself, neither directly nor through other intermediate labels. The row types of the chained labels must be equal (this is more strict than for queueing up the row operations for labels, and might change one or the other way in the future).

When $label1 executes, its chained labels will normally be executed in the order they were chained. However sometines it's necessary to add a label to the chain later but have it called first. This is done with the method:

$label1->chainFront($label2);

It chains $label2 at the start of the chain. Of course, if more labels will be chained at the front afterwards, $label2 will be called only after them. But usually there is a need for only one such label, and it's usually connected to the FnReturn and Facet objects. For an example, see Section 16.3: “Multithreaded pipeline” .

A label's chainings can be cleared with

$label1->clearChained();

It returns nothing, and clears the chainings from this label. There is no way to unchain only some selected labels.

To check if there are any labels chained from this one, use:

$result = $label->hasChained();

The same check can be done with

@chain = $label->getChain();

if ($#chain >= 0) { ... }

but hasChained() is more efficient since it doesn't have to construct that intermediate array.

There is also a convenience method that creates a new label by chaining it from an existing label:

$label2 = $label1->makeChained($name, $subClear, $subExec, @args);

The arguments are very much the same as in Unit::makeLabel(), only there is no need to specify the row type for the new label (nor obviously the Unit), these are taken from the original label. It's really a wrapper that finds the unit and row type from label1, makes a new label, and then chains it off label1.

The whole label can be cleared with

$label->clear();

This is fully equivalent to what happens when an execution unit clears the labels: it calls the clear function (if any) and clears the chainings. Note that the labels that used to be chained from this one do not get cleared themselves, they're only unchained from this one. To check whether the label has been already cleared use:

$result = $label->isCleared();

Labels have the usual way of comparing the references:

$label1->same($label2)

returns true if both references point to the same label object.

The labels introspection can be done with the methods:

$rowType = $label->getType();
$rowType = $label->getRowType();
$unit = $label->getUnit();
$name = $label->getName();
@chainedLabels = $label->getChain();
$execSubRef = $label->getCode();

The methods getType() and getRowType() are the same, they both return the row type of the label. getType() is shorter, which looked convenient for a while, but getRowType() has the name consistent with the rest of the classes. This consistency comes useful when passing the objects of various types to the same methods, using the Perl's name-based polymorphism. For now both of them are present, but getType() will likely be deprecated in the future.

If the label has been cleared, getUnit() will return an undef. getChain() returns an array of references to the chained labels. getCode() is actually half-done because it returns just the Perl function reference to the execution handler but not its arguments, nor reference to the clearing function. It will be changed in the future to fix these issues. getCode() is not applicable to the dummy labels, and would return an undef for them.

The labels actually exist in multiple varieties. The underlying common denominator is the C++ class Label. This class may be extended and the resulting labels embedded into the C++ objects. These labels can be accesses and controlled from Perl but their logic is hardcoded in their objects and is not directly visible from Perl. The dummy labels are a subclass of labels in general, and can be constructed directly from Perl. Another subclass is the labels with the Perl handlers. They can be constructed from Perl, and really only from Perl. The C++ code can access and control them, in a symmetrical relation. The method getCode() has meaning only on these Perl labels. Finally, the clearing labels also get created from Perl, and fundamentally are Perl labels with many settings hardcoded in the constructor. getCode() can be used on them too but since they have no handler code, it would always return undef.

There is also a way to change a label's name:

$label->setName($name);

It returns nothing, and there is probably no reason to call it. It will likely be removed in the future.

The label also provides the constructor methods for the row operations, which are described below.

And for completeness I'll mention the methods used to mark the label as non-reentrant and to read this mark back. They will be described in detail in Section 7.13: “Recursion control” .

$label->setNonReentrant();
$val = $label->isNonReentrant();

6.4. Row operations

A row operation (also known as rowop) in Triceps is an unit of work for a label. It's always destined for a particular label (which could also pass the rowop to its chained labels), and has a row to process and an opcode. The opcodes will be described momentarily in the Section 6.5: “Opcodes” .

A row operation is constructed as:

$rowop = $label->makeRowop($opcode, $row);

The opcode may be specified an integer or as a string. Historically, there is also an optional extra argument for the enqueuing mode but it's already obsolete, so I don't show it here.

Since the labels are single-threaded, the rowops are single-threaded too. The rowops are immutable, just as the rows are. It's possible to keep a rowop around and call it over and over again.

A rowop can be created from a bunch of fields in an array or hash form in two steps:

$rowop = $label->makeRowop($opcode, $rt->makeRowHash(
  $fieldName => $fieldValue, ...));
$rowop = $label->makeRowop($opcode, $rt->makeRowArray(@fields));

Since this kind of creation happens fairly often, writing out these calls every time becomes tedious. The Label provides the combined constructors to make life easier:

$rowop = $label->makeRowopHash($opcode, $fieldName => $fieldValue, ...);
$rowop = $label->makeRowopArray($opcode, @fields);

Note that they don't need the row type argument any more, because the label knows the row type and provides it. Internally these methods are currently implemented in Perl, and just wrap the two calls into one. In the future they will be rewritten in C++ for greater efficiency.

There also are the methods that create a rowop and immediately call it. They will be described with the execution unit.

A copy of rowop (not just another reference but an honest separate copied object) can be created with:

$rowop2 = $rowop1->copy();

However, since the rowops are immutable, a reference is just as good as a copy. This method is historic and will likely be removed or modified.

A more interesting operation is the rowop adoption: it is a way to pass the row and opcode from one rowop to another new one, with a different label.

$rowop2 = $label->adopt($rowop1);

It is very convenient for building the label handlers that pass the rowops to the other labels unchanged. For example, a label that filters the data and passes it to the next label, can be implemented as follows:

my $lab1 = $unit->makeLabel($rt1, "lab1", undef, sub {
  my ($label, $rowop) = @_;
  if ($rowop->getRow()->get("a") > 10) {
    $unit->call($lab2->adopt($rowop));
  }
});

This code doesn't even look at the opcode in the rowop, it just passes it through and lets the next label worry about it. The functionality of adopt() also can be implemented with

$rowop2 = $label->makeRowop($rowop1->getOpcode(), $rowop1->getRow());

But adopt() is easier to call and also more efficient, because less of the intermediate data surfaces from the C++ level to the Perl level.

The references to rowops can be compared as usual:

$rowop1->same($rowop2)

returns true if both point to the same rowop object.

The rowop data can be extracted back:

$label = $rowop->getLabel();
$opcode = $rowop->getOpcode();
$row = $rowop->getRow();

A Rowop can be printed (usually for debugging purposes) with

$string = $rowop->printP();
$string = $rowop->printP($name);

Just as with a row, the method printP() is implemented in Perl. In the future a print() done right in C++ may be added, but for now I try to keep all the interpretation of the data on the Perl side. Even though printP() is implemented in Perl, it can print the rowops for any kinds of labels. The following example gives an idea of the format in which the rowops get printed:

$lb = $unit->makeDummyLabel($rt, "lb");
$rowop = $lb->makeRowop(&Triceps::OP_INSERT, $row);
print $rowop->printP(), "\n";

would produce

lb OP_INSERT a="123" b="456" c="3000000000000000" d="3.14" e="text"

The row contents is printed through Row::printP(), so it has the same format.

The optional argument allows to override the name of the label printed. For example, if in the example above the last line were to be replaced with

print $rowop->printP("OtherLabel"), "\n";

the result will become:

OtherLabel OP_INSERT a="123" b="456" c="3000000000000000" d="3.14" e="text"

It makes the printing of rowops in the chained labels more convenient. A chained label's execution handler receives the original unchanged rowop that refers to the first label in the chain. So when it gets printed, it will print the name of the first label in the chain, which might be very surprising. The explicit argument allows to override it to the name of the chained label (or to any other value).

6.5. Opcodes

The defined opcodes are:

  • &Triceps::OP_NOP or "OP_NOP"
  • &Triceps::OP_INSERT or "OP_INSERT"
  • &Triceps::OP_DELETE or "OP_DELETE"

The meaning is straightforward: NOP does nothing, INSERT inserts a row, DELETE deletes a row. There is no opcode to replace or update a row. The updates are done as two separate operations: first DELETE the old value then INSERT the new value. The order is important: the old value has to be deleted before inserting the new one. But there is no requirement that these operations must go one after another. If you want to update ten rows, you can first delete all ten and then insert the new ten. In the normal processing the end result will be the same, even though it might go through some different intermediate states. It's a good idea to write your models to follow the same principle.

Internally an opcode is always represented as an integer constant. The same constant value can be obtained by calling the functions &Triceps::OP_*. However when constructing the rowops, you can also use the string literals "OP_*" with the same result, they will be automatically transtaled to the integers. In fact, the string literal form is slightly faster (unless you save the result of the function in a variable and then use the integer value from that variable for the repeated construction).

But when you get the opcodes back from rowops, they are always returned as integers. Triceps provides functions that convert the opcodes between the integer and string constants:

$opcode = &Triceps::stringOpcode($opcodeName);
$opcodeName = &Triceps::opcodeString($opcode);

They come handy for all kinds of print-outs. If you pass the invalid values, the conversion to integers will return an undef.

The conversion of the invalid integers to strings is more interesting. And by the way, you can pass the invalid integer opcodes to the rowop construction too, and they won't be caught. The way they will be processed is a bit of a lottery. The proper integer values are actually bitmasks, and they are nicely formatted to make sense. The invalid values would make some random bitmasks, and they will get processed in some unpredictable way. When converting an invalid integer to a string, opcodeString tries to predict and show this way in a set of letters I and D in square brackets, for INSERT and DELETE flags. If both are present, usually the INSERT flag wins over the DELETE in the processing. If none are present, it's a NOP.

In the normal processing you don't normally read the opcode and then compare it with different values. Instead you check the meaning of the opcode (that is internally a bitmask) directly with the rowop methods:

$rowop->isNop()
$rowop->isInsert()
$rowop->isDelete()

The typical idiom for the label's handler function is:

if ($rowop->isInsert()) {
  # handle the insert logic ...
} elsif($rowop->isDelete()) {
  # handle the delete logic...
}

The NOPs get silently ignored in this idiom, as they should be. Generally there is no point in creating the rowops with the OP_NOP opcode, unless you want to use them for some weird logic.

The main Triceps package also provides functions to check the integer opcode values directly:

Triceps::isNop($opcode)
Triceps::isInsert($opcode)
Triceps::isDelete($opcode)

The same-named methods of Rowop are just the more convenient and efficient way to say

Triceps::isNop($rowop->getOpcode())
Triceps::isInsert($rowop->getOpcode())
Triceps::isDelete($rowop->getOpcode())

They handle the whole logic directly in C++ without an extra Perl conversion of the values.

Chapter 7. Scheduling

7.1. Introduction to the scheduling

The scheduling determines, in which order the row operations are processed. If there are multiple operations available, which one should be processed first? The scheduler keeps a queue of the operations and selects, which one to execute next. This has a major effect on the logic of a CEP model.

The Triceps approach to scheduling varied over time. Initially it looked like the purely procedural execution will be enough, with the order determined by the order of the procedural execution, and no explicit scheduling would be needed. This has proved to have its own limitations, and thus the labels and their scheduling were born. Then it had turned out that the most typical thing to do with a label is to call it, again in the purely procedural order.

So for the most part you don't need to think about scheduling in Triceps. It just works as expected: when you call a label with a rowop, the call returns after the label's work is all done. You can pretty much skip over the section with the low-level details altogether, just read the high-level sections. The only important exception is the topological loops, where the rowops go repeatedly through a closed loop of the labels. But even for them the Perl API provides the high-level methods that take care of the details under the hood. And there is another way to deal with the loops by using the streaming functions and procedural loops.

If you want to understand the loop scheduling better, skim over the sections with the details. You'd also need to do this if you plan to write the Triceps models in C++, since as of version 2.0 the C++ API does not provide the high-level methods for building the loops yet.

Only if you are a serious CEP affictionado and want to understand how everything really works, you need to seriously read all the details.

7.2. Comparative scheduling in the various CEP systems

There are multiple approaches to scheduling employed by different CEP systems. The classic Aleri CEP essentially didn't have any, except for the flow control between threads, because each its element is a separate thread. Coral8 had an intricate scheduling algorithm. Sybase R5.1 has the same logic as Coral8 inside each thread. StreamBase presumably also has some.

The scheduling logic in Triceps is different from the other CEP systems. The Coral8 logic looks at first like the only reasonable way to go, but could not be used in Triceps for three reasons: First, it's a trade secret, so it can't be simply reused. If I'd never seen it, that would not be an issue but I've worked on it and implemented its version for R5.1. Second, it relies on the properties that the compiler computes from the model graph analysis. Triceps has no compiler, and could not do this. Third, in reality it simply doesn't work that well. There are quite a few cases when the Coral8 scheduler comes up with a strange and troublesome execution order.

7.3. Execution unit basics

An execution unit (often called simply unit) keeps the state of the Triceps execution for one thread. Each thread running Triceps must have its own execution unit.

It's perfectly possible to have multiple execution units in the same thread. This is typically done when there is some permanent model plus some small intermittent sub-models created on demand to handle the user requests. These small sub-models would be created in the separate units, to be destroyed when their work is done. But this is a somewhat advanced usage, more examples will be shown in Section 15.11: “Streaming functions and unit boundaries” . The TQL implementation also does this, as described in Chapter 17: “TQL, Triceps Trivial Query Language.

This section describes the basic methods of the units, the most often used ones. The more advanced ones are described in the following sections, and the full reference is located in Section 19.1: “Unit and FrameMark reference” .

A unit is created with:

$myUnit = Triceps::Unit->new("name");

The name argument will be used in the error messages, making easier to find, which exact part of the model is having troubles. By convention the name should be the same as the name of the unit variable (myUnit in this case).

The name can be read back:

$name = $myUnit->getName();

Also, as usual, the variable $myUnit here contains a reference to the actual unit object, and two references can be compared for whether they refer to the same object:

$result = $unit1->same($unit2);

A unit also keeps an empty row type (one with no fields), primarily for the creation of the clearing labels (discussed in Section 8.2: “Clearing of the labels” and Section 6.2: “Label construction” ), but you can use it for any other purposes too. You can get it with the method:

$rt = $unit->getEmptyRowType();

Each unit has its own instance of an empty row type. Its purely for the conveniece of memory management, they are all equivalent.

The labels are called with:

$unit->call($rowop, ...);

The identity of the label being called is embedded in the row operation. The ... shows that multiple rowops may be passed as arguments. So the real signature of this method is:

$unit->call(@rowops);

But this way it looks more confusing. A call with multiple arguments produces the same result as doing multiple calls with one argument at a time. Not only rowops but also trays (to be discussed later) of rowops can be used as arguments.

There also are the convenience methods that create the rowops from the field values and immediately call them:

$unit->makeHashCall($label, $opcode,
  $fieldName => $fieldValue, ...);
$unit->makeArrayCall($label, $opcode, @fieldValues);

The methods for creation of labels have been already discussed in Section 6.2: “Label construction” . Here is their recap along with the similar methods for creation of tables and trays that will be discussed later:

$label = $unit->makeDummyLabel($rowType, "name");

$label = $unit->makeLabel($rowType, "name",
  $clearSub, $execSub, @args);

$label = $unit->makeClearingLabel("name", @args);

$table = $unit->makeTable($tableType, "name");

$tray = $unit->makeTray(@rowops);

A special thing about the labels is that when a unit creates a label, it keeps a reference to it, for clearing. A label keeps a pointer back to the unit but not a reference (if you call getUnit() on a label, the returned value becomes a reference). For a table or a tray, the unit doesn't keep a reference to them. Instead, they keep a reference to the unit. The references are at the C++ level, not Perl level.

With the tables, the references can get pretty involved: A table has labels associated with it. When a table is created, it also creates these labels. The unit keeps references of these labels. The table also keeps references of these labels. The table keeps a reference of the unit. The labels have pointers to the unit and the table but not references, to avoid the reference cycles.

See more on the memory management and label clearing in the Chapter 8: “Memory Management.

7.4. Trays

The easiest way to store a sequence of rowops is to put them into the Perl arrays, like:

my @ops = ($rowop1, $rowop2);
push @ops, $rowop3;

However the C++ internals of Triceps do not know about the Perl arrays. And some of them can work directly with the sequences of rowops. So Triceps defines an internal sort-of-equivalent of Perl array for rowops, called a Tray.

The trays have first been used to catch the side effects of operations on the stateful elements, so the name tray came from the metaphor put a tray under it to catch the drippings. The new and better approach for catching the results in a tray catches the results of streaming functions.

The trays get created as:

$tray = $unit->makeTray(@rowops);

A tray always stores rowops for only one unit. It can be only used in one thread. A tray can be used in all the calling/enqueueing methods, just like the direct rowops (the details of the enqueueing methods will be described later in Section 7.11: “The gritty details of Triceps scheduling” and in Section 19.1: “Unit and FrameMark reference” ).

$unit->call($tray);
$unit->fork($tray);
$unit->schedule($tray);
$unit->enqueue($mode, $tray);
$unit->loopAt($mark, $tray);

Moreover, multiple trays may be passed, and the loose rowops and trays can be mixed in the arguments of these functions, for example:

$unit->call($rowopStartPkg, $tray, $rowopEndPkg);

A tray may contain the rowops of any types mixed in any order. This is by design, and it's an important feature that allows to build the protocol blocks out of rowops and perform an orderly data exchange. This feature is an absolute necessity for proper inter-process and inter-thread communication.

The ability to send the rows of multiple types through the same channel in order is a must, and its lack makes the communication with some other CEP systems exceedingly difficult. Coral8 supports only one stream per connection. Aleri (and I believe Sybase R5) allows to send multiple streams through the same connection but has no guarantees of order between them. I don't know about the others, check yourself.

To iterate on a tray in the Perl code, it can be converted to a Perl array:

@array = $tray->toArray();

The size of the tray (the count of rowops in it) can be found directly without a conversion, and the unit can be read back too:

$size = $tray->size();
$traysUnit = $tray->getUnit();

Another way to create a tray is by copying an existing one:

$tray2 = $tray1->copy();

This copies the contents (which is the references to the rowops) and does not create any ties between the trays. The copying is really just a more efficient way to do an equivalent of:

$tray2 = $tray1->getUnit()->makeTray($tray1->toArray());

The tray references can be compared for whether they point to the same tray object:

$result = $tray1->same($tray2);

The contents of a tray may be cleared. Which is more convenient and more efficient than discarding a tray and creating another one:

$tray->clear();

The data may be added to the back of a tray:

$tray->push(@rowops);

Multiple rowops can be pushed in a single call. There are no other Perl-like operations on a tray: it's either create from a set of rowops, push, or convert to a Perl array.

Note that the trays are mutable, unlike the rows and rowops. Multiple references to a tray will see the same contents. If a tray is changed through one reference, the others will see the changes too.

7.5. Error handling during the execution

The basics of error handling have been described in Section 4.2: “Errors, deaths and confessions” . Now let's look more in-depth. When the labels execute, they may produce errors in one of two ways:

  • The Perl code in the label might die.
  • The call topology might violate the rules.

The rules are basically that by default you can't make the recursive calls. A label may not make calls directly or through other labels to itself. The idea is to catch the call sequences that are likely to go into the deep recursion and overflow the stack. It catches them early, on the first attempt of recursion. If you need to do the recursion, the best way is to use instead schedule() or loopAt() or the streaming functions with trays. That way you avoid overrunning the stack.

It's also possible to relax the recursion checks by specifying higher limits for the recursion count and stack depth. How to do it is described in Section 7.13: “Recursion control” . It comes useful in some special cases, as described in Section 15.9: “Streaming functions and recursion” . However such higher limits best be avoided unless really needed.

What particular stack is meant here? The execution of Triceps in Perl has three stacks:

  • The system stack used by the underlying Triceps C++ code and by the internal functions of the Perl interpreter.
  • The Perl call stack, keeping the call history of the Perl code.
  • The Triceps call stack, keeping the call history of the Triceps labels in a Unit.

The answer is all three of these stacks. As the calls are made, frames are pushed onto all these stacks, logically intermingling.

Whichever way the error is detected, it causes the stacks to be unwound, undoing the intermingling in the opposite order. The Perl error messages from die or confess and the Triceps tracing (in the C++ code) of the rowop calls and label chainings get combined into a common stack trace as the stacks are being unwound. When the code gets back to Perl, the XS code triggers a confess with the message containing the unwound stack trace up to this point. If that happens to be in the handler of another label, it continues the hybrid stack unwinding. If not caught by eval, it keeps going to the topmost Triceps Unit call() or drainFrame() and causes the whole program to die, printing the stack trace. In a multithreaded Triceps model there is also a step of interrupting all the threads in the model, but in the end it still ends up dying and printing the stack trace along with the information, what thread caused it. Which is a reasonable reaction most of the time.

Remember, the root cause is a serious error that is likely to leave the model in an inconsistent state, and it should usually be considered fatal.

If you want to catch the errors, nip them in the bud by wrapping your Perl code in eval. Then you can handle the errors before they have a chance to propagate.

In case if the program runs multiple models (multiple Units, or multiple multithreaded Apps) in it, it can also wrap the outermost call in eval, and discard just this one erroneous model while leaving the other models running. If the erroneous units get properly cleared, they will free their memory and cause no leaks.

What happens to the rowops that were enqueued in the Triceps stack frames when the stack gets unwound? They get thrown away. The memory gets collected thanks to the reference counting, but the rowops and their sequence order get thrown out of the stack. The reason is basically that there may be no catching of the errors until unwinding to the outermost call. The choice is to either throw away everything after the first error or keep trying to execute the following rowops, collecting the errors. And that might become a lot of errors. I've taken the choice of stopping as early as possible, because the state of the model will probably be corrupted anyway and nothing but garbage would be coming out (if anything would be coming at all and not be stuck in an endless loop).

7.6. No bundling

The most important principle of Triceps scheduling is: No Bundling. Every rowop is for itself.

I've seen the most damage done by bundling in the Coral8/Sybase R4 scheduling, so I'll refer to it when explaining the dangers of bundling.

What is a bundle? It's a set of records that go through the execution together. If you have a model consisting of two functional elements F1 and F2 connected in a sequential fashion

F1->F2

and a few loose records R1, R2, R3, the normal execution order without bundling will be:

F1(R1), F2(R1), F1(R2), F2(R2), F1(R3), F2(R3)

Each row goes through the whole model (a real simple one in this case) before the next one is touched. This allows F2 to take into accont the state of F1 exactly as it was right after processing the same record, without any interventions in between.

Even though the trays in Triceps store multiple rowops, they are not bundles. When a tray is called, it works exactly as if every rowop from it were called separately in order. The first rowop fully propagates, then the second one, and so on. The ordered storage in the trays only provides the order for that future execution or for a manual iteration over the rowops.

If the same records are placed in a bundle (R1, R2, R3), the execution order will be different:

F1(R1), F1(R2), F1(R3), F2(R1), F2(R2), F2(R3)

The whole bundle goes through F1 before the rows go to F2.

That would not always be a problem, and even could be occasionally useful, if the bundles were always created explicitly. In the reality of Coral8/Sybase R4 scheduling, every time a statement produces multiple rows from a single one (think of a join that picks multiple rows from another side), it creates a bundle and messes up all the logic after it. Some logic gets affected so badly that a few statements in CCL (the Sybase modeling language), such as ON UPDATE, had to be designated to always ignore the bundles, otherwise they would not work at all. At my past work I wrote a CCL pattern for breaking up the bundles. It's rather heavyweight and thus could not be used all over the place but provides a generic solution for the most unpleasant cases.

Worse yet, the bundles may get created in Coral8 absolutely accidentally: if two rows happen to have the same timestamp, for all practical purposes they would act as a bundle. In the models that were designed without the appropriate guards, this leads to the time-based bugs that are hard to catch and debug. Writing these guards correctly is hard, and testing them is even harder.

Another issue with bundles is that they make the large queries slower. Suppose you do a query from a window that returns a million rows. All of them will be collected in a bundle, then the bundle will be sent to the interface gateway that would build one huge protocol packet, which will then be sent to the client, which will receive the whole packet and then finally iterate on the rows in it. Assuming that nothing runs out of memory along the way, it will be a long time until the client sees the first row. Very, very annoying.

The Aleri CEP also had its own version of bundles, called transactions, but a more smart one. Aleri always relied on the primary keys. The condition for a transaction is that it must never contain multiple modification for the same primary key. Since there are no execution order guarantees between the functional elements, in this respect the transactions work in the same way as loose records, only with a more efficient communication between threads. Still, if the primary key changes in an element (say, an aggregator), the condition does not propagate through it. Such elements have to internally collapse the outgoing transactions along the new key, adding overhead.

7.7. Topological loops

The easiest and most efficient way to schedule the loops is to do it procedurally, something like this:

foreach my $row (@rowset) {
  $unit->call($lbA->makeRowop(&Triceps::OP_INSERT, $row));
}

However it requires that all the rowops to loop over are known in advance. In some situations this might not be true, but instead the rowop entering a loop iteration gets produced by the previous iteration. These situations are better served by the topological loops, formed by connecting the labels in a loop as shown in Figure 7.1.

Labels forming a topological loop.

Figure 7.1. Labels forming a topological loop.


However if the labels are simplemindedly doing the calls through a topology like this, the loop becomes a recursion: each label ends up indirectly calling itself for the next iteration of the loop, which repeats the same thing again ang again. This arrangement would quickly use up the stack and crash, so Triceps normally prohibits the recursive calls.

There are two ways to get around that problem. The first one is to use the trays and streaming functions as described in Section 15.5: “Streaming functions and loops” . It might be the more powerful alternative of the two, however the concept of streaming functions takes a fair amount of explaining and thus is placed later in the manual. The second way is to use the more advanced scheduling capabilities of the Triceps units, which is described here.

The detailed explanation of how it all works is somewhat complicated, split into a separate section Section 7.12: “The gritty details of Triceps loop scheduling” for those interested. But there are the easy methods that cover up all the complexity.

The first part is done by creating the first label of the loop (such as the label A in Figure 7.1) through a special wrapper. This can be done in one of two ways:

my ($lbFirst, $mark) = $unit->makeLoopHead($rowType, "name", $clearSub,
  $execSub, @args);
my ($lbFirst, $mark) = $unit->makeLoopAround("name", $lbToWrap);

makeLoopHead() is the way to use if you're creating a new Perl label to be the first one in the loop. It has the exact same arguments as makeLabel(), which is described in Section 6.2: “Label construction” . It will put an appropriate wrapper directly into the Perl code, that would do all the required magic before your code executes.

makeLoopAround() is the way to use if you want to start the loop with some existing label (such as an input label of a table). It will create a new label that does the necessary magic, then chain its argument label from the new one. Nothing really stops you from creating a Perl label manually and then wrapping it in makeLoopAround() but makeLoopHead() produces a slightly more efficient code.

Either way, two values are returned: the newly created label and a special FrameMark object.

When you send the rows into the loop, you absolutely must send them to this newly created label, not directly to the underlying wrapped label! Otherwise the magic won't work.

The FrameMark is a special opaque object that is used to remember the state of the Triceps call stack at the start of the loop, to get back to it on the next iterations. It will be used when sending the rowops to the next iteration of the loop. Naturally, this object must be made accessible in the label handlers that do this sending.

The name argument will become the name of the created label. The FrameMark object also has a name, useful for diagnostics, that gets created by adding a suffix to the argument: name.mark.

The second part, whenever you need to send a rowop back to the start of the loop, such as in the label C in Figure 7.1, don't call it but use a special method:

$unit->loopAt($mark, @rowops_or_trays);

This will remember this rowop for the future. When the processing of the current iteration is all done, the scheduler in the unit will pick up the next remembered looped rowop and will feed it into the next iteration, until there are no more remembered rowops. Only after that will the first call of the first label in the loop return to its caller. In Figure 7.1 the said caller will be the label X.

The rowops sent back must always be for the label $lbFirst, returned by the makeLoop*().

It's perfectly fine to send multiple rowops back from a single iteration of the loop, each of these rowops will be processed in its own iteration in the order they were sent.

It's also perfectly fine to have the nested loops, as long as each loop uses its own frame mark object and starts from a separate label (add an empty label if needed).

There also are the convenience methods that create a rowop and loop it back in one go, just like makeHashCall()/makeArrayCall():

$unit->makeHashLoopAt($mark, $lbFirst, $opcode,
  $fieldName => $fieldValue, ...);
$unit->makeArrayLoopAt($mark, $lbFirst, $opcode, @fieldValues);

Now with all this knowledge let's write an example. It will compute the Fibonacci numbers. It's a real overcomplicated and perverse way of calculating the Fibonacci numbers. But it also is a great fit to the type of problems that get solved with the topological loop, one of a simple kind.

First, a quick reminder of what is a Fibonacci number. Historically it's a solution to the problem of breeding the spherical rabbits in a vacuum. But in the mathematical reality it's the sequence of numbers where each number is a sum of the two previous ones. Two initial elements are defined to be equal to 1, and it goes from there:

Fi = Fi-1 + Fi-2

F1 = 1; F2 = 1

The Fibonacci numbers are often used as an example of recursive computations in the beginner's books on programming. The computation of the n-th Fibonacci number is usually shown like this:

sub fib1 # ($n)
{
  my $n = shift;
  if ($n <= 2) {
    return 1;
  } else {
    return &fib1($n-1) + &fib1($n-2);
  }
}

However that's not a good way to compute in the real world. When a function calls itself recursively once, its complexity is linear, O(n). When a function calls itself twice or more, its complexity becomes exponential, O(en). At first you might think that it's only quadratic O(n2) because it forks two ways on each step. But these two ways keep forking and forking on each step, and it compounds to exponential. Which is a real bad thing.

To think of it, it's a huge waste, since the (n-2)-th number is calculated anyway for the (n-1)-th number. Why calculate it separately the second time? We could as well have saved and reused it. The Lisp people have figured this out a long time ago, and the Lisp books (if you can read Finnish or Russian, [Hyvonen86] is a classical one) are full of examples that do exactly that. However I'm too lazy to explain how they work, so we're going to skip it together with the conversion of a tail recursion into a loop and get directly to the loop version. I find the loop version more natural and easier to write than a recursion anyway.

sub fibStep2 # ($prev, $preprev)
{
  return ($_[0] + $_[1], $_[0]);
}

sub fib2 # ($n)
{
  my $n = shift;
  my @prev = (1, 0); # n and n-1

  while ($n > 1) {
    @prev = &fibStep2(@prev);
    $n--;
  }
  return $prev[0];
}

The split into two functions is not mandatory for the loop version, it just does the clean separation of the loop counter logic and of the computation of the next step of the function. (But for the recursion version if would be mandatory).

I'm going to take this procedural loop version and transform it into a topological loop. It actually happens to be a real good match for the topological loop. In a topological loop a record keeps traveling through it and being transformed until it satisfies the loop exit condition. Here @prev is the record contents, and the iteration count will be added to them to keep track of the exit condition.

$uFib = Triceps::Unit->new("uFib");

my $rtFib = Triceps::RowType->new(
  iter => "int32", # iteration number
  cur => "int64", # current number
  prev => "int64", # previous number
);

my $lbPrint = $uFib->makeLabel($rtFib, "Print", undef, sub {
  print($_[1]->getRow()->get("cur"));
});

my $lbCompute; # will fill in later

my ($lbNext, $markFib) = $uFib->makeLoopHead(
  $rtFib, "Fib", undef, sub {
    my $iter = $_[1]->getRow()->get("iter");
    if ($iter <= 1) {
      $uFib->call($lbPrint->adopt($_[1]));
    } else {
      $uFib->call($lbCompute->adopt($_[1]));
    }
  }
);

$lbCompute = $uFib->makeLabel($rtFib, "Compute", undef, sub {
  my $row = $_[1]->getRow();
  my $cur = $row->get("cur");
  $uFib->makeHashLoopAt($markFib, $lbNext, $_[1]->getOpcode(),
    iter => $row->get("iter") - 1,
    cur => $cur + $row->get("prev"),
    prev => $cur,
  );
});

my $lbMain = $uFib->makeLabel($rtFib, "Main", undef, sub {
  my $row = $_[1]->getRow();
  $uFib->makeHashCall($lbNext, $_[1]->getOpcode(),
    iter => $row->get("iter"),
    cur => 1,
    prev => 0,
  );
  print(" is Fibonacci number ", $row->get("iter"), "\n");
});

while(<STDIN>) {
  chomp;
  my @data = split(/,/);
  $uFib->makeArrayCall($lbMain, @data);
  $uFib->drainFrame(); # just in case, for completeness
}

You can see that it has grown quite a bit. That's why the procedural loops are generally a better idea. However if the computation involves a lot of the SQLy logic, the topological loops are still beneficial.

The main loop reads the CSV lines with opcodes (which aren't really used here, just passed through and then thrown away before printing) and calls $lbMain. Here is an example of an input and output as they would intermix if the input was typed from the keyboard. As in the rest of this manual, the input lines are shown in bold.

OP_INSERT,1
1 is a Fibonacci number 1
OP_DELETE,2
1 is a Fibonacci number 2
OP_INSERT,5
5 is a Fibonacci number 5
OP_INSERT,6
8 is a Fibonacci number 6

The input lines contain the values only for the field iter, which intentionally happens to be the first field in the row type. The other fields will be reset anyway in $lbMain, so they are left as NULL.

The point of $lbMain is to call the loop begin label $lbBegin and then print the message about which Fibonacci number was requested. The value of the computed number is printed at the end of the loop, so when the words is a Fibonacci number are printed after it, that demonstrates that the execution of $lbMain continues only after the loop is completed.

Just to rub it in a bit more, $lbMain itself doesn't get back the result of the computation, because the Triceps call() has no way to return any results. The intermediate states circle through the loop until the computation is completed, and the results are forwarded out of the loop to $lbPrint(). All this time $lbMain sits and waits for its call to complete. After the execution gets back to $lbMain, it knows that $lbPrint() already ran and printed the result, so it prints more detail after it. Another option would be for the loop result label to put the result value into some static variable, letting $lbMain read it and print the whole message in one statement.

The loop logic is split into two labels $lbNext and $lbCompute purely to show that it can be split like this. $lbNext handles the loop termination condition, and $lbCompute does essentially the work of fibStep2(). After the loop terminates, it passes the result row to $lbPrint for the priniting of the value.

When the code for $lbNext is created, it contains the call of $lbCompute. However the label $lbCompute has not been created at this time yet! Not a problem, creating in advance an empty variable $lbCompute is enough. The closure in $lbNext will keep a reference to that variable, and the variable will be filled with the reference to the label later (but before the main loop executes).

And here is the version with makeLoopAround():

my ($lbNext, $markFib); # will fill in later

$lbCompute = $uFib->makeLabel($rtFib, "Compute", undef, sub {
  my $row = $_[1]->getRow();
  my $cur = $row->get("cur");
  my $iter = $row->get("iter");
  if ($iter <= 1) {
    $uFib->call($lbPrint->adopt($_[1]));
  } else {
    $uFib->makeHashLoopAt($markFib, $lbNext, $_[1]->getOpcode(),
      iter => $row->get("iter") - 1,
      cur => $cur + $row->get("prev"),
      prev => $cur,
    );
  }
});

($lbNext, $markFib) = $uFib->makeLoopAround(
  "Fib", $lbCompute
);

The unit, row type, $lbPrint, $lbMain and the main loop have stayed the same, so they are omitted from this example. The whole loop logic, both the termination condition and the computation step, have been collected into one label $lbCompute, to show that it can be done this way too. Then the loop head is created around $lbCompute.

Since both $lbNext and $markFib need to be accessible inside $lbCompute, they are created in advance and become visible in the closure scope. But the values are placed into these variables only after $lbCompute is already defined (since $lbCompute is an argument to build these values).

For the more curious, let's dig a little into what happens inside the makeLoop*() methods. The same effect can be (and in C++ API has to be) achieved by calling the slightly lower-level methods.

The frame mark is created as follows:

my $mark = Triceps::FrameMark->new("markName");

It has to be remembered and then used in the first label of the loop to remember the state of the Triceps call stack:

$unit->setMark($mark);

This is normally the first thing done in the first label's handler. Yes, it will be remembered on every iteration of the loop. However the trick of the arrangement is that the call stack will be returned to the same state before each iteration, so on the second and following iterations this call will become a no-op.

The makeLoop*() methods just do this for you, their implementation is fairly simple:

sub makeLoopHead # ($self, $rt, $name, $clearSub, $execSub, @args)
{
  my ($self, $rt, $name, $clear, $exec, @args) = @_;

  my $mark = Triceps::FrameMark->new($name . ".mark");

  my $label = $self->makeLabel($rt, $name, $clear, sub {
    $self->setMark($mark);
    &$exec(@_);
  }, @args);

  return ($label, $mark);
}

sub makeLoopAround # ($self, $name, $lbFirst)
{
  my ($self, $name, $lbFirst) = @_;
  my $rt = $lbFirst->getRowType();

  my $mark = Triceps::FrameMark->new($name . ".mark");

  my $lbWrap = $self->makeLabel($rt, $name, undef, sub {
    $self->setMark($mark);
  });
  $lbWrap->chain($lbFirst);

  return ($lbWrap, $mark);
}

7.8. The main loop

The examples above had already shown the main loop, now let's look at it up close and discuss, what and why is it doing. The point of the main loop is to get the execution of the model going: accept some rowops from the outside world, shovel them into the Triceps model and process them, sending some result rowops back into the outside world. The sending back is done from inside the label handlers, so as long as the model runs, nothing else is needed for them.

By the time the program enters the main loop, the model should be all constructed and ready to run. The simplest main loop may look like this:

while ($rowop = &readRowop()) { # reads with some user-defined function
  $unit->call($rowop);
}

This loop will read the incoming rowops as long as they're available, and call them. When $unit->call() returns, the processing of the rowop in the model is done, including all the nested calls it caused.

However there is also a way to request the post-processing. It's somewhat similar to the Tcl concept of idletasks. An example of post-processing might be the flushing of the output buffer: the normal processing may collect a number of the output rowops in the buffer, and after everything is done, the buffer would be serialized and sent out. This post-processing needs to happen after the initial call returns.

The rowops are scheduled for post-processing with the method:

$unit->schedule(@rowops_or_trays);

The model keeps a queue of the post-processing requests, and schedule() adds to this queue.

However the simplest main loop shown above won't run the postprocessing. The queue would just keep growing. The postprocessing is done by the method

$unit->drainFrame();

It calls all the collected post-processing rowops in order. Their handling may keep scheduling more rowops, and the draining won't stop until all of them are processed. So they should not keep schduling more rowops forever, or the draining will never end. To handle the postprocessing properly, the main loop should be:

while ($rowop = &readRowop()) { # reads with some user-defined function
  $unit->call($rowop);
  $unit->drainFrame();
}

You can even write it in a slightly different form:

while ($rowop = &readRowop()) { # reads with some user-defined function
  $unit->schedule($rowop);
  $unit->drainFrame();
}

In this version the incoming rowop gets added to the queue, and then drainFrame() calls it and any of its after-effects. Historically, this has been the intended way but then it had turned out that there is no point in first placing the incoming rowop onto the queue and then reading it from the queue, so calling it directly is slightly more efficient.

What if you decide in some label handler deep in the call tree that now is the good time to run the schduled rowops, similar to Tcl's update idletasks and call drainFrame()? First of all, this is a very bad idea. The CEP models are usually very sensitive to the particular execution order, and inserting some random rowops in the middle tends to break things. Second, it won't work. It might execute some rowops (which ones exactly is a long story, described in Section 7.11: “The gritty details of Triceps scheduling” ) but none of the scheduled ones. In short, there is a reason to why the method is called drainFrame(): the queue is organized in frames that are pushed stack-wise as the labels are called, and popped after the calls complete. DrainFrame() drains the current frame. Schedule() puts the rowops onto the outermost frame that becomes accessible for draining only when the model is idle.

It is possible to find out whether there are the post-processing rowops scheduled and to run them one by one:

while ($rowop = &readRowop()) { # reads with some user-defined function
  $unit->call($rowop);
  while (!$unit->empty()) {
    $unit->callNext();
  }
}

But of course a Perl loop is less efficient than the C++ loop in drainFrame().

Another straightforward idea is to read and execute the input as it comes in but delay the post-processing until the input becomes idle, exacly like the Tcl idletasks do. Somewhat like this:

while (1) {
  if (!$unit->empty()) {
    $rowop = &readRowopNoWait();
    if ($rowop) {
      $unit->call($rowop);
    } else {
      $unit->callNext();
    }
  } else {
    $rowop = &readRowop();
    last if (!$rowop); # no more input
    $unit->call($rowop);
  }
}

It might even be useful sometimes but most of the time this turns out to be nothing but pain. The problem is that the exact order of execution becomes dependent on the timing of the data arrival, and the repeatable testing becomes next to impossible. It's another case of the bundling problem.

If the data arrives bundled with multiple rowops per packet, you have a choice whether to drain the frame after each rowop or after each packet. Which approach is better depends on the needs of the application and on whether the bundling of the rowops into packets is predictable and repeatable. If there are no defined boundaries between packets but the grouping is done simply by timeout or buffer size, such bundles are much better off being broken up into the individual rowops.

Now let's look at yet another aspect: the main loop may need to exit not only when there is no more input available but also after processing some requests. This can be done by adding a global stop flag, with label handlers setting it when they need to request the exit:

$stop = 0;
while (!$stop && ($rowop = &readRowop())) {
  $unit->call($rowop);
  $unit->drainFrame();
}

The examples in this manual tend to read the input data as plain text lines, convert them to rowops and execute. They are simple-minded, so they don't do any error checking, they would just fail randomly on the incorrect input. Their main loop usually goes along the following lines (with variations, to fit the examples, and as the main loop was refined over time):

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "lbCur") {
    $unit->makeArrayCall($lbCur, @data);
  } elsif ($type eq "lbPos") {
    $unit->makeArrayCall($lbPos, @data);
  }
  $unit->drainFrame();
}

It reads the CSV (Comma-Separated Values) data from stdin, with the label name in the first column, the opcode in the second, and the data fields in the rest. Then dispatches according to the label.

Many variations are possible. It can be generalized to look up the labels from the hash:

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  $unit->makeArrayCall($labels{$type}, @data);
  $unit->drainFrame();
}

Or call the procedural functions for some types:

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "lbCur") {
    $unit->makeArrayCall($lbCur, @data);
  } elsif ($type eq "lbPos") {
    $unit->makeArrayCall($lbPos, @data);
  } elsif ($type eq "clear") { # clear the previous day
    &clearByDate($tPosition, @data);
  }
  $unit->drainFrame();
}

Once again, none of these small examples are production-ready. They have no error handling, and their parsing of the CSV data is primitive. It can't handle the quoting properly and can't parse the data with commas in it. A better ready way to parse the data will be provided in the future. For now, make your own.

The multithreaded models have their own special needs for the main loops. These will be discussed in Section 16.6: “Dynamic threads and fragments in a socket server” .

7.9. Main loop with a socket

A fairly typical situation is when a CEP model has to run in a daemon process, receiving and sending data through the network sockets. Here goes an example that does this. It's not production-ready, it's only of an example quality, and thus is located in an X-package. It still has the issue with the parsing of the CSV data, its handling of the errors is not well-tested, and it makes a few simplifying assumptions about the buffering (more on this below). Other than that, it's a decent starting point. You can import this package as Triceps::X::SimpleServer, its source code found in lib/Triceps/X/SimpleServer.pm.

package Triceps::X::SimpleServer;

sub CLONE_SKIP { 1; }

our $VERSION = 'v2.0.0';

use Carp;
use Errno qw(EINTR EAGAIN);
use IO::Poll qw(POLLIN POLLOUT POLLHUP);
use IO::Socket;
use IO::Socket::INET;

our @ISA = qw(Exporter);

our %EXPORT_TAGS = ( 'all' => [ qw(
  outBuf outCurBuf mainLoop startServer makeExitLabel makeServerOutLabel
) ] );

our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );

# For whatever reason, Linux signals SIGPIPE when writing on a closed
# socket (and it's not a pipe). So intercept it.
sub interceptSigPipe
{
  if (!$SIG{PIPE}) {
    $SIG{PIPE} = sub {};
  }
}

# and intercept SIGPIPE by default on import
&interceptSigPipe();

The package starts with the usual imports and exports. The CLONE_SKIP is required to make sure that the package interacts properly with the multithreading (any objects of this package won't be cloned into the new threads, and since the cloning tends to not work right anyway, I'm not sure why it's not the default).

Then it intercepts and ignores the SIGPIPE signal for the reasons described in the comment. It's very inconvenient to have your server die on a signal when the other side decides to drop the connection. Any server dealing with sockets on Linux must intercept SIGPIPE. Intercepting it with an empty handler looks like a better idea than ignoring it altogether, to make extra-sure that the writer won't be stuck in that write forever, but perhaps ignoring it would be just as good. The interception is placed into a function which gets called on the package import and can be called again later in case if something else resets the handler to default.

# the socket and buffering control for the main loop;
# they are all indexed by a unique id
our %clients; # client sockets
our %inbufs; # input buffers, collecting the whole lines
our %outbufs; # output buffers
our $poll; # the poll object
our $cur_cli; # the id of the current client being processed
our $srv_exit; # exit when all the client connections are closed

# Writing to the output buffers. Will also trigger the polling to
# actually send the output data to the client's socket.
#
# @param id - the client id, as generated on the client connection
#        (if the client already disconnected, this call will
#        have no effect)
# @param string - the string to write
sub outBuf # ($id, $string)
{
  my $id = shift;
  my $line = shift;
  if (exists $clients{$id}) {
    $outbufs{$id} .= $line;
    # If there is anything to write on a buffer, stop reading from it.
    $poll->mask($clients{$id} => POLLOUT);
  }
}

# Write to the output buffer of the current client (as set in $cur_cli
# by the main loop).
#
# @param string - the string to write
sub outCurBuf # ($string)
{
  outBuf($cur_cli, @_);
}

# Close the client connection. This doesn't flush the ouput buffer,
# so it must be called only after the flush is done, or if the flush
# can not be done (such as, if the client has dropped the connection).
# It does delete all the client-related data.
#
# @param id - the client id, as generated on the client connection
# @param h - the socket handle of the client
sub _closeClient # ($id, $h)
{
  my $id = shift;
  my $h = shift;
  $poll->mask($h, 0);
  $h->close();
  delete $clients{$id}; # OK per Perl manual even when iterating
  delete $inbufs{$id};
  delete $outbufs{$id};
}

# The server main loop. Runs with the specified server socket.
# Accepts the connections from it, then polls the connections for
# input, reads the data in CSV and dispatches it using the labels hash.
#
# XXX Caveats:
# The way this works, if there is no '\n' before EOF,
# the last line won't be processed.
# Also, the whole output for all the input will be buffered
# before it can be sent.
#
# @param srvsock - the server socket handle
# @param labels - Reference to the label hash, that contains the
#        mappings used to dispatch the input, in either of formats:
#          name => label_object
#          name => code_reference
#        The input from the clients is parsed as CSV with the 1st field
#        containing the label name.  Then if the looked up dispatch is an
#        actual label, the rest of CSV fields are: the 2nd the opcode, and the rest
#        the data fields in the order of the label's row type. If the
#        looked up dispatch is a Perl sub reference, just the whole input
#        line is passed to it as an argument.
sub mainLoop # ($srvsock, $%labels)
{
  my $srvsock = shift;
  my $labels = shift;

  my $client_id = 0; # unique strings
  our $poll = IO::Poll->new();

  $srvsock->blocking(0);
  $poll->mask($srvsock => POLLIN);
  $srv_exit = 0;

  while(!$srv_exit || keys %clients != 0) {
    my $r = $poll->poll();
    confess "poll failed: $!" if ($r < 0 && ! $!{EAGAIN} && ! $!{EINTR});

    if ($poll->events($srvsock)) {
      while(1) {
        my $client = $srvsock->accept();
        if (defined $client) {
          $client->blocking(0);
          $clients{++$client_id} = $client;
          # print("Accepted client $client_id\n");
          $poll->mask($client => (POLLIN|POLLHUP));
        } elsif($!{EAGAIN} || $!{EINTR}) {
          last;
        } else {
          confess "accept failed: $!";
        }
      }
    }

    my ($id, $h, $mask, $n, $s);
    while (($id, $h) = each %clients) {
      no warnings; # or in tests prints a lot of warnings about undefs

      $cur_cli = $id;
      $mask = $poll->events($h);
      if (($mask & POLLHUP) && !defined $outbufs{$id}) {
        # print("Lost client $client_id\n");
        _closeClient($id, $h);
        next;
      }
      if ($mask & POLLOUT) {
        $s = $outbufs{$id};
        $n = $h->syswrite($s);
        if (defined $n) {
          if ($n >= length($s)) {
            delete $outbufs{$id};
            # now can accept more input
            $poll->mask($h => (POLLIN|POLLHUP));
          } else {
            substr($outbufs{$id}, 0, $n) = '';
          }
        } elsif(! $!{EAGAIN} && ! $!{EINTR}) {
          warn "write to client $id failed: $!";
          _closeClient($id, $h);
          next;
        }
      }
      if ($mask & POLLIN) {
        $n = $h->sysread($s, 10000);
        if ($n == 0) {
          # print("Lost client $client_id\n");
          _closeClient($id, $h);
          next;
        } elsif ($n > 0) {
          $inbufs{$id} .= $s;
        } elsif(! $!{EAGAIN} && ! $!{EINTR}) {
          warn "read from client $id failed: $!";
          _closeClient($id, $h);
          next;
        }
      }
      # The way this works, if there is no '\n' before EOF,
      # the last line won't be processed.
      # Also, the whole output for all the input will be buffered
      # before it can be sent.
      while($inbufs{$id} =~ s/^(.*)\n//) {
        my $line = $1;
        chomp $line;
        {
          local $/ = "\r"; # take care of a possible CR-LF in this block
          chomp $line;
        }
        my @data = split(/,/, $line);
        my $lname = shift @data;
        my $label = $labels->{$lname};
        if (defined $label) {
          if (ref($label) eq 'CODE') {
            &$label($line);
          } else {
            my $unit = $label->getUnit();
            confess "label '$lname' received from client $id has been cleared"
              unless defined $unit;
            eval {
              $unit->makeArrayCall($label, @data);
              $unit->drainFrame();
            };
            warn "input data error: $@\nfrom data: $line\n" if $@;
          }
        } else {
          warn "unknown label '$lname' received from client $id: $line "
        }
      }
    }
  }
}

The general outline follows the single-threaded multiplexing server described in [Babkin10]. mainLoop() gets the server socket and a dispatch table of labels or functions as its arguments. It then proceeds with waiting for connections.

Once a connection is received, it gets added to the set of active connections, to get included in the waiting for the input data. The input data is read as simplified CSV (no commas in the middle of values, and no way to reprsent the NULL values othar than for those omitted at the end of the line). It's expected to have the format:

name,opcode,data...

Such as:

window,OP_INSERT,5,AAA,30,30
window.query,OP_INSERT
exit,OP_NOP

The name part is then used to find a label in the dispatch table. The rest of the data is used to create a rowop for that label and execute it. As you can see, a row must contain at least the label name and opcode, or the execution will print an error message on the server's standard error and return no response to the client in the socket.

If the dispatch table contains not a label but a simple function reference for some name, the rest of the row is not even parsed, the function gets called without any arguments. If the exit is implemented as a function in the dispatch table, the following would also work:

exit

The data is sent back to the client through buffering. To send some data to a client, use

&outBuf($id, $text);

The $id is the unique id of the client. How do you find, what is the id of the client you want to send the data to? When an input line is processed, the main loop knows, from what client it was received. It puts the id of that client in the global variable $Triceps::X::SimpleServer::cur_cli. You can take it from there and remember. If you want to reply to the current client, you don't need to bother yourself with the id at all, just call

&outCurBuf($text);

If you remember an id for the future use, and the client disconnects before you call outBuf(), the call will have no effect. In any case, if a client has disconnected, the further processing of its requests should usually be stopped, and thus checking if the client is still connected is a good idea anyway:

if (exists $clients{$id}) {
  # ... prepare the data for it ...
  &outBuf($id, $text);
} else {
  # ... stop sending the data to this client ...
}

The client ids are not reused, so this check is always safe.

Once some output is buffered to send to a client, the further input from that client stops being accepted until the output buffer drains. But the processing in the Triceps unit scheduler keeps running until it runs out of things to do before it returns to the main loop. All this time the output buffer keeps collecting data without sending it to the client. Also, the input buffer might happen to already contain multiple lines. Then all these lines will be processed before the data from the output buffer starts being sent to the client. If a request produces a large amount of data, all this data will be buffered first. It's a simplification but really the commercial CEP systems aren't doing a whole lot better: when asked for the contents of a table/window/materliaized view, Coral8 and Aleri and Sybase (don't know about StreamBase but it might be not different either) would make a copy of it first before sending the data. In some cases the copy is more efficient because it references the rows rather than copying the whole byte data, but in the grand scheme of things it's all the same.

Internally the information about the client sockets and their buffers is kept in the global hashes %clients, %inbufs, %outbufs. It could be done a a single hash of objects but this was simpler.

The loop exits when the global variable $Triceps::X::SimpleServer::srv_exit gets set (synchronously, i.e. by one of the label handlers) to 1 and all the clients disconnect. The requirement for disconnection of all the clients makes sure that all the output buffers get flushed before exit, and that was the easiest way to achieve this goal.

mainLoop() relies on the listening socket being already created, bound and given to it as a parameter. The creation of the socket and forking of a separate server process is wrapped in another function:

# The server start function that creates the server socket,
# remembers its port number, then forks and
# starts the main loop in the child process. The parent
# process then returns the pair (port number, child PID).
#
# @param port - the port number to use; 0 will cause a unique free
#        port number to be auto-assigned
# @param labels - reference to the label hash, to be passed to mainLoop()
# @return - pair (port number, child PID) that can then be used to connect
#        to and control the server in the child process
sub startServer # ($port, $%labels)
{
  my $port = shift;
  my $labels = shift;

  my $srvsock = IO::Socket::INET->new(
    Proto => "tcp",
    LocalPort => $port,
    Listen => 10,
  ) or confess "socket failed: $!";
  # Read back the port, since the port 0 will cause a free port
  # to be auto-assigned.
  $port = $srvsock->sockport() or confess "sockport failed: $!";
  my $pid = fork();
  confess "fork failed: $!" unless defined $pid;
  if ($pid) {
    # parent
    $srvsock->close();
  } else {
    # child
    &mainLoop($srvsock, $labels);
    exit(0);
  }
  return ($port, $pid);
}

You can specify the server port 0 to request that the OS bind it to a randum unused port. The port number is then read back with sockport(). The pair of the port numer and the server's child process id is then returned as the result. The process where the server runs is in this case just a child process, it's not properly daemonized.

For a simple complete example, let's make an echo server that would print back the rows it receives, as found in t/xQuery.t:

our $rtTrade = Triceps::RowType->new(
  id => "int32", # trade unique id
  symbol => "string", # symbol traded
  price => "float64",
  size => "float64", # number of shares traded
);

use Triceps::X::SimpleServer qw(:all);

my $uEcho = Triceps::Unit->new("uEcho");
my $lbEcho = $uEcho->makeLabel($rtTrade, "echo", undef, sub {
  &outCurBuf($_[1]->printP() . "\n");
});
my $lbEcho2 = $uEcho->makeLabel($rtTrade, "echo2", undef, sub {
  &outCurBuf(join(",", "echo", &Triceps::opcodeString($_[1]->getOpcode()),
    $_[1]->getRow()->toArray()) . "\n");
});
my $lbExit = $uEcho->makeLabel($rtTrade, "exit", undef, sub {
  $Triceps::X::SimpleServer::srv_exit = 1;
});

my %dispatch;
$dispatch{"echo"} = $lbEcho;
$dispatch{"echo2"} = $lbEcho2;
$dispatch{"exit"} = $lbExit;

my ($port, $pid) = &Triceps::X::SimpleServer::startServer(0, \%dispatch);
print STDERR "port=$port pid=$pid\n";
waitpid($pid, 0);
exit(0);

It starts the server and waits for it to exit. waitpid() is used here in a simplified way too, it should properly be done in a loop until it succeeds or an error other than EINTR is returned.

$rtTrade is the row type for the expected data. Two labels, echo and echo2 differ in the way they print the data back: echo prints it in the symbolic form while echo2 prints in CSV. The label exit sets the exit flag. Here is a small session log from the client side (46651 is the port that got picked at random and printed by the server on the start):

$ telnet localhost 46651
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
echo,OP_INSERT,1,a,2,3.4
echo OP_INSERT id="1" symbol="a" price="2" size="3.4"
echo2,OP_INSERT,1,a,2,3.4
echo,OP_INSERT,1,a,2,3.4
exit,OP_NOP
^]
telnet> q
Connection closed.

The names in the dispatch table don't have to be the same as the names of the labels. It's often convenient to have them the same but not mandatory.

The exit label was created manually in this example but SimpleServer also provides the functions that create an exit label or an exit function, either of which can be placed into a dispatch table:

# A dispatch function, sending anything to which will exit the server.
# The server will not flush the outputs before exit.
#
# Use like:
#   $dispatch{"exit"} = \&Triceps::X::SimpleServer::exitFunc;
#
# In this way the input line doesn't have to contain the opcode.
# The alternative way is through makeExitLabel().
sub exitFunc # ($line)
{
    $srv_exit = 1;
}

# Create a label, sending anything to which will exit the server.
# The server will not flush the outputs before exit.
#
# Use like:
#   $dispatch{"exit"} = &Triceps::X::SimpleServer::makeExitLabel($uTrades, "exit");
#
# In this way the input line has to contain at least the opcode.
# The alternative way is through exitFunc().
#
# @param unit - the unit in which to create the label
# @param name - the label name
# @return - the newly created label object
sub makeExitLabel # ($unit, $name)
{
  my $unit = shift;
  my $name = shift;
  return $unit->makeLabel($unit->getEmptyRowType(), $name, undef, sub {
    $srv_exit = 1;
  });
}

makeExitLabel() is quite simple, it creates a label with hardcoded function of setting the flag $srv_exit. Even its row type is hardcoded to the empty rows. exitFunc() sets the same flag directly.

There is also a function for making the labels that output their rows to the client in CSV format (as usual, no commas in the values, the same as is expected by the socket server):

# Create a label that will print the data in CSV format to the server output
# (to the current client).
#
# @param fromLabel - the new label will be chained to this one and get the
#        data from it
# @param printName - if present, overrides the label name printed
# @return - the newly created label object
sub makeServerOutLabel # ($fromLabel [, $printName])
{
  no warnings; # or in tests prints a lot of warnings about undefs
  my $fromLabel = shift;
  my $printName = shift;
  my $unit = $fromLabel->getUnit();
  my $fromName = $fromLabel->getName();
  if (!$printName) {
    $printName = $fromName;
  }
  my $lbOut = $unit->makeLabel($fromLabel->getType(),
    $fromName . ".serverOut", undef, sub {
      &outCurBuf(join(",", $printName,
        &Triceps::opcodeString($_[1]->getOpcode()),
        $_[1]->getRow()->toArray()) . "\n");
    });
  $fromLabel->chain($lbOut);
  return $lbOut;
}

makeServerOutLabel() finds the unit and row type from the parent label, creates the printing label and chains it off the parent label. The newly created label is returned. The return value can be kept in a variable or immediately discarded; since the created label is already chained, it won't disappear. Tha name of the new label is produced from the name of the parent label by appending .serverOut to it.

Running the automated tests of the servers requires the clients to be started automatically too, feed the input, receive the results, and then compare them to the expected results. The package Triceps::X::DumbClient from lib/Triceps/X/DumbClient.pm does exactly that. The server code gets created as usual, only instead of starting the server, the dispatch table is given to the DumbClient method that takes care of starting the server, feeding the input, collecting the results, and waiting for the server to stop.

For example, the same echo example is run like this with DumbClient:

use Triceps::X::SimpleServer qw(:all);

my $uEcho = Triceps::Unit->new("uEcho");
my $lbEcho = $uEcho->makeLabel($rtTrade, "echo", undef, sub {
  &outCurBuf($_[1]->printP() . "\n");
});
my $lbEcho2 = $uEcho->makeLabel($rtTrade, "echo2", undef, sub {
  &outCurBuf(join(",", "echo", &Triceps::opcodeString($_[1]->getOpcode()),
    $_[1]->getRow()->toArray()) . "\n");
});
my $lbExit = $uEcho->makeLabel($rtTrade, "exit", undef, sub {
  $Triceps::X::SimpleServer::srv_exit = 1;
});

my %dispatch;
$dispatch{"echo"} = $lbEcho;
$dispatch{"echo2"} = $lbEcho2;
$dispatch{"exit"} = $lbExit;

my @inputQuery = (
"echo,OP_INSERT,1,a,2,3.4\n",
"echo2,OP_INSERT,1,a,2,3.4\n",
);
my $expectQuery =
'> echo,OP_INSERT,1,a,2,3.4
> echo2,OP_INSERT,1,a,2,3.4
echo OP_INSERT id="1" symbol="a" price="2" size="3.4"
echo,OP_INSERT,1,a,2,3.4
';

Triceps::X::TestFeed::setInputLines(@inputQuery);
Triceps::X::DumbClient::run(\%dispatch);

ok(&Triceps::X::TestFeed::getResultLines(), $expectQuery);

DumbClient works in symbiosis with the TestFeed module that handles the recorded inputs and outputs. Note that the exit line is not there, DumbClient adds it implicitly at the end of the input.

The input lines are also included by TestFeed in the output with the > prepended to them. DumbClient feeds all the inputs first and then reads all the results, relying on the TCP buffering to avoid deadlocking on the flow control. This works only for the small amounts of input but is good enough for the small tests.

And the implementation of DumbClient is fairly small, there is only one method:

sub run # ($labels)
{
  my $labels = shift;

  my ($port, $pid) = Triceps::X::SimpleServer::startServer(0, $labels);
  my $sock = IO::Socket::INET->new(
    Proto => "tcp",
    PeerAddr => "localhost",
    PeerPort => $port,
  ) or confess "socket failed: $!";
  while(& readLine) {
    $sock->print($_);
    $sock->flush();
  }
  $sock->print("exit,OP_INSERT\n");
  $sock->flush();
  $sock->shutdown(1); # SHUT_WR
  while(<$sock>) {
    & send($_);
  }
  waitpid($pid, 0);
}

As mentioned before in Section 4.8: “The Perl libraries and examples” , the methods readLine and send are imported from the TestFeed module.

7.10. Tracing the execution

When developing the CEP models, there always comes the question: WTF had just happened? How did it manage get this result? Followed by subscribing to many intermediate results and trying to piece together the execution order.

Triceps provides two solutions for this situation: First, the procedural approach should make the logic much easier to follow. Second, it has a ready way to trace the execution and then read the trace in one piece. It can also be used to analyze any variables on the fly, and possibly stop the execution and enter some manual mode.

The idea here is simple: provide the Unit with a method that will be called:

  • before a label executes,
  • before the chained labels execute,
  • after the chained labels execute,
  • after the label executes,
  • before the label's frame is drained (and thus the forked rowops execute, see the details of that in Section 7.11: “The gritty details of Triceps scheduling” ),
  • after the frame is drained.

The calls around the chaining and around the draining are done only if there are the chained labels to call or forked rowops to drain accordingly. Otherwise these pairs are skipped.

The tracing calls happen in the order shown above. The call after the label executes goes after the chained calls (if any), enveloping them. However the draining calls happen after that (and no matter how many rowops were forked onto that frame, there will be only one after-draining call per frame, still referring to the original label).

For the simple tracing, a small simple tracer is provided. It actually executes directly as compiled in C++ so it's quite efficient:

$tracer = Triceps::UnitTracerStringName(option => $value, ...);

The arguments are specified as the option name-value pairs.

The only option supported is verbose, which may be 0 (default) or non-0. If it's 0 (false), the tracer will record a message only before executing each label. If true, it will record a message on each stage. The class is named UnitTracerStringName because it records the execution trace in the string format, including the names of the labels. The tracer is set into the unit:

$unit->setTracer($tracer);

The unit's current tracer can also be read back:

$oldTracer = $unit->getTracer();

If no tracer was previously set, getTracer() will return undef. And undef can also be used as an argument of setTracer(), to cancel any previously set tracing.

The tracer references can be compared for whether they refer to the same underlying object:

$result = $tracer1->same($tracer2);

There are multiple kinds of tracer objects, and same() can be called safely for either kind of tracer, including mixing them together. Of course, the tracers of different kinds definitely would not be the same tracer object.

As the unit runs, the tracing information gets collected in the tracer object. It can be extracted back with:

$data = $tracer->print();

This does not reset the trace. To reset it, use:

$tracer->clearBuffer();

Here is a code sequence designed to produce a fairly involved trace:

$sntr = Triceps::UnitTracerStringName->new(verbose => 1);
$u1->setTracer($sntr);

$c_lab1 = $u1->makeDummyLabel($rt1, "lab1");
$c_lab2 = $u1->makeDummyLabel($rt1, "lab2");
$c_lab3 = $u1->makeDummyLabel($rt1, "lab3");

$c_op1 = $c_lab1->makeRowop(&Triceps::OP_INSERT, $row1);
$c_op2 = $c_lab1->makeRowop(&Triceps::OP_DELETE, $row1);

$c_lab1->chain($c_lab2);
$c_lab1->chain($c_lab3);
$c_lab2->chain($c_lab3);

$u1->schedule($c_op1);
$u1->schedule($c_op2);

$u1->drainFrame();

The trace is:

unit 'u1' before label 'lab1' op OP_INSERT {
unit 'u1' before-chained label 'lab1' op OP_INSERT {
unit 'u1' before label 'lab2' (chain 'lab1') op OP_INSERT {
unit 'u1' before-chained label 'lab2' (chain 'lab1') op OP_INSERT {
unit 'u1' before label 'lab3' (chain 'lab2') op OP_INSERT {
unit 'u1' after label 'lab3' (chain 'lab2') op OP_INSERT }
unit 'u1' after-chained label 'lab2' (chain 'lab1') op OP_INSERT }
unit 'u1' after label 'lab2' (chain 'lab1') op OP_INSERT }
unit 'u1' before label 'lab3' (chain 'lab1') op OP_INSERT {
unit 'u1' after label 'lab3' (chain 'lab1') op OP_INSERT }
unit 'u1' after-chained label 'lab1' op OP_INSERT }
unit 'u1' after label 'lab1' op OP_INSERT }
unit 'u1' before label 'lab1' op OP_DELETE {
unit 'u1' before-chained label 'lab1' op OP_DELETE {
unit 'u1' before label 'lab2' (chain 'lab1') op OP_DELETE {
unit 'u1' before-chained label 'lab2' (chain 'lab1') op OP_DELETE {
unit 'u1' before label 'lab3' (chain 'lab2') op OP_DELETE {
unit 'u1' after label 'lab3' (chain 'lab2') op OP_DELETE }
unit 'u1' after-chained label 'lab2' (chain 'lab1') op OP_DELETE }
unit 'u1' after label 'lab2' (chain 'lab1') op OP_DELETE }
unit 'u1' before label 'lab3' (chain 'lab1') op OP_DELETE {
unit 'u1' after label 'lab3' (chain 'lab1') op OP_DELETE }
unit 'u1' after-chained label 'lab1' op OP_DELETE }
unit 'u1' after label 'lab1' op OP_DELETE }

The print-out is not indented because the execution of real models tends to involve some quite long call chains, which would result in some extremely wide indenting. Instead the curly braces at the end of each line help to find the matching pair. You can always use the vi command % to jump to the matching brace, or a similar feature in the other editors.

In non-verbose mode the same trace would be:

unit 'u1' before label 'lab1' op OP_INSERT
unit 'u1' before label 'lab2' (chain 'lab1') op OP_INSERT
unit 'u1' before label 'lab3' (chain 'lab2') op OP_INSERT
unit 'u1' before label 'lab3' (chain 'lab1') op OP_INSERT
unit 'u1' before label 'lab1' op OP_DELETE
unit 'u1' before label 'lab2' (chain 'lab1') op OP_DELETE
unit 'u1' before label 'lab3' (chain 'lab2') op OP_DELETE
unit 'u1' before label 'lab3' (chain 'lab1') op OP_DELETE

The non-verbose trace doesn't have the curly braces because there are no matching pairs of lines.

The actual contents of the rows is not printed in either case. This is basically because the tracer is implemented in C++, and I've been trying to keep the knowledge of the meaning of the simple data types out of the C++ code as much as possible for now. But it can be implemented with a Perl tracer.

A Perl tracer is created with:

$tracer = Triceps::UnitTracerPerl->new($sub, @args);

The arguments are a reference to a function, and optionally arguments for it. The resulting tracer can be used in the unit's setTracer() as usual. A source code string may be used instead of the function reference, see Section 4.4: “Code references and snippets” .

The function of the Perl tracer gets called as:

&$sub($unit, $label, $fromLabel, $rowop, $when, @args)

The arguments are:

  • $unit is the usual unit reference.
  • $label is the current label being traced.
  • $fromLabel is the parent label in the chaining (would be undef if the current label is called directly, without chaining from anything).
  • $rowop is the current row operation.
  • $when is an integer constant showing the point when the tracer is being called. It's value may be one of &Triceps::TW_BEFORE, &Triceps::TW_AFTER, &Triceps::TW_BEFORE_DRAIN, &Triceps::TW_AFTER_DRAIN, &Triceps::TW_BEFORE_CHAINED, &Triceps::TW_AFTER_CHAINED; the prefix TW stands for tracer when.
  • @args are the extra arguments passed from the tracer creation.

The TW_* constants can as usual be converted to and from strings with the calls

$string = &Triceps::tracerWhenString($value);
$value = &Triceps::stringTracerWhen($string);
$string = &Triceps::tracerWhenStringSafe($value);
$value = &Triceps::stringTracerWhenSafe($string);

There also are the conversion functions with strings more suitable for the human-readable messages: before, after, before-chained, after-chained, before-drain, after-drain. These are actually the conversions used in the UnitTracerStringName. The functions for them are:

$string = &Triceps::tracerWhenHumanString($value);
$value = &Triceps::humanStringTracerWhen($string);
$string = &Triceps::tracerWhenHumanStringSafe($value);
$value = &Triceps::humanStringTracerWhenSafe($string);

Now that the constants have been mentioned, the order of tracing calls for a single executing rowop on a single label is:

TW_BEFORE
TW_BEFORE_CHAINED
TW_AFTER_CHAINED
TW_AFTER
TW_BEFORE_DRAIN
TW_AFTER_DRAIN

There is also a general way to find, if the $when refers to a before or after situation:

$result = &Triceps::tracerWhenIsBefore($when);
$result = &Triceps::tracerWhenIsAfter($when);

Their typical usage in a trace function, to append an opening or closing brace, looks like:

if (Triceps::tracerWhenIsBefore($when)) {
  $msg .= " {";
} elsif (Triceps::tracerWhenIsAfter($when)) {
  $msg .= " }";
}

More trace points that are neither before or after could get added in the future, so a good practice is to use an elsif with both conditions rather than a simple if/else with one condition.

The Perl tracers allow to execute any arbitrary actions when tracing. They can act as breakpoints by looking for certain conditions and opening a debugging session when those are met.

For an example of a Perl tracer, let's start with a tracer function that works like UnitTracerStringName:

sub tracerCb() # unit, label, fromLabel, rop, when, extra
{
  my ($unit, $label, $from, $rop, $when, @extra) = @_;
  our $history;

  my $msg = "unit '" . $unit->getName() . "' "
    . Triceps::tracerWhenHumanString($when) . " label '"
    . $label->getName() . "' ";
  if (defined $fromLabel) {
    $msg .= "(chain '" . $fromLabel->getName() . "') ";
  }
  $msg .= "op " . Triceps::opcodeString($rop->getOpcode());
  if (Triceps::tracerWhenIsBefore($when)) {
    $msg .= " {";
  } elsif (Triceps::tracerWhenIsAfter($when)) {
    $msg .= " }";
  }
  $msg .= "\n";
  $history .= $msg;
}

undef $history;
$ptr = Triceps::UnitTracerPerl->new(\&tracerCb);
$u1->setTracer($ptr);

It's slightly different, in the way that it always produces the verbose trace, and that it collects the trace in the global variable $history. But the resulting text is the same as with UnitTracerStringName.

Now let's improve on it by printing the whole rowop contents too. In a proper way this advanced tracer would be defined as a class constructing the tracer objects. But to reduce the amount of code let's just make it a standalone function to be used with the Perl tracer constructor.

And for something different let's make the result indented, with two spaces per indenting level. As mentioned before, the indenting is actually not such a great idea. But for the small short examples it works well. The function would take 3 extra arguments:

  • Verbosity, a boolean value.
  • Reference to an array variable where to append the text of the trace. This is more flexible than the fixed $history. The array will contain the lines of the trace as its elements. And appending to an array should be more efficient than appending to the end of a potentially very long string.
  • Reference to a scalar variable that would be used to keep the indenting level. The value of that variable will be updated as the tracing happens. Its initial value will determine the initial indenting level.
sub traceStringRowop
{
  my ($unit, $label, $fromLabel, $rowop, $when,
    $verbose, $rlog, $rnest) = @_;

  if ($verbose) {
    ${$rnest}-- if (Triceps::tracerWhenIsAfter($when));
  } else {
    return if ($when != &Triceps::TW_BEFORE);
  }

  my $msg =  "unit '" . $unit->getName() . "' "
    . Triceps::tracerWhenHumanString($when) . " label '"
    . $label->getName() . "' ";
  if (defined $fromLabel) {
    $msg .= "(chain '" . $fromLabel->getName() . "') ";
  }
  my $tail = "";
  if (Triceps::tracerWhenIsBefore($when)) {
    $tail = " {";
  } elsif (Triceps::tracerWhenIsAfter($when)) {
    $tail = " }";
  }
  push (@{$rlog}, ("  " x ${$rnest}) . $msg . "op "
    . $rowop->printP() . $tail);

  if ($verbose) {
    ${$rnest}++ if (Triceps::tracerWhenIsBefore($when));
  }
}

undef @history;
my $tnest =  0; # keeps track of the tracing nesting level
$ptr = Triceps::UnitTracerPerl->new(\&traceStringRowop, 1, \@history, \$tnest);
$u1->setTracer($ptr);

For the same call sequence as before, the output will be as follows (I've tried to wrap the long lines in a logically consistent way but it still spoils the effect of indenting a bit):

unit 'u1' before label 'lab1' op lab1 OP_INSERT a="123" b="456"
    c="789" d="3.14" e="text"  {
  unit 'u1' before-chained label 'lab1' op lab1 OP_INSERT a="123"
      b="456" c="789" d="3.14" e="text"  {
    unit 'u1' before label 'lab2' (chain 'lab1') op lab1 OP_INSERT
        a="123" b="456" c="789" d="3.14" e="text"  {
      unit 'u1' before-chained label 'lab2' (chain 'lab1') op lab1
          OP_INSERT a="123" b="456" c="789" d="3.14" e="text"  {
        unit 'u1' before label 'lab3' (chain 'lab2') op lab1 OP_INSERT
            a="123" b="456" c="789" d="3.14" e="text"  {
        unit 'u1' after label 'lab3' (chain 'lab2') op lab1 OP_INSERT
            a="123" b="456" c="789" d="3.14" e="text"  }
      unit 'u1' after-chained label 'lab2' (chain 'lab1') op lab1
          OP_INSERT a="123" b="456" c="789" d="3.14" e="text"  }
    unit 'u1' after label 'lab2' (chain 'lab1') op lab1 OP_INSERT
        a="123" b="456" c="789" d="3.14" e="text"  }
    unit 'u1' before label 'lab3' (chain 'lab1') op lab1 OP_INSERT
        a="123" b="456" c="789" d="3.14" e="text"  {
    unit 'u1' after label 'lab3' (chain 'lab1') op lab1 OP_INSERT
        a="123" b="456" c="789" d="3.14" e="text"  }
  unit 'u1' after-chained label 'lab1' op lab1 OP_INSERT a="123"
      b="456" c="789" d="3.14" e="text"  }
unit 'u1' after label 'lab1' op lab1 OP_INSERT a="123" b="456" c="789"
    d="3.14" e="text"  }
unit 'u1' before label 'lab1' op lab1 OP_DELETE a="123" b="456"
    c="789" d="3.14" e="text"  {
  unit 'u1' before-chained label 'lab1' op lab1 OP_DELETE a="123"
      b="456" c="789" d="3.14" e="text"  {
    unit 'u1' before label 'lab2' (chain 'lab1') op lab1 OP_DELETE
        a="123" b="456" c="789" d="3.14" e="text"  {
      unit 'u1' before-chained label 'lab2' (chain 'lab1') op lab1
          OP_DELETE a="123" b="456" c="789" d="3.14" e="text"  {
        unit 'u1' before label 'lab3' (chain 'lab2') op lab1 OP_DELETE
            a="123" b="456" c="789" d="3.14" e="text"  {
        unit 'u1' after label 'lab3' (chain 'lab2') op lab1 OP_DELETE
            a="123" b="456" c="789" d="3.14" e="text"  }
      unit 'u1' after-chained label 'lab2' (chain 'lab1') op lab1
          OP_DELETE a="123" b="456" c="789" d="3.14" e="text"  }
    unit 'u1' after label 'lab2' (chain 'lab1') op lab1 OP_DELETE
        a="123" b="456" c="789" d="3.14" e="text"  }
    unit 'u1' before label 'lab3' (chain 'lab1') op lab1 OP_DELETE
        a="123" b="456" c="789" d="3.14" e="text"  {
    unit 'u1' after label 'lab3' (chain 'lab1') op lab1 OP_DELETE
        a="123" b="456" c="789" d="3.14" e="text"  }
  unit 'u1' after-chained label 'lab1' op lab1 OP_DELETE a="123"
      b="456" c="789" d="3.14" e="text"  }
unit 'u1' after label 'lab1' op lab1 OP_DELETE a="123" b="456" c="789"
    d="3.14" e="text"  }

As mentioned before, each label produces two levels of indenting: one for everything after before, another one for the nested labels.

Eventually this tracing should become another standard class in Triceps.

7.11. The gritty details of Triceps scheduling

There are four ways of executing a rowop in Triceps:

Call:

Execute the label right now, including all the nested calls. When the call returns, the execution is completed. This is the most typical way, and the only one described in detail so far.

Schedule:

Execute the label after everything else is done.

Fork:

Execute the label after the current label returns but before its caller gets the control back or anything else is done. Obviously, if multiple labels are forked, they will execute in the order they were forked. The forked labels can be seen as little siblings of the current label. Forking is currently not used much, other than for the special case of looping.

Loop:

Execute the label as the start of the next iteration of the topological loop, after the current iteration is fully completed. This is a special case of fork, essentially forking at the level of the loop's first label.

The common term encompassing all of them is enqueue. Enqueue is an ugly word but since I've already used the word schedule for a specific purpose, I needed another word to name all these operations together. Hence enqueue.

The meaning is kind of intuitively straightforward but the details might sometimes be a bit surprising. So let us look in detail at how it works inside on an example of a fairly convoluted scheduling sequence.

A scheduler in the execution unit keeps not just a single queue but a stack of queues that contain the rowops to be executed. The rowops get into the queues when they are forked or looped or scheduled. Each queue is essentially a stack frame, so I'll be using the terms queue and frame interchangeably. The stack always contains at least one queue, which is called the outermost stack frame.

When the new rowops arrive from the outside world, they can be added with the method schedule() to that stack frame. That's what schedule() does: always adds rowops to the outermost stack frame, no matter how many frames might be pushed on top of it. If rowops 1, 2 and 3 are added, the stack looks like this (the brackets denote a stack frame):

[1, 2, 3]

The unit method drainFrame() is then used to run the scheduler and process the rowops. It makes the unit call each rowop on the innermost frame (which is initially the same as outermost frame, since there is only one frame) in order.

First it calls the rowop 1. It's removed from the queue, then a new frame is pushed onto the stack:

[ ] ~1
[2, 3]

This new frame is the rowop 1's frame, which is marked on the diagram by ~1. The diagram shows the most recently pushed, innermost, frame on top, and the oldest, outermost frame on the bottom. The concepts of innermost and outermost come from the nested calls: the most recent call is nested the deepest in the middle and is the innermost one.

Then the rowop 1 executes. If it calls rowop 4, another frame is pushed onto the stack for it:

[ ] ~4
[ ] ~1
[2, 3]

Then the rowop 4 executes. The rowop 4 never gets onto any of the queues. The call just pushes a new frame and executes the rowop right away. The identity of rowop being processed is kept in the call context. A call also involves a direct C++ call on the thread stack, and if any Perl code is involved, a Perl call too. Because of this, if you nest the calls too deeply, you may run out of the thread stack space and get it to crash.

After the rowop 4 is finished (not calling any other rowops), the innermost empty frame is popped before the execution of rowop 1 continues. The queue stack reverts to the previous state.

[ ] ~1
[2, 3]

Suppose then rowop 1 forks rowops 5 and 6 by calling the Unit method fork(). They are appended to the innermost frame in the order they are forked.

[5, 6] ~1
[2, 3]

If rowop 1 then calls rowop 7, again a frame is pushed onto the stack before it executes:

[ ] ~7
[5, 6] ~1
[2, 3]

The rowops 5 and 6 still don't execute, they keep sitting on the queue until the rowop 1 would return. After the call of rowop 7 completes, the scheduler stack returns to the previous state.

Suppose now the execution of rowop 1 completes. But its stack frame can not be popped yet, because it is not empty. Now is the time to execute the rowops from it. It's also called frame draining but if works somewhat differently in the case of the forked rowops. The first rowop gets picked from the frame and called, but in a special way. It doesn't get its own frame. Instead, it takes over the frame of its parent rowop. The frame that was marked ~1 now changes its marking to ~5 because of that take-over:

[6] ~5
[2, 3]

If the rowop 5 forks rowop 8, the stack becomes:

[6, 8] ~5
[2, 3]

Since the frame was inherited from the parent rowop 1, the rowop 8 just gets appended to the end of it after rowop 6. The rowops forked in the same frame are executed in the order they were forked. Unlike the calls, there is no nesting involved in forking.

When the execution of rowop 5 returns, the execution of the forked rowops from the innermost frame continues. The rowop 6 gets picked from the front of the frame and takes over the frame ownership:

[8] ~6
[2, 3]

Suppose the rowop 6 doesn't call or fork anything else and returns. Then the rowop 8 starts executing and takes over the frame:

[ ] ~8
[2, 3]

Suppose rowop 8 calls schedule() of rowop 9. Rowop 9 is then added to the outermost queue:

[ ] ~8
[2, 3, 9]

Rowop 8 then returns, its queue is empty, so it's popped and its call completes.

[2, 3, 9]

The method drainFrame() keeps running on the outermost frame, now taking the rowop 2 and executing it, and so on, until the outermost queue becomes empty, and drainFrame() returns.

An interesting question is, what happens with the chained labels? Where do they fit in the order of execution? They turn out to be similar to a fork(). The presence of chaining gets checked after the original label completes its execution but before executing any of the forked labels from its frame. If any chained labels are found, they are called one by one. They take over the frame of the parent, just like the forked labels. Any of the chained labels may also call fork(), adding more labels to the frame. The next forked label (if any) gets executed only after all the labels chained from the current one are done.

What would happen if drainFrame() is called not from outside the model but from inside some label handler? It will drain the innermost frame. Suppose that the queue stack was in the following state, with rowop 5 executing:

[6, 8] ~5
[2, 3]

If the label handler of the rowop 5 calls drainFrame() now, drainFrame() will do its usual job: pick the rowops one by one from the innermost frame, create the nested frames for them and execute. So first it will pick up the rowop 6:

[ ] ~6
[8] ~5
[2, 3]

After the rowop 6 completes, its frame gets popped:

[8] ~5
[2, 3]

But drainFrame() continues running, and now picks the rowop 8:

[ ] ~8
[ ] ~5
[2, 3]

After the rowop 8 completes, its frame gets also popped:

[ ] ~5
[2, 3]

At this point the innermost frame becomes empty and drainFrame() returns. The label handler of rowop 5 continues its execution.

If you haven't forked anything, the innermost frame will be empty, and drainFrame() will do nothing. If you did fork some rowops, drainFrame() looks like a convenient way to call them now and then continue. However note that in this case the semantics is different from the normal forking. The rowops from the frame will be called in the nested frames, not taking over the original frame. So if say rowop 6 refers to the same label as rowop 5, this nested execution will be considered a recursive call of the same label. Thus drainFrame() is best used only with the outermost frame.

What if the rowop 1 weren't scheduled and then drained but was just directly called? The outermost frame will remain empty, while a new frame will be pushed for the rowop 1 as usual:

[ ] ~1
[ ]

If the rowop 1 executed the same code as before, after the call it will leave the rowop 9 scheduled on the outermost frame:

[9]

To execute the rowop 9, call drainFrame(), or it will be stuck there forever.

Note that the execution order differs depending on whether the incoming rowops were scheduled or directly called, and on when the drainFrame() is called. If the three rowops were scheduled and then drained, the execution order will be 1, 2, 3, 9. If they were called directly with draining the frame after each one, the order will be 1, 9, 2, 3. And if they were called directly but with draining only after the last one, it would be again 1, 2, 3, 9.

The loop scheduling is a whole big separate subject that will be discussed in the next section.

7.12. The gritty details of Triceps loop scheduling

Now it's time to look at what is really going on when a topological loop gets executed. Let's continue looking at the loop example that was already shown in Figure 7.1 (page ).

If the loop were handled simple-mindedly, with all the execution done by calls, it could use a lot of stack space. Suppose some rowop X1 is scheduled for label X, and causes the loop to be executed twice, with rowops X1, A2, B3, C4, A5, B6, C7, Y8. If each operation is done as a call(), the stack grows like this: It starts with X1 called, creating its own execution frame (marked as such for clarity):

[ ] ~X1
[ ]

Which then calls A2:

[ ] ~A2
[ ] ~X1
[ ]

Which then continues the calls in sequence. By the time the execution comes to Y8, the stack looks like this:

[ ] ~Y8
[ ] ~C7
[ ] ~B6
[ ] ~A5
[ ] ~C4
[ ] ~B3
[ ] ~A2
[ ] ~X1
[ ]

The loop has been converted into recursion, and the whole length of execution is the depth of the recursion. If the loop executes a million times, the stack will be three million levels deep. Worse yet, it's not just the Triceps scheduler stack that grows, it's also the process (C++ and Perl) stack.

Which is why this kind of recursive calls is forbidden by default in Triceps. If you try to do it, on the first recursive call the execution will die with an error. You can enable the recursion but this only lets the stack grow and doesn't prevent the growth.

Would things be better with fork() instead of call() used throughout the loop? It starts the same way:

[X1]

Then X1 executes, gets its own frame and forks A2:

[A2] ~X1
[ ]

Then A2 inherits the stack frame and executes, forks B3:

[B3] ~A2
[ ]

On each step the frame will be inherited by the next label, and if Y8 is also eventually forked, at the end the stack will be:

[ ] ~Y8
[ ]

Problem solved, no matter how many iterations were done by the loop, the stack will stay limited.

The catch though is that every operation inside the loop must be done with a fork(). If there is even one call() occuring in the loop, the stack will grow by a frame for each call() and may become quite deep again. The problem is that call() is hardcoded in many primitives, such as Tables, and is fairly typically used in the templates as well. The historic solution for that was to specify for each table, how it should handle its results, call them or fork them or even schedule them. And the templates could use a similar approach.

The practice had quickly showed that not only all this explicit choice is quite cumbersome and easy to miss, but also the semantics of fork() is different from call() in a very annoying way. If some label wants to do something, call some other label, then do something more using the result of the call, doing it with call() is simple: just execute all this procedurally in sequence. After call() returns, its work is guaranteed to be done and any global state to be updated. Not so with fork() that just puts the rowop onto a queue, there just isn't any way to get the second half of the original label's code to execute only after all the effects from the forked rowop had propagated. (Historically fork() worked differently in Triceps 1.0 and did allow to reproduce the call semantics through some minor contortions but then it kept growing the stack on every fork, just as the calls do).

The solution, even back in the version 1.0 days, was to add a special method for the loop scheduling.

It starts with the concept of the frame mark. A frame mark is a token object, completely opaque to the program. It can be used only in two operations:

  • setMark() remembers the position in the frame stack, just outside the current frame.
  • loopAt() enqueues a rowop at the marked frame.

Then the loop wold have its mark object M. The label A will execute setMark(M), and the label C will execute loopAt(M, rowop(A)). The rest of the execution can as well use call(), as shown in Figure 7.1.

Proper calls in a loop.

Figure 7.2. Proper calls in a loop.


When the label A executes the rowop A2, first things it does is calling setMark(M). After that the stack will look like this:

[ ] ~A2, mark M
[ ] ~X1
[ ]

The mark M remembers the current frame. The stack at the end of C4, after it has called loopAt(M, A5), is:

[ ] ~C4
[ ] ~B3
[A5] ~A2, mark M
[ ] ~X1
[ ]

The stack then unwinds until A5 starts its execution:

[ ] ~A5, mark M
[ ] ~X1
[ ]

When A5 inherits the stack frame from A2, the mark M stays put. The label A would normally call setMark(M) again anyway, but it will just put the mark onto the same frame, so effectively it's a no-operation.

Thus each iteration starts with a fresh stack, and the stack depth is limited to one iteration. The nested loops can also be properly executed.

After Y8 completes, the stack will unroll back, and X1 can continue its execution:

[ ] ~X1
[ ]

To reiterate, when the control returns back to X1, the whole loop is done.

What happens after the stack unwinds past the mark? The mark gets unset. When someone calls loopAt() with an unset mark, the rowop is enqueued in the outermost frame, having the same effect as schedule().

It's possible to use this handling of an unset mark to some creative effects. It allows the loops to take a pause in the middle. Suppose the label B finds that it can't process the rowop B3 until some other data has arrived. What it can do then is remember B3 somewhere in the thread state and return. The loop has not completed but it can't progress either, so the call unrolls until it becomes empty. In this case the code of label X must be prepared to find that the loop hadn't completed yet after the call of A2 returns. Since the frame of X1 is popped off the stack, the mark M gets unset. The knowledge that the loop needs to be continued stays remembered in the state.

After some time that awaited data arrives, as some other rowop. When that rowop gets processed, it will find that remembered state with B3 and will make it continue, maybe by calling call(B3) again. So now the logic in B finds all the data it needs and continues with the loop, calling C4. C4 will do its job and call loopAt(M, A5). But the mark M has been unset a while ago! Scheduling A5 at the outermost frame seems to be a logical thing to do at this point. Then whatever current processing will complete and unwind, and the loop will continue after it. When the rowop A5 gets executed, the label A will call setMark(M) again, thus setting the mark on its new frame, and making the loop run as far as it can before executing any other scheduled rowops.

Overall, pausing and then restarting a loop like this is not such a good idea. The caller of the loop normally expects that it can wait for the loop to complete, and that when the loop returns, it's all done. If a loop may decide to bail out now and continue later, the effects may be quite unexpected.

7.13. Recursion control

Historically, the recursive calls (when a label calls itself, directly or indirectly) have been forbidden in Triceps. Mind you, the recursive calling could still be done even then with the help of trays and forking. And it's probably the best way too from the standpoint of correctness. However it's not the most straightforward way, and the real recursion still comes handy once in a while.

Now the recursion is allowed in its direct way. Especially that it doesn't have to be all-or-nothing, it can be done in a piecemeal and controlled fashion.

It's controlled per-unit. Each unit has two adjustable limits:

Maximal stack depth:

Limits the total depth of the unit's call stack. That's the maximal length of the call chain, whether it goes straight or in loops.

Maximal recursion depth:

Limits the number of times each particular label may appear on the call stack. So if you have a recursive code fragment (a simple-minded loop or a recursive streaming function), this is the limit on its recursive reentrances.

Both these limits accept the 0 and negative values to mean unlimited.

The default is as it has been before: unlimited stack depth, recursion depth of 1 (which means that each label may be called once but it may not call itself). But now you can change them with the calls:

$unit->setMaxStackDepth($n);
$unit->setMaxRecursionDepth($n);

You can change them at any time, even when the unit is running (but they will be enforced only on the next attempt to execute a rowop).

You can also read the current values:

$n = $unit->maxStackDepth();
$n = $unit->maxRecursionDepth();

Another thing about the limits is that even if you set them to unlimited or to some very large values, there still are the system limits. The calls use the C++ process (or thread) stack and the Perl stack, and if you make too many of them, the stack will overflow and the whole process will crash and possibly dump core. Keeping the call depths within reason is still a good idea.

Now you can do the direct recursion. However as with the procedural code, not all the labels are reentrant. Some of them may work with the static data structures that can't be modified in a nested fashion. Think for example of a table: when you modify a table, it sends rowops to its pre and out labels. You can connect the other labels there, and react to the table modifications. However these labels can't attempt to modify the same table, because the table is already in the middle of a modification, and it's not reentrant.

The table still has a separate logic to check for non-reentrance, and no matter what is the unit's general recursion depth limit, for the table it always stays at 1. Moreover, the table enforces it across both the input label interface and the procedural interface.

If you make your own non-reentrant labels, Triceps can make this check for you. Just mark the first label of the non-reentrant sequence with

$label->setNonReentrant();

It will have its own private recursion limit of 1. Any time it's attempted to execute recursively, it will confess. There is no way to unset this flag: when a label is known to be non-reentrant, it can not suddenly become reentrant until its code is rewritten.

You can read this flag with

$val = $label->isNonReentrant();

Chapter 8. Memory Management

8.1. Reference cycles

Remember that the Triceps memory management uses the reference counting, which does not like the reference cycles, as has been mentioned in Section 4.3: “Memory management fundamentals” . The reference cycles cause the objects to be never freed. It's no big deal if the data structures exist until the program exit anyway but it becomes a memory leak if they keep being created and deleted dynamically.

The problems come not with the data that goes through the models but with the models themselves. The data gets reference-counted without any issues. The reference cycles can get formed only between the elements of the models: labels, tables etc. If you don't need them destroyed until the program exits (or more exactly, until the Perl interpreter instance exits), there is no problem. The leaks could happen only if the model elements get created and destroyed as the program runs, such as if you use them to parse and process the short-lived ad-hoc queries.

These leaks are pretty hard to diagnose. There are some packages, like Devel::Cycle, but they won't detect the loops that involve a reference at C++ level. And when the Perl interpreter exits, it clears up all the variables used, even the ones involved in the loops, so if you run it under valgrind, valgrind doesn't show any leaks. There is a package Devel::LeakTrace that should be able to detect all these left-over variables. However I can't tell for sure yet, so far I haven't had enough patience to build all the dependencies for it.

One possibility is to use the weak references (using the module Scalar::Util). But the problem is that you need to not forget weakening the references manually. Too much work, too much attention, too easy to forget.

The mechanism used in Triceps works by breaking up the reference cycles when the data needs to be cleared. The execution unit keeps track of all its labels, and when it gets destoryed, clears them up, breaking up the cycles. It's also possible to clear the labels individually, by a manual call.

The clearing of a label clears all the chainings. The chained labels get cleared too in their turn, and eventually the whole chain clears up. This removes the links in the forward direction, and if any cycles were present, they become open. More on the details of label clearing in the Section 8.2: “Clearing of the labels” .

Another potential for reference cycles is between the execution unit and the labels. A unit keeps a reference to all its labels. So the labels can not keep a reference to the unit. And they don't. Internally they have a plain C++ pointer to the unit. However the Perl level may present a problem.

In many cases the labels have a Perl reference to the template object where they belong. And that object is likely to have a Perl reference to the unit. It's one more opportunity for the reference cycle. This code usually looks like this:

package MyTemplate;

sub new # ($class, $unit, $name, $rowType, ...)
{
  my $class = shift;
  my $unit = shift;
  my $name = shift;
  my $rowType = shift;
  my $self = {};

  ...

  $self->{unit} = $unit;
  $self->{inputLabel} = $unit->makeLabel($rowType, $name . ".in",
    sub { ... }, sub { ... }, $self);

  ...

  bless $self, $class;
  return $self;
}

So the unit refers to the label at the C++ level, the label has a $self reference to the Perl object that owns it, and the object's $self->{unit} refers back to the unit. Once the label clearing happens, the link from the unit will disappear and the cycle would unroll. But the clearing would not happen by itself because the unit can't get automatically defererenced and destroyed.

Because of this, the unit provides an explicit way to trigger the clearing:

$unit->clearLabels();

If you want to get rid of an execution unit with all its components without exiting the whole program, use this call. It will start the chain reaction of destruction. Of course, don't forget to undefine all the other references in your program to these objects being destroyed.

There is also a way to trigger this chain reaction automatically. It's done with a helper object that is created as follows:

my $clearUnit = $unit->makeClearingTrigger();

When the reference to $clearUnit gets destroyed, it will call $unit->clearLabels() and trigger the destruction of the whole unit. Obviously, don't copy the $clearUnit variable, keep it on one place.

If you put it into a block variable, the unit will get destroyed on exiting the block. If you put it into a global variable in a thread, the unit will get destroyed when the thread exits (though I'm a bit hazy on the Perl memoery management with threads yet, it might get all cleared by itself without any special tricks too).

8.2. Clearing of the labels

To remind, a label that executes the Perl code is created with:

$label = $unit->makeLabel($rowType, "name", \&clearSub,
  \&execSub, @args);

The function clearSub deals with the destruction.

The clearing of a label drops all the references to execSub, clearSub and arguments, and clears all the chainings. And of course the chained labels get cleared too. But before anything else is done, clearSub gets a chance to execute and clear any application-level data. It gets as its arguments all the arguments from the label constructor, same as execSub:

clearSub($label, @args)

A typical case is to keep the state of a stateful element in a hash:

package MyTemplate;

sub new # ($class, $unit, $name, $rowType, ...)
{
  my $class = shift;
  my $unit = shift;
  my $name = shift;
  my $rowType = shift;
  my $self = {};

  ...

  $self->{unit} = $unit;
  $self->{inputLabel} = $unit->makeLabel($rowType, $name . ".in",
    \&clear, \&handle, $self);

  ...

  bless $self, $class;
  return $self;
}

These elements may end up pointing to the other elements. It's fairly common to keep the pointers to the other elements (especially tables) that provide inputs to this one. In general, these references up should be safe because the clearing of the labels would destroy the references down and open the cycles. But the way things get connected in the heat of the moment, you never know. It's better to be safe than sorry. To be on the safe side, the clearing function can wipe out the whole state of the element by undefining its hash:

sub clear # ($label, $self)
{
  my ($label, $self) = @_;
  undef %$self;
}

The whole contents of the hash becomes lost, all the refrences from it disappear. And if you use this approach in every object, the complete destruction reigns and everything is nicely laid to waste.

Writing these clear methods for each class quickly becomes tedious and easy to forget. Triceps is a step ahead: it provides a ready function Triceps::clearArgs() that does all this destruction. It can undefine the contents of various things passed as its arguments, and then also undefines these arguments themselves. Just reuse it:

$self->{inputLabel} = $unit->makeLabel($rowType, $name . ".in",
  \&Triceps::clearArgs, \&handle, $self);

But that's not all. Triceps is actually two steps ahead. If the clearSub is specified as undef, Triceps automatically treats it to be Triceps::clearArgs(). The last snippet and the following one are equivalent:

$self->{inputLabel} = $unit->makeLabel($rowType, $name . ".in",
  undef, \&handle, $self);

No need to think, the default will do the right thing for you. Of course, if by some reason you don't want this destruction to happen, you'd have to override it with an empty function sub {}.

8.3. The clearing labels

Some templates don't have their own input labels, instead they just combine and tie together a few internal objects, and use the input labels of some of these internal objects as their inputs. Among the templates included with Triceps, JoinTwo is one of them, it just combines two LookupJoins. Without an input label, there would be no clearing, and the template object would never get undefined.

This can be solved by creating an artificial label that is not connected anywhere and has no code to execute. Its only purpose in life would be to clear the object when told so. To make life easier, rather than abusing makeLabel(), there is a way to create the special clearing-only labels:

$lb = $unit->makeClearingLabel("name", @args);

The arguments would be the references to the objects that need clearing, usually $self. For a concrete usage example, here is how JoinTwo uses it:

$self->{clearingLabel} = $self->{unit}->makeClearingLabel(
  $self->{name} . ".clear", $self);

Since this call should never fail, on any errors it will confess. There is no need to check the result. The result can be saved in a variable or can be simply ignored. If you throw away the result, you won't be able to access that label from the Perl code but it won't be lost: it will be still referenced from the unit, until the unit gets cleared.

Note how the clearing label doesn't have a row type. In reality every label does how a row type, just it would be silly to abuse the random row types to create the clearing-only labels. Because of this, the clearing labels are created with a special empty row type that has no fields in it. If you ever want to use this row type for any other purposes, you can get it with the method

$rt = $unit->getEmptyRowType();

Under the hood, the clearing label is the same as a normal label with Perl code, only with the special default values used for its construction. The normal Perl label methods would work on it like on a normal label.

Chapter 9. Tables

9.1. Hello, tables!

The tables are the fundamental elements of state-keeping in Triceps. Let's start with a basic example:

my $hwunit = Triceps::Unit->new("hwunit");
my $rtCount = Triceps::RowType->new(
  address => "string",
  count => "int32",
);

my $ttCount = Triceps::TableType->new($rtCount)
  ->addSubIndex("byAddress",
    Triceps::IndexType->newHashed(key => [ "address" ])
  )
;
$ttCount->initialize();

my $tCount = $hwunit->makeTable($ttCount, "tCount");

while(<STDIN>) {
  chomp;
  my @data = split(/\W+/);

  # the common part: find if there already is a count for this address
  my $rhFound = $tCount->findBy(
    address => $data[1]
  );
  my $cnt = 0;
  if (!$rhFound->isNull()) {
    $cnt = $rhFound->getRow()->get("count");
  }

  if ($data[0] =~ /^hello$/i) {
    my $new = $rtCount->makeRowHash(
      address => $data[1],
      count => $cnt+1,
    );
    $tCount->insert($new);
  } elsif ($data[0] =~ /^count$/i) {
    print("Received '", $data[1], "' ", $cnt + 0, " times\n");
  } else {
    print("Unknown command '$data[0]'\n");
  }
}

What happens here? The main loop reads the lines from standard input, splits into words and uses the first word as a command and the second word as a key. Note that it's not CSV format, it's words with the non-alphanumeric characters separating the words. Hello, table!, hello world, count world are examples of the valid inputs. For someting different, the commands are compared with their case ignored (but the case matters for the key).

The example counts, how many times each key has been hello-ed, and prints this count back on the command count. Here is a sample, with the input lines printed in bold:

Hello, table!
Hello, world!
Hello, table!
count world
Received 'world' 1 times
Count table
Received 'table' 2 times

In this example the table is read and modified using the direct procedural calls. As you can see, there isn't even any need for unit scheduling and such. There is a scheduler-based interface to the tables too, it will be shown soon. But in many cases the direct access is easier. Indeed, this particular example could have been implemented with the plain Perl hashes. Nothing wrong with that either. Well, the Perl tables provide many more intersting ways of indexing the data. But if you don't need them, they don't matter. And at some future point the tables will be supporting the on-disk persistence, but no reason to bother much about that now: things are likely to change a dozen times yet before that happens. Feel free to just use the Perl data structures if they make the code easier.

A table is created through a table type. This allows to stamp out duplicate tables of the same type, which can get handy when the multithreading will be added. A table is local to a thread. A table type can be shared between threads. To look up something in another thread's table, you'd either have to ask it through a request-reply protocol or to keep a local copy of the table. Such a copy can be easily done by creating a copy table from the same type.

In reality, right now all the business with table types separated from the tables is more pain than gain. It not only adds extra steps but also makes difficult to define a template that acts on a table by defining extra features on it. Something will be done about it, I have a few ideas.

The table type gets first created and configured, then initialized. After a table type is initialized, it can not be changed any more. That's the point of the initialization call: tell the type that all the configuration has been done, and it can go immutable now. Fundamentally, configuting a table type just makes it collect bits and pieces. Nothing but the most gross errors can be detected at that point. At initialization time everything comes together and everything gets checked for consistency. A table type must be fully initialized in one thread before it can be shared with other threads. The historic reason for this API is that it mirrors the C++ API, which has turned out not to look that good in Perl. It's another candidate for a change.

A table type gets the row type and at least one index. Here it's a hashed index by the key field address. "Hashed" means that you can look up the rows by the key value but there are no promises about any specific row order. And the hashing is used to make the key comparisons more efficient. The key of a hashed index may consist of multiple fields.

The table is then created from the table type, and given a name.

The rows can then be inserted into the table (and removed too, not shown in this example yet). The default behavior of the hashed index is to replace the old row if a new row with the same key is inserted.

The search in the table is done by the method findBy() with the key fields of the index. Which returns a RowHandle object. A RowHandle is essentially an iterator in the table. Even if the row is not found, a RowHandle will be still returned but it will be NULL, which is checked for by $rh->isNull().

No matter which command will be used, it's always useful to look up the previous row for the key: its contents would be either printed or provide the previous value for the increase. So the model does it first and gets the count from it. If it's not found, then the count is set to 0.

Then it looks at the command and does what it's been told. Updating the count amounts to creating a new row with the new values and inserting it into the table. It replaces the previous one.

This is just the tip of the iceberg. The tables in Triceps have a lot more features.

9.2. Tables and labels

A table does not have to be operated in a procedural way. It can be plugged into the the scheduler machinery. Whenever a table is created, three labels are created with it.

  • The input label is for sending the modification rowops to the table. The table provides the handler for it that applies the incoming rowops to the table.
  • The output label propagates the modifications done to the table. It is a dummy label, and does nothing by itself. It's there for chaining the other labels to it. The output rowop comes quite handy to propagate the table's modifications to the rest of the state.
  • The pre-modification label is also a dummy label, for chaining other labels to it. It sends the rowops right before they are applied to the table. This comes very handy for the elements that need to act depending on the previous state of the table, such as joins. The pre-modification label doesn't simply mirror the input label. The rows received on the input label may trigger the automatic changes to the table, such as an old row being deleted when a new row with the same key is inserted. All these modifications, be they automatic or explicit, will be reported to the pre-modification label. Since the pre-modification label is used relatively rarely, it contains a special optimization: if there is no label chained to it, no rowop will be sent to it in the first place. Don't be surprised if you enable the tracing and don't see it in the trace.

Again, the rowops coming through these labels aren't necessarily the same. If a DELETE rowop comes to the input label, referring to a row that is not in the table, it will not propagate anywhere. If an INSERT rowop comes in and causes another row to be replaced, the replaced row will be sent to the pre-modification and output labels as a DELETE rowop first.

Anf of course the table may be modified through the procedural interface. These modifications also produce rowops on the pre-modification and output labels.

The labels of the table have names. They are produced by adding suffixes to the table name. They are "tablename.in", "tablename.pre" and "tablename.out".

In the no bundling spirit, a rowop is sent to the pre-modification label right before it's applied to the table, and to the output label right after it's applied. If the labels executed from there need to read the table, they can, and will find the table in the exact state with no intervening modifications. However, they can't modify the table neither directly nor by calling its input label. When these labels are called, the table is in the middle of a modification and it can't accept another one. Such attempts are treated as recursive modifications, forbidden, and the program will die on them. If you need to modify the table, use schedule() or loopAt() to have the next modification done later. However there are no guarantees about other modifications getting done in between. When the looped rowop executes, it might need to check the state of the table again and decide if its operation still makes sense.

So, let's make a version of Hello, table example that passes the modification requests as rowops through the labels. It will print the information about the updates to the table as they happen, so there is no more use having a separate command for that. But for another demonstration let's add a command that would clear the counter of hellos. Here is its code:

my $hwunit = Triceps::Unit->new("hwunit");
my $rtCount = Triceps::RowType->new(
  address => "string",
  count => "int32",
);

my $ttCount = Triceps::TableType->new($rtCount)
  ->addSubIndex("byAddress",
    Triceps::IndexType->newHashed(key => [ "address" ])
  )
;
$ttCount->initialize();

my $tCount = $hwunit->makeTable($ttCount, "tCount");

my $lbPrintCount = $hwunit->makeLabel($tCount->getRowType(),
  "lbPrintCount", undef, sub { # (label, rowop)
    my ($label, $rowop) = @_;
    my $row = $rowop->getRow();
    print(&Triceps::opcodeString($rowop->getOpcode), " '",
      $row->get("address"), "', count ", $row->get("count"), "\n");
  } );
$tCount->getOutputLabel()->chain($lbPrintCount);

# the updates will be sent here, for the tables to process
my $lbTableInput = $tCount->getInputLabel();

while(<STDIN>) {
  chomp;
  my @data = split(/\W+/);

  # the common part: find if there already is a count for this address
  my $rhFound = $tCount->findBy(
    address => $data[1]
  );
  my $cnt = 0;
  if (!$rhFound->isNull()) {
    $cnt = $rhFound->getRow()->get("count");
  }

  if ($data[0] =~ /^hello$/i) {
    $hwunit->makeHashSchedule($lbTableInput, "OP_INSERT",
      address => $data[1],
      count => $cnt+1,
    );
  } elsif ($data[0] =~ /^clear$/i) {
    $hwunit->makeHashSchedule($lbTableInput, "OP_DELETE",
      address => $data[1]
    );
  } else {
    print("Unknown command '$data[0]'\n");
  }
  $hwunit->drainFrame();
}

The table creation is the same as last time. The row finding in the table is also the same.

The printing of the modifications to the table is done with $lbPrintCount, which is connected to the table's output label. It prints the opcode, the address of the greeting, and the count of greetings. It will show us what is happening to the table as soon as it happens. An unit trace could be used instead but a custom printout contains less noise. The pre-modification label is of no interest here, so it's not used.

The references to the labels of a table are gotten with:

$label = $table->getInputLabel();
$label = $table->getPreLabel();
$label = $table->getOutputLabel();

The deletion does not require an exact row to be sent in. All it needs is a row with the keys for deletion, the rest of the fields in it are ignored. So the clear command puts only the key field in it.

Here is an example of input (in bold) and output:

Hello, table!
OP_INSERT 'table', count 1
Hello, world!
OP_INSERT 'world', count 1
Hello, table!
OP_DELETE 'table', count 1
OP_INSERT 'table', count 2
clear, table
OP_DELETE 'table', count 2
Hello, table!
OP_INSERT 'table', count 1

An interesting thing happens after the second Hello, table!: the code send only an OP_INSERT but the output shows an OP_DELETE and OP_INSERT. The OP_DELETE for the old row gets automatically generated when a row with repeated key is inserted. Now, depending on what you want, just sending in the first place the consequent inserts of rows with the same keys, and relying on the table's internal consistency to turn them into updates, might be a good thing or not. Overall it's a dirty way to write but sometimes it comes convenient. The clean way is to send the explicit deletes first. When the data goes through the table, it gets automatically cleaned. The subscribers to the table's output and pre-modification labels get the clean and consistent picture: a row never gets simply replaced, they always see an OP_DELETE first and only then an OP_INSERT.

9.3. Basic iteration through the table

Let's add a dump of the table contents to the "Hello, table" example, either version of it. For that, the code needs to go through every record in the table:

  elsif ($data[0] =~ /^dump$/i) {
    for (my $rhi = $tCount->begin(); !$rhi->isNull(); $rhi = $rhi->next()) {
      print($rhi->getRow->printP(), "\n");
    }
  }

As you can see, the row handle works kind of like an STL iterator. Only the end of iteration is detected by receiving a NULL row handle. Calling next() on a NULL row handle is OK but it would just return another NULL handle. And there is no decrementing the iterator, you can only go forward with next(). The backwards iteration is in the plans but not implemented yet.

An example of this fragment's output would be:

Hello, table!
Hello, world!
Hello, table!
count world
Received 'world' 1 times
Count table
Received 'table' 2 times
dump
address="world" count="1"
address="table" count="2"

The order of the rows in the printout is the same as the order of rows in the table's index. Which is no particular order, since it's a hashed index. As long as you stay with the same 64-bit AMD64 architecture (with LSB-first byte order), it will stay the same on consecutive runs. But switching to a 32-bit machine or to an MSB-first byte order (such as a SPARC, if you can still find one) will change the hash calculation, and with it the resulting row order. There are the ordered indexes as well, they will be described later.

9.4. Deleting a row

Deleting a row from a table through the input label is simple: send a rowop with OP_DELETE, it will find the row with the matching key and delete it, as was shown above. In the procedural way the same can be done with the method deleteRow(). The added row deletion code for the main loop of Hello, table (either version, but particularly relevant for the one from Section 9.1: “Hello, tables!” ) is:

  elsif ($data[0] =~ /^delete$/i) {
    my $res = $tCount->deleteRow($rtCount->makeRowHash(
      address => $data[1],
    ));
    print("Address '", $data[1], "' is not found\n") unless $res;
  }

The result allows to differentiate between the situations when the row was found and deleted and the row was not found. On any error the call confesses.

However we already find the row handle in advance in $rhFound. For this case a more efficient form is available, and it can be added to the example as:

    elsif ($data[0] =~ /^remove$/i) {
      if (!$rhFound->isNull()) {
        $tCount->remove($rhFound);
      } else {
        print("Address '", $data[1], "' is not found\n");
      }
    }

It removes a specific row handle from the table. In whichever way you find it, you can remove it. An attempt to remove a NULL handle would be an error and cause a confession.

The reason why remove() is more efficient than deleteRow() is that deleteRow() amounts to finding the row handle by key and then removing it. And the OP_DELETE rowop sent to the input label calls deleteRow().

deleteRow() never deletes more than one row, even if multiple rows match (yes, the indexes don't have to be unique). There isn't any method to delete multiple rows at once. Every row has to be deleted by itself. As an example, here is the implementation of the command clear for Hello, table that clears all the table contents by iterating through it:

  elsif ($data[0] =~ /^clear$/i) {
    my $rhi = $tCount->begin();
    while (!$rhi->isNull()) {
      my $rhnext = $rhi->next();
      $tCount->remove($rhi);
      $rhi = $rhnext;
    }
  }

After a handle is removed from the table, it continues to exist, as long as there are references to it. It could even be inserted back into the table. However until (and unless) it's inserted back, it can not be used for iteration any more. Calling next() on a handle that is not in the table would just return a NULL handle. So the next row has to be found before removing the current one.

9.5. A closer look at the RowHandles

A few uses of the RowHandles have been shown by now. So, what is a RowHandle? As Captain Obvious would say, RowHandle is a class (or package, in Perl terms) implementing a row handle.

A row handle keeps a table's service information (including the index data) for a single data row, including of course a reference to the row itself. Each row is stored in the table through its handle. The row handle is also an iterator in the table, and a special one: it's an iterator for all the table's indexes at once. For you SQLy people, an iterator is essentially a cursor on an index. For you Java people, an iterator can be used to do more than step sequentially through rows. So far only the table types with one index have been shown, but in reality multiple indexes are supported, potentially with quite complicated arrangements. More on the indexes later, for now just keep it in mind. A row handle can be found through one index and then used to iterate through another one. Or you can iterate through one index, find a certain row handle and continue iterating through another index starting from that handle. If you remember a reference on a particular row handle, you can always continue iteration from that point later. (unless the row handle gets removed from the table).

A RowHandle always belongs to a particular table, the RowHandles can not be shared nor moved between two tables, even if the tables are of the same type. Since the tables are single-threaded, obviously the RowHandles may not be shared between the threads either.

However a RowHandle may exist without being inserted into a table. In this case it still has a spiritual connection to that table but is not included in the index (the iteration attempts with it would just return end of the index), and will be destroyed as soon as all the references to it disappear.

The insertion of a row into a table actually happens in two steps:

  1. A RowHandle is created for a row.
  2. This new handle is inserted into the table.

This is done with the following code:

$rh = $table->makeRowHandle($row);
$table->insert($rh);

Only it just so happens that to make life easier, the method insert() has been made to accept either a row handle or directly a row. If it finds a row, it makes a handle for it behind the curtains and then proceeds with the insertion of that handle. Passing a row directly is also more efficient (if you don't have a handle already created for it for some other reason) because the row handle creation then happens entirely in the C++ code, without surfacing into Perl.

A handle can be created for any row of a type matching the table's row type. For a while it was accepting only equal types but that was not consistent with what the labels are doing, so I've changed it.

The method insert() has a return value. It's often ignored but occasionally comes handy. 1 means that the row has been inserted successfully, and 0 means that the row has been rejected. On errors it confesses. An attempt to insert a NULL handle or a handle that is already in the table will cause a rejection, not an error. Also the table's index may reject a row with duplicate key (though right now this option is not implemented, and the hash index silently replaces the old row with the new one).

There is a method to find out if a row handle is in the table or not:

$result = $rh->isInTable();

Though it's used mostly for debugging, when some strange things start going on.

The searching for rows in the table by key has been previously shown with the method findBy(). Which happens to be a wrapper over a more general method find(): it constructs a row from its argument fields and then calls find() with that row as a sample of data to find. The method find() is similar to insert() in the handling of its arguments: the proper way is to give it a row handle argument, but the more efficient way is to give it a row argument, and it will create the handle for it as needed before performing a search.

Now you might wonder: huh, find() takes a row handle and returns a row handle? What's the point? Why not just use the first row handle? Well, those are different handles:

  • The argument handle is normally not in the table. It's created brand new from a row that contains the keys that you want to find, just for the purpose of searching.
  • The returned handle is always in the table (of course, unless it's NULL). It can be further used to extract back the row data, and/or for iteration.

Though nothing really prevents you from searching for a handle that is already in the table. You'll just get back the same handle, after gratuitously spending some CPU time. (There are exceptions to this, with the more complex indexes that will be described later).

Why do you need to create new a row handle just for the search? Due to the internal mechanics of the implementation. A handle stores the helper information for the index. For example, the hash index calculates the hash value of all the row's key fields once and stores it in the row handle. Despite it being called a hash index, it really stores the data in a tree, with the hash value used to speed up the comparisons for the tree order. It's much easier to make both the insert() and find() work with the hash value and row reference stored in the same way in a handle than to implement them differently. Because of this, find() uses the exactly same row handle argument format as insert().

Can you create multiple row handles referring to the same row? Sure, knock yourself out. From the table's perspective it's the same thing as multiple row handles for multiple copies of the row with the same values in them, only using less memory.

There is more to the row handles than has been touched upon yet. It will all be revealed when more of the table features are described. The internal structure of the row handles will be described in the Section 9.10: “The index tree” .

9.6. A window is a FIFO

A fairly typical situation in the CEP world is when a model needs to keep a limited history of events. For a simple example, let's discuss, how to remember the last two trades per stock symbol. The size of two has been chosen to keep the sample input and outputs small.

This is normally called a window logic, with a sliding window. You can think of it in a mechanical analogy: as the trades become available, they get printed on a long tape. However the tape is covered with a masking plate. The plate has a window cut in it that lets you see only the last two trades.

Some CEP systems have the special data structures that implement this logic, that are called windows. Triceps has a feature on a table instead that makes a table work as a window. It's not unique in this department: for example Coral8 does the opposite, calls everything a window, even if some windows are really tables in every regard but name.

Here is a Triceps example of keeping the window for the last two trades and iteration over it:

our $uTrades = Triceps::Unit->new("uTrades");
our $rtTrade = Triceps::RowType->new(
  id => "int32", # trade unique id
  symbol => "string", # symbol traded
  price => "float64",
  size => "float64", # number of shares traded
);

our $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
      ->addSubIndex("last2",
        Triceps::IndexType->newFifo(limit => 2)
      )
  )
;
$ttWindow->initialize();
our $tWindow = $uTrades->makeTable($ttWindow, "tWindow");

# remember the index type by symbol, for searching on it
our $itSymbol = $ttWindow->findSubIndex("bySymbol");
# remember the FIFO index, for finding the start of the group
our $itLast2 = $itSymbol->findSubIndex("last2");

# print out the changes to the table as they happen
our $lbWindowPrint = $uTrades->makeLabel($rtTrade, "lbWindowPrint",
  undef, sub { # (label, rowop)
    print($_[1]->printP(), "\n"); # print the change
  });
$tWindow->getOutputLabel()->chain($lbWindowPrint);

while(<STDIN>) {
  chomp;
  my $rTrade = $rtTrade->makeRowArray(split(/,/));
  my $rhTrade = $tWindow->makeRowHandle($rTrade);
  $tWindow->insert($rhTrade);
  # There are two ways to find the first record for this
  # symbol. Use one way for the symbol AAA and the other for the rest.
  my $rhFirst;
  if ($rTrade->get("symbol") eq "AAA") {
    $rhFirst = $tWindow->findIdx($itSymbol, $rTrade);
  } else  {
    # $rhTrade is now in the table but it's the last record
    $rhFirst = $rhTrade->firstOfGroupIdx($itLast2);
  }
  my $rhEnd = $rhFirst->nextGroupIdx($itLast2);
  print("New contents:\n");
  for (my $rhi = $rhFirst;
      !$rhi->same($rhEnd); $rhi = $rhi->nextIdx($itLast2)) {
    print("  ", $rhi->getRow()->printP(), "\n");
  }
}

This example reads the trade records in CSV format, inserts them into the table, and then prints the actual modifications reported by the table and the new state of the window for this symbol. And here is a sample log, with the input lines in bold:

1,AAA,10,10
tWindow.out OP_INSERT id="1" symbol="AAA" price="10" size="10"
New contents:
  id="1" symbol="AAA" price="10" size="10"
2,BBB,100,100
tWindow.out OP_INSERT id="2" symbol="BBB" price="100" size="100"
New contents:
  id="2" symbol="BBB" price="100" size="100"
3,AAA,20,20
tWindow.out OP_INSERT id="3" symbol="AAA" price="20" size="20"
New contents:
  id="1" symbol="AAA" price="10" size="10"
  id="3" symbol="AAA" price="20" size="20"
4,BBB,200,200
tWindow.out OP_INSERT id="4" symbol="BBB" price="200" size="200"
New contents:
  id="2" symbol="BBB" price="100" size="100"
  id="4" symbol="BBB" price="200" size="200"
5,AAA,30,30
tWindow.out OP_DELETE id="1" symbol="AAA" price="10" size="10"
tWindow.out OP_INSERT id="5" symbol="AAA" price="30" size="30"
New contents:
  id="3" symbol="AAA" price="20" size="20"
  id="5" symbol="AAA" price="30" size="30"
6,BBB,300,300
tWindow.out OP_DELETE id="2" symbol="BBB" price="100" size="100"
tWindow.out OP_INSERT id="6" symbol="BBB" price="300" size="300"
New contents:
  id="4" symbol="BBB" price="200" size="200"
  id="6" symbol="BBB" price="300" size="300"

You can see that the window logic works: at no time is there more than two rows in each group. As more rows are inserted, the oldest rows get deleted.

Now let's dig into the code. The first thing to notice is that the table type has two indexes (strictly speaking, index types, but most of the time they can be called indexes without creating a confusion) in it. Unlike your typical database, the indexes in this example are nested.

TableType
+-IndexType Hash "bySymbol"
  +-IndexType Fifo "last2"

If you follow the nesting, you can see, that the first call addSubIndex() adds an index type to the table type, while the textually second addSubIndex() adds an index to the previous index.

The same can also be written out in multiple separate calls, with the intermediate results stored in the variables:

$itLast2 = Triceps::IndexType->newFifo(limit => 2);
$itSymbol = Triceps::IndexType->newHashed(key => [ "symbol" ]);
$itSymbol->addSubIndex("last2", $itLast2);
$ttWindow = Triceps::TableType->new($rtTrade);
$ttWindow->addSubIndex("bySymbol", $itSymbol);

I'm not perfectly happy with the way the table types are constructed with the index types right now, since the parenthesis levels have turned out a bit hard to track. This is another example of following the C++ API in Perl that didn't work out too well, and it will change in the future. But for now please bear with it.

The index nesting is kind of intuitively clear, but the details may take some time to get your head wrapped around them. You can think of it as the inner index type creating the miniature tables that hold the rows, and then the outer index holding not individual rows but those miniature tables. So, to find the rows in the table you go through two levels of indexes: first through the outer index, and then through the inner one. The table takes care of these details and makes them transparent, unless you want to stop your search at an intermediate level: such as, to find all the transactions with a given symbol, you need to do a search in the outer index, but then from that point iterate through all rows in the found inner index. For this you obviously have to tell the table, where do you want to stop in the search.

The outer index is the hash index that we've seen before, the inner index is a FIFO index. A FIFO index doesn't have any key, it just keeps the rows in the order they were inserted. You can search in a FIFO index but most of the time it's not the best idea: since it has no keys, it searches linearly through all its rows until it finds an exact match (or runs out of rows). It's a reasonable last-resort way but it's not fast and in many cases not what you want. This also sends a few ripples through the row deletion. Remember that the method deleteRow() and sending the OP_DELETE to the table's input label invoke find(), which would cause the linear search on the FIFO indexes. So when you use a FIFO index, it's usually better to find the row handle you want to delete in some other way and then call remove() on it, or use another approach that will be shown later. Or just keep inserting the rows and never delete them, like this example does.

A FIFO index may contain multiple copies of an exact same row. It doesn't care, it just keeps whatever rows were given to it in whatever order they were given.

By default a FIFO index just keeps whatever rows come to it. However it may have a few options. Setting the option limit limits the number of rows stored in the index (not per the whole table but per one of those miniature tables). When you try to insert one row too many, the oldest row gets thrown out, and the limit stays unbroken. That's what creates the window behavior: keep the most recent N rows.

If you look at the sample output, you can see that inserting the rows with ids 1-4 generates only the insert events on the table. But the rows 5 and 6 start overflowing their FIFO indexes, and cause the oldest row to be automatically deleted before completing the insert of the new one.

A FIFO index doesn't have to be nested inside a hash index. If you put a FIFO index at the top level, it will control the whole table. So it would be not two last record per key but two last records inserted in the whole table.

Continuing with the example, the table gets created, and then the index types get extracted back from the table type. Now, why not just write out the table type creation with intermediate variables as shown above and remember the index references? At some point in the past this actually would have worked but not any more. It has to do with the way the table type and its index types are connected. It's occasionally convenient to create one index type and then reuse it in multiple table types. However for the whole thing to work, the index type must be tied to its particular table type. This tying together happens when the table type is initialized. If you put the same index type into two table types, then when the first table type is initialized, the index type will get tied to it. The second table type would then fail to initialize because an index in it is already tied elsewhere. To get around this dilemma, now when you call addSubIndex(), it doesn't connect the original index type, instead it makes a copy of it. That copy then gets tied with the table type and later gets returned back with findSubIndex().

The table methods that take an index type argument absolutely require that the index type must be tied to that table's type. If you try to pass a seemingly the same index type that has not been tied, or has been tied to a different table type, that is an error.

One last note on this subject: there is no interdependency between the methods makeTable() and findSubIndex(), they can be done in either order.

The example output comes from two sources. The running updates on the table's modifications (the lines with OP_INSERT and OP_DELETE) are printed from the label $lbWindowPrint. The new window contents is printed from the main loop.

The main loop reads the trade records in the simple CSV format without the opcode, and for simplicity inserts directly into the table with the procedural API, bypassing the scheduler. After the row is inserted, the contents of its index group (that miniature table) gets printed. The insertion could as well have been done with passing directly the row reference, without explicitly creating a handle. But that handle will be used to demonstrate an interesting point.

To print the contents of an index group, we need to find its boundaries. In Triceps these boundaries are expressed as the first row handle of the group, and as the row handle right after the group. There is an internal logic to that, and it will be explained later, but for now just take it on faith.

With the information we have, there are two ways to find the first row of the group:

  • With the table's method findIdx(). It's very much like find(), only it has an extra argument of a specific index type. If the index type given has no further nesting in it, findIdx() works exactly like find(). In fact, find() is exactly such a special case of findIdx() with an automatically chosen index type. If you use an index type with further nesting under it, findIdx() will return the handle of the first row in the group under it (or the usual NULL row handle if not found).
  • If we create the row handle explicitly before inserting it into the table, as was done in the example, that will be the exact row handle inserted into the table. Not a copy or anything but this particular row handle. After a row handle gets inserted into the table, it knows its position in the indexes. It knows, in which group it is. And we still have a reference to it. So then we can use this knowledge to navigate within the group, jump to the first row handle in the group with firstOfGroupIdx(). It also takes an index type but in this case it's the type that controls the group, the FIFO index in out case.

The example shows both ways. As a demonstration, it uses the first way if the symbol is AAA and the second way for all the other symbols.

The end boundary is found by calling nextGroupIdx() on the first row's handle. The handle of the newly inserted row could have also been used for nextGroupIdx(), or any other handle in the group. For any handle belonging to the same group, the result is exactly the same.

And finally, after the iteration boundaries have been found, the iteration on the group can run. The end condition comparison is done with same(), to compare the row handle references and not just their Perl-level wrappers. The stepping is done with nextIdx(), with is exactly like next() but according to a particular index, the FIFO one. This has actually been done purely to show off this method. In this particular case the result produced by next(), nextIdx() on the FIFO index type and nextIdx() on the outer hash index type is exactly the same. We'll come to the reasons of that yet.

Looking forward, as you iterate through the group, you could do some manual aggregation along the way. For example, find the average price of the last two trades, and then do something useful with it.

There is also a piece of information that you can find without iteration: the size of the group.

$size = $table->groupSizeIdx($idxType, $row_or_rh);

This information is important for the joins, and iterating every time through the group is inefficient if all you want to get is the group size. Since when you need this data you usually have the row and not the row handle, this operation accepts either and implicitly performs a findIdx() on the row to find the row handle. Moreover, even if it receives the argument of a row handle that is not in the table, it will also automatically perform a findIdx() on it (though calling it for a row handle in the table is more efficient because the group would not need to be looked up first).

If there is no such group in the table, the result will be 0.

The $idxType argument is the non-leaf parent index of the group. (Using a leaf index type is not an error but it always returns 0, because there are no groups under it). It's basically the same index type as you would use in findIdx() to find the first row of the group or in firstOfGroupIdx() or nextGroupIdx() to find the boundaries of thr group. Remember, a non-leaf index type defines the groups, and the nested index types under it define the order in those groups (and possibly further break them down into sub-groups).

It's a bit confusing, so let's recap with another example. If you have a table type defined as:

our $ttPosition = Triceps::TableType->new($rtPosition)
  ->addSubIndex("primary",
    Triceps::IndexType->newHashed(key => [ "date", "customer", "symbol" ])
  )
  ->addSubIndex("currencyLookup", # for joining with currency conversion
    Triceps::IndexType->newHashed(key => [ "date", "currency" ])
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
  ->addSubIndex("byDate", # for cleaning by date
    Triceps::SimpleOrderedIndex->new(date => "ASC")
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
;

then it would make sense to call groupSizeIdx(), firstOfGroupIdx() and nextGroupIdx() with the indexes currencyLookup or byDate but not with primary, currencyLookup/grouping nor byDate/grouping. You can call findIdx() with any index, but for currencyLookup or byDate it would return the first row of the group while for primary, currencyLookup/grouping or byDate/grouping it would return the only matching row. On the other hand, for iteration in a group, it makes sense to call nextIdx() only on primary, currencyLookup/grouping or byDate/grouping. Calling nextIdx() on the non-leaf index types is not an error but it would in effect resolve to the same thing as using their first leaf sub-indexes.

9.7. Secondary indexes

The last example dealt only with the row inserts, because it could not handle the deletions that well. What if the trades may get cancelled and have to be removed from the table? There is a solution to this problem: add one more index. Only this time not nested but in parallel. The indexes in the table type become tree-formed:

TableType
+-IndexType Hash "byId" (id)
+-IndexType Hash "bySymbol" (symbol)
  +-IndexType Fifo "last2"

It's very much like the common relational databases where you can define multiple indexes on the same table. Both indexes byId and bySymbol (together with its nested sub-index) refer to the same set of rows stored in the table. Only byId allows to easily find the records by the unique id, while bySymbol is responsible for keeping then grouped by the symbol, in FIFO order. It could be said that byId is the primary index (since it has a unique key) and bySymbol is a secondary one (since it does the grouping) but from the Triceps'es standpoint they are pretty much equal and parallel to each other.

To illustrate the point, here is a modified version of the previous example. Not only does it manage the deletes but also computes the average price of the collected transactions as it iterates through the group, thus performing a manual aggregation.

our $uTrades = Triceps::Unit->new("uTrades");
our $rtTrade = Triceps::RowType->new(
  id => "int32", # trade unique id
  symbol => "string", # symbol traded
  price => "float64",
  size => "float64", # number of shares traded
);

our $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
      ->addSubIndex("last2",
        Triceps::IndexType->newFifo(limit => 2)
      )
  )
;
$ttWindow->initialize();
our $tWindow = $uTrades->makeTable($ttWindow, "tWindow");

# remember the index type by symbol, for searching on it
our $itSymbol = $ttWindow->findSubIndex("bySymbol");
# remember the FIFO index, for finding the start of the group
our $itLast2 = $itSymbol->findSubIndex("last2");

# remember, which was the last row modified
our $rLastMod;
our $lbRememberLastMod = $uTrades->makeLabel($rtTrade, "lbRememberLastMod",
  undef, sub { # (label, rowop)
    $rLastMod = $_[1]->getRow();
  });
$tWindow->getOutputLabel()->chain($lbRememberLastMod);

# Print the average price of the symbol in the last modified row
sub printAverage # (row)
{
  return unless defined $rLastMod;
  my $rhFirst = $tWindow->findIdx($itSymbol, $rLastMod);
  my $rhEnd = $rhFirst->nextGroupIdx($itLast2);
  print("Contents:\n");
  my $avg;
  my ($sum, $count);
  for (my $rhi = $rhFirst;
      !$rhi->same($rhEnd); $rhi = $rhi->nextIdx($itLast2)) {
    print("  ", $rhi->getRow()->printP(), "\n");
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  if ($count) {
    $avg = $sum/$count;
  }
  print("Average price: ", (defined $avg? $avg: "Undefined"), "\n");
}

while(<STDIN>) {
  chomp;
  my @data = split(/,/);
  $uTrades->makeArrayCall($tWindow->getInputLabel(), @data);
  &printAverage();
  undef $rLastMod; # clear for the next iteration
  $uTrades->drainFrame(); # just in case, for completeness
}

And an example of its work, with the input lines shown in bold:

OP_INSERT,1,AAA,10,10
Contents:
  id="1" symbol="AAA" price="10" size="10"
Average price: 10
OP_INSERT,2,BBB,100,100
Contents:
  id="2" symbol="BBB" price="100" size="100"
Average price: 100
OP_INSERT,3,AAA,20,20
Contents:
  id="1" symbol="AAA" price="10" size="10"
  id="3" symbol="AAA" price="20" size="20"
Average price: 15
OP_INSERT,4,BBB,200,200
Contents:
  id="2" symbol="BBB" price="100" size="100"
  id="4" symbol="BBB" price="200" size="200"
Average price: 150
OP_INSERT,5,AAA,30,30
Contents:
  id="3" symbol="AAA" price="20" size="20"
  id="5" symbol="AAA" price="30" size="30"
Average price: 25
OP_INSERT,6,BBB,300,300
Contents:
  id="4" symbol="BBB" price="200" size="200"
  id="6" symbol="BBB" price="300" size="300"
Average price: 250
OP_DELETE,3
Contents:
  id="5" symbol="AAA" price="30" size="30"
Average price: 30
OP_DELETE,5
Contents:
Average price: Undefined

The input has changed: now an extra column is prepended to it, containing the opcode for the row. The updates to the table are not printed any more, but the calculated average price is printed after the new contents of the group.

In the code, the first obvious addition is the extra index in the table type. The label that used to print the updates is gone, and replaced with another one, that remembers the last modified row in a global variable.

That last modified row is then used in the function printAverage() to find the group for iteration. Why? Could not we just remember the symbol from the input data? Not always. As you can see from the last two input rows with OP_DELETE, the trade id is the only field required to find and delete a row using the index byId. So these trade cancellation rows take a shortcut and only provide the trade id, not the rest of the fields. If we try to remember the symbol fields from them, we'd remember an undef. Can we just look up the row by id after the incoming rowop has been processed? Not after the deletion. If we try to find the symbol by looking up the row after the deletion, we will find nothing, because the row will already be deleted. We could look up the row in the table before the deletion, and remember it, and afterwards do the look-up of the group by it. But since on deletion the row with will come to the table's output label anyway, we can just ride the wave and remember it instead of doing the manual look-up. And this also spares the need of creating a row with the last symbol for searching: we get a ready pre-made row with the right symbol in it.

Note that in this example, unlike the previous one, there are no two ways of finding the group any more: after deletion the row handle will not be in the table any more, and could not be used to jump directly to the beginning of its group. findIdx() has to be used to find the group.

By the time printAverage() executes, it could happen that all the rows with that symbol will be gone, and the group will disappear. This situation is handled nicely in an automatic way: findIdx() will return a NULL row handle, for which then nextGroupIdx() will also return a NULL row handle. The for-loop will immediately satisfy the condition of $rhi->same($rhEnd), it will make no iterations, the $count and $avg will be left undefined. In result no rows will be printed and the average value will be printed as Undefined, as you can see in the reaction to the last input row in the sample output.

The main loop becomes reduced to reading the input, splitting the line, separating the opcode, calling the table's input label, and printing the average. The auto-conversion from the opcode name is used when constructing the rowop. Normally it's not a good practice, since the program will die if it finds a bad rowop in the input, but good enough for a small example. The direct use of $uTrades->call() guarantees that by the time it returns, the last modified row will be remembered in $rLastMod, available for printAverage() to use.

After the average is calculated, $rLastMod is reset to prevent it from accidentally affecting the next row. If the next row is an attempt to delete a trade id that is not in the table any more, the DELTE operation will have no effect on the table, and nothing will be sent from the table's output label. $rLastMod will stay undefined, and printAverage() will check it and immediately return. An attempt to pass an undef argument to findIdx() would be an error.

The final $uTrades->drainFrame() is there purely for completeness. In this case we know that nothing will be scheduled by the labels downstream from the table, and there will be nothing to drain.

Now, an interesting question is: how does the table know, that to delete a row, it has to find it using the field id? Or, since the deletion internally uses find(), the more precise question is: how does find() know that it has to use the index byId? It doesn't use any magic. It simply goes by the first index defined in the table. That's why the index byId has been very carefully placed before bySymbol. The same principle applies to all the other functions like next(), that use an index but don't receive one as an argument: the first index is always the default index. There is a bit more detail to it, but that's the rough principle.

9.8. Sorted index

The hashed index provides a way to store rows indexed by a key. It is fast but it has a price to pay for that speed: when iterating through it, the records come in an unpredictable (though repeatable, within a particular machine architecture) order determined by the hash function. If the order doesn't matter, that's fine. But often the order does matter, and is desirable even at the tradeoff of the reduced performance.

The sorted index provides a solution for this problem. It is created with:

$it = Triceps::IndexType->newPerlSorted($sortName,
  $initFunc, $compareFunc, @args);

The Perl in newPerlSorted refers to the fact that the sorting order is specified as a Perl comparison function.

$sortName is just a symbolic name for printouts. It's used when you call $it->print() (directly or as a recursive call from the table type print) to let you know what kind of index type it is, since it can't print the compiled comparison function. It is also used in the error messages if something dies inside the comparison function: the comparison is executed from deep inside the C++ code, and by that time the $sortName is the only way to identify the source of the problems. It's not the same name as used to connect the index type into the table type hierarchy with addSubIndex(). As usual, an index type may be reused in multiple hierarchies, with different names, but in all cases it will also keep the same $sortName. This may be easier to show with an example:

$rt1 = Triceps::RowType->new(
  a => "int32",
  b => "string",
);

$it1 = Triceps::IndexType->newPerlSorted("basic", undef, \&compBasic);

$tt1 = Triceps::TableType->new($rt1)
  ->addSubIndex("primary", $it1)
;

$tt2 = Triceps::TableType->new($rt1)
  ->addSubIndex("first", $it1)
;

print $tt1->print(), "\n";
print $tt2->print(), "\n";

The print calls in it will produce:

table (
  row {
    int32 a,
    string b,
  }
) {
  index PerlSortedIndex(basic) primary,
}
table (
  row {
    int32 a,
    string b,
  }
) {
  index PerlSortedIndex(basic) first,
}

Both the name of the index type in the table type and the name of the sorted index type are printed, but in different spots.

The $initFunc and/or $compareFunc function references (or, as usual, they may be specified as source code strings) specify the sorting order. One of them may be left undefined but not both. @args are the optional arguments that will be passed to both functions.

The easiest but least flexible way is to just use the $compareFunc. It gets two Rows (not RowHandles!) as arguments, plus whatever is specified in @args. It returns the usual Perl-style <=> result. For example:

sub compBasic # ($row1, $row2)
{
  return $_[0]->get("a") <=> $_[1]->get("a");
}

Don't forget to use <=> for the numbers and cmp for the strings. The typical Perl idiom for sorting by more than one field is to connect them by ||.

Or, if we want to specify the field names as arguments, we could define a sort function that sorts first by a numeric field in ascending order, then by a string field in descending order:

sub compAscDesc # ($row1, $row2, $numFldAsc, $strFldDesc)
{
  my ($row1, $row2, $numf, $strf) = @_;
  return $row1->get($numf) <=> $row2->get($numf)
    || $row2->get($strf) cmp $row1->get($strf); # backwards for descending
}

my $sit = Triceps::IndexType->newPerlSorted("by_a_b", undef,
  \&compAscDesc, "a", "b");

This assumes that the row type will have a numeric field a and a string field b. If it doesn't then this will not be discovered until you create a table and try to insert some rows into it, which will finally call the comparison function. At which point the attempt to get a non-existing field will confess, this error will be caught by the table and set the sticky error in it. The insert operation will confess with this error, and any future operations on the table will also confess with this error (this is the meaning of sticky), the table will become unusable.

The $initFunc provides a way to do that check and more up front. It is called at the table type initialization time. By this time all this extra information is known, and it gets the references to the table type, index type (itself, but with the class stripped back to Triceps::IndexType), row type, and whatever extra arguments that were passed. It can do all the checks once.

The init function's return value is kind of backwards to everything else: on success it returns undef, on error it returns the error message. It could die too, but simply returning an error message is somewhat nicer. The returned error messages may contain multiple lines separated by \n, so it should try to collect all the error information it can.

The init function that would check the arguments for the last example can be defined as:

sub initNumStr # ($tabt, $idxt, $rowt, @args)
{
  my ($tabt, $idxt, $rowt, @args) = @_;
  my %def = $rowt->getdef(); # the field definition
  my $errors; # collect as many errors as possible
  my $t;

  if ($#args != 1) {
    $errors .= "Received " . ($#args + 1) . " arguments, must be 2.\n"
  } else {
    $t = $def{$args[0]};
    if ($t !~ /int32$|int64$|float64$/) {
      $errors .= "Field '" . $args[0] . "' is not of numeric type.\n"
    }
    $t = $def{$args[1]};
    if ($t !~ /string$|uint8/) {
      $errors .= "Field '" . $args[1] . "' is not of string type.\n"
    }
  }

  if (defined $errors) {
    # help with diagnostics, append the row type to the error listing
    $errors .= "the row type is:\n";
    $errors .= $rowt->print();
  }
  return $errors;
}

my $sit = Triceps::IndexType->newPerlSorted("by_a_b", \&initNumStr,
  \&compAscDesc, "a", "b");

The init function can do even better: it can create and set the comparison function. It's done with:

$idxt->setComparator($compareFunc);

When the init function sets the comparator, the compare function argument in newPerlSorted() can be left undefined, because setComparator() would override it anyway. But one way or the other, the compare function must be set, or the index type initialization and with it the table type initialization will fail.

By the way, the sorted index type init function is not of the same kind as the aggregator type init function. The aggregator type could use an init function of this kind too, but at the time it looked like too much extra complexity. It probably will be added in the future. But more about aggregators later.

A fancier example of the init function will be shown in the next section.

Internally the implementation of the sorted index shares much with the hashed index. They both are implemented as trees but they compare the rows in different ways. The hashed index is aimed for speed, the sorted index for flexibility. The common implementation means that they share certain traits. Both kinds have the unique keys, there can not be two rows with the same key in an index of either kind. Both kinds allow to nest other indexes in them.

The handling of the fatal errors (as in die()) in the initialization and especially comparison functions is an interesting subject. The errors propagate properly through the table, and the table operations confess with the Perl handler's error message. But since an error in the comparison function means that things are going very, very wrong, after that the table becomes inoperative and will die on all the subsequent operations as well. You need to be very careful in writing these functions.

9.9. Ordered index

To specify the sorting order in a more SQL-like fashion, Triceps has the class SimpleOrderedIndex. It's implemented entirely in Perl, on top of the sorted index. Besides being useful by itself, it shows off two concepts: the initialization function of the sorted index, and the template with code generation on the fly.

First, how to create an ordered index:

$it = Triceps::SimpleOrderedIndex->new($fieldName => $order, ...);

The arguments are the key fields. $order is one of "ASC" for ascending and "DESC" for descending. Here is an example of a table with this index:

my $tabType = Triceps::TableType->new($rowType)
  ->addSubIndex("sorted",
    Triceps::SimpleOrderedIndex->new(
      a => "ASC",
      b => "DESC",
    )
  );

When it gets translated into a sorted index, the comparison function gets generated automatically. It's smart enough to generate the string comparisons for the string and uint8 fields, and the numeric comparisons for the numeric fields. It's not smart enough to do the locale-specific comparisons for the strings and locale-agnostic for the uint8, it just uses whatever you have set up in cmp for both. It treats the NULL field values as numeric 0 or empty strings. It doesn't handle the array fields at all but can at least detect such attempts and flag them as errors.

A weird artifact of the boundary between C++ and Perl is that when you get the index type back from the table type like

$sortIdx = $tabType->findSubIndex("sorted");

the reference stored in $sortIdx will be of the base type Triceps::IndexType. That's because the C++ internals of the TableType object know nothing about any derived Perl types. But it's no big deal, since there are no other useful methods for SimpleOrderedIndex anyway. For the future, I have an idea of a workaround, but it has to wait for the future.

If you call $sortIdx->print(), it will give you an idea of how it was constructed:

PerlSortedIndex(SimpleOrder a ASC, b DESC, )

The contents of the parenthesis is a sort name from the sorted index'es standpoint. It's an arbitrary string. But when the ordered index prepares this string to pass to the sorted index, it puts its arguments into it.

Now the interesting part, I want to show the implementation of the ordered index. It's not too big and it shows the flexibility and the extensibility of Triceps:

package Triceps::SimpleOrderedIndex;

our @ISA = qw(Triceps::IndexType);

# Create a new ordered index. The order is specified
# as pairs of (fieldName, direction) where direction is a string
# "ASC" or "DESC".
sub new # ($class, $fieldName => $direction...)
{
  my $class = shift;
  my @args = @_; # save a copy

  # build a descriptive sortName
  my $sortName = 'SimpleOrder ';
  while ($#_ >= 0) {
    my $fld = shift;
    my $dir = shift;
    $sortName .= quotemeta($fld) . ' ' . quotemeta($dir) . ', ';
  }

  $self = Triceps::IndexType->newPerlSorted(
    $sortName, '&Triceps::SimpleOrderedIndex::init(@_)', undef, @args
  );
  bless $self, $class;
  return $self;
}

# The initialization function that actually parses the args.
sub init # ($tabt, $idxt, $rowt, @args)
{
  my ($tabt, $idxt, $rowt, @args) = @_;
  my %def = $rowt->getdef(); # the field definition
  my $errors; # collect as many errors as possible
  my $compare = ""; # the generated comparison function
  my $connector = "return"; # what goes between the comparison operators

  while ($#args >= 0) {
    my $f = shift @args;
    my $dir = uc(shift @args);

    my ($left, $right); # order the operands depending on sorting direction
    if ($dir eq "ASC") {
      $left = 0; $right = 1;
    } elsif ($dir eq "DESC") {
      $left = 1; $right = 0;
    } else {
      $errors .= "unknown direction '$dir' for field '$f', use 'ASC' or 'DESC'\n";
      # keep going, may find more errors
    }

    my $type = $def{$f};
    if (!defined $type) {
      $errors .= "no field '$f' in the row type\n";
      next;
    }

    my $cmp = "<=>"; # the comparison operator
    if ($type eq "string"
    || $type =~ /^uint8.*/) {
      $cmp = "cmp"; # string version
    } elsif($type =~ /\]$/) {
      $errors .= "can not order by the field '$f', it has an array type '$type', not supported yet\n";
      next;
    }

    my $getter = "->get(\"" . quotemeta($f) . "\")";

    $compare .= "  $connector \$_[$left]$getter $cmp \$_[$right]$getter\n";

    $connector = "||";
  }

  $compare .= "  ;\n";

  if (defined $errors) {
    # help with diagnostics, append the row type to the error listing
    $errors .= "the row type is:\n";
    $errors .= $rowt->print();
  } else {
    # set the comparison as source code
    #print STDERR "DEBUG Triceps::SimpleOrderedIndex::init: comparison function:\n$compare\n";
    $idxt->setComparator($compare);
  }
  return $errors;
}

The class constructor simply builds the sort name from the arguments and offloads the rest of logic to the init function. It can't really do much more: when the index type object is constructed, it doesn't know yet, where it will be used and what row type it will get. It tries to enquote nicely the weird characters in the arguments when they go into the sort name. Not that much use is coming from it at the moment: the C++ code that prints the table type information doesn't do the same, so there still is a chance of misbalanced quotes in the result. But perhaps the C++ code will be fixed at some point too.

The init function is called at the table type initialization time with all the needed information. It goes through all the arguments, looks up the fields in the row type, and checks them for correctness. It tries to collect as much of the error information as possible. The returned error messages may contain multiple lines separated by \n, and the ordered index makes use of it. The error messages get propagated back to the table type level, nicely indented and returned from the table initialization. If the init function finds any errors, it appends the printout of the row type too, to make finding what went wrong easier. A result of a particularly bad call to a table type initialization may look like this:

index error:
  nested index 1 'sorted':
    unknown direction 'XASC' for field 'z', use 'ASC' or 'DESC'
    no field 'z' in the row type
    can not order by the field 'd', it has an array type 'float64[]', not supported yet
    the row type is:
    row {
      uint8 a,
      uint8[] b,
      int64 c,
      float64[] d,
      string e,
    }

Also as the init goes through the arguments, it constructs the text of the compare function in the variable $compare. Here the use of quotemeta() for the user-supplied strings is important to avoid the syntax errors in the generated code. If no errors are found in the arguments, the compare function gets compiled with eval. There should not be any errors, but it's always better to check. Finally the compiled compare function is set in the sorted index with

$idxt->setComparator($cmpfunc)

If you uncomment the debugging printout line (and run make, and maybe make install afterwards), you can see the auto-generated code printed on stderr when you use the simple ordered index. It will look somewhat like this:

sub {
  return $_[0]-&gt;get("a") cmp $_[1]-&gt;get("a")
  || $_[1]-&gt;get("c") &lt;=&gt; $_[0]-&gt;get("c")
  || $_[0]-&gt;get("b") cmp $_[1]-&gt;get("b")
  ;
}

That's it! An entirely new piece of functionality added in a smallish Perl snippet. This is your typical Triceps template: collect the arguments, use them to build Perl code, and compile it. Of course, if you don't want to deal with the code generation and compilation, you can just call your class methods and whatnot to interpret the arguments. But if the code will be reused, the compilation is more efficient.

9.10. The index tree

The index types in a table type can form a pretty much arbitrary tree. Following the common tree terminology, the index types that have no other index types nested in them, are called the leaf index types. Since there seems to be no good one-word naming for the index types that have more index types nested in them ("inner"? "nested" is too confusing), I simply call them non-leaf.

At the moment the Hashed, Sorted and Ordered index types can be used only in both leaf and non-leaf positions. The FIFO index types must always be in the leaf position, they don't allow the further nesting.

Now is the time to look deeper into what is going on inside a table. Note that I've been very carefully talking about index types and not indexes. In this section the difference matters. The index types are in the table type, the indexes are in the table. One index type may generate multiple indexes.

This will become clearer after you see the illustrations. First, the legend in the Figure 9.1.

Drawings legend.

Figure 9.1. Drawings legend.


The nodes belonging to the table type are shown in red, the nodes belonging to the table are shown in blue, and the contents of the RowHandle is shown separately in yellow. The lines on the drawings represent not exactly pointers as such but more of the logical connections that may be more complicated than the simple pointers.

The lines in the RowHandle don't mean anything at all, they just show that the parts go together. In reality a RowHandle is a chunk of memory, with various elements placed in that memory. As far as indexes are concerned, the RowHandle contains an iterator for every index where it belongs. This lets it know its position in the table, to iterate along every index, and, most importantly, to be removed quickly from every index. A RowHandle belongs to one index of each index type, and contains the matching number of iterators in it.

The table type is shown as a normal flat tree. But the table itself is more complex and becomes 3-dimensional. Its view from above matches the table type's tree but the data grows up in the third dimension.

Let's start with the simplest case: a table type with only one index type. Whether the index type is hash or FIFO, doesn't matter here.

TableType
+-IndexType "A"

Figure 9.2 shows the table structure.

One index type.

Figure 9.2. One index type.


The table here always contains exactly one index, matching the one defined index type, and the root index. The root index is very dumb, its only purpose is to tie together the multiple top-level indexes into a tree.

The only index of type A provides an ordering of the records, and this ordering is used for the iteration on the table.

For the next example let's look at the straight nesting in Figure 9.3.

TableType
+-IndexType "A"
  +-IndexType "B"
Straight nesting.

Figure 9.3. Straight nesting.


The stack of row references is shown visually divided to match the indexing, but in reality there is no special division. This was done purely to make the picture easier to read.

There is still only one index of type A. And this is always the case with the top-level indexes, there is only one of them. This index divides the rows into 3 groups. Just like the rows in a leaf index, the groups in a non-leaf index are ordered in some index-specific way.

Each group then has its own second-level index of type B. Which then defines an order for the rows in it. To reiterate: the index of type A splits the rows by groups, then the group's index of type B defines the order of the rows in the group.

So what happens when we iterate through the table and ask for the next row handle? The current row handle contains the iterators in the indexes of types A and B. The easy thing is to advance the iterator of type B. Yeah, but in which index? The Figure 9.3 shows three indexes of type B, let's call them B1, B2 and B3. The iterator of type B in the row handle tells the relative position in the index, but it doesn't tell, which index it is. We need to step back and look at the index type A. It's the top-level index type, so there is always only one index for it. Then we take the iterator of type A and find this row's group in the index A. The group contains the index of type B, say B1. We can then take this index B1, take the iterator of type B from the row handle, and advance this iterator in this index. If the advance succeeded, then great, we've got the next row handle. But if the current row was the last row in B1, we need to step back to the index A again, advance the current row handle's iterator of type A there, find its index B2, and pick the first row handle of B2.

This process is what happens when we use $rh->nextIdx($itB). The iteration goes by the leaf index type B, however it relies on all the index types in the path from the table type to B. If we do $rh->next(), the result is the same because the first leaf index type is used as the default index type for the iteration.

If we do $rh->next($itA), the semantics is still the same: return the next row handle (not the next group). There is no way to get to the row handle without going all the way through a leaf index. So when a non-leaf index type is used for the iteration, it gets implicitly extended to its first nested leaf index type.

What would happen if a new row gets inserted, and the index type A determines that it does not belong to any of the existing groups? A new group will be created and inserted in the appropriate position in A's order. This group will have a new index of type B created, and the new row inserted in that index.

What would happen if both rows in B1 are removed? B1 will become empty and will be collapsed. The index A will delete the B1's group and B1 itself, and will remain with only two groups. The effect propagates upwards: if all the rows are removed, the last index of type B will collapse, then the index A will become empty and also collapse and be deleted. The only thing left will be the root index that stays in the existence no matter what.

When a table is first created, it has only the root index. The rest of the indexes pop into the existence as the rows get inserted. If you wonder, yes, this does apply to a table type with only one index type as well. Just this point has not been brought up until now.

Among all this froth of creation and collapse the iterators stay stable. Once a row is inserted, the indexes leading to it are not going anywhere (at least until that row gets removed). But since other rows and groups may be inserted around it, the notion of what row is next, will change over time.

Let's go through how the other index-related operations work.

The iteration through the whole table starts with begin() or beginIdx(), the first being a form of the second that always uses the first leaf index type. beginIdx() is fairly straightforward: it just follows the path from the root to the leaf, picking the first position in each index along the way, until it hits the RowHandle, as is shown in Figure 9.4. That found RowHandle becomes its result. If the table is empty, it returns the NULL row handle.

begin(), beginIdx($itA) and beginIdx($itB) work the same for this table.

Figure 9.4. begin(), beginIdx($itA) and beginIdx($itB) work the same for this table.


The next pair is find() and findIdx() (and findBy() and findIdxBy() are wrappers around those). As usual, find() is the same thing as findIdx() on the table's first leaf index type. It also follows the path from the root to the target index type. On each step it tries to find a matching position in the current index. If the position could not be found, the search fails and a NULL row handle is returned. If found, it is used to progress to the next index.

As has been mentioned in Section 9.5: “A closer look at the RowHandles” , the search always works internally on a RowHandle argument. If a plain Row is used as an argument, a new temporary RowHandle will be created for it, searched, and then freed after the search. This works well for two reasons. First, the indexes already have the functions for comparing two row handles to build their ordering. The same functions are reused for the search. Second, the row handles contain not only the index iterators but also the cached information from the rows, to make the comparisons faster. The exact kind of cached information varies by the index type. The FIFO, Sorted and Ordered indexes use none. The Hashed indexes calculate a hash of the key field values, that will be used as a quick differentiator for the search. This information gets created when the row handle gets created. Whether the row handle is then used to insert into the table or to search in it, the hash is then used in the same way, to speed up the comparisons.

In findIdx(), the non-leaf index type arguments behave differently than the leaf ones: up to and including the index of the target type, the search works as usual. But then at the next level the logic switches to the same as in beginIdx(), going for the first row handle of the first leaf sub-index. This lets you find the first row handle of the matching group under the target index type.

If you use $table->findIdx($itA, $rh), on Figure 9.5 it will go through the root index to the index A. There it will try to find the matching position. If none is found, the search ends and returns a NULL row handle. If the position is found, the search progresses towards the first leaf sub-index type. Which is the index type B, and which conveniently sits in this case right under A. The position in the index A determines, which index of type B will be used for the next step. Suppose it's the second position, so the second index of type B is used. Since we're now past the target index A, the logic used is the same as for beginIdx(), and the first position in B2 is picked. Which then leads to the first row handle of the second sub-stack of handles.

findIdx($itA, $rh) goes through A and then switches to the beginIdx() logic.

Figure 9.5. findIdx($itA, $rh) goes through A and then switches to the beginIdx() logic.


The method firstOfGroupIdx() allows to navigate within a group, to jump from some row somewhere in the group to the first one, and then from there iterate through the group. The example in Section 9.6: “A window is a FIFO” made use of it.

The Figure 9.6 shows an example of $table->firstOfGroupIdx($itB, $rh), where $rh is pointing to the third record in B2. What it needs to do is go back to B2, and then execute the begin() logic from there on. However, remember, the row handle does not have a pointer to the indexes in the path, it only has the iterators. So, to find B2, the method does not really back up from the original row. It has to start all the way back from the root and follow the path to B2 using the iterators in $rh. Since it uses the ready iterators, this works fast and requires no row comparisons. But logically it's equivalent to backing up by one level, and I'll continue calling it that for simpicity. Once B2 (an index of type B) is reached, the begin() logic goes for the first row in there.

firstOfGroupIdx() works on both leaf and non-leaf index type arguments in the same way: it backs up from the reference row to the index of that type and executes the begin() logic from there. Obviously, if you use it on a non-leaf index type, the begin()-like part will follow its first leaf index type.

firstOfGroupIdx($itB, $rh).

Figure 9.6. firstOfGroupIdx($itB, $rh).


The method nextGroupIdx() jumps to the first row of the next group, according to the argument index type. To do that, it has to retrace one level higher than firstOfGroupIdx(). Figure 9.7 shows that $table->nextGroupIdx($itB, $rh) that starts from the same row handle as in Figure 9.6, has to logically back up to the index A, go to the next iterator there, and then follow to the first row of B3.

nextGroupIdx($itB, $rh).

Figure 9.7. nextGroupIdx($itB, $rh).


As before, in reality there is no backing up, just the path is retraced from the root using the iterators in the row handle. Once the parent of index type B is reached (which is the index of type A), the path follows not the iterator from the row handle but the next one (yes, copied from the row handle, increased, followed). This gives the index of type B that contains the next group. And from there the same begin()-like logic finds its first row.

Same as firstOfGroupIdx(), nextGroupIdx() may be used on both the leaf and non-leaf indexes, with the same logic.

It's kind of annoying that firstOfGroupIdx() and nextGroupIdx() take the index type inside the group while findIdx() uses takes the parent index type to act on the same group. But as you can see, each of them follows its own internal logic, and I'm not sure if they can be reconciled to be more consistent.

At the moment the only navigation is forward. There is no matching last(), prev() or lastGroupIdx() or prevGroupIdx(). They are in the plan, but so far they are the victims of corner-cutting. Though there is a version of last() in the AggregatorContext, since it happens to be particularly important for the aggregation.

Continuing our excursion into the index nesting topologies, the next example is of two parallel leaf index types:

TableType
+-IndexType A
+-IndexType B

The resulting internal arrangement is shown in Figure 9.8.

Two top-level index types.

Figure 9.8. Two top-level index types.


Each index type produces exactly one index under the root (since the top-level index types always produce one index). Both indexes contain the same number of rows, and exactly the same rows. When a row is added to the table, it's added to all the leaf index types (one actual index of each type). When a row is deleted from the table, it's deleted from all the leaf index types. So the total is always the same. However the order of rows in the indexes may differ. The drawing shows the row references stacked in the same order as the index A because the index A is of the first leaf index type, and as such is the default one for the iteration.

The row handle contains the iterators for both paths, A and B. It's pretty normal to find a row through one index type and then iterate from there using the other index type.

The next example in Figure 9.9 has a primary index with a unique key and a secondary index that groups the records:

TableType
+-IndexType A
+-IndexType B
  +-IndexType C
A primary and secondary index type.

Figure 9.9. A primary and secondary index type.


The index type A still produces one index and references all the rows directly. The index of type B produces the groups, with each group getting an index of type C. The total set of rows referrable through A and through B is still the same but through B they are split into multiple groups.

And Figure 9.10 shows two leaf index types nested under one non-leaf.

TableType
+-IndexType A
  +-IndexType B
  +-IndexType C
Two index types nested under one.

Figure 9.10. Two index types nested under one.


As usual, there is only one index of type A, and it splits the rows into groups. The new item in this picture is that each group has two indexes in it: one of type B and one of type C. Both indexes in the group contain the same rows. They don't decide, which rows they get. The index A decides, which rows go into which group. Then if the group 1 contains two rows, indexes B1 and C1, would both contain two rows each, the exact same set. The stack of row references has been visually split by groups to make this point more clear.

This happens to be a pretty useful arrangement: for example, B might be a hash index type, or a sorted index type, allowing to find the records by the key (and for the sorted index, to iterate in the order of keys), while C might be a FIFO index, keeping the insertion order, and maybe keeping the window size limited.

That's pretty much it for the basic index topologies. Some much more complex index trees can be created, but they would be the combinations of the examples shown. Also, don't forget that every extra index type adds overhead in both memory and CPU time, so avoid adding indexes that are not needed.

One more fine point has to do with the replacement policies. Consider that we have a table that contains the rows with a single field:

id int32

And the table type has two indexes:

TableType
+-IndexType "A" HashIndex key=(id)
+-IndexType "B" FifoIndex limit=3

And we send there the rowops:

INSERT id=1
INSERT id=2
INSERT id=3
INSERT id=2

The last rowop that inserts the row with id=2 for the second time triggers the replacement policy in both index types. In the index A it is a duplicate key and will cause the removal of the previous row with id=2. In the index B it overflows the limit and pushes out the oldest row, the one with id=1. If both records get deleted, the resulting table contents will be 2 rows (shown in FIFO order):

id=3
id=2

Which is probably not the best outcome. It might be tolerable with a FIFO index and a hashed index but gets even more annoying if there are two FIFO index types in the table: one top-level limiting the total number of rows, another one nested under a hashed index, limiting the number of rows per group, and they start conflicting this way with each other.

The Triceps FIFO index is actually smart enough to avoid such problems: it looks at what the preceding indexes have decided to remove, checks if any of these rows belong to its group, and adjusts its calculation accordingly. In this example the index B will find out that the row with id=2 is already displaced by the index A. That leaves only 2 rows in the index B, so adding a new one will need no displacement. The resulting table contents will be

id=1
id=3
id=2

However here the order of index types is important. If the table were to be defined as

TableType
+-IndexType "B" FifoIndex limit=3
+-IndexType "A" HashIndex key=(id)

then the replacement policy of the index type B would run first, find that nothing has been displaced yet, and displace the row id=1. After that the replacement policy of the index type A will run, and being a hashed index, it doesn't have a choice, it has to replace the row id=2. And both rows end up displaced.

If the situations with automatic replacement of rows by the keyed indexes may arise, always make sure to put the keyed leaf index types before the FIFO leaf index types. However if you always diligently send a DELETE before the INSERT of the new version of the recond, then this problem won't occur and the order of index types will not matter.

9.11. Table and index type introspection

A lot of information about a table type and the index types in it can be read back from them.

$result = $tabType->isInitialized();
$result = $idxType->isInitialized();

return whether a table or index type has been initialized. The index type gets initialized when the table type where it belongs gets initialized. After a table or index type has been initialized, it can not be changed any more, and any methods that change it will return an error. When an index type becomes initialized, it becomes tied to a particular table type. This table type can be read with:

$tabType = $idxType->getTabtype();
$tabType = $idxType->getTabtypeSafe();

The difference between these methods is what happens if the index type was not set into a table type yet. getTabtype() would confess while getTabtypeSafe() would return an undef. Which method to use, depends on the circumstances: if this situation is valid and you're ready to check for it and handle it, use getTabtypeSafe(), otherwise use getTabtype().

Even though an initialized index type can't be tied to another table, when you add it to another table or index type, a deep copy with all its sub-indexes will be made automatically, and that copy will be uninitialized. So it will be able to get initialized and tied to the new table. However if you want to add more sub-indexes to it, do a manual copy first:

$idxTypeCopy = $idxType->copy();

The information about the nested indexes can be found with:

$itSub = $tabType->findSubIndex("indexName");
$itSub = $tabType->findSubIndexSafe("indexName");
@itSubs = $tabType->getSubIndexes();

$itSub = $idxType->findSubIndex("indexName");
$itSub = $idxType->findSubIndexSafe("indexName");
@itSubs = $idxType->getSubIndexes();

The findSubIndex() has been already shown in Section 9.7: “Secondary indexes” . It allows to find the index types on the next level of nesting, starting down from the table, and going recursively into the sub-indexes. The Safe versions return undef if the index is not found, instead of confessing. getSubIndexes() returns the information about the index types of the next level at once, as the name => value pairs. The result array can be placed into a hash but that would lose the order of the sub-indexes, and the order is important for the logic.

This finds the index types step by step. An easier way to find an index type in a table type by the path of the index is with

$idxType = $tabType->findIndexPath(\@idxNames);

The arguments in the array form a path of names in the index type tree. If the path is not found, the function would confess. An empty path is also illegal and would cause the same result. Yes, the argument is not an array but a reference to array. This array is used essentially as a path object. For example the index from the Section 9.7: “Secondary indexes” could be found as:

$itLast2 = $ttWindow->findIndexPath([ "bySymbol", "last2" ]);

The key (the set of fields that uniquely identify the rows) of the index type can be found with

@keys = $it->getKey();

It can be used on any kind of index types but actually returns the data only for the Hashed index types. On the other index types it returns an empty array, though a better support will be available for the Sorted and Ordered indexes in the future.

A fairly common need is to find an index by its name path, and also all the key fields that are used by all the indexes in this path. It's used for such purposes as joins, and it allows to treat a nested index pretty much as a composition of all the indexes in its path. The method

($idxType, @keys) = $tabType->findIndexKeyPath(\@path);

solves this problem and finds by path an index type that allows the direct look-up by key fields. It requires that every index type in the path returns a non-empty array of fields in getKey(). In practice it means that every index in the path must be a Hashed index. Otherwise the method confesses. When the Sorted and maybe other index types will support getKey(), they will be usable with this method too.

Besides checking that each index type in the path works by keys, this method builds and returns the list of all the key fields required for a look-up in this index. Note that @keys is an actual array and not a reference to array. The return protocol of this method is a little weird: it returns an array of values, with the first value being the reference to the index type, and the rest of them the names of the key fields. If the table type were defined as

$tt = Triceps::TableType->new($rt)
  ->addSubIndex("byCcy1",
    Triceps::IndexType->newHashed(key => [ "ccy1" ])
    ->addSubIndex("byCcy12",
      Triceps::IndexType->newHashed(key => [ "ccy2" ])
    )
  )
  ->addSubIndex("byCcy2",
    Triceps::IndexType->newHashed(key => [ "ccy2" ])
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
;

then $tt->findIndexKeyPath([ "byCcy1", "byCcy12" ]) would return ($ixtref, "ccy1", "ccy2"), where $ixtref is the reference to the index type. When assigned to ($ixt, @keys), $ixtref would go into $ixt, and ("ccy1", "ccy2") would go into @keys.

The key field names in the result go in the order they occurred in the definition, from the outermost to the innermost index. The key fields must not duplicate. It's possible to define the index types where the key fields duplicate in the path, say:

$tt = Triceps::TableType->new($rt)
  ->addSubIndex("byCcy1",
    Triceps::IndexType->newHashed(key => [ "ccy1" ])
    ->addSubIndex("byCcy12",
      Triceps::IndexType->newHashed(key => [ "ccy2", "ccy1" ])
    )
  )
;

And they would even work fine, with just a little extra overhead from duplication. But findIndexKeyPath() will refuse such indexes and confess.

Yet another way to find an index is by the keys. Think of an SQL query: having a WHERE condition, you would want to find if there is an index on the fields in the condition, allowing to find the records quickly. Triceps is not quite up to this level of automatic query planning yet but it does some for the joins. If you know, by which fields you want to join, it's nice to find the correct index automatically. The finding of an index by key is done with the method:

@idxPath = $tableType->findIndexPathForKeys(@keyFields);

It returns the array that represents the path to an index type that matches these key fields. And then having the path you can find the index type as such. The index type and all the types in the path still have to be of the Hashed variety. If the correct index cannot be found, an empty array is returned. If you specify the fields that aren't present in the row type in the first place, this is simply treated the same as being unable to find an index for these fields. If more that one index would match, the first one found in the direct order of the index tree walk is returned.

The kind of the index type is also known as the type id. It can be found for an index type with

$id = $idxType->getIndexId();

It's an integer constant, matching one of the values:

  • &Triceps::IT_HASHED
  • &Triceps::IT_FIFO
  • &Triceps::IT_SORTED

There is no different id for the ordered index, because it's built on top of the sorted index, and would return &Triceps::IT_SORTED.

The conversion between the strings and constants for index type ids is done with

$intId = &Triceps::stringIndexId($stringId);
$stringId = &Triceps::indexIdString($intId);

If an invalid value is supplied, the conversion functions will return undef.

There is also a way to find the first index type of a particular kind. It's called somewhat confusingly

$itSub = $idxType->findSubIndexById($indexTypeId);

where $indexTypeId is one of either of Triceps constants or the matching strings "IT_HASHED", "IT_FIFO", "IT_SORTED".

Technically, there is also IT_ROOT but it's of little use for this situation since it's the root of the index type tree hidden inside the table type, and would never be a sub-index type. It's possible to iterate through all the possible index type ids as

for ($i = 0; $i < &Triceps::IT_LAST; $i++) { ... }

The first leaf sub-index type, that is the default for iteration, can be found explicitly as

$itSub = $tabType->getFirstLeaf();
$itSub = $idxType->getFirstLeaf();

If an index is already a leaf, getFirstLeaf() on it will return itself. The leaf-ness of an index type can be found with:

$result = $idxType->isLeaf();

The usual reference comparison methods are:

$result = $tabType1->same($tabType2);
$result = $tabType1->equals($tabType2);
$result = $tabType1->match($tabType2);

$result = $idxType1->same($idxType2);
$result = $idxType1->equals($idxType2);
$result = $idxType1->match($idxType2);

Two table types are considered equal when they have the equal row types, and exactly the same set of index types, with the same names.

Two table types are considered matching when they have the matching row types, and matching set of index types, although the names of the index types may be different.

Two index types are considered equal when they are of the same kind (type id), their type-specific parameters are equal, they have the same number of sub-indexes, with the same names, and equal pair-wise. They must also have the equal aggregators, which will be described in detail in the Chapter 11: “Aggregation.

Two index types are considered matching when they are of the same kind, have matching type-specific parameters, they have the same number of sub-indexes, which are matching pair-wise, and the matching aggregators. The names of the sub-indexes may differ. As far as the type-specific parameters are concerned, it depends on the kind of the index type. The FIFO type considers any parameters matching. For a Hashed index the key fields must be the same. For a Sorted index the sorted condition must also be the same, and by extension this means the same condition for the Ordered index.

9.12. The copy tray

The table methods insert(), remove() and deleteRow() have an extra optional argument: the copy tray.

If used, it will put a copy of all the rowops produced during the operation (including the output of the aggregators, which will be described in Chapter 11: “Aggregation ) into that tray. The idea here is to use it in cases if you don't want to connect the output labels of the table directly, but instead collect and process the rows from the tray manually afterwards. Like this:

$ctr = $unit->makeTray();
$table->insert($row, $ctr);
foreach my $rop ($ctr->toArray()) {
  ...
}

However in reality it didn't work out so well. The processing loop would have to have all the lengthy if-else sequences to branch first by the label (if there are any aggregators) and then by opcode. It looks too difficult. Well, it could work in the simple situations but not more than that.

In the future this feature will likely be deprecated unless it proves itself useful, and I already have a better idea. Because of this, I see no point in going into the more extended examples.

9.13. Table wrap-up

Not all of the table's features have been shown yet. The table class is the cornerstone of Triceps, and everything is connected to it. The aggregators work with the tables and are a whole separate big subject with their own Chapter 11: “Aggregation. The features that take advantage of the streaming functions are described in Section 15.7: “Streaming functions and tables” . There also are many more options and small methods that haven't been touched upon yet. They are enumerated in the reference chapter, please refer there.

Chapter 10. Templates

10.1. Comparative modularity

The templates are the Triceps term for the reusable program modules. I've adopted the term from C++ because that was my inspiration for flexibility. But the Triceps templates are much more flexible yet. The problem with the C++ templates is that you have to write in them like in a functional language, substituting loops with recursion, with perverse nested calls for branching, and the result is quite hard to diagnose. Triceps uses the Perl's compilation on the fly to make things easier and more powerful.

Triceps is not unique in the desire for modularity. The other CEP systems have it too, but they tend to have it even more rigid than the C++ templates. Let me show on a simple example.

Coral8 doesn't provide a way to query the windows directly, especially when the CCL is compiled without debugging. So you're expected to make your own. People at a company where I've worked have developed a nice pattern that goes approximately like this:

// some window that we want to make queryable
create window w_my schema s_my
keep last per key_a per key_b
keep 1 week;

// the stream to send the query requests
// (the schema can be shared by all simple queries)
create schema s_query (
  qqq_id string // unique id of the query
);
create input stream query_my schema s_query;

// the stream to return the results
// (all result streams will inherit a partial schema)
create schema s_result (
  qqq_id string, // returns back the id received in the query
  qqq_end boolean, // will be TRUE in the special end indicator record
);
create output stream result_my schema inherits from s_result, s_my;

// now process the query
insert into result_my
select q.qqq_id, NULL, w.*
from s_query as q, w_my as w;

// the end marker
insert into result_my (qqq_id, qqq_end)
select qqq_id, TRUE
from s_query;

To query the window, a program would select a unique query id, subscribe to result_my with a filter (qqq_id = unique_id) and send a record of (unique_id) into query_my. Then it would sit and collect the result rows. Finally it would get a row with qqq_end = TRUE and disconnect.

This is a fairly large amount of code to be repeated for every window. What I would like to to instead is to just write:

create window w_my schema s_my
keep last per key_a per key_b
keep 1 week;

make_queryable(w_my);

and have the template make_queryable expand into the rest of the code (obviously, the schema definitions would not need to be expanded repeatedly, they would go into an include file).

To make things more interesting, it would be nice to have the query filter the results by some field values. Nothing as fancy as SQL, just by equality to some fields. Suppose, s_my includes the fields field_c and field_d, and we want to be able to filter by them. Then the query can be done as:

create input stream query_my schema inherits from s_query (
  field_c integer,
  field_d string
);

// result_my is the same as before...

// query with filtering (in a rather inefficient way)
insert into result_my
select q.qqq_id, NULL, w.*
from s_query as q, w_my as w
where
  (q.field_c is null or q.field_c = w.field_c)
  and (q.field_d is null or q.field_d = w.field_d);

// the end marker is as before
insert into result_my (qqq_id, qqq_end)
select qqq_id, TRUE
from s_query;

It would be nice then to create this kind of query as a template instantiation

make_query(w_my, (field_c, field_d));

Or even better, have the template determine the non-NULL fields in the query record and compile the right query on the fly.

But the Coral8 modules (nor the later Sybase CEP R5) aren't flexible enough to do any of it. A CCL module requires a fixed schema for all its interfaces. The StreamBase language is more flexible and allows to achieve some of the flexibility through the capture fields, where the logically unimportant fields are carried through the module as one combined payload field. But they don't allow the variable lists of fields as parameters either, nor generation of different model topologies depending on the parameters.

10.2. Template variety

A template in Triceps is generally a function or class that creates a fragment of the model based on its arguments. It provides the access points used to connect this fragment to the rest of the model.

There are different ways do do this. They can be broadly classified in the order of increasing complexity as:

  • A function that creates a single Triceps object and returns it. The benefit is that the function would automatically choose some complex object parameters based on the function parameters, thus turning a complex creation into a simple one.
  • A class that similarly creates multiple fixed objects and interconnects them properly. It would also provide the accessor methods to export the access points of this sub-model. Since the Perl functions may return multiple values, this functionality sometimes can be conveniently done with a function as well, returning the access points in the return array.
  • A class or function that creates multiple objects, with their number and connections dependent on the parameters. For a simple example, a template might receive multiple functions/closures as arguments and then create a pipeline of computational labels, each of them computing one function (of course, this really makes sense only when each label runs in a separate thread).
  • A class or function that automatically generates the Perl code that will be used in the created objects. For a simple example, given the pairs of field names and values, a template can generate the code for a filter label that would pass only the rows where these fields have these values. The same effect can often be achieved by the interpretation as well: keep the arguments until the evaluation needs to be done, and then interpret them. But the early code generation with compilation improves the efficiency of the computation. It's the same idea as in the C++ templates: do more of the hard work at the compile time and then run faster.

The more complex and flexible is the template, the more difficult it's generally to write and debug, but then it just works, encapsulating a complex problem with a simpler interface. There is also the problem of user errors: when the user gives an incorrect argument to a complex template, understanding what exactly went wrong when the error manifests itself, may be quite difficult. The C++ templates are a good example of this. However the use of Perl, a general programming language, as a template language in Triceps provides a good solution for this problem: just check the arguments early in the template and produce the meaningful error messages. It may be a bit cumbersome to write but then easy to use. I also have plans for improving the automatic error reports, to make tracking through the layers of templates easier with minimal code additions in the templates.

I will show the examples of all the template types by implementing the table querying, the same I have shown in CCL in Section 10.1: “Comparative modularity” , only now in Triceps.

10.3. Simple wrapper templates

The SimpleServer package described in Section 7.9: “Main loop with a socket” contains templates for the repeating tasks. makeExitLabel() creates a label that will request the server to exit, makeServerOutLabel() creates a label that will send the rows from some label back into the socket.

Rather than copying the code here again, please refer to the description in Section 7.9: “Main loop with a socket” .

Another similar template that is used throughout the following chapters creates a label that prints the rowop contents. It's located in the package that wraps the input (e.g. feeding) and output of the tests:

package Triceps::X::TestFeed;

# a template to make a label that prints the data passing through another label
sub makePrintLabel($$) # ($print_label_name, $parent_label)
{
  my $name = shift;
  my $lbParent = shift;
  my $lb = $lbParent->getUnit()->makeLabel($lbParent->getType(), $name,
    undef, sub { # (label, rowop)
      print($_[1]->printP(), "\n");
    });
  $lbParent->chain($lb);
  return $lb;
}

It works very much the same as makeServerOutLabel(), only prints to a different destination.

10.4. Templates of interconnected components

Let's move on to the query template. It will work a little differently than the CCL version. First, the socket main loop allows to send the response directly to the same client who issued the request. So there is no need for adding the request id field in the response and for the client filtering by it. Second, Triceps rows have the opcode field, which can be used to signal the end of the response. For example, the data rows can be sent with the opcode INSERT and the indication of the end of response can be sent with the opcode NOP and all fields NULL. The query template can then be made as follows:

package Query1;

sub new # ($class, $table, $name)
{
  my $class = shift;
  my $table = shift;
  my $name = shift;

  my $unit = $table->getUnit();
  my $rt = $table->getRowType();

  my $self = {};
  $self->{unit} = $unit;
  $self->{name} = $name;
  $self->{table} = $table;
  $self->{inLabel} = $unit->makeLabel($rt, $name . ".in", undef, sub {
    # This version ignores the row contents, just dumps the table.
    my ($label, $rop, $self) = @_;
    my $rh = $self->{table}->begin();
    for (; !$rh->isNull(); $rh = $rh->next()) {
      $self->{unit}->call(
        $self->{outLabel}->makeRowop("OP_INSERT", $rh->getRow()));
    }
    # The end is signaled by OP_NOP with empty fields.
    $self->{unit}->makeArrayCall($self->{outLabel}, "OP_NOP");
  }, $self);
  $self->{outLabel} = $unit->makeDummyLabel($rt, $name . ".out");

  bless $self, $class;
  return $self;
}

sub getInputLabel # ($self)
{
  my $self = shift;
  return $self->{inLabel};
}

sub getOutputLabel # ($self)
{
  my $self = shift;
  return $self->{outLabel};
}

sub getName # ($self)
{
  my $self = shift;
  return $self->{name};
}

It creates the input label that does the work and the dummy output label that is used to send the result. The logic is easy: whenever a rowop is received on the input label, iterate through the table and send the contents to the output label. The contents of that received rowop doesn't even matter. The getter methods allow to get the endpoints.

Now this example can be used in a program. Most of it is the example infrastructure: the function to start the server in background and connect a client to it, the creation of the row type and table type to query, and then finally near the end the interesting part: the usage of the query template. The general running is enclosed in the package Triceps::X::DumbClient:

package Triceps::X::DumbClient;

sub run # ($labels)
{
  my $labels = shift;

  my ($port, $pid) = Triceps::X::SimpleServer::startServer(0, $labels);
  my $sock = IO::Socket::INET->new(
    Proto => "tcp",
    PeerAddr => "localhost",
    PeerPort => $port,
  ) or confess "socket failed: $!";
  while(& readLine) {
    $sock->print($_);
    $sock->flush();
  }
  $sock->print("exit,OP_INSERT\n");
  $sock->flush();
  $sock->shutdown(1); # SHUT_WR
  while(<$sock>) {
    & send($_);
  }
  waitpid($pid, 0);
}

The function run() takes care of making the example easier to run: it starts the server in the background, reads the input data and sends it to the server, then reads the responses and prints them back, and finally waits for the server process to exit. It also takes care of sending the exit request to the server when the input reaches EOF. The approach with first sending all the data there and then reading all the responses back is not very good. It works only if either the data gets sent without any responses, or a small amount of data (not to overflow the TCP buffers along the way) gets sent and then it's all the responses coming back. But it's simple, and it works good enough for the small examples. And actually many of the commercial CEP interfaces work exacly like this: they either publish the data to the model or send a small subscription request and print the data received from the subscription.

Then the actual example makes use of this function:

# The basic table type to be used as template argument.
our $rtTrade = Triceps::RowType->new(
  id => "int32", # trade unique id
  symbol => "string", # symbol traded
  price => "float64",
  size => "float64", # number of shares traded
);

our $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("bySymbol",
    Triceps::SimpleOrderedIndex->new(symbol => "ASC")
      ->addSubIndex("last2",
        Triceps::IndexType->newFifo(limit => 2)
      )
  )
;
$ttWindow->initialize();

my $uTrades = Triceps::Unit->new("uTrades");
my $tWindow = $uTrades->makeTable($ttWindow, "tWindow");
my $query = Query1->new($tWindow, "qWindow");
my $srvout = &Triceps::X::SimpleServer::makeServerOutLabel($query->getOutputLabel());

my %dispatch;
$dispatch{$tWindow->getName()} = $tWindow->getInputLabel();
$dispatch{$query->getName()} = $query->getInputLabel();
$dispatch{"exit"} = &Triceps::X::SimpleServer::makeExitLabel($uTrades, "exit");

Triceps::X::DumbClient::run(\%dispatch);

The row type and table type have been just copied from some other example. There is no particular meaning to why such fields were selected or why the table has such indexes. They have been selected semi-randomly. The only triucky thing that affects the result is that this table implements a window with a limit of 2 rows per symbol.

After the table is created, the template instantiation is a single call, Query1->new(). Then the output label of the query template gets connected to a label that sends the output back to the client, and that's it.

Here is an example of a run, with the input rows printed as always in bold.

tWindow,OP_INSERT,1,AAA,10,10
tWindow,OP_INSERT,3,AAA,20,20
qWindow,OP_INSERT
tWindow,OP_INSERT,5,AAA,30,30
qWindow,OP_INSERT
qWindow.out,OP_INSERT,1,AAA,10,10
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_NOP,,,,
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,5,AAA,30,30
qWindow.out,OP_NOP,,,,

Because of the way run() works, all the input rows are printed before the output ones. If it were smarter and knew, when to expect the responses before sending more inputs, the output would have been:

tWindow,OP_INSERT,1,AAA,10,10
tWindow,OP_INSERT,3,AAA,20,20
qWindow,OP_INSERT
qWindow.out,OP_INSERT,1,AAA,10,10
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_NOP,,,,
tWindow,OP_INSERT,5,AAA,30,30
qWindow,OP_INSERT
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,5,AAA,30,30
qWindow.out,OP_NOP,,,,

Two rows get inserted into the table, then a query is done, then one more row is inserted, then another query sent. When the third row is inserted, the first row gets thrown away by the window limit, so the second query also returns two rows albeit different than the first query does.

It is possible to fold the table and the client send label creation into the template as well. It will then be used as follows:

my $window = $uTrades->makeTableQuery2($ttWindow, "window");

my %dispatch;
$dispatch{$window->getName()} = $window->getInputLabel();
$dispatch{$window->getQueryLabel()->getName()} = $window->getQueryLabel();
$dispatch{"exit"} = &ServerHelpers::makeExitLabel($uTrades, "exit");

The rest of the infrastructure would stay unchanged. Just to show how it can be done, I've even added a factory method Unit::makeTableQuery2(). The implementation of this template is:

package TableQuery2;
use Carp;

sub CLONE_SKIP { 1; }

sub new # ($class, $unit, $tabType, $name)
{
  my $class = shift;
  my $unit = shift;
  my $tabType = shift;
  my $name = shift;

  my $table = $unit->makeTable($tabType, $name);
  my $rt = $table->getRowType();

  my $self = {};
  $self->{unit} = $unit;
  $self->{name} = $name;
  $self->{table} = $table;
  $self->{qLabel} = $unit->makeLabel($rt, $name . ".query", undef, sub {
    # This version ignores the row contents, just dumps the table.
    my ($label, $rop, $self) = @_;
    my $rh = $self->{table}->begin();
    for (; !$rh->isNull(); $rh = $rh->next()) {
      $self->{unit}->call(
        $self->{resLabel}->makeRowop("OP_INSERT", $rh->getRow()));
    }
    # The end is signaled by OP_NOP with empty fields.
    $self->{unit}->makeArrayCall($self->{resLabel}, "OP_NOP");
  }, $self);
  $self->{resLabel} = $unit->makeDummyLabel($rt, $name . ".response");

  $self->{sendLabel} = &Triceps::X::SimpleServer::makeServerOutLabel($self->{resLabel});

  bless $self, $class;
  return $self;
}

sub getName # ($self)
{
  my $self = shift;
  return $self->{name};
}

sub getQueryLabel # ($self)
{
  my $self = shift;
  return $self->{qLabel};
}

sub getResponseLabel # ($self)
{
  my $self = shift;
  return $self->{resLabel};
}

sub getSendLabel # ($self)
{
  my $self = shift;
  return $self->{sendLabel};
}

sub getTable # ($self)
{
  my $self = shift;
  return $self->{table};
}

sub getInputLabel # ($self)
{
  my $self = shift;
  return $self->{table}->getInputLabel();
}

sub getOutputLabel # ($self)
{
  my $self = shift;
  return $self->{table}->getOutputLabel();
}

sub getPreLabel # ($self)
{
  my $self = shift;
  return $self->{table}->getPreLabel();
}

# add a factory to the Unit type
package Triceps::Unit;

sub makeTableQuery2 # ($self, $tabType, $name)
{
  return TableQuery2->new(@_);
}

The meat of the logic stays the same. The creation of the table and of the client sending label are added around it, as well as a bunch of getter methods to get access to the components.

The output of this example is the same, with the only difference that it expects and sends different label names:

window,OP_INSERT,1,AAA,10,10
window,OP_INSERT,3,AAA,20,20
window.query,OP_INSERT
window,OP_INSERT,5,AAA,30,30
window.query,OP_INSERT
window.response,OP_INSERT,1,AAA,10,10
window.response,OP_INSERT,3,AAA,20,20
window.response,OP_NOP,,,,
window.response,OP_INSERT,3,AAA,20,20
window.response,OP_INSERT,5,AAA,30,30
window.response,OP_NOP,,,,

10.5. Template options

Often the arguments of the template constructor become more convenient to organize in the option name-value pairs. It becomes particularly useful when there are many arguments and/or when some of them really are optional. For our little query template this is not the case but it can be written with options nevertheless (a modification of the original version, without the table in it):

package Query3;

sub new # ($class, $optionName => $optionValue ...)
{
  my $class = shift;
  my $self = {};

  &Triceps::Opt::parse($class, $self, {
    name => [ undef, \&Triceps::Opt::ck_mandatory ],
    table => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "Triceps::Table") } ],
  }, @_);

  my $name = $self->{name};

  my $table = $self->{table};
  my $unit = $table->getUnit();
  my $rt = $table->getRowType();

  $self->{unit} = $unit;
  $self->{name} = $name;
  $self->{inLabel} = $unit->makeLabel($rt, $name . ".in", undef, sub {
    # This version ignores the row contents, just dumps the table.
    my ($label, $rop, $self) = @_;
    my $rh = $self->{table}->begin();
    for (; !$rh->isNull(); $rh = $rh->next()) {
      $self->{unit}->call(
        $self->{outLabel}->makeRowop("OP_INSERT", $rh->getRow()));
    }
    # The end is signaled by OP_NOP with empty fields.
    $self->{unit}->makeArrayCall($self->{outLabel}, "OP_NOP");
  }, $self);
  $self->{outLabel} = $unit->makeDummyLabel($rt, $name . ".out");

  bless $self, $class;
  return $self;
}

The getter methods stayed the same, so I've skipped them here. The call has changed:

my $query = Query3->new(table => $tWindow, name => "qWindow");

The output stayed the same.

The class Triceps::Opt is used to parse the arguments formatted as options. There is actually a similar option parser in CPAN but it didn't do everything I wanted, and considering how tiny it is, it's easier to write a new one from scratch than to extend that one. I also like to avoid the extra dependencies.

The heart of it is the method Triceps::Opt::parse(). It's normally called from a class constructor to parse the constructor's options, but can be called from the other functions as well. It does the following:

  • Checks that all the options are known.
  • Checks that the values are acceptable.
  • Copies the values into the instance hash of the calling class.
  • Provides the default values for the unspecified options.

If anything goes wrong, it confesses with a reasonable message. The arguments tell the class name for the messages (since, remember, it is normally called from the class constructor), the reference to the object instance hash where to copy the options, the descriptions of the supported options, and the actual key-value pairs.

At the end of it, if all went well, the query's $self will have the values at keys name and table.

The options descriptions go in pairs of option name and an array reference with description. The array contains the default value and the checking function, either of which may be undef. The checking function returns if everything went fine or confesses on any errors. To die happily with a proper message, it gets not only the value to check but more, altogether:

  • The value to check.
  • The name of the option.
  • The name of the class, for error messages.
  • The object instance ($self), just in case.

If you want to do multiple checks, you just make a closure and call all the checks in sequence, passing @_ to them all, like shown here for the option table. If more arguments need to be passed to the checking function, just add them after @_ (or, if you prefer, before it, if you write your checking function that way).

You can create any checking functions, but a few ready ones are provided:

  • Triceps::Opt::ck_mandatory checks that the value is defined.
  • Triceps::Opt::ck_ref checks that the value is a reference to a particular class, or a class derived from it. Just give the class name as the extra argument. Or, to check that the reference is to array or hash, make the argument "ARRAY" or "HASH". Or an empty string "" to check that it's not a reference at all. For the arrays and hashes it can also check the values contained in them for being references to the correct types: give that type as the second extra argument. But it doesn't go deeper than that, just one nesting level. It might be extended later, but for now one nesting level has been enough.
  • Triceps::Opt::ck_refscalar checks that the value is a reference to a scalar. This is designed to check the arguments which are used to return data back to the caller, and it would accept any previous value in that scalar: an actual scalar value, an undef or a reference, since it's about to be overwritten anyway.

The ck_ref() and ck_refscalar() allow the value to be undefined, so they can safely be used on the truly optional options. When I come up with more of the useful check functions, I'll add them.

Triceps::Opt provides more helper functions to deal with options after they have been parsed. One of them is handleUnitTypeLabel() that handles a very specific but frequently occuring case: Depending on the usage, sometimes it's more convenient to give the template the input row type and unit, and later chain its input to another label; and sometimes it's more convenient to give it another ready label and have the template find out the row type and unit from it, and chain its input to that label automatically, like ServerHelpers::makeServerOutLabel() was shown doing in Section 10.3: “Simple wrapper templates” . It's possible if the unit, row type and source label are made the optional options.

Triceps::Opt::handleUnitTypeLabel() takes care of sorting out what information is available, that enough of it is available, that exactly one of row type or source label options is specified, and fills in the unit and row type values from the source label (specifying the unit option along with the source label is OK as long as the unit is the same). To show it off, I re-wrote the ServerHelpers::makeServerOutLabel() as a class with options:

package ServerOutput;
use Carp;

sub CLONE_SKIP { 1; }

# Sending of rows to the server output.
sub new # ($class, $option => $value, ...)
{
  no warnings;

  my $class = shift;
  my $self = {};

  &Triceps::Opt::parse($class, $self, {
    name => [ undef, undef ],
    unit => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::Unit") } ],
    rowType => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::RowType") } ],
    fromLabel => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::Label") } ],
  }, @_);

  &Triceps::Opt::handleUnitTypeLabel("$class::new",
    unit => \$self->{unit},
    rowType => \$self->{rowType},
    fromLabel => \$self->{fromLabel}
  );
  my $fromLabel = $self->{fromLabel};

  if (!defined $self->{name}) {
    confess "$class::new: must specify at least one of the options name and fromLabel"
      unless (defined $self->{fromLabel});
    $self->{name} = $fromLabel->getName() . ".serverOut";
  }

  my $lb = $self->{unit}->makeLabel($self->{rowType},
    $self->{name}, undef, sub {
      &Triceps::X::SimpleServer::outCurBuf(join(",",
        $fromLabel? $fromLabel->getName() : $self->{name},
        &Triceps::opcodeString($_[1]->getOpcode()),
        $_[1]->getRow()->toArray()) . "\n");
    }, $self # $self is not used in the function but used for cleaning
  );
  $self->{inLabel} = $lb;
  if (defined $fromLabel) {
    $fromLabel->chain($lb);
  }

  bless $self, $class;
  return $self;
}

sub getInputLabel() # ($self)
{
  my $self = shift;
  return $self->{inLabel};
}

The arguments to Triceps::Opt::handleUnitTypeLabel() are the caller function name for the error messages, and the pairs of option name and reference to the option value for the unit, row type and the source label.

The new class also has the optional option name. If it's not specified and fromLabel is specified, the name is generated by appending a suffix to the name of the source label. The new class can be used in one of two ways, either

my $srvout = ServerOutput->new(fromLabel => $query->getOutputLabel());

or

my $srvout = ServerOutput->new(
  name => "out",
  unit => $uTrades,
  rowType => $tWindow->getRowType(),
);
$query->getOutputLabel()->chain($srvout->getInputLabel());

The second form comes handy if you want to create it before creating the query.

The other helper function is Triceps::Opt::checkMutuallyExclusive(). It checks that no more than one option from the list is specified. The joins use it to allow multiple ways to specify the join condition. For now I'll show a bit contrived example, rewriting the last example of ServerOutput with it:

package ServerOutput2;
use Carp;

sub CLONE_SKIP { 1; }

# Sending of rows to the server output.
sub new # ($class, $option => $value, ...)
{
  no warnings;

  my $class = shift;
  my $self = {};

  &Triceps::Opt::parse($class, $self, {
    name => [ undef, undef ],
    unit => [ undef, sub { &Triceps::Opt::ck_mandatory; &Triceps::Opt::ck_ref(@_, "Triceps::Unit") } ],
    rowType => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::RowType") } ],
    fromLabel => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::Label") } ],
  }, @_);

  my $fromLabel = $self->{fromLabel};
  if (&Triceps::Opt::checkMutuallyExclusive("$class::new", 1,
      rowType => $self->{rowType},
      fromLabel => $self->{fromLabel}
    ) eq "fromLabel"
  ) {
    $self->{rowType} = $fromLabel->getRowType();
  }

  if (!defined $self->{name}) {
    confess "$class::new: must specify at least one of the options name and fromLabel"
      unless (defined $self->{fromLabel});
    $self->{name} = $fromLabel->getName() . ".serverOut";
  }

  my $lb = $self->{unit}->makeLabel($self->{rowType},
    $self->{name}, undef, sub {
      &Triceps::X::SimpleServer::outCurBuf(join(",",
        $fromLabel? $fromLabel->getName() : $self->{name},
        &Triceps::opcodeString($_[1]->getOpcode()),
        $_[1]->getRow()->toArray()) . "\n");
    }, $self # $self is not used in the function but used for cleaning
  );
  $self->{inLabel} = $lb;
  if (defined $fromLabel) {
    $fromLabel->chain($lb);
  }

  bless $self, $class;
  return $self;
}

sub getInputLabel() # ($self)
{
  my $self = shift;
  return $self->{inLabel};
}

The arguments of the Triceps::Opt::checkMutuallyExclusive() are the caller name for error messages, flag whether one of the mutually exclusive options must be specified, and the pairs of option names and values (this time not references, just values). It returns the name of the only option specified by the user, or undef if none were. If more than one option was used, or if none were used and the mandatory flag is set, the function will confess.

The way this version of the code works, the option unit must be specified in any case, so the use case with the source label becomes:

my $srvout = ServerOutput2->new(
  unit => $uTrades,
  fromLabel => $query->getOutputLabel()
);

The use case with the independent creation is the same as with the previous version of the ServerOutput.

10.6. Code generation in the templates

Suppose we want to filter the result of the query by the equality to the fields in the query request row. The list of the fields would be given to the query template. The query code would check if these fields are not NULL (and since the simplistic CSV parsing is not good enough to tell between NULL and empty values, not an empty value either), and pass only the rows that match it. Here we go (skipping the methods that are the same as before):

package Query4;
use Carp;

sub CLONE_SKIP { 1; }

sub new # ($class, $optionName => $optionValue ...)
{
  my $class = shift;
  my $self = {};

  &Triceps::Opt::parse($class, $self, {
    name => [ undef, \&Triceps::Opt::ck_mandatory ],
    table => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "Triceps::Table") } ],
    fields => [ undef, sub { &Triceps::Opt::ck_ref(@_, "ARRAY") } ],
  }, @_);

  my $name = $self->{name};

  my $table = $self->{table};
  my $unit = $table->getUnit();
  my $rt = $table->getRowType();

  my $fields = $self->{fields};
  if (defined $fields) {
    my %rtdef = $rt->getdef();
    foreach my $f (@$fields) {
      my $t = $rtdef{$f};
      confess "$class::new: unknown field '$f', the row type is:\n"
          . $rt->print() . " "
        unless defined $t;
    }
  }

  $self->{unit} = $unit;
  $self->{name} = $name;
  $self->{inLabel} = $unit->makeLabel($rt, $name . ".in", undef, sub {
    my ($label, $rop, $self) = @_;
    my $query = $rop->getRow();
    my $cmp = $self->{compare};
    my $rh = $self->{table}->begin();
    ITER: for (; !$rh->isNull(); $rh = $rh->next()) {
      if (defined $self->{fields}) {
        my $data = $rh->getRow();
        my %rtdef = $self->{table}->getRowType()->getdef();
        foreach my $f (@{$self->{fields}}) {
          my $v = $query->get($f);
          # Since the simplified CSV parsing in the mainLoop() provides
          # no easy way to send NULLs, consider any empty or 0 value
          # in the query row equivalent to NULLs.
          if ($v
          && (&Triceps::Fields::isStringType($rtdef{$f})
            ? $query->get($f) ne $data->get($f)
            : $query->get($f) != $data->get($f)
            )
          ) {
            next ITER;
          }
        }
      }
      $self->{unit}->call(
        $self->{outLabel}->makeRowop("OP_INSERT", $rh->getRow()));
    }
    # The end is signaled by OP_NOP with empty fields.
    $self->{unit}->makeArrayCall($self->{outLabel}, "OP_NOP");
  }, $self);
  $self->{outLabel} = $unit->makeDummyLabel($rt, $name . ".out");

  bless $self, $class;
  return $self;
}

Used as:

my $query = Query4->new(table => $tWindow, name => "qWindow",
  fields => ["symbol", "price"]);

The field names get checked up front for correctness. And then at run time the code iterates through them and does the checking. Since the comparisons have to be done differently for the string and numeric values, Triceps::Fields::isStringType() is used to check the type of the fields. Triceps::Fields is a collection of functions that help dealing with fields in the templates. Another similar function is Triceps::Fields::isArrayType()

If the option fields is not specified, it would work the same as before and produce the same result. For the filtering by symbol and price, a sample output is:

tWindow,OP_INSERT,1,AAA,10,10
tWindow,OP_INSERT,3,AAA,20,20
tWindow,OP_INSERT,4,BBB,20,20
qWindow,OP_INSERT
tWindow,OP_INSERT,5,AAA,30,30
qWindow,OP_INSERT,5,AAA,0,0
qWindow,OP_INSERT,0,,20,0
qWindow.out,OP_INSERT,1,AAA,10,10
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,4,BBB,20,20
qWindow.out,OP_NOP,,,,
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,5,AAA,30,30
qWindow.out,OP_NOP,,,,
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,4,BBB,20,20
qWindow.out,OP_NOP,,,,

The table data now has one more row of data added to it, with the symbol BBB. The first query has no values to filter in it, so it just dumps the whole table as before. The second query filters by the symbol AAA. The field for price is 0, so it gets treated as empty and excluded from the comparison. The fields for id and size are not in the fields option, so they get ignored even if the value of id is 5. The third query filters by the price equal to 20. The symbol field is empty in the query, so it does not participate in the filtering.

Looking at the query execution code, now there is a lot more going on in it. And quite a bit of it is static, that could be computed at the time the query object is created. The next version does that, building and compiling the comparator function in advance:

package Query5;
use Carp;

sub CLONE_SKIP { 1; }

sub new # ($class, $optionName => $optionValue ...)
{
  my $class = shift;
  my $self = {};

  &Triceps::Opt::parse($class, $self, {
    name => [ undef, \&Triceps::Opt::ck_mandatory ],
    table => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "Triceps::Table") } ],
    fields => [ undef, sub { &Triceps::Opt::ck_ref(@_, "ARRAY") } ],
    saveCodeTo => [ undef, \&Triceps::Opt::ck_refscalar ],
  }, @_);

  my $name = $self->{name};

  my $table = $self->{table};
  my $unit = $table->getUnit();
  my $rt = $table->getRowType();

  my $fields = $self->{fields};
  if (defined $fields) {
    my %rtdef = $rt->getdef();

    # Generate the code of the comparison function by the fields.
    # Since the simplified CSV parsing in the mainLoop() provides
    # no easy way to send NULLs, consider any empty or 0 value
    # in the query row equivalent to NULLs.
    my $gencmp = '
      sub # ($query, $data)
      {
        use strict;
        my ($query, $data) = @_;
        my $v;';
    foreach my $f (@$fields) {
      my $t = $rtdef{$f};
      confess "$class::new: unknown field '$f', the row type is:\n"
          . $rt->print() . " "
        unless defined $t;
      $gencmp .= '
        $v = $query->get("' . quotemeta($f) . '");
        if ($v) {';
      if (&Triceps::Fields::isStringType($t)) {
        $gencmp .= '
          return 0 if ($v ne $data->get("' . quotemeta($f) . '"));';
      } else {
        $gencmp .= '
          return 0 if ($v != $data->get("' . quotemeta($f) . '"));';
      }
      $gencmp .= '
        }';
    }
    $gencmp .= '
        return 1; # all succeeded
      }';

    ${$self->{saveCodeTo}} = $gencmp if (defined($self->{saveCodeTo}));
    $self->{compare} = eval $gencmp;
    confess("Internal error: $class failed to compile the comparator:\n$@\nfunction text:\n$gencmp ")
      if $@;
  }

  $self->{unit} = $unit;
  $self->{name} = $name;
  $self->{inLabel} = $unit->makeLabel($rt, $name . ".in", undef, sub {
    my ($label, $rop, $self) = @_;
    my $query = $rop->getRow();
    my $cmp = $self->{compare};
    my $rh = $self->{table}->begin();
    for (; !$rh->isNull(); $rh = $rh->next()) {
      if (!defined $cmp || &$cmp($query, $rh->getRow())) {
        $self->{unit}->call(
          $self->{outLabel}->makeRowop("OP_INSERT", $rh->getRow()));
      }
    }
    # The end is signaled by OP_NOP with empty fields.
    $self->{unit}->makeArrayCall($self->{outLabel}, "OP_NOP");
  }, $self);
  $self->{outLabel} = $unit->makeDummyLabel($rt, $name . ".out");

  bless $self, $class;
  return $self;
}

The code of the anonymous comparison function gets generated in $gencmp and then compiled by using eval. eval returns the pointer to the compiled function which is then used at run time. The generation uses all the same logic to decide on the string or numeric comparisons, and also effectively unrolls the loop. When generating the string constants in functions from the user-supplied values, it's important to enquote them with quotemeta(). Even when we're talking about the field names, they still could have some funny characters in them. The option saveCodeTo can be used to get the source code of the comparator, it gets saved at the reference after it gets generated.

If the filter field option is not used, the comparator remains undefined.

The use of this version is the same as of the previous one, but to show the source code of the comparator, I've added its printout:

my $cmpcode;
my $query = Query5->new(table => $tWindow, name => "qWindow",
  fields => ["symbol", "price"], saveCodeTo => \$cmpcode );
# as a demonstration
print("Code:\n$cmpcode\n");

This produces the result:

Code:

      sub # ($query, $data)
      {
        use strict;
        my ($query, $data) = @_;
        my $v = $query->get("symbol");
        if ($v) {
          return 0 if ($v ne $data->get("symbol"));
        }
        my $v = $query->get("price");
        if ($v) {
          return 0 if ($v != $data->get("price"));
        }
        return 1; # all succeeded
      }
tWindow,OP_INSERT,1,AAA,10,10
tWindow,OP_INSERT,3,AAA,20,20
tWindow,OP_INSERT,4,BBB,20,20
qWindow,OP_INSERT
tWindow,OP_INSERT,5,AAA,30,30
qWindow,OP_INSERT,5,AAA,0,0
qWindow,OP_INSERT,0,,20,0
qWindow.out,OP_INSERT,1,AAA,10,10
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,4,BBB,20,20
qWindow.out,OP_NOP,,,,
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,5,AAA,30,30
qWindow.out,OP_NOP,,,,
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,4,BBB,20,20
qWindow.out,OP_NOP,,,,

Besides the code printout, the result is the same as last time.

Now, why list the fields in an option? Why not just take them all? After all, if the user doesn't want filtering on some field, he can always simply not set it in the query row. If the efficiency is a concern, with possibly hundreds of fields in the row with only few of them used for filtering, we can do better: we can generate and compile the comparison function after we see the query row. Here goes the next version that does all this:

package Query6;
use Carp;

sub CLONE_SKIP { 1; }

sub new # ($class, $optionName => $optionValue ...)
{
  my $class = shift;
  my $self = {};

  &Triceps::Opt::parse($class, $self, {
    name => [ undef, \&Triceps::Opt::ck_mandatory ],
    table => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "Triceps::Table") } ],
  }, @_);

  my $name = $self->{name};

  my $table = $self->{table};
  my $unit = $table->getUnit();
  my $rt = $table->getRowType();

  $self->{unit} = $unit;
  $self->{name} = $name;
  $self->{inLabel} = $unit->makeLabel($rt, $name . ".in", undef, sub {
    my ($label, $rop, $self) = @_;
    my $query = $rop->getRow();
    my $cmp = $self->genComparison($query);
    my $rh = $self->{table}->begin();
    for (; !$rh->isNull(); $rh = $rh->next()) {
      if (&$cmp($query, $rh->getRow())) {
        $self->{unit}->call(
          $self->{outLabel}->makeRowop("OP_INSERT", $rh->getRow()));
      }
    }
    # The end is signaled by OP_NOP with empty fields.
    $self->{unit}->makeArrayCall($self->{outLabel}, "OP_NOP");
  }, $self);
  $self->{outLabel} = $unit->makeDummyLabel($rt, $name . ".out");

  bless $self, $class;
  return $self;
}

# Generate the comparison function on the fly from the fields in the
# query row.
# Since the simplified CSV parsing in the mainLoop() provides
# no easy way to send NULLs, consider any empty or 0 value
# in the query row equivalent to NULLs.
sub genComparison # ($self, $query)
{
  my $self = shift;
  my $query = shift;

  my %qhash = $query->toHash();
  my %rtdef = $self->{table}->getRowType()->getdef();
  my ($f, $v);

  my $gencmp = '
      sub # ($query, $data)
      {
        use strict;';

  # the sorting keeps the key order predictable for the tests;
  # the can also be done with Hash::Util::hash_traversal_mask()
  # but would not be backwards-compatible
  foreach $f (sort keys %qhash) {
    $v = $qhash{$f};
    next unless($v);
    my $t = $rtdef{$f};

    if (&Triceps::Fields::isStringType($t)) {
      $gencmp .= '
        return 0 if ($_[0]->get("' . quotemeta($f) . '")
          ne $_[1]->get("' . quotemeta($f) . '"));';
    } else {
      $gencmp .= '
        return 0 if ($_[0]->get("' . quotemeta($f) . '")
          != $_[1]->get("' . quotemeta($f) . '"));';
    }
  }
  $gencmp .= '
        return 1; # all succeeded
      }';

  my $compare = eval $gencmp;
  confess("Internal error: Query '" . $self->{name}
      . "' failed to compile the comparator:\n$@\nfunction text:\n$gencmp ")
    if $@;

  # for debugging
  &Triceps::X::SimpleServer::outCurBuf("Compiled comparator:\n$gencmp\n");

  return $compare;
}

Thie option fields is gone, and the code generation has moved into the method genComparison(), that gets called for each query. I've inserted the sending back of the comparison source code at the end of it, to make it easier to understand. Obviously, if this code were used in production, this would have to be commented out, and maybe some better option added for debugging. An example of the output is:

tWindow,OP_INSERT,1,AAA,10,10
tWindow,OP_INSERT,3,AAA,20,20
tWindow,OP_INSERT,4,BBB,20,20
qWindow,OP_INSERT
tWindow,OP_INSERT,5,AAA,30,30
qWindow,OP_INSERT,5,AAA,0,0
qWindow,OP_INSERT,0,,20,0
Compiled comparator:

      sub # ($query, $data)
      {
        use strict;
        return 1; # all succeeded
      }
qWindow.out,OP_INSERT,1,AAA,10,10
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,4,BBB,20,20
qWindow.out,OP_NOP,,,,
Compiled comparator:

      sub # ($query, $data)
      {
        use strict;
        return 0 if ($_[0]->get("symbol")
          ne $_[1]->get("symbol"));
        return 0 if ($_[0]->get("id")
          != $_[1]->get("id"));
        return 1; # all succeeded
      }
qWindow.out,OP_INSERT,5,AAA,30,30
qWindow.out,OP_NOP,,,,
Compiled comparator:

      sub # ($query, $data)
      {
        use strict;
        return 0 if ($_[0]->get("price")
          != $_[1]->get("price"));
        return 1; # all succeeded
      }
qWindow.out,OP_INSERT,3,AAA,20,20
qWindow.out,OP_INSERT,4,BBB,20,20
qWindow.out,OP_NOP,,,,

The first query contains no filter fields, so the function compiles to the constant 1. The second query has the fields id and symbol not empty, so the filtering goes by them. The third query has only the price field, and it is used for filtering.

The code generation on the fly is a powerful tool and is used throughout Triceps.

10.7. Result projection in the templates

The other functionality provided by the Triceps::Fields is the filtering of the fields in the result row type, also known as projection. You can select which fields you want and which you don't want, and rename the fields.

To show how it's done, I took the Query3 example from Section 10.5: “Template options” and added the result field filtering to it. I've also changed the format in which it returns the results to printP(), to show the field names and make the effects of the field renaming visible.

package Query7;

sub CLONE_SKIP { 1; }

sub new # ($class, $optionName => $optionValue ...)
{
  my $class = shift;
  my $self = {};

  &Triceps::Opt::parse($class, $self, {
    name => [ undef, \&Triceps::Opt::ck_mandatory ],
    table => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "Triceps::Table") } ],
    resultFields => [ undef, sub { &Triceps::Opt::ck_ref(@_, "ARRAY", ""); } ],
  }, @_);

  my $name = $self->{name};

  my $table = $self->{table};
  my $unit = $table->getUnit();
  my $rtIn = $table->getRowType();
  my $rtOut = $rtIn;

  if (defined $self->{resultFields}) {
    my @inFields = $rtIn->getFieldNames();
    my @pairs =  &Triceps::Fields::filterToPairs($class, \@inFields, $self->{resultFields});
    ($rtOut, $self->{projectFunc}) = &Triceps::Fields::makeTranslation(
      rowTypes => [ $rtIn ],
      filterPairs => [ \@pairs ],
    );
  } else {
    $self->{projectFunc} = sub {
      return $_[0];
    }
  }

  $self->{unit} = $unit;
  $self->{name} = $name;
  $self->{inLabel} = $unit->makeLabel($rtIn, $name . ".in", undef, sub {
    # This version ignores the row contents, just dumps the table.
    my ($label, $rop, $self) = @_;
    my $rh = $self->{table}->begin();
    for (; !$rh->isNull(); $rh = $rh->next()) {
      $self->{unit}->call(
        $self->{outLabel}->makeRowop("OP_INSERT",
          &{$self->{projectFunc}}($rh->getRow())));
    }
    # The end is signaled by OP_NOP with empty fields.
    $self->{unit}->makeArrayCall($self->{outLabel}, "OP_NOP");
  }, $self);
  $self->{outLabel} = $unit->makeDummyLabel($rtOut, $name . ".out");

  bless $self, $class;
  return $self;
}

sub getInputLabel # ($self)
{
  my $self = shift;
  return $self->{inLabel};
}

sub getOutputLabel # ($self)
{
  my $self = shift;
  return $self->{outLabel};
}

sub getName # ($self)
{
  my $self = shift;
  return $self->{name};
}

package main;

my $uTrades = Triceps::Unit->new("uTrades");
my $tWindow = $uTrades->makeTable($ttWindow, "tWindow");
my $query = Query7->new(table => $tWindow, name => "qWindow",
  resultFields => [ '!id', 'size/lot_$&', '.*' ],
);
# print in the tokenized format
my $srvout = $uTrades->makeLabel($query->getOutputLabel()->getType(),
  $query->getOutputLabel()->getName() . ".serverOut", undef, sub {
    &Triceps::X::SimpleServer::outCurBuf($_[1]->printP() . "\n");
  });
$query->getOutputLabel()->chain($srvout);

my %dispatch;
$dispatch{$tWindow->getName()} = $tWindow->getInputLabel();
$dispatch{$query->getName()} = $query->getInputLabel();
$dispatch{"exit"} = &Triceps::X::SimpleServer::makeExitLabel($uTrades, "exit");

Triceps::X::DumbClient::run(\%dispatch);

The query now has the new option resultFields that defines the projection. That option accepts a reference to an array of pattern strings. If present, it gives the patterns of the fields to let through. The patterns may be either the explicit field names or regular expressions implicitly anchored at both front and back. There is also a bit of extra modification possible:

!pattern
Skip the fields matching the pattern.
pattern/substitution
Pass the matching fields and rename them according to the substitution.

So in this example [ '!id', 'size/lot_$&', '.*' ] means: skip the field id, rename the field size by prepending lot_ to it, and pass through the rest of the fields. In the renaming pattern, $& is the reference to the whole original field name. If you use the parenthesised groups, they are referred to as $1, $2 and so on. But if you use any of those, don't forget to put the pattern into single quotes to prevent the unwanted expansion in the double quotes before the projection gets a chance to see it.

For an example of why the parenthesised groups can be useful, suppose that the row type has multiple account-related elements that all start with acct: acctsrc, acctinternal, acctexternal. Suppose we want to insert an underscore after acct. This can be achieved with the pattern 'acct(.*)/acct_$1'. As usual in the Perl regexps, the parenthesised groups are numbered left to right, starting with $1.

If a specification element refers to a literal field, like here id and size, the projection checks that the field is actually present in the original row type, catching the typos. For the general regular expressions it doesn't check whether the pattern matched anything. It's not difficult to check but that would preclude the reuse of the same patterns on the varying row types, and I'm not sure yet, what is more important.

The way this whole thing works is that each field gets tested against each pattern in order. The first pattern that matches determines what happens to this field. If none of the patterns matches, the field gets ignored. An important consequence about the skipping patterns is that they don't automatically pass through the non-matching fields. You need to add an explicit positive pattern at the end of the list to pass the fields through. '.*' serves this purpose in the example.

A consequence is that the order of the fields can't be changed by the projection. They are tested in the order they appear in the original row type, and are inserted into the projected row type in the same order.

Another important point is that the field names in the result must not duplicate. It would be an error. Be careful with the substitution syntax to avoid creating the duplicate names.

A run example from this version, with the same input as before:

tWindow,OP_INSERT,1,AAA,10,10
tWindow,OP_INSERT,3,AAA,20,20
qWindow,OP_INSERT
tWindow,OP_INSERT,5,AAA,30,30
qWindow,OP_INSERT
qWindow.out OP_INSERT symbol="AAA" price="10" lot_size="10"
qWindow.out OP_INSERT symbol="AAA" price="20" lot_size="20"
qWindow.out OP_NOP
qWindow.out OP_INSERT symbol="AAA" price="20" lot_size="20"
qWindow.out OP_INSERT symbol="AAA" price="30" lot_size="30"
qWindow.out OP_NOP

The rows returned are the same, but projected and printed in the printP() format.

Inside the template the projection works in three steps:

  • Triceps::Fields::filterToPairs() does the projection of the field names and returns its result as an array of names. The names in the array go in pairs: the old name and the new name in each pair. The fields that got skipped do not get included in the list. In this example the array would be ( "symbol", "symbol", "price", "price", "size", "lot_size" ).
  • Triceps::Fields::makeTranslation() then takes this array along with the original row type and produces the result row type and a function reference that does the projection by converting an original row into the projected one.
  • The template execution then calls this projection function for the result rows.

The split of work between filterToPairs() and makeTranslation() has been done partially historically and partially because sometimes you may want to just get the pair names array and then use them on your own instead of calling makeTranslation(). There is one more function that you may find useful if you do the handling on your own: filter(). It takes the same arguments and does the same thing as filterToPairs() but returns the result in a different format. It's still an array of strings but it contains only the names of the translated field names instead of the pairs, in the order matching the order of the original fields. For the fields that have been skipped it contains an undef. For this example it would return ( undef, "symbol", "price", "lot_size" ).

The calls are:

@fields = &Triceps::Fields::filter(
  $caller, \@inFields, \@translation);
@pairs = &Triceps::Fields::filterToPairs(
  $caller, \@inFields, \@translation);
($rowType, $projectFunc) = &Triceps::Fields::makeTranslation(
  $optName => $optValue, ...);

All of them confess on errors, and the argument $caller is used for building the error messages. The options of makeTranslations() are:

rowTypes is a reference to an array of original row types. filterPairs is a reference to an array of filter pair arrays. Both of these options are mandatory. And that's right, makeTranslations() can accept and merge more than one original row type, with a separate projection specification for each of them. It's not quite as flexible as I'd want it to be, not allowing to reorder and mix the fields from different originals (now the fields go in sequence: from the first original, from the second original, and so on), but it's a decent start. When you combine multiple original row types, you need to be particularly careful with avoiding the duplicate field names in the result.

The option saveCodeTo also allows to save the source code of the generated function, same as in the Query5 example in Section 10.6: “Code generation in the templates” .

The general call form of makeTranslations() is:

($rowType, $projectFunc) = &Triceps::Fields::makeTranslation(
  rowTypes => [ $rt1, $rt2, ..., $rtN ],
  filterPairs => [ \@pairs1, \@pairs2, ..., \@pairsN ],
  saveCodeTo => \$codeVar,
);

One of the result type or projection function referece could have also been returned to a place pointed to by an option, like saveCodeTo, but since Perl supports returning multiple values from a function, that looks simpler and cleaner.

The projection function is then called:

$row = &$projectFunc($origRow1, $origRow2, ..., $origRowN);

Naturally, makeTranslations() is a template itself. Let's look at its source code, it shows a new trick.

package Triceps::Fields;

use Carp;

use strict;

sub makeTranslation # (optName => optValue, ...)
{
  my $opts = {}; # the parsed options
  my $myname = "Triceps::Fields::makeTranslation";

  &Triceps::Opt::parse("Triceps::Fields", $opts, {
      rowTypes => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "ARRAY", "Triceps::RowType") } ],
      filterPairs => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "ARRAY", "ARRAY") } ],
      saveCodeTo => [ undef, sub { &Triceps::Opt::ck_refscalar(@_) } ],
    }, @_);

  # reset the saved source code
  ${$opts->{saveCodeTo}} = undef if (defined($opts->{saveCodeTo}));

  my $rts = $opts->{rowTypes};
  my $fps = $opts->{filterPairs};

  confess "$myname: the arrays of row types and filter pairs must be of the same size, got " . ($#{$rts}+1) . " and " . ($#{$fps}+1) . " elements"
    unless ($#{$rts} == $#{$fps});

  my $gencode = '
    sub { # (@rows)
      use strict;
      use Carp;
      confess "template internal error in ' . $myname  . ': result translation expected ' . ($#{$rts}+1) . ' row args, received " . ($#_+1)
        unless ($#_ == ' . $#{$rts} . ');
      # $result_rt comes at compile time from Triceps::Fields::makeTranslation
      return $result_rt->makeRowArray(';

  my @rowdef; # of the result row type
  for (my $i = 0; $i <= $#{$rts}; $i++) {
    my %origdef = $rts->[$i]->getdef();
    my @fp = @{$fps->[$i]}; # copy the array, because it will be shifted
    while ($#fp >= 0) {
      my $from = shift @fp;
      my $to = shift @fp;
      my $type = $origdef{$from};
      confess "$myname: unknown original field '$from' in the original row type $i:\n" . $rts->[$i]->print() . " "
        unless (defined $type);
      push(@rowdef, $to, $type);
      $gencode .= '
        $_[' . $i . ']->get("' . quotemeta($from) . '"),';
    }
  }

  $gencode .= '
      );
    }';

  my $result_rt = Triceps::RowType->new(@rowdef);
    # XXX extended error "$myname: Invalid result row type specification: $! ";

  ${$opts->{saveCodeTo}} = $gencode if (defined($opts->{saveCodeTo}));

  # compile the translation function
  my $func = eval $gencode
    or confess "$myname: error in compilation of the function:\n  $@\nfunction text:\n$gencode ";

  return ($result_rt, $func);
}

By now almost all the parts of the implementation should look familiar to you. It builds the result row definition and the projection function code in parallel by iterating through the originals. An interesting trick is done with passing the result row type into the projection function. The function needs it to create the result rows. But it can't be easily placed into the function source code. So the closure property of the projection function is used: whatever outside my variables occur in the function at the time when it's compiled, will have their values compiled hardcoded into the function. So the my variable $result_rt is set with the result row type, and then the projection function gets compiled. The projection function refers to $result_rt, which gets picked up from the parent scope and hardcoded in the closure.

Chapter 11. Aggregation

11.1. The ubiquitous VWAP

Every CEP supplier loves an example of VWAP calculation: it's small, it's about that quintessential CEP activity: aggregation, and it sounds like something from the real world.

A quick sidebar: what is the VWAP? It's the Value-Weighted Average Price: the average price for the shares traded during some period of time, usually a day. If you take the price of every share traded during the day and calculate the average, you get the VWAP. What is the value-weighted part? The shares don't usually get sold one by one. They're sold in the variable-sized lots. If you think in the terms of lots and not individual shares, you have to weigh the trade prices (not to be confused with costs) for the lots proportional to the number of shares in them.

I've been using VWAP for trying out the different approaches to the aggregation. There are multiple ways to do it, from fully manual, to the aggregator infrastructure with manual computation of the aggregations, to the simple aggregation functions. The cutest version of VWAP so far is implemented as a user-defined aggregation function for the SimpleAggregator. Here is how it goes:

# VWAP function definition
my $myAggFunctions = {
  myvwap => {
    vars => { sum => 0, count => 0, size => 0, price => 0 },
    step => '($%size, $%price) = @$%argiter; '
      . 'if (defined $%size && defined $%price) '
        . '{$%count += $%size; $%sum += $%size * $%price;}',
    result => '($%count == 0? undef : $%sum / $%count)',
  },
};

my $uTrades = Triceps::Unit->new("uTrades");

# the input data
my $rtTrade = Triceps::RowType->new(
  id => "int32", # trade unique id
  symbol => "string", # symbol traded
  price => "float64",
  size => "float64", # number of shares traded
);

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("fifo", Triceps::IndexType->newFifo())
  )
;

# the aggregation result
my $rtVwap;
my $compText; # for debugging

Triceps::SimpleAggregator::make(
  tabType => $ttWindow,
  name => "aggrVwap",
  idxPath => [ "bySymbol", "fifo" ],
  result => [
    symbol => "string", "last", sub {$_[0]->get("symbol");},
    id => "int32", "last", sub {$_[0]->get("id");},
    volume => "float64", "sum", sub {$_[0]->get("size");},
    vwap => "float64", "myvwap", sub { [$_[0]->get("size"), $_[0]->get("price")];},
  ],
  functions => $myAggFunctions,
  saveRowTypeTo => \$rtVwap,
  saveComputeTo => \$compText,
);

$ttWindow->initialize();
my $tWindow = $uTrades->makeTable($ttWindow, "tWindow");

# label to print the result of aggregation
my $lbPrint = $uTrades->makeLabel($rtVwap, "lbPrint",
  undef, sub { # (label, rowop)
    print($_[1]->printP(), "\n");
  });
$tWindow->getAggregatorLabel("aggrVwap")->chain($lbPrint);

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a string opcode
  $uTrades->makeArrayCall($tWindow->getInputLabel(), @data);
  $uTrades->drainFrame(); # just in case, for completeness
}

The aggregators get defined as parts of the table type. Triceps::SimpleAggregator::make() is a kind of a template that adds an aggregator definition to the table type that is specified in the option tabType. An aggeragtor doesn't live in a vacuum, it always works as a part of the table type. As the table gets modified, the aggregator also re-computes its aggregation results. The fine distinction is that the aggregator is a part of the table type, and is common for all the tables of this type. But the table stores its aggregation state, and when an aggregator runs on a table, it uses and modifies that state.

The name of the aggregator is how you can find its result later in the table: each aggregator has an output label created for it, that can be found with $table->getAggregatorLabel(). The option idxPath defines both the grouping of the rows for this aggregator and their order in the group. The index type at the path determines the order and its parent defines the groups. In this case the grouping happens by symbol, and the rows in the groups go in the FIFO order. This means that the aggregation function last will be selecting the row that has been inserted last, in the FIFO order.

The option result defines both the row type of the result and the rules for its computation. Each field is defined there with four elements: name, type, aggregation function name, and the function reference to select the value to be aggregated from the row. Triceps provides a bunch of pre-defined aggregation functions like first, last, sum, count, avg and so on. But VWAP is not one of them (well, maybe now it should be, but then this example would be less interesting). Not to worry, the user can add custom aggregation functions, and that's what this example does.

The option functions contains the definitions of such user-defined aggregation functions. Here it defines the function myvwap. It defines the state variables that will be used to keep the intermediate values for a group, a step computation, and the result computation. Whenever the group changes, the aggregator will reset the state variables to the default values and iterate through the new contents of the group. It will perform the step computation for each row and collect the data in the intermediate variables. After the iteration it will perform the result computation and produce the final value.

The VWAP computation in a weird one, taking two fields as arguments. These two fields get packed into an array reference by

sub { [$_[0]->get("size"), $_[0]->get("price")];}

and then the step computation unpacks and handles them. In the aggregator computations the syntax $%name refers to the intermediate variables and also to a few pre-defined ones. $%argiter is the value extracted from the current row during the iteration.

And that's pretty much it: send the rows to the table, the iterator state gets updated to match the table contents, computes the results and sends them. For example:

OP_INSERT,11,abc,123,100
tWindow.aggrVwap OP_INSERT symbol="abc" id="11" volume="100"
    vwap="123"
OP_INSERT,12,abc,125,300
tWindow.aggrVwap OP_DELETE symbol="abc" id="11" volume="100"
    vwap="123"
tWindow.aggrVwap OP_INSERT symbol="abc" id="12" volume="400"
    vwap="124.5"
OP_INSERT,13,def,200,100
tWindow.aggrVwap OP_INSERT symbol="def" id="13" volume="100"
    vwap="200"
OP_INSERT,14,fgh,1000,100
tWindow.aggrVwap OP_INSERT symbol="fgh" id="14" volume="100"
    vwap="1000"
OP_INSERT,15,abc,128,300
tWindow.aggrVwap OP_DELETE symbol="abc" id="12" volume="400"
    vwap="124.5"
tWindow.aggrVwap OP_INSERT symbol="abc" id="15" volume="700"
    vwap="126"
OP_INSERT,16,fgh,1100,25
tWindow.aggrVwap OP_DELETE symbol="fgh" id="14" volume="100"
    vwap="1000"
tWindow.aggrVwap OP_INSERT symbol="fgh" id="16" volume="125"
    vwap="1020"
OP_INSERT,17,def,202,100
tWindow.aggrVwap OP_DELETE symbol="def" id="13" volume="100"
    vwap="200"
tWindow.aggrVwap OP_INSERT symbol="def" id="17" volume="200"
    vwap="201"
OP_INSERT,18,def,192,1000
tWindow.aggrVwap OP_DELETE symbol="def" id="17" volume="200"
    vwap="201"
tWindow.aggrVwap OP_INSERT symbol="def" id="18" volume="1200"
    vwap="193.5"

When a group gets modified, the aggregator first sends a DELETE of the old contents, then an INSERT of the new contents. But when the first row gets inserted in a group, there is nothing to delete, and only INSERT is sent. And the opposite, when the last row is deleted from a group, only the DELETE is sent.

After this highlight, let's look at the aggregators from the bottom up.

11.2. Manual aggregation

The table exanmple in Section 9.7: “Secondary indexes” prints the aggregated information (the average price of two records). This can be fairly easily changed to put the information into the rows and send them on as labels. The function printAverage() has morphed into computeAverage(), while the rest of the example stayed the same and is omitted:

our $rtAvgPrice = Triceps::RowType->new(
  symbol => "string", # symbol traded
  id => "int32", # last trade's id
  price => "float64", # avg price of the last 2 trades
);

# place to send the average: could be a dummy label, but to keep the
# code smaller also print the rows here, instead of in a separate label
our $lbAverage = $uTrades->makeLabel($rtAvgPrice, "lbAverage",
  undef, sub { # (label, rowop)
    print($_[1]->printP(), "\n");
  });

# Send the average price of the symbol in the last modified row
sub computeAverage # (row)
{
  return unless defined $rLastMod;
  my $rhFirst = $tWindow->findIdx($itSymbol, $rLastMod);
  my $rhEnd = $rhFirst->nextGroupIdx($itLast2);
  print("Contents:\n");
  my $avg = 0;
  my ($sum, $count);
  my $rhLast;
  for (my $rhi = $rhFirst;
      !$rhi->same($rhEnd); $rhi = $rhi->nextIdx($itLast2)) {
    print("  ", $rhi->getRow()->printP(), "\n");
    $rhLast = $rhi;
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  if ($count) {
    $avg = $sum/$count;
    $uTrades->call($lbAverage->makeRowop(&Triceps::OP_INSERT,
      $rtAvgPrice->makeRowHash(
        symbol => $rhLast->getRow()->get("symbol"),
        id => $rhLast->getRow()->get("id"),
        price => $avg
      )
    ));
  }
}

while(<STDIN>) {
  chomp;
  my @data = split(/,/);
  $uTrades->makeArrayCall($tWindow->getInputLabel(), @data);
  &computeAverage();
  undef $rLastMod; # clear for the next iteration
  $uTrades->drainFrame(); # just in case, for completeness
}

For the demonstration, the aggregated rows sent to $lbAverage get printed. The rows being aggregated are printed during the iteration too, indented after Contents:. And here is a sample run's result, with the input records shown in bold:

OP_INSERT,1,AAA,10,10
Contents:
  id="1" symbol="AAA" price="10" size="10"
lbAverage OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
Contents:
  id="1" symbol="AAA" price="10" size="10"
  id="3" symbol="AAA" price="20" size="20"
lbAverage OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,5,AAA,30,30
Contents:
  id="3" symbol="AAA" price="20" size="20"
  id="5" symbol="AAA" price="30" size="30"
lbAverage OP_INSERT symbol="AAA" id="5" price="25"
OP_DELETE,3
Contents:
  id="5" symbol="AAA" price="30" size="30"
lbAverage OP_INSERT symbol="AAA" id="5" price="30"
OP_DELETE,5
Contents:

There are a couple of things to notice about it: it produces only the INSERT rowops, no DELETEs, and when the last record of the group is removed, that event produces nothing.

The first item is mildly problematic because the processing downstream from here might not be able to handle the updates properly without the DELETE rowops. It can be worked around fairly easily by connecting another table to store the aggregation results, with the same primary key as the aggregation key. That table would automatically transform the repeated INSERTs on the same key to a DELETE-INSERT sequence.

The second item is actually pretty bad because it means that the last record deleted gets stuck in the aggregation results. The Coral8 solution for this situation is to send a row with all non-key fields set to NULL, to reset them (interestingly, it's a relatively recent addition, that bug took Coral8 years to notice). But with the opcodes available, we can as well send a DELETE rowop with the key fields filled, the helper table will fill in the rest of the fields, and produce a clean DELETE.

All this can be done by the following changes. Add the table, remember its input label in $lbAvgPriceHelper. It will be used to send the aggregated rows instead of $tAvgPrice. Then still use $tAvgPrice to print the records coming out, but now connect it after the helper table. And in computeAverage() change the destination label and add the case for when the group becomes empty ($count == 0). The rest of the example stays the same.

our $rtAvgPrice = Triceps::RowType->new(
  symbol => "string", # symbol traded
  id => "int32", # last trade's id
  price => "float64", # avg price of the last 2 trades
);

our $ttAvgPrice = Triceps::TableType->new($rtAvgPrice)
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
  )
;
$ttAvgPrice->initialize();
our $tAvgPrice = $uTrades->makeTable($ttAvgPrice, "tAvgPrice");
our $lbAvgPriceHelper = $tAvgPrice->getInputLabel();

# place to send the average: could be a dummy label, but to keep the
# code smaller also print the rows here, instead of in a separate label
our $lbAverage = makePrintLabel("lbAverage", $tAvgPrice->getOutputLabel());

# Send the average price of the symbol in the last modified row
sub computeAverage2 # (row)
{
  return unless defined $rLastMod;
  my $rhFirst = $tWindow->findIdx($itSymbol, $rLastMod);
  my $rhEnd = $rhFirst->nextGroupIdx($itLast2);
  print("Contents:\n");
  my $avg = 0;
  my ($sum, $count);
  my $rhLast;
  for (my $rhi = $rhFirst;
      !$rhi->same($rhEnd); $rhi = $rhi->nextIdx($itLast2)) {
    print("  ", $rhi->getRow()->printP(), "\n");
    $rhLast = $rhi;
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  if ($count) {
    $avg = $sum/$count;
    $uTrades->makeHashCall($lbAvgPriceHelper, &Triceps::OP_INSERT,
      symbol => $rhLast->getRow()->get("symbol"),
      id => $rhLast->getRow()->get("id"),
      price => $avg
    );
  } else {
    $uTrades->makeHashCall($lbAvgPriceHelper, &Triceps::OP_DELETE,
      symbol => $rLastMod->get("symbol"),
    );
  }
}

The change is straightforward. The label $lbAverage now reverts to just printing the rowops going through it, so it can be created with the template makePrintLabel() described in Section 10.3: “Simple wrapper templates” .

Then the output for the same input becomes:

OP_INSERT,1,AAA,10,10
Contents:
  id="1" symbol="AAA" price="10" size="10"
tAvgPrice.out OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
Contents:
  id="1" symbol="AAA" price="10" size="10"
  id="3" symbol="AAA" price="20" size="20"
tAvgPrice.out OP_DELETE symbol="AAA" id="1" price="10"
tAvgPrice.out OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,5,AAA,30,30
Contents:
  id="3" symbol="AAA" price="20" size="20"
  id="5" symbol="AAA" price="30" size="30"
tAvgPrice.out OP_DELETE symbol="AAA" id="3" price="15"
tAvgPrice.out OP_INSERT symbol="AAA" id="5" price="25"
OP_DELETE,3
Contents:
  id="5" symbol="AAA" price="30" size="30"
tAvgPrice.out OP_DELETE symbol="AAA" id="5" price="25"
tAvgPrice.out OP_INSERT symbol="AAA" id="5" price="30"
OP_DELETE,5
Contents:
tAvgPrice.out OP_DELETE symbol="AAA" id="5" price="30"

All fixed, the proper DELETEs are coming out. The last line shows the empty group contents in the table but the DELETE row is still coming out.

Why should we worry so much about the DELETEs? Because without them, relying on just INSERTs for updates, it's easy to create bugs. The last example still has an issue with handling the row replacement by INSERTs. Can you spot it from reading the code?

Here is run example that highlights the issue (as usual, the input lines are in bold):

OP_INSERT,1,AAA,10,10
Contents:
  id="1" symbol="AAA" price="10" size="10"
tAvgPrice.out OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
Contents:
  id="1" symbol="AAA" price="10" size="10"
  id="3" symbol="AAA" price="20" size="20"
tAvgPrice.out OP_DELETE symbol="AAA" id="1" price="10"
tAvgPrice.out OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,5,AAA,30,30
Contents:
  id="3" symbol="AAA" price="20" size="20"
  id="5" symbol="AAA" price="30" size="30"
tAvgPrice.out OP_DELETE symbol="AAA" id="3" price="15"
tAvgPrice.out OP_INSERT symbol="AAA" id="5" price="25"
OP_INSERT,5,BBB,30,30
Contents:
  id="5" symbol="BBB" price="30" size="30"
tAvgPrice.out OP_INSERT symbol="BBB" id="5" price="30"
OP_INSERT,7,AAA,40,40
Contents:
  id="3" symbol="AAA" price="20" size="20"
  id="7" symbol="AAA" price="40" size="40"
tAvgPrice.out OP_DELETE symbol="AAA" id="5" price="25"
tAvgPrice.out OP_INSERT symbol="AAA" id="7" price="30"

The row with id=5 has been replaced to change the symbol from AAA to BBB. This act changes both the groups of AAA and of BBB, removing the row from the first one and inserting it into the second one. Yet only the output for BBB came out. The printout of the next row with id=7 and symbol=AAA shows that the row with id=5 has been indeed removed from the group AAA. It even corrects the result. But until that row came in, the average for the symbol AAA remained unchanged and incorrect.

There are multiple ways to fix this issue but first it had to be noticed. Which requires a lot of attention to detail. It's much better to avoid these bugs in the first place by sending the clean and nice input.

11.3. Introducing the proper aggregation

Since the manual aggregation is error-prone, Triceps can manage it for you and do it right. The only thing you need to do is do the actual iteration and computation. Here is the rewrite of the same example with a Triceps aggregator:

my $uTrades = Triceps::Unit->new("uTrades");

# the input data
my $rtTrade = Triceps::RowType->new(
  id => "int32", # trade unique id
  symbol => "string", # symbol traded
  price => "float64",
  size => "float64", # number of shares traded
);

# the aggregation result
my $rtAvgPrice = Triceps::RowType->new(
  symbol => "string", # symbol traded
  id => "int32", # last trade's id
  price => "float64", # avg price of the last 2 trades
);

# aggregation handler: recalculate the average each time the easy way
sub computeAverage1 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;

  # don't send the NULL record after the group becomes empty
  return if ($context->groupSize()==0
    || $opcode == &Triceps::OP_NOP);

  my $sum = 0;
  my $count = 0;
  for (my $rhi = $context->begin(); !$rhi->isNull();
      $rhi = $context->next($rhi)) {
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  my $rLast = $context->last()->getRow();
  my $avg = $sum/$count;

  my $res = $context->resultType()->makeRowHash(
    symbol => $rLast->get("symbol"),
    id => $rLast->get("id"),
    price => $avg
  );
  $context->send($opcode, $res);
}

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("last2",
      Triceps::IndexType->newFifo(limit => 2)
      ->setAggregator(Triceps::AggregatorType->new(
        $rtAvgPrice, "aggrAvgPrice", undef, \&computeAverage1)
      )
    )
  )
;
$ttWindow->initialize();
my $tWindow = $uTrades->makeTable($ttWindow, "tWindow");

# label to print the result of aggregation
my $lbAverage = makePrintLabel("lbAverage",
  $tWindow->getAggregatorLabel("aggrAvgPrice"));

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a string opcode
  $uTrades->makeArrayCall($tWindow->getInputLabel(), @data);
  $uTrades->drainFrame(); # just in case, for completeness
}

What has changed in this code? The things got rearranged a bit.The aggregator is now defined as a part of the table type, so the aggregation result row type and its computation function had to be moved up.

The AggregatorType object holds the information about the aggregator. In the table type, the aggregator type gets attached to an index type with setAggregator(). In this case, to the FIFO index type. The parent of that index type determines the aggregation groups, grouping happening by its combined key fields (that is, all the key fields of all the indexes in the path starting from the root). For aggregation the working or non-working method getKey() doesn't matter, so any of the Hashed, Ordered and Sorted index types can be used. The index type where the aggregator type is attached determines the order of the rows in the groups. If you use FIFO, the rows will be in the order of arrival. If you use Ordered or Sorted, the rows will be in the sort order. If you use Hashed, the rows will be in some random order, which is not particularly useful.

At present an index type may have no more than one aggregator type attached to it. There is no particular reason for that, other than that it was slightly easier to implement, and that I can't think yet of a real-word situation where multiple aggregators on the same index would be needed. If this situation will ever occur, this support can be added. However a table type may have multiple aggregator types in it, on different indexes. You can save a reference to an aggregator type in a variable and reuse it in the different table types too (though not multiple times in the same table, since that would cause a naming conflict).

The aggregator type is created with the arguments of

  • result row type,
  • aggregator name,
  • group initialization Perl function (which may be undef, as in this example),
  • group computation Perl function or source code snippet,
  • the optional arguments for the functions.

Note that there is a difference in naming between the aggregator types and index types: an aggregator type knows its name, while an index type does not. An index type is given a name only in its hierarchy inside the table type, but it does not know its name.

When a table is created, it finds all the aggregator types in it, and creates an output label for each of them. The names of the aggregator types are used as suffixes to the table name. In this example the aggregator will have its output label named tWindow.aggrAvgPrice. This puts all the aggregator types in the table into the same namespace, so make sure to give them different names in the same table type. Also avoid the names in, out and pre because these are already taken by the table's own labels. The aggregator labels in the table can be found with

$aggLabel = $table->getAggregatorLabel("aggName");

The aggregator types are theoretically multithreaded but the way the Perl threads work, the Perl code has to be recompiled from the source code in each thread. So for a table type with aggregators to be exportable to the other threads, the aggregators must have their logic specified as the Perl source code, not a compiled Perl function.

After the logic is moved into a managed aggregator, the main loop becomes simpler.

The computation function gets a lot more arguments than it used to. The most interesting and most basic ones are $context, $opcode, and $rh. The rest are useful in the more complex cases only.

The aggregator type is exactly that: a type. It doesn't know, on which table or index, or even index type it will be used. And indeed, it might be used on multiple tables and index types. But to do the iteration on the rows, the computation function needs to get this information somehow. And it does, in the form of aggregator context. The manual aggregation used the last table output row to find, on which exact group to iterate. The managed aggregator gets the last modified row handle as the argument $rh. But our simple aggregator doesn't even need to consult $rh because the context takes care of finding the group too: it knows the exact group and exact index that needs to be aggregated (look at the index tree drawings in Section 9.10: “The index tree” for the difference between an index type and an index).

The context provides its own begin() and next() methods. They are actually slightly more efficient than the usual table iteration methods because they take advantage of that exact known index. The most important part, they work differently.

$rhi = $context->next($rhi);

returns a NULL row handle when it reaches the end of the group. Do not, I repeat, DO NOT use the $rhi->next() in the aggregators, or you'll get some very wrong results.

The context also has a bit more of its own magic.

$rh = $context->last();

returns the last row handle in the group. This comes very handy because in most of the cases you want the data from the last row to fill the fields that haven't been aggregated as such. This is like the SQL function LAST(). Using the fields from the argument $rh, unless they are the key fields for this group, is generally not a good idea because it adds an extra dependency on the order of modifications to the table. The FIRST() or LAST() (i.e. the context's begin() or last()) are much better and not any more expensive.

$size = $context->groupSize();

returns the number of rows in the group. It's your value of COUNT(*) in SQL terms, and if that's all you need, you don't need to iterate.

$context->send($opcode, $row);

constructs a result rowop and sends it to the aggregator's output label. Remember, the aggregator type as such knows nothing about this label, so the path through the context is the only path. Note also that it takes a row and not a rowop, because a label is needed to construct the rowop in the first place.

$rt = $context->resultType();

provides the result row type needed to construct the result row. There also are a couple of convenience methods that combine the row construction and sending, that can be used instead:

$context->makeHashSend ($opcode, $fieldName => $fieldValue, ...);
$context->makeArraySend($opcode, @fieldValues);

The final thing about the aggregator context: it works only inside the aggregator computation function. Once the function returns, all its methods start returning undef. So there is no point in trying to save it for later in a global variable or such, don't do that.

As you can see, computeAverage() has the same logic as before, only now it uses the aggregation context. And I've removed the debugging printout of the rows in the group.

The last unexplained piece is the opcode handling and that comparison to OP_NOP. Basically, the table calls the aggregator computation every time something changes in its index. It describes the reason for the call in the argument $aggop (aggregation operation). Depending on how clever an aggregator wants to be, it may do something useful on all of these occasions, or only on some of them. The simple aggregator that doesn't try any smart optimizations but just goes and iterates through the rows every time only needs to react in some of the cases. To make its life easier, Triceps pre-computes the opcode that should be used for the result and puts it into the argument $opcode. So to ignore the non-interesting calls, the simple aggregator computation can just return if it sees the opcode OP_NOP.

Why does it also check for the group size being 0? Again, Triceps provides flexibility in the aggregators. Among other things, it allows to implement the logic like Coral8, when on deletion of the last row in the group the aggregator would send a row with all non-key fields set to NULL (it can take the key fields from the argument $rh). So for this specific purpose the computation function gets called with all rows deleted from the group, and $opcode set to OP_INSERT. And, by the way, a true Coral8-styled aggregator would ignore all the calls where the $opcode is not OP_INSERT. But the normal aggregators need to avoid doing this kind of crap, so they have to ignore the calls where $context->groupSize()==0.

And here is an example of the output from that code (as usual, the input lines are in bold):

OP_INSERT,1,AAA,10,10
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="1" price="10"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,5,AAA,30,30
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="3" price="15"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="25"
OP_DELETE,3
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="25"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="30"
OP_DELETE,5
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="30"

As you can see, it's exactly the same as from the manual aggregation example with the helper table, minus the debugging printout of the group contents. However here it's done without the helper table: instead the aggregation function is called before and after each update.

This presents a memory vs CPU compromise: a helper table uses more memory but requires less CPU for the aggregation computations (presumably, the insertion of the row into the table is less computationally intensive than the iteration through the original records).

The managed aggregators can be made to work with a helper table too: just chain a helper table to the aggregator's label, and in the aggregator computation add

return if ($opcode == &Triceps::OP_DELETE
  && $context->groupSize() != 1);

This would skip all the DELETEs except for the last one, before the group collapses.

There is also a way to optimize this logic right inside the aggregator: remember the last INSERT row sent, and on DELETE just resend the same row, as will be shown in Section 11.5: “Optimized DELETEs” . This remembered last state can also be used for the other interesting optimizations that will be shown in Section 11.6: “Additive aggregation” .

Which approach is better, depends on the particular case. If you need to store the results of aggregation in a table for the future look-ups anyway, then that table is no extra overhead. That's what the Aleri system does internally: since each element in its model keeps a primary-indexed table (materialized view) of the result, that table is used whenever possible to generate the DELETEs without involving any logic. Or the extra optimization inside the aggregator can seriously improve the performance on the large groups. Sometimes you may want both.

Now let's look at the run with the same input that went wrong with the manual aggregation:

OP_INSERT,1,AAA,10,10
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="1" price="10"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,5,AAA,30,30
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="3" price="15"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="25"
OP_INSERT,5,BBB,30,30
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="25"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="3" price="20"
tWindow.aggrAvgPrice OP_INSERT symbol="BBB" id="5" price="30"
OP_INSERT,7,AAA,40,40
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="3" price="20"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="7" price="30"

Here it goes right. Triceps recognizes that the second INSERT with id=5 moves the row to another group. So it performs the aggregation logic for both groups. First for the group where the row gets removed, it updates the aggregator result with a DELETE and INSERT (note that id became 3, since it's now the last row left in that group). Then for the group where the row gets added, and since there was nothing in that group before, it generates only an INSERT.

The handling of the fatal errors (as in die()) in the aggregator functions is an interesting subject. The errors propagate properly through the table, and the table operations confess with the Perl handler's error message. But since an error in the aggregator function means that things are going very, very wrong, after that the table becomes inoperative and will die on all the subsequent operations as well. You need to be very careful in writing these functions.

11.4. Tricks with aggregation on a sliding window

Now it all works as it should, but there is still some room for improvement, related to the way the sliding window limits are handled.

Let's look again at the sample aggregation output with row deletion, copied here for convenience:

OP_INSERT,1,AAA,10,10
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="1" price="10"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,5,AAA,30,30
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="3" price="15"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="25"
OP_DELETE,3
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="25"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="30"
OP_DELETE,5
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="30"

When the row with id=3 is deleted, the average price reverts to 30, which is the price of the trade with id=5, not the average of trades with id 1 and 5.

This is because the table is actually a sliding window, with the FIFO index having a limit of 2 rows

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("last2",
      Triceps::IndexType->newFifo(limit => 2)
      ->setAggregator(Triceps::AggregatorType->new(
        $rtAvgPrice, "aggrAvgPrice", undef, \&computeAverage1)
      )
    )
  )
;

When the row with id=5 was inserted, it pushed out the row with id=1. Deleting the record with id=3 does not put that row with id=1 back. You can see the group contents in an even earlier printout with the manual aggregation, also copied here for convenience:

OP_INSERT,1,AAA,10,10
Contents:
  id="1" symbol="AAA" price="10" size="10"
lbAverage OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
Contents:
  id="1" symbol="AAA" price="10" size="10"
  id="3" symbol="AAA" price="20" size="20"
lbAverage OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,5,AAA,30,30
Contents:
  id="3" symbol="AAA" price="20" size="20"
  id="5" symbol="AAA" price="30" size="30"
lbAverage OP_INSERT symbol="AAA" id="5" price="25"
OP_DELETE,3
Contents:
  id="5" symbol="AAA" price="30" size="30"
lbAverage OP_INSERT symbol="AAA" id="5" price="30"
OP_DELETE,5
Contents:

Like the toothpaste, once out of the tube, it's not easy to put back. But for this particular kind of toothpaste there is a trick: keep more rows in the group just in case but use only the last few for the actual aggregation. To allow an occasional deletion of a single row, we can keep 3 rows instead of 2.

So, change the table definition:

...
      Triceps::IndexType->newFifo(limit => 3)
...

and modify the aggregator function to use only the last 2 rows from the group, even if more are available:

sub computeAverage2 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;

  # don't send the NULL record after the group becomes empty
  return if ($context->groupSize()==0
    || $opcode == &Triceps::OP_NOP);

  my $skip = $context->groupSize()-2;
  my $sum = 0;
  my $count = 0;
  for (my $rhi = $context->begin(); !$rhi->isNull();
      $rhi = $context->next($rhi)) {
    if ($skip > 0) {
      $skip--;
      next;
    }
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  my $rLast = $context->last()->getRow();
  my $avg = $sum/$count;

  my $res = $context->resultType()->makeRowHash(
    symbol => $rLast->get("symbol"),
    id => $rLast->get("id"),
    price => $avg
  );
  $context->send($opcode, $res);
}

The output from this version becomes:

OP_INSERT,1,AAA,10,10
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="1" price="10"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,5,AAA,30,30
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="3" price="15"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="25"
OP_DELETE,3
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="25"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="20"
OP_DELETE,5
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="20"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="1" price="10"

Now after OP_DELETE,3 the average price becomes 20, the average of 10 and 30, because the row with id=1 comes into play again. Can you repeat that in the SQLy languages?

This version stores one extra row and thus can handle only one deletion (until the deleted row's spot gets pushed out of the window naturally, then it can handle another). It can not handle the arbitrary modifications properly. If you insert another row with id=3 for the same symbol AAA, the new version will be placed again at the end of the window. If it was the last row anyway, that is fine. But if it was not the last, as in this example, that would be an incorrect order that will produce incorrect results.

But just change the table type definition to aggregate on a sorted index instead of FIFO and it becomes able to handle the updates while keeping the rows in the order of their ids:

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("orderById",
      Triceps::SimpleOrderedIndex->new(id => "ASC",)
      ->setAggregator(Triceps::AggregatorType->new(
        $rtAvgPrice, "aggrAvgPrice", undef, \&computeAverage3)
      )
    )
    ->addSubIndex("last3",
      Triceps::IndexType->newFifo(limit => 3))
  )
;

The FIFO index is still there, in parallel, but it doesn't determine the order of rows for aggregation any more. Here is a sample of this version's work:

OP_INSERT,1,AAA,10,10
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,3,AAA,20,20
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="1" price="10"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,5,AAA,30,30
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="3" price="15"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="25"
OP_DELETE,3
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="25"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="20"
OP_INSERT,3,AAA,20,20
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="20"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="25"
OP_INSERT,7,AAA,40,40
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="25"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="7" price="35"

When the row with id=3 gets deleted, the average reverts to the rows 1 and 5. When the row 3 gets inserted back, the average works on rows 3 and 5 again. Then when the row 7 is inserted, the aggregation moves up to the rows 5 and 7.

The row expiration is still controlled by the FIFO index. So after the row 3 is inserted back, the order of rows in the FIFO becomes

1, 5, 3

Then when the row 7 is inserted, it advances to

5, 3, 7

At this point, until the row 3 gets naturally popped out of the FIFO, it's best not to have other deletions nor updates, or the group contents may become incorrect.

The FIFO and Ordered index types work in parallel on the same group, and the Ordered index always keeps the right order:

1, 3, 5
3, 5, 7

At long as the records with the two highest ids are in the group at all, the Ordered index will keep them in the right position at the end.

In this case we could even make a bit of optimization: turn the sorting order around, and have the Ordered index arrange the rows in the descending order. Then instead of skipping the rows until the last two, just take the first two rows of the reverse order. They'll be iterated in the opposite direction but for the averaging it doesn't matter. And instead of the last row take the first row of the opposite order. This is a simple modification and is left as an exercise for the reader.

Thinking further, the sensitivity to the ordering comes largely from the FIFO index. If the replacement policy could be done directly on the Ordered index, it would become easier. Would be a good thing to add in the future. Also, if you keep all the day's trades anyway, you might not need to have a replacement policy at all: just pick the last 2 records for the aggregation. There is currently no way to iterate back from the end (another thing to add in the future) but the same trick with the opposite order would work.

For a new subject, this table type indexes by id twice: once as a primary index, another time as a nested one. Are both of them really necessary or would just the nested one be good enough? That depends on your input data. If you get the DELETEs like OP_DELETE,3 with all the other fields as NULL, then a separate primary index is definitely needed. But if the DELETEs come exactly as the same records that were inserted, only with a different opcode, like OP_DELETE,3,AAA,20,20 then the primary index can be skipped because the nested sorted index will be able to find the rows correctly and handle them. The bottom line is, the fully correct DELETE records are good.

11.5. Optimized DELETEs

I've already mentioned that the DELETEs coming out of an aggregator do not have to be recalculated every time. Instead the rows can be remembered from the insert time, and simply re-sent with the new opcode. That allows to trade the CPU time for the extra memory. Of course, this works best when there are many rows per aggregation group, then more CPU time is saved on not iterating through them. How many is many? It depends on the particular cases. You'd have to measure. Anyway, here is how it's done:

sub computeAverage4 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;

  # don't send the NULL record after the group becomes empty
  return if ($context->groupSize()==0
    || $opcode == &Triceps::OP_NOP);
  if ($opcode == &Triceps::OP_DELETE) {
    $context->send($opcode, $$state);
    return;
  }

  my $sum = 0;
  my $count = 0;
  for (my $rhi = $context->begin(); !$rhi->isNull();
      $rhi = $context->next($rhi)) {
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  my $rLast = $context->last()->getRow();
  my $avg = $sum/$count;

  my $res = $context->resultType()->makeRowHash(
    symbol => $rLast->get("symbol"),
    id => $rLast->get("id"),
    price => $avg
  );
  ${$state} = $res;
  $context->send($opcode, $res);
}

sub initRememberLast #  (@args)
{
  my $refvar;
  return \$refvar;
}

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("last2",
      Triceps::IndexType->newFifo(limit => 2)
      ->setAggregator(Triceps::AggregatorType->new(
        $rtAvgPrice, "aggrAvgPrice", \&initRememberLast, \&computeAverage4)
      )
    )
  )
;

The rest of the example stays the same, so it's not shown. Even in the part that is shown, very little has changed.

The aggregator type now has an initialization function. (This function is not of the same kind as for the sorted index!) This function gets called every time a new aggregation group gets created, before the first row is inserted into it. It initializes the aggregator group's Perl state by creating and returning the state value (the state is per aggregator type, so if there are two parallel index types, each with an aggregator, each aggregator will have its own group state).

The state is stored in the group as a single Perl variable. So it usually is a reference to a more complex object. In this case the value returned is a reference to a variable that would contain a Row reference. (Ironically, the simplest case looks a bit more confusing than if it were a reference to an array or hash). Returning a reference to a my variable is a way to create a reference to an anonymous value: each time my executes, it creates a new value. Which is then kept in a reference after the initialization function returns. The next time the function executes, my would create another new value.

The computation function has that state passed as an argument and now makes use of it. It has two small additions. Before sending a new result row, that row gets remembered in the state reference. And then before doing any computation the function checks, whether the required opcode is DELETE, and if so then simply resends the last result with the new opcode. Remember, the rows are not copied but reference-counted, so this is fairly cheap.

The extra level of referencing is used because simply assigning to $state would only change the local variable and not the value kept in the group.

However if you change the argument of the function directly, that would change the value kept in the group (similar to changing the loop variable in a foreach loop). So you can save a bit of overhead by eliminating the extra indirection. The modified version will be:

sub computeAverage5 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;

  # don't send the NULL record after the group becomes empty
  return if ($context->groupSize()==0
    || $opcode == &Triceps::OP_NOP);
  if ($opcode == &Triceps::OP_DELETE) {
    $context->send($opcode, $state);
    return;
  }

  my $sum = 0;
  my $count = 0;
  for (my $rhi = $context->begin(); !$rhi->isNull();
      $rhi = $context->next($rhi)) {
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  my $rLast = $context->last()->getRow();
  my $avg = $sum/$count;

  my $res = $context->resultType()->makeRowHash(
    symbol => $rLast->get("symbol"),
    id => $rLast->get("id"),
    price => $avg
  );
  $_[5] = $res;
  $context->send($opcode, $res);
}

sub initRememberLast5 #  (@args)
{
  return undef;
}

Even though the initialization function returns undef, it still must be present. If it's not present, the state argument of the comparison function will contain a special hardcoded and unmodifiable undef constant, and nothing could be remembered.

And here is an example of its work:

OP_INSERT,1,AAA,10,10
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="1" price="10"
OP_INSERT,2,BBB,100,100
tWindow.aggrAvgPrice OP_INSERT symbol="BBB" id="2" price="100"
OP_INSERT,3,AAA,20,20
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="1" price="10"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="3" price="15"
OP_INSERT,4,BBB,200,200
tWindow.aggrAvgPrice OP_DELETE symbol="BBB" id="2" price="100"
tWindow.aggrAvgPrice OP_INSERT symbol="BBB" id="4" price="150"
OP_INSERT,5,AAA,30,30
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="3" price="15"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="25"
OP_DELETE,3
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="25"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="5" price="30"
OP_DELETE,5
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="5" price="30"

Since the rows are grouped by the symbol, the symbols AAA and BBB will have separate aggregation states.

11.6. Additive aggregation

In some cases the aggregation values don't have to be calculated by going through all the rows from scratch every time. If you do a sum of a field, you can as well add the value of the field when a row is inserted and subtract when a row is deleted. Not surprisingly, this is called an additive aggregation.

The averaging can also be done as an additive aggregation: it amounts to a sum divided by a count. The sum can obviously be done additively. The count is potentially additive too, but even better, we have the shortcut of $context->groupSize(). Well, at least for the same definition of count that has been used previously in the non-additive example. The SQL definition of count (and of average) includes only the non-NULL values, but in the next example we will go with the Perl approach where a NULL is taken to have the same meaning as 0. The proper SQL count could not use that shortcut but would still be additive.

Triceps provides a way to implement the additive aggregation too. It calls the aggregation computation function for each changed row, giving it an opportunity to react. The argument $aggop indicates, what has happened. Here is the same example from Section 11.3: “Introducing the proper aggregation” rewritten in an additive way:

sub computeAverage7 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;
  my $rowchg;

  if ($aggop == &Triceps::AO_BEFORE_MOD) {
    $context->send($opcode, $state->{lastrow});
    return;
  } elsif ($aggop == &Triceps::AO_AFTER_DELETE) {
    $rowchg = -1;
  } elsif ($aggop == &Triceps::AO_AFTER_INSERT) {
    $rowchg = 1;
  } else { # AO_COLLAPSE, also has opcode OP_DELETE
    return
  }

  $state->{price_sum} += $rowchg * $rh->getRow()->get("price");

  return if ($context->groupSize()==0
    || $opcode == &Triceps::OP_NOP);

  my $rLast = $context->last()->getRow();
  my $count = $context->groupSize();
  my $avg = $state->{price_sum}/$count;
  my $res = $context->resultType()->makeRowHash(
    symbol => $rLast->get("symbol"),
    id => $rLast->get("id"),
    price => $avg
  );
  $state->{lastrow} = $res;

  $context->send($opcode, $res);
}

sub initAverage7 #  (@args)
{
  return { lastrow => undef, price_sum => 0 };
}

The tricks of keeping an extra row from Section 11.4: “Tricks with aggregation on a sliding window” could not be used with the additive aggregation. An additive aggregation relies on Triceps to tell it, which rows are deleted and which inserted, so it can not do any extra skipping easily. The index for the aggregation has to be defined with the correct limits. If we want an average of the last 2 rows, we set the limit to 2:

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("last2",
      Triceps::IndexType->newFifo(limit => 2)
      ->setAggregator(Triceps::AggregatorType->new(
        $rtAvgPrice, "aggrAvgPrice", \&initAverage7, \&computeAverage7)
      )
    )
  )
;

The aggregation state has grown: now it includes not only the last sent row but also the sum of the price, which is used for the aggregation, kept together in a hash. The last sent row doesn't really have to be kept, and I'll show another example without it, but for now let's look at how things are done when it is kept.

The argument $aggop describes, why the computation is being called. Note that Triceps doesn't know if the aggregation is additive or not. It does the calls the same in every case. Just in the previous examples we weren't interested in this information and didn't look at it. $aggop contains one of the constant values:

  • &Triceps::AO_BEFORE_MOD: the group is about to be modified, need to send a DELETE of the old aggregated row. The argument $opcode will always be OP_DELETE.
  • &Triceps::AO_AFTER_DELETE: the group has been modified by deleting a row from it. The argument $rh will refer to the row handle being deleted. The $opcode may be either OP_NOP or OP_INSERT. A single operation on a table may affect multiple rows: an insert may trigger the replacement policy in the indexes and cause one or more rows to be deleted. If there are multiple rows deleted or inserted in a group, the additive aggregator needs to know about all of them to keep its state correct but does not need (and even must not) send a new result until the last one of them has been processed. The call for the last modification will have the opcode of OP_INSERT. The preceding intermediate ones will have the opcode of OP_NOP. An important point, even though a row is being deleted from the group, the aggregator opcode is OP_INSERT, because it inserts the new aggregator state!
  • &Triceps::AO_AFTER_INSERT: the group has been modified by inserting a row into it. Same as for AO_AFTER_DELETE, $rh will refer to the row handle being inserted, and $opcode will be OP_NOP or OP_INSERT.
  • &Triceps::AO_COLLAPSE: called after the last row is deleted from the group, just before the whole group is collapsed and deleted. This allows the aggregator to destroy its state properly. For most of the aggregators there is nothing special to be done. The only case when you want to do something is if your state causes some circular references. Perl doesn't free the circular references until the whole interpreter exits, and so you'd have to break the circle to let them be freed immediately. The aggregator should not produce any results on this call. The $opcode will be OP_NOP.

The computation reacts accordingly: for the before-modification it re-sends the old result with the new opcode, for the collapse it does nothing, and for after-modification it calculates the sign, whether the value from $rh needs to be added or subtracted from the sum. I'm actually thinking, maybe this sign should be passed as a separate argument too, and then both the aggregation operation constants AO_AFTER_* can be merged into one. We'll see, maybe it will be changed in the future.

Then the addition/subtraction is done and the state updated.

After that, if the row does not need to be sent (opcode is OP_NOP or group size is 0), the function can as well return here without constructing the new row.

If the row needs to be produced, continue with the same logic as the non-additive aggregator, only without iteration through the group. The id field in the result is produced by essentially the SQL LAST() operator. LAST() and FIRST() are not additive, they refer to the values in the last or first row in the group's order, and simply can not be calculated from looking at which rows are being inserted and deleted without knowing their order in the group. But they are fast as they are, and do not require iteration. The same goes for the row count (as long as we don't care about excluding NULLs, violating the SQL semantics). And for averaging there is the last step to do after the additive part is done: divide the sum by the count.

All these non-additive steps are done in this last section, then the result row is constructed, remembered and sent.

Not all the aggregation operations can be expressed in an additive way. It may even vary by the data. For MAX(), the insertion of a row can be always done additively, just comparing the new value with the remembered maximum, and replacing it if the new value is greater. The deletion can also compare the deleted value with the remembered maximum. If the deleted value is less, then the maximum is unchanged. But if the deleted value is equal to the maximum, MAX() has to iterate through all the values and find the new maximum.

There is also an issue with the floating point precision in the additive aggregation. It's not such a big issue if the rows are only added and never deleted from the group, but can get much worse with the deletion. Let me show it with a sample run of the additive code:

OP_INSERT,1,AAA,1,10
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="1" price="1"
OP_INSERT,2,AAA,1e20,20
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="1" price="1"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="2" price="5e+19"
OP_INSERT,3,AAA,2,10
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="2" price="5e+19"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="3" price="5e+19"
OP_INSERT,4,AAA,3,10
tWindow.aggrAvgPrice OP_DELETE symbol="AAA" id="3" price="5e+19"
tWindow.aggrAvgPrice OP_INSERT symbol="AAA" id="4" price="1.5"

Why is the last result 1.5 while it had to be (2+3)/2 = 2.5? Because adding together 1e20 and 2 had pushed the 2 beyond the precision of floating-point number. 1e20+2 = 1e20. So when the row with 1e20 was deleted from the group and subtracted form the sum, that left 0. Which got then averaged with 3, producing 1.5.

Of course, with the real stock prices there won't be that much variation. But the subtler errors will still accumulate over time, and you have to expect them and plan accordingly.

Switching to a different subject, the additive aggregation contains enough information in its state to generate the result rows quickly without an iteration. This means that keeping the saved result row for DELETEs doesn't give a whole lot of advantage and adds at least a little memory overhead. We can change the code and avoid keeping it:

sub computeAverage8 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;
  my $rowchg;

  if ($aggop == &Triceps::AO_COLLAPSE) {
    return
  } elsif ($aggop == &Triceps::AO_AFTER_DELETE) {
    $state->{price_sum} -= $rh->getRow()->get("price");
  } elsif ($aggop == &Triceps::AO_AFTER_INSERT) {
    $state->{price_sum} += $rh->getRow()->get("price");
  }
  # on AO_BEFORE_MOD do nothing

  return if ($context->groupSize()==0
    || $opcode == &Triceps::OP_NOP);

  my $rLast = $context->last()->getRow();
  my $count = $context->groupSize();

  $context->makeHashSend($opcode,
    symbol => $rLast->get("symbol"),
    id => $rLast->get("id"),
    price => $state->{price_sum}/$count,
  );
}

sub initAverage8 #  (@args)
{
  return { price_sum => 0 };
}

On AO_BEFORE_MOD it doesn't do any change to the additive state but then produces the result row from that state as usual, using the supplied $opcode value of OP_DELETE. The other change in this example is that the sum gets directly added or subtracted in AO_AFTER_* instead of computing the sign first. It's all pretty much self-explanatory.

11.7. Computation function arguments

Let's look up close at what calls are done to the aggregation computation function. Just make a computation that prints the call arguments:

sub computeAverage9 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;

  print(&Triceps::aggOpString($aggop), " ", &Triceps::opcodeString($opcode), " ", $context->groupSize(), " ", (!$rh->isNull()? $rh->getRow()->printP(): "NULL"), "\n");
}

It prints the aggregation operation, the result opcode, row count in the group, and the argument row (or NULL). The aggregation is done as before, on the same FIFO index with the size limit of 2.

To show the order of aggregator calls relative to the table label calls, I've added the labels that print the updates form the table:

my $lbPre = makePrintLabel("lbPre", $tWindow->getPreLabel());
my $lbOut = makePrintLabel("lbOut", $tWindow->getOutputLabel());

To make keeping track of the printout easier, I broke up the sequence into multiple fragments, with a description after each fragment:

OP_INSERT,1,AAA,10,10
tWindow.pre OP_INSERT id="1" symbol="AAA" price="10" size="10"
tWindow.out OP_INSERT id="1" symbol="AAA" price="10" size="10"
AO_AFTER_INSERT OP_INSERT 1 id="1" symbol="AAA" price="10" size="10"
OP_INSERT,2,BBB,100,100
tWindow.pre OP_INSERT id="2" symbol="BBB" price="100" size="100"
tWindow.out OP_INSERT id="2" symbol="BBB" price="100" size="100"
AO_AFTER_INSERT OP_INSERT 1 id="2" symbol="BBB" price="100" size="100"

The INSERT of the first row in each group causes only one call. There is no previous value to delete, only a new one to insert. The call happens after the row has been inserted into the group.

OP_INSERT,3,AAA,20,20
AO_BEFORE_MOD OP_DELETE 1 NULL
tWindow.pre OP_INSERT id="3" symbol="AAA" price="20" size="20"
tWindow.out OP_INSERT id="3" symbol="AAA" price="20" size="20"
AO_AFTER_INSERT OP_INSERT 2 id="3" symbol="AAA" price="20" size="20"

Adding the second record in a group means that the aggregation result for this group is modified. So first the aggregator is called to delete the old result, then the new row gets inserted, and the aggregator is called the second time to produce its new result.

OP_INSERT,5,AAA,30,30
AO_BEFORE_MOD OP_DELETE 2 NULL
tWindow.pre OP_DELETE id="1" symbol="AAA" price="10" size="10"
tWindow.out OP_DELETE id="1" symbol="AAA" price="10" size="10"
tWindow.pre OP_INSERT id="5" symbol="AAA" price="30" size="30"
tWindow.out OP_INSERT id="5" symbol="AAA" price="30" size="30"
AO_AFTER_DELETE OP_NOP 2 id="1" symbol="AAA" price="10" size="10"
AO_AFTER_INSERT OP_INSERT 2 id="5" symbol="AAA" price="30" size="30"

The insertion of the third row in a group triggers the replacement policy in the FIFO index. The replacement policy causes the row with id=1 to be deleted before the row with id=5 is inserted. For the aggregator result it's still a single delete-insert pair: First, before modification, the old aggregation result is deleted. Then the contents of the group gets modified with both the delete and insert. And then the aggregator gets told, what has been modified. The deletion of the row with id=1 is not the last step, so that call gets the opcode of OP_NOP. Note that the group size with it is 2, not 1. That's because the aggregator gets notified only after all the modifications are already done. So the additive part of the computation must never read the group size or do any kind of iteration through the group, because that would often cause an incorrect result: it has no way to tell, what other modifications have been already done to the group. The last AO_AFTER_INSERT gets the opcode of OP_INSERT which tells the computation to send the new result of the aggregation. When the opcode is OP_INSERT, reading the group size and the other group information becomes safe, because by this time all the modifications are guaranteed to be done, and the additive notifications have caught up with all the changes.

OP_INSERT,3,BBB,20,20
AO_BEFORE_MOD OP_DELETE 2 NULL
AO_BEFORE_MOD OP_DELETE 1 NULL
tWindow.pre OP_DELETE id="3" symbol="AAA" price="20" size="20"
tWindow.out OP_DELETE id="3" symbol="AAA" price="20" size="20"
tWindow.pre OP_INSERT id="3" symbol="BBB" price="20" size="20"
tWindow.out OP_INSERT id="3" symbol="BBB" price="20" size="20"
AO_AFTER_DELETE OP_INSERT 1 id="3" symbol="AAA" price="20" size="20"
AO_AFTER_INSERT OP_INSERT 2 id="3" symbol="BBB" price="20" size="20"

This insert is of a dirty kind, the one that replaces the row using the replacement policy of the hashed primary index, without deleting its old state first. It also moves the row from one aggregation group to another. So the table logic calls AO_BEFORE_MOD for each of the modified groups, then modifies the contents of the groups, then tells both groups about the modifications. In this case both calls with AO_AFTER_* have the opcode of OP_INSERT because each of them is the last and only change to a separate aggregation group.

OP_DELETE,5
AO_BEFORE_MOD OP_DELETE 1 NULL
tWindow.pre OP_DELETE id="5" symbol="AAA" price="30" size="30"
tWindow.out OP_DELETE id="5" symbol="AAA" price="30" size="30"
AO_AFTER_DELETE OP_INSERT 0 id="5" symbol="AAA" price="30" size="30"
AO_COLLAPSE OP_NOP 0 NULL

This operation removes the last row in a group. It starts as usual with deleting the old state. The next AO_AFTER_DELETE with OP_INSERT is intended for the Coral8-style aggregators that produce only the rows with the INSERT opcodes, never DELETEs, to let them insert the NULL (or zero) values in all the non-key fields. For the normal aggregators the work is all done after OP_DELETE. That's why all the shown examples were checking for $context->groupSize() == 0 and returning if so. The group size will be zero in absolutely no other case than after the deletion of the last row. Finally AO_COLLAPSE allows to clean up the aggregator's group state if it needs any cleaning. It has the opcode OP_NOP because no rows need to be sent.

To recap, the high-level order of the table operation processing is:

  1. Execute the replacement policies on all the indexes to find all the rows that need to be deleted first.
  2. If any of the index policies forbid the modification, return 0.
  3. Call all the aggregators with AO_BEFORE_MOD on all the affected rows.
  4. Send these aggregator results.
  5. For each affected row:

    1. Call the "pre" label (if it has any labels chained to it).
    2. Modify the row in the table.
    3. Call the "out" label.
  6. Call all the aggregators with AO_AFTER_*, on all the affected rows.
  7. Send these aggregator results.

11.8. Using multiple indexes

I've mentioned before that the floating numbers are tricky to handle. Even without additive aggregation the result depends on the rounding. Which in turn depends on the order in which the operations are done. Let's look at a version of the aggregation code that highlights this issue.

sub computeAverage10 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;

  # don't send the NULL record after the group becomes empty
  return if ($context->groupSize()==0
    || $opcode != &Triceps::OP_INSERT);

  my $sum = 0;
  my $count = 0;
  for (my $rhi = $context->begin(); !$rhi->isNull();
      $rhi = $context->next($rhi)) {
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  my $rLast = $context->last()->getRow();
  my $avg = $sum/$count;

  my $res = $context->resultType()->makeRowHash(
    symbol => $rLast->get("symbol"),
    id => $rLast->get("id"),
    price => $avg
  );
  $context->send($opcode, $res);
}

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("last4",
      Triceps::IndexType->newFifo(limit => 4)
      ->setAggregator(Triceps::AggregatorType->new(
        $rtAvgPrice, "aggrAvgPrice", undef, \&computeAverage10)
      )
    )
  )
;
$ttWindow->initialize();
my $tWindow = $uTrades->makeTable($ttWindow, "tWindow");

# label to print the result of aggregation
my $lbAverage = $uTrades->makeLabel($rtAvgPrice, "lbAverage",
  undef, sub { # (label, rowop)
    printf("%.17g\n", $_[1]->getRow()->get("price"));
  });
$tWindow->getAggregatorLabel("aggrAvgPrice")->chain($lbAverage);

The differences from the previously shown basic aggregation are:

  • the FIFO limit has been increased to 4;
  • the only result value printed by the $lbAverage handler is the price, and it's printed with a higher precision to make the difference visible;
  • the aggregator computation only does the inserts, to reduce the clutter in the results and highlight the issue.

And here is an example of how the order of computation matters:

OP_INSERT,1,AAA,1,10
1
OP_INSERT,2,AAA,1,10
1
OP_INSERT,3,AAA,1,10
1
OP_INSERT,4,AAA,1e16,10
2500000000000001
OP_INSERT,5,BBB,1e16,10
10000000000000000
OP_INSERT,6,BBB,1,10
5000000000000000
OP_INSERT,7,BBB,1,10
3333333333333333.5
OP_INSERT,8,BBB,1,10
2500000000000000

Of course, the real prices won't vary so wildly. But the other values could. This example is specially stacked to demonstrate the point. The final results for AAA and BBB should be the same but aren't. Why? The precision of the 64-bit floating-point numbers is such that adding 1 to 1e16 makes this 1 fall beyond the precision, and the result is still 1e16. On the other hand, adding 3 to 1e16 makes at least a part of it stick. 1 still falls off but the other 2 of 3 sticks on. Next look at the data sets: if you add 1e16+1+1+1, that's adding 1e16+1 repeated three times, and the result is still the same unchanged 1e16. But if you add 1+1+1+1e16, that's adding 3+1e16, and now the result is different and more correct. When the averages get computed from these different values by dividing the sums by 4, the results are also different.

Overall the rule of thumb for adding the floating point numbers is this: add them up in the order from the smallest to the largest. (What if the numbers can be negative too? I don't know, that goes beyond my knowledge of floating point calculations. My guess is that you still arrange them in the ascending order, only by the absolute value.) So let's do it in the aggregator.

our $idxByPrice;

# aggregation handler: sum in proper order
sub computeAverage11 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;
  our $idxByPrice;

  # don't send the NULL record after the group becomes empty
  return if ($context->groupSize()==0
    || $opcode != &Triceps::OP_INSERT);

  my $sum = 0;
  my $count = 0;
  my $end = $context->endIdx($idxByPrice);
  for (my $rhi = $context->beginIdx($idxByPrice); !$rhi->same($end);
      $rhi = $rhi->nextIdx($idxByPrice)) {
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  my $rLast = $context->last()->getRow();
  my $avg = $sum/$count;

  my $res = $context->resultType()->makeRowHash(
    symbol => $rLast->get("symbol"),
    id => $rLast->get("id"),
    price => $avg
  );
  $context->send($opcode, $res);
}

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("last4",
      Triceps::IndexType->newFifo(limit => 4)
      ->setAggregator(Triceps::AggregatorType->new(
        $rtAvgPrice, "aggrAvgPrice", undef, \&computeAverage11)
      )
    )
    ->addSubIndex("byPrice",
      Triceps::SimpleOrderedIndex->new(price => "ASC",)
      ->addSubIndex("multi", Triceps::IndexType->newFifo())
    )
  )
;
$ttWindow->initialize();
my $tWindow = $uTrades->makeTable($ttWindow, "tWindow");

$idxByPrice = $ttWindow->findIndexPath("bySymbol", "byPrice");

Here another index type is added, ordered by price. It has to be non-leaf, with a FIFO index type nested in it, to allow for multiple rows having the same price in them. That would work out more efficiently if the ordered index could have a multimap mode, but that is not supported yet.

When the compute function does its iteration, it now goes by that index. The aggregator can't be simply moved to that new index type, because it still needs to get the last trade id in the order in which the rows are inserted into the group. Instead it has to work with two index types: the one on which the aggregator is defined, and the additional one. The calls for iteration on an additional index are different. $context->beginIdx() is similar to $context->begin() but the end condition and the next step are done differently. When $rhi->nextIdx() reaches the end of the group, it returns not a NULL row handle but a handle value that has to be found in advance with $context->endIdx(). Perhaps the consistency in this department can be improved in the future.

And finally, the reference to that additional index type has to make it somehow into the compute function. It can't be given as an argument because it's not known yet at the time when the aggregator is constructed (and no, reordering the index types won't help because the index types are copied when connected to their parents, and we need the exact index type that ends up in the assembled table type). So a global variable $idxByPrice is used. The index type reference is found and placed there, and later when the compute function runs, it takes the reference from the global variable.

The printout from this version on the same input is:

OP_INSERT,1,AAA,1,10
1
OP_INSERT,2,AAA,1,10
1
OP_INSERT,3,AAA,1,10
1
OP_INSERT,4,AAA,1e16,10
2500000000000001
OP_INSERT,5,BBB,1e16,10
10000000000000000
OP_INSERT,6,BBB,1,10
5000000000000000
OP_INSERT,7,BBB,1,10
3333333333333334
OP_INSERT,8,BBB,1,10
2500000000000001

Now no matter what the order of the row arrival, the prices get added up in the same order from the smallest to the largest and produce the same correct (inasmuch the floating point precision allows) result.

Which index type is used to put the aggregator on, doesn't matter a whole lot. The computation can be turned around, with the ordered index used as the main one, and the last value from the FIFO index obtained with $context->lastIdx():

our $idxByOrder;

# aggregation handler: sum in proper order
sub computeAverage12 # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;
  our $idxByOrder;

  # don't send the NULL record after the group becomes empty
  return if ($context->groupSize()==0
    || $opcode != &Triceps::OP_INSERT);

  my $sum = 0;
  my $count = 0;
  for (my $rhi = $context->begin(); !$rhi->isNull();
      $rhi = $context->next($rhi)) {
    $count++;
    $sum += $rhi->getRow()->get("price");
  }
  my $rLast = $context->lastIdx($idxByOrder)->getRow();
  my $avg = $sum/$count;

  my $res = $context->resultType()->makeRowHash(
    symbol => $rLast->get("symbol"),
    id => $rLast->get("id"),
    price => $avg
  );
  $context->send($opcode, $res);
}

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("last4",
      Triceps::IndexType->newFifo(limit => 4)
    )
    ->addSubIndex("byPrice",
      Triceps::SimpleOrderedIndex->new(price => "ASC",)
      ->addSubIndex("multi", Triceps::IndexType->newFifo())
      ->setAggregator(Triceps::AggregatorType->new(
        $rtAvgPrice, "aggrAvgPrice", undef, \&computeAverage12)
      )
    )
  )
;
$ttWindow->initialize();
my $tWindow = $uTrades->makeTable($ttWindow, "tWindow");

$idxByOrder = $ttWindow->findIndexPath("bySymbol", "last4");

The last important note: when aggregating with multiple indexes, always use the sibling index types forming the same group or their nested sub-indexes (since the actual order is defined by the first leaf sub-index anyway). But don't use the random unrelated index types. If you do, the context would return some unexpected values for those, and you may end up with endless loops.

11.9. SimpleAggregator

Even though the writing the aggregation computation functions manually gives the flexibility, it's too much work for the simple cases. The SimpleAggregator template takes care of most of that work and allows you to specify the aggregation in a way similar to SQL. It has been already shown on the VWAP example, ans here is the trade aggregation example from Section 11.3: “Introducing the proper aggregation” rewritten with SimpleAggregator:

my $ttWindow = Triceps::TableType->new($rtTrade)
  ->addSubIndex("byId",
    Triceps::IndexType->newHashed(key => [ "id" ])
  )
  ->addSubIndex("bySymbol",
    Triceps::IndexType->newHashed(key => [ "symbol" ])
    ->addSubIndex("last2",
      Triceps::IndexType->newFifo(limit => 2)
    )
  )
;

# the aggregation result
my $rtAvgPrice;
my $compText; # for debugging

Triceps::SimpleAggregator::make(
  tabType => $ttWindow,
  name => "aggrAvgPrice",
  idxPath => [ "bySymbol", "last2" ],
  result => [
    symbol => "string", "last", sub {$_[0]->get("symbol");},
    id => "int32", "last", sub {$_[0]->get("id");},
    price => "float64", "avg", sub {$_[0]->get("price");},
  ],
  saveRowTypeTo => \$rtAvgPrice,
  saveComputeTo => \$compText,
);

$ttWindow->initialize();
my $tWindow = $uTrades->makeTable($ttWindow, "tWindow");

# label to print the result of aggregation
my $lbAverage = makePrintLabel("lbAverage",
  $tWindow->getAggregatorLabel("aggrAvgPrice"));

The main loop and the printing is the same as before. The result produced is also exactly the same as before.

But the aggregator is created with Triceps::SimpleAggregator::make(). Its arguments are in the option format: the option name-value pairs, in any order.

$tabType = Triceps::SimpleAggregator::make($optName => $optValue, ...);

It returns back the table type that it received as an option. But most of the time there is not a whole lot of use to that return value, and it gets simply ignored. Most of the options are actually mandatory. The aggregator type is connected to the table type with the options:

tabType
Table type to put the aggregator on. It must be un-initialized yet.
idxPath
A reference to an array of index names, forming the path to the index where the aggregator type will be set.
name
The aggregator type name.

The result row type and computation is defined with the option result: each group of four values in that array defines one result field:

  • The field name.
  • The field type.
  • The aggregation function name used to compute the field. There is no way to combine multiple aggregation functions or even an aggregation function and any arithmetics in a field computation. The workaround is to compute each function in a separate field, and then send the result rows to a computational label that would arithmetically combine these fields into one.
  • A closure that extracts the aggregation function argument from the row (well, it can be any function reference, doesn't have to be an anonymous closure). That closure gets the row as the argument $_[0] and returns the extracted value to run the aggregation on.

The field name is by convention separated from its definition fields by =>. Remember, it's just a convention, for Perl a => is just as good as a comma.

SimpleAggregator::make() automatically generates the result row type and aggregation function, creates an aggregator type from them, and sets it on the index type. The information about the aggregation result can be found by traversing through the index type tree, or by constructing a table and getting the row type from the aggregator result label. However it's often easier to save it during construction, and the option (this time an optional one!) saveRowTypeTo allows to do this. Give it a reference to a variable, and the row type will be placed into that variable.

Most of the time the things would just work. However if they don't and something dies in the aggregator, you will need the source code of the compute function to make sense of these errors. The option saveComputeTo gives a variable to save that source code for future perusal and other entertainment. Here is the compute function that gets produced by the example above (it gets implicitly wrapped in a sub { ... }, like any other source code argument):

  use strict;
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;
  return if ($context->groupSize()==0 || $opcode == &Triceps::OP_NOP);
  my $v2_count = 0;
  my $v2_sum = 0;
  my $npos = 0;
  for (my $rhi = $context->begin(); !$rhi->isNull(); $rhi = $context->next($rhi)) {
    my $row = $rhi->getRow();
    # field price=avg
    my $a2 = $args[2]($row);
    { if (defined $a2) { $v2_sum += $a2; $v2_count++; }; }
    $npos++;
  }
  my $rowLast = $context->last()->getRow();
  my $l0 = $args[0]($rowLast);
  my $l1 = $args[1]($rowLast);
  $context->makeArraySend($opcode,
    ($l0), # symbol
    ($l1), # id
    (($v2_count == 0? undef : $v2_sum / $v2_count)), # price
  );

At the moment the compute function is quite straightforward and just does the aggregation from scratch every time. It doesn't support the additive aggregation nor the DELETE optimization. It's only smart enough to skip the iteration if all the result consists of only aggregation functions first, last and count_star. It receives the closures for the argument extraction as arguments in @args, SimpleAggregator arranges these arguments when it creates the aggregator.

The aggregation functions available at the moment are:

first
Value from the first row in the group.
last
Value from the last row in the group.
count_star
Number of rows in the group, like SQL COUNT(*). Since there is no argument for this function, use undef instead of the argument closure.
sum
Sum of the values.
max
The maximal value.
min
The minimal value.
avg
The average of all the non-NULL values.
avg_perl
The average of all values, with NULL values treated in Perl fashion as zeroes. So, technically when the example above used avg, it works the same as the previous versions only for the non-NULL fields. To be really the same, it should have used avg_perl.
nth_simple
The Nth value from the start of the group. This is a tricky function because it needs two arguments: the value of N and the field selector. Multiple direct arguments will be supported in the future but right now it works through a workaround: the argument closure must return not just the extracted field but a reference to array with two values, the N and the field. For example, sub { [1, $_[0]->get("id")];}. The N is counted starting from 0, so the value of 1 will return the second record. This function works in a fairly simple-minded and inefficient way at the moment.

As usual in Triceps and Perl, the case of the aggregation function name matters. The names have to be used in lowercase as shown. There will be more functions to come, and you can even already add your own, as has been shown in Section 11.1: “The ubiquitous VWAP” .

The user-defined aggregation functions are defined with the option functions. Let's take another look at the code from the VWAP example:

# VWAP function definition
my $myAggFunctions = {
  myvwap => {
    vars => { sum => 0, count => 0, size => 0, price => 0 },
    step => '($%size, $%price) = @$%argiter; '
      . 'if (defined $%size && defined $%price) '
        . '{$%count += $%size; $%sum += $%size * $%price;}',
    result => '($%count == 0? undef : $%sum / $%count)',
  },
};

...

Triceps::SimpleAggregator::make(
  functions => $myAggFunctions,
);

The definition of the functions is a reference to a hash, keyed by the function name. Each function definition in order is a hash of options, keyed by the option name. When the SimpleAggregator builds the common computation function, it assembles the code by tying together the code fragments from these options: Whenever the group changes, the aggregator will reset the function state variables to the default values and iterate through the new contents of the group. It will perform the step computation for each row and collect the data in the intermediate variables. After the iteration it will perform the result computation of all the functions and produce the final value.

The expected format of the values of these options varies with the option. The option result is mandatory, the rest can be skipped if not needed. The supported options are:

argcount
Integer. Defines the number of arguments of the function, which may currently be 0 or 1, with 1 being the default. If this option is 0, SimpleAggregator will check that the argument closure is undef. If the aggregation function needs more arguments than one, they have to be packed into an array or hash, and then its reference used as a single argument. The standard function nth_simple and the VWAP function provide the examples of how to do this.
vars
Reference to a hash. Defines the variables used to keep the context of this function during the iteration (the hash keys are the variable names) and their initial values (specified as the values in the hash).
step

String. The code fragment to compute a single step of iteration. It can refer to the variables defined in vars and to a few of the pre-defined values using the syntax $%name (which has been chosen because it's illegal in the normal Perl variable syntax). When SimpleAggregator generates the code, it creates the actual scope variables for everything defined in vars, then substitutes them for the $% syntax in the string and inserts the result into its group iteration code.

If this option is not defined, SimpleAggregator assumes that this function doesn't need it. If no functions in the aggregation define the step, the iteration does not get included into the generated code altogether.

The defined special values are:

  • $%argiter - The function's argument extracted from the current row.
  • $%niter - The number of the current row in the group, starting from 0.
  • $%groupsize - The size of the group ($context->groupSize()).
result

String. The code fragment to compute the result of the function. This option is mandatory. Works in the same way as step, only gets executed once per call of the computation function, and the defined special values are different:

  • $%argfirst - The function's argument extracted from the first row.
  • $%arglast - The function's argument extracted from the last row.
  • $%groupsize - The size of the group ($context->groupSize()).

I can think of many ways the SimpleAggregator can be improved, but for now they have been pushed into the future to keep it simple.

11.10. The guts of SimpleAggregator

The implementation of the SimpleAggregator has turned out to be surprisingly small. Not quite tiny but still small. I've liked it so much that I've even saved the original small version in the file xSimpleAggregator.t. As more features will be added, the official version of the SimpleAggregator will grow (and already did) but that example file will stay small and simple.

It's a nice example of yet another kind of template that I want to present. I'm going to go through it, interlacing the code with the commentary.

package MySimpleAggregator;
use Carp;

use strict;

our $FUNCTIONS = {
  first => {
    result => '$%argfirst',
  },
  last => {
    result => '$%arglast',
  },
  count_star => {
    argcount => 0,
    result => '$%groupsize',
  },
  count => {
    vars => { count => 0 },
    step => '$%count++ if (defined $%argiter);',
    result => '$%count',
  },
  sum => {
    vars => { sum => 0 },
    step => '$%sum += $%argiter;',
    result => '$%sum',
  },
  max => {
    vars => { max => 'undef' },
    step => '$%max = $%argiter if (!defined $%max || $%argiter > $%max);',
    result => '$%max',
  },
  min => {
    vars => { min => 'undef' },
    step => '$%min = $%argiter if (!defined $%min || $%argiter < $%min);',
    result => '$%min',
  },
  avg => {
    vars => { sum => 0, count => 0 },
    step => 'if (defined $%argiter) { $%sum += $%argiter; $%count++; }',
    result => '($%count == 0? undef : $%sum / $%count)',
  },
  avg_perl => { # Perl-like treat the NULLs as 0s
    vars => { sum => 0 },
    step => '$%sum += $%argiter;',
    result => '$%sum / $%groupsize',
  },
  nth_simple => { # inefficient, need proper multi-args for better efficiency
    vars => { n => 'undef', tmp => 'undef', val => 'undef' },
    step => '($%n, $%tmp) = @$%argiter; if ($%n == $%niter) { $%val = $%tmp; }',
    result => '$%val',
  },
};

The package name of this saved simple version is MySimpleAggregator, to avoid confusion with the official SimpleAggregator class. First goes the definition of the aggregation functions. They are defined in exactly the same way as the vwap function has been shown before. They are fairly straightforward. You can use them as the starting point for adding your own.

sub make # (optName => optValue, ...)
{
  my $opts = {}; # the parsed options
  my $myname = "MySimpleAggregator::make";

  &Triceps::Opt::parse("MySimpleAggregator", $opts, {
      tabType => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "Triceps::TableType") } ],
      name => [ undef, \&Triceps::Opt::ck_mandatory ],
      idxPath => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "ARRAY", "") } ],
      result => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "ARRAY") } ],
      saveRowTypeTo => [ undef, sub { &Triceps::Opt::ck_refscalar(@_) } ],
      saveInitTo => [ undef, sub { &Triceps::Opt::ck_refscalar(@_) } ],
      saveComputeTo => [ undef, sub { &Triceps::Opt::ck_refscalar(@_) } ],
    }, @_);

The options get parsed. Since it's not a proper object constructor but a factory, it uses the hash $opts instead of $self to save the processed copy of the options. This early version doesn't have an option for the user-supplied aggregation function definitions.

  # reset the saved source code
  ${$opts->{saveInitTo}} = undef if (defined($opts->{saveInitTo}));
  ${$opts->{saveComputeTo}} = undef if (defined($opts->{saveComputeTo}));
  ${$opts->{saveRowTypeTo}} = undef if (defined($opts->{saveRowTypeTo}));

The generated source code will not be placed into the save* references until the table type gets initialized, so for the meantime they get filled with undefs.

  # find the index type, on which to build the aggregator
  my $idx = $opts->{tabType}->findIndexPath(@{$opts->{idxPath}});
  confess "$myname: the index type is already initialized, can not add an aggregator on it"
    if ($idx->isInitialized());

Since the SimpleAggregator uses an existing table with existing index, it doesn't require the aggregation key: it just takes an index that forms the group, and whatever key that leads to this index becomes the aggregation key.

  # check the result definition and build the result row type and code snippets for the computation
  my $rtRes;
  my $needIter = 0; # flag: some of the functions require iteration
  my $needfirst = 0; # the result needs the first row of the group
  my $needlast = 0; # the result needs the last row of the group
  my $codeInit = ''; # code for function initialization
  my $codeStep = ''; # code for iteration
  my $codeResult = ''; # code to compute the intermediate values for the result
  my $codeBuild = ''; # code to build the result row
  my @compArgs; # the field functions are passed as args to the computation
  {
    my $grpstep = 4; # definition grouped by 4 items per result field
    my @resopt = @{$opts->{result}};
    my @rtdefRes; # field definition for the result
    my $id = 0; # numeric id of the field

    while ($#resopt >= 0) {
      confess "$myname: the values in the result definition must go in groups of 4"
        unless ($#resopt >= 3);
      my $fld = shift @resopt;
      my $type = shift @resopt;
      my $func = shift @resopt;
      my $funcarg = shift @resopt;

      confess("$myname: the result field name must be a string, got a " . ref($fld) . " ")
        unless (ref($fld) eq '');
      confess("$myname: the result field type must be a string, got a " . ref($type) . " for field '$fld'")
        unless (ref($type) eq '');
      confess("$myname: the result field function must be a string, got a " . ref($func) . " for field '$fld'")
        unless (ref($func) eq '');

This starts the loop that goes over the result fields and builds the code to create them. The code will be built in multiple snippets that will eventually be combined to produce the compute function. Since the arguments go in groups of 4, it becomes fairly easy to miss one element somewhere, and then everything gets real confusing. So the code attempts to check the types of the arguments, in hopes of catching these off-by-ones as early as possible. The variable $id will be used to produce the unique prefixes for the function's variables.

      my $funcDef = $FUNCTIONS->{$func}
        or confess("$myname: function '" . $func . "' is unknown");

      my $argCount = $funcDef->{argcount};
      $argCount = 1 # 1 is the default value
        unless defined($argCount);
      confess("$myname: in field '$fld' function '$func' requires an argument computation that must be a Perl sub reference")
        unless ($argCount == 0 || ref $funcarg eq 'CODE');
      confess("$myname: in field '$fld' function '$func' requires no argument, use undef as a placeholder")
        unless ($argCount != 0 || !defined $funcarg);

      push(@rtdefRes, $fld, $type);

      push(@compArgs, $funcarg)
        if (defined $funcarg);

The function definition for a field gets pulled out by name, and the arguments of the field are checked for correctness. The types of the fields get collected for the row definition, and the aggregation argument computation closures (or, technically, functions) get also collected, to pass later as the arguments of the compute function.

      # add to the code snippets

      ### initialization
      my $vars = $funcDef->{vars};
      if (defined $vars) {
        foreach my $v (keys %$vars) {
          # the variable names are given a unique prefix;
          # the initialization values are constants, no substitutions
          $codeInit .= "  my \$v${id}_${v} = " . $vars->{$v} . ";\n";
        }
      } else {
        $vars = { }; # a dummy
      }

The initialization fragment gets processed if defined. The unique names for variables are generated from the $id and the variable name in the definition, so that there would be no interference between the result fields. And the initialization snippets are collected in $codeInit. The initialization values are not enquoted because they are expected to be strings suitable for such use. That's why the undefined values in the function defnitions are not undef but 'undef'. If you'd want to initialize a variable as a string "x", you'd use it as '"x"'. For the numbers it doesn't really matter, the numbers just get converted to strings as needed, so the zeroes are simply 0s without quoting.

Another possibility would be to have the actual values as-is in the hash and then either put these values into the argument array passed to the computation function or use the closure trick from Triceps::Fields::makeTranslation() described in Section 10.7: “Result projection in the templates” .

      ### iteration
      my $step = $funcDef->{step};
      if (defined $step) {
        $needIter = 1;
        $codeStep .= "    # field $fld=$func\n";
        if (defined $funcarg) {
          # compute the function argument from the current row
          $codeStep .= "    my \$a${id} = \$args[" . $#compArgs ."](\$row);\n";
        }
        # substitute the variables in $step
        $step =~ s/\$\%(\w+)/&replaceStep($1, $func, $vars, $id, $argCount)/ge;
        $codeStep .= "    { $step; }\n";
      }

Then the iteration fragment gets processed. The logic remembers in $needIter if any of the functions involved needs iteration. Before the iteration snippet gets collected, it has the $% names substitutted, and placed into a block, just in case if it wants to define some local variables. An extra ; is added just in case, it doesn't hurt and helps if it was forgotten in the function definition.

      ### result building
      my $result = $funcDef->{result};
      confess "MySimpleAggregator: internal error in definition of aggregation function '$func', missing result computation"
        unless (defined $result);
      # substitute the variables in $result
      if ($result =~ /\$\%argfirst/) {
        $needfirst = 1;
        $codeResult .= "  my \$f${id} = \$args[" . $#compArgs ."](\$rowFirst);\n";
      }
      if ($result =~ /\$\%arglast/) {
        $needlast = 1;
        $codeResult .= "  my \$l${id} = \$args[" . $#compArgs ."](\$rowLast);\n";
      }
      $result =~ s/\$\%(\w+)/&replaceResult($1, $func, $vars, $id, $argCount)/ge;
      $codeBuild .= "    ($result), # $fld\n";

      $id++;
    }
    $rtRes = Triceps::RowType->new(@rtdefRes);
  }
  ${$opts->{saveRowTypeTo}} = $rtRes if (defined($opts->{saveRowTypeTo}));

In the same way the result computation is created, and remembers if any function wanted the fields from the first or last row. And eventually after all the functions have been processed, the result row type is created. If it was asked to save, it gets saved.

  # build the computation function
  my $compText = "sub {\n";
  $compText .= "  use strict;\n";
  $compText .= "  my (\$table, \$context, \$aggop, \$opcode, \$rh, \$state, \@args) = \@_;\n";
  $compText .= "  return if (\$context->groupSize()==0 || \$opcode == &Triceps::OP_NOP);\n";
  $compText .= $codeInit;
  if ($needIter) {
    $compText .= "  my \$npos = 0;\n";
    $compText .= "  for (my \$rhi = \$context->begin(); !\$rhi->isNull(); \$rhi = \$context->next(\$rhi)) {\n";
    $compText .= "    my \$row = \$rhi->getRow();\n";
    $compText .= $codeStep;
    $compText .= "    \$npos++;\n";
    $compText .= "  }\n";
  }
  if ($needfirst) {
    $compText .= "  my \$rowFirst = \$context->begin()->getRow();\n";
  }
  if ($needlast) {
    $compText .= "  my \$rowLast = \$context->last()->getRow();\n";
  }
  $compText .= $codeResult;
  $compText .= "  \$context->makeArraySend(\$opcode,\n";
  $compText .= $codeBuild;
  $compText .= "  );\n";
  $compText .= "}\n";

  ${$opts->{saveComputeTo}} = $compText if (defined($opts->{saveComputeTo}));

The compute function gets assembled from the collected fragments. The optional parts get included only if some of the functions needed them.

  # compile the computation function
  my $compFun = eval $compText
    or confess "$myname: error in compilation of the aggregation computation:\n  $@\nfunction text:\n$compText ";

  # build and add the aggregator
  my $agg = Triceps::AggregatorType->new($rtRes, $opts->{name}, undef, $compFun, @compArgs);

  $idx->setAggregator($agg);

  return $opts->{tabType};
}

Then the compute function is compiled. In case if the compilation fails, the error message will include both the compilation error and the text of the auto-generated function. Otherwise there would be no way to know, what exactly went wrong. Well, since no user code is included into the auto-generated function, it should never fail. Except if there is some bad code in the aggregation function definitions. The compiled function and collected closures are then used to create the aggregator, which should also never fail.

The functions that translate the $%variable names are built after the same pattern but have the different built-in variables:

sub replaceStep # ($varname, $func, $vars, $id, $argCount)
{
  my ($varname, $func, $vars, $id, $argCount) = @_;

  if ($varname eq 'argiter') {
    confess "MySimpleAggregator: internal error in definition of aggregation function '$func', step computation refers to 'argiter' but the function declares no arguments"
      unless ($argCount > 0);
    return "\$a${id}";
  } elsif ($varname eq 'niter') {
    return "\$npos";
  } elsif ($varname eq 'groupsize') {
    return "\$context->groupSize()";
  } elsif (exists $vars->{$varname}) {
    return "\$v${id}_${varname}";
  } else {
    confess "MySimpleAggregator: internal error in definition of aggregation function '$func', step computation refers to an unknown variable '$varname'"
  }
}

sub replaceResult # ($varname, $func, $vars, $id, $argCount)
{
  my ($varname, $func, $vars, $id, $argCount) = @_;

  if ($varname eq 'argfirst') {
    confess "MySimpleAggregator: internal error in definition of aggregation function '$func', result computation refers to '$varname' but the function declares no arguments"
      unless ($argCount > 0);
    return "\$f${id}";
  } elsif ($varname eq 'arglast') {
    confess "MySimpleAggregator: internal error in definition of aggregation function '$func', result computation refers to '$varname' but the function declares no arguments"
      unless ($argCount > 0);
    return "\$l${id}";
  } elsif ($varname eq 'groupsize') {
    return "\$context->groupSize()";
  } elsif (exists $vars->{$varname}) {
    return "\$v${id}_${varname}";
  } else {
    confess "MySimpleAggregator: internal error in definition of aggregation function '$func', result computation refers to an unknown variable '$varname'"
  }
}

They check for the references to the undefined variables and confess if any are found. That's it, the whole aggregator generation.

Now let's look back at the printout of a generated computation function that has been shown above.. The aggregation results were:

  result => [
    symbol => "string", "last", sub {$_[0]->get("symbol");},
    id => "int32", "last", sub {$_[0]->get("id");},
    price => "float64", "avg", sub {$_[0]->get("price");},
  ],

Which produced the function body:

  use strict;
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;
  return if ($context->groupSize()==0 || $opcode == &Triceps::OP_NOP);
  my $v2_count = 0;
  my $v2_sum = 0;
  my $npos = 0;
  for (my $rhi = $context->begin(); !$rhi->isNull(); $rhi = $context->next($rhi)) {
    my $row = $rhi->getRow();
    # field price=avg
    my $a2 = $args[2]($row);
    { if (defined $a2) { $v2_sum += $a2; $v2_count++; }; }
    $npos++;
  }
  my $rowLast = $context->last()->getRow();
  my $l0 = $args[0]($rowLast);
  my $l1 = $args[1]($rowLast);
  $context->makeArraySend($opcode,
    ($l0), # symbol
    ($l1), # id
    (($v2_count == 0? undef : $v2_sum / $v2_count)), # price
  );

The fields get assigned the ids 0, 1 and 2. avg for the price field is the only function here that requires the iteration, and its variables are defined with the prefix $v2_. In the loop the function argument closure is called from $args[2], and its result is stored in $a2 (again, 2 here is the id of this field). Then a copy of the step computation for avg is copied in a block, with the variables substituted. $%argiter becomes $a2, $%sum becomes $v2_sum, $%count becomes $v2_count. Then the loop ends.

The functions make use of the last row, so $rowLast is computed. The values for the $%arglast fields 0 and 1 are calculated in $l0 and $l1. Then the result row is created and sent from an array of substituted result snippets from all the fields. That's how it all works together.

Chapter 12. Joins

12.1. Joins variety

The joins are quite important for the relational data processing, and come in many varieties. And the CEP systems have their own specifics. Basically, in CEP you want the joins to be processed fast. The CEP systems deal with the changing model state, and have to process these changes incrementally.

A small change should be handled fast. It has to use the indexes to find and update all the related result rows. Even though you can make it just go sequentially through all the rows and find the relevant ones, like in a common database, that's not what you normally want. When something like this happens, the usual reaction is wtf is my model suddenly so slow? following by an annoyingly long investigation into the reasons of the slowness, and then rewriting the model to make it work faster. It's better to just prevent the slowness in the first place and make sure that the joins always use an index. And since you don't have to deal much with the ad-hoc queries when you write a CEP model, you can provide all the needed indexes in advance very easily.

A particularly interesting kind of joins in this regard is the equi-joins: ones that join the rows by the equality of the fields in them. They allow a very efficient index look-up. Because of this, they are popular in the CEP world. Some systems, like Aleri, support only the equi-joins to start with. The other systems are much more efficient on the equi-joins than on the other kinds of joins. At the moment Triceps follows the fashion of having the advanced support only for the equi-joins. Even though the Sorted/Ordered indexes in Triceps should allow the range-based comparisons to be efficient too, at the moment there are no table methods for the look-up of ranges, they are left for the future work. Of course, nothing stops you from copying an equi-join template and modifying it to work by a dumb iteration. Just it would be slow, and I didn't see much point in it.

There also are three common patterns of the join usage.

In the first pattern the rows sort of go by and get enriched by looking up some information from a table and tacking it onto these rows. Sometimes not even tacking it on but maybe just filtering the data: passing through some of the rows and throwing away the rest, or directing the rows into the different kinds of processing, based on the looked-up data. For a reference, in the Coral8 CCL this situation is called stream-to-window joins. In Triceps there are no streams and no windows, so I just call them the lookup joins.

In the second pattern multiple stateful tables are joined together. Whenever any of the tables changes, the join result also changes, and the updates get propagated through. This can be done through lookups, but in reality it turns out that defining manually the lookups for the every possible table change becomes tedious pretty quickly. This has to be addressed by the automation.

In the third pattern the same table gets joined recursively, essentially traversing a representation of a tree stored in that table. This actually doesn't work well with the classic SQL unless the recursion depth is strictly limited. There are SQL extensions for the recursive self-joins in the modern databases but I haven't seen them in the CEP systems yet. Anyway, the procedural approach tends to work for this situation much better than the SQLy one, so the templates tend to be of not much help. I'll show a templated and a manual example of this kind for comparison.

12.2. Hello, joins!

As usual, let me show a couple of little teasers before starting the long bottom-up discussion. We'll eventually get by the long way to the same examples, so here I'll show only some very short code snippets and basic explanations.

our $join = Triceps::LookupJoin->new(
  name => "join",
  leftFromLabel => $lbTrans,
  rightTable => $tAccounts,
  leftFields => [ "!acct.*", ".*" ],
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
);

This is a lookup join that gets the incoming rows with transactions data from the label $lbTrans, finds the account translation in the table $tAccounts, and translates the external account representation to internal one on its output. The join condition is an equivalent of the SQLy

on
  lbTrans.acctSrc = tAccounts.source
  and lbTrans.acctXtrId = tAccounts.external

The condition looks up the rows in $tAccounts using the index that has the key fields source and external. Such index must be already defined, or the join will refuse to compile.

The result fields will contain all the fields from $lbTrans except those starting with acct plus the field internal from $tAccounts that becomes renamed to acct.

Next goes a table join:

our $join = Triceps::JoinTwo->new(
  name => "join",
  leftTable => $tPosition,
  rightTable => $tToUsd,
  byLeft => [ "date", "currency" ],
  type => "inner",
);

It joins the tables $tPosition and $tToUsd, with the inner join logic. The join condition is on the fields date and currency being equal in rows in both tables.

12.3. The lookup join, done manually

First let's look at a lookup done manually. It would also establish the baseline for the further joins.

For the background of the model, let's consider the trade information coming in from multiple sources. Each source system has its own designation of the accounts on which the trades happen but ultimately they are the same accounts. So there is a table that contains the translation from the account designations of various external systems to our system's own internal account identifier. This gets described with the row types:

our $rtInTrans = Triceps::RowType->new( # a transaction received
  id => "int32", # the transaction id
  acctSrc => "string", # external system that sent us a transaction
  acctXtrId => "string", # its name of the account of the transaction
  amount => "int32", # the amount of transaction (int is easier to check)
);

our $rtAccounts = Triceps::RowType->new( # account translation map
  source => "string", # external system that sent us a transaction
  external => "string", # its name of the account in the transaction
  internal => "int32", # our internal account id
);

Other than those basics, the rest of information is only minimal, to keep the examples smaller. Even the trade ids are expected to be global and not per the source systems (which is not realistic but saves another little bit of work).

The accounts table can be indexed in multiple ways for multiple purposes, say:

our $ttAccounts = Triceps::TableType->new($rtAccounts)
  ->addSubIndex("lookupSrcExt", # quick look-up by source and external id
    Triceps::IndexType->newHashed(key => [ "source", "external" ])
  )
  ->addSubIndex("iterateSrc", # for iteration in order grouped by source
    Triceps::IndexType->newHashed(key => [ "source" ])
    ->addSubIndex("iterateSrcExt",
      Triceps::IndexType->newHashed(key => [ "external" ])
    )
  )
  ->addSubIndex("lookupIntGroup", # quick look-up by internal id (to multiple externals)
    Triceps::IndexType->newHashed(key => [ "internal" ])
    ->addSubIndex("lookupInt", Triceps::IndexType->newFifo())
  )
;
$ttAccounts->initialize();

For our purpose of joining, the first, primary key is the way to go. Using the primary key also has the advantage of making sure that there is no more than one row for each key value.

The first manual lookup example will just do the filtering: find, whether there is a match in the translation table, and if so then pass the row through. The example goes as follows:

our $uJoin = Triceps::Unit->new("uJoin");

our $tAccounts = $uJoin->makeTable($ttAccounts, "tAccounts");

my $lbFilterResult = $uJoin->makeDummyLabel($rtInTrans, "lbFilterResult");
my $lbFilter = $uJoin->makeLabel($rtInTrans, "lbFilter", undef, sub {
  my ($label, $rowop) = @_;
  my $row = $rowop->getRow();
  my $rh = $tAccounts->findBy(
    source => $row->get("acctSrc"),
    external => $row->get("acctXtrId"),
  );
  if (!$rh->isNull()) {
    $uJoin->call($lbFilterResult->adopt($rowop));
  }
});

# label to print the changes to the detailed stats
makePrintLabel("lbPrint", $lbFilterResult);

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "acct") {
    $uJoin->makeArrayCall($tAccounts->getInputLabel(), @data);
  } elsif ($type eq "trans") {
    $uJoin->makeArrayCall($lbFilter, @data);
  }
  $uJoin->drainFrame(); # just in case, for completeness
}

The findBy() is where the join actually happens: the lookup of the data in a table by values from another row. Very similar to what the basic window example in Section 9.1: “Hello, tables!” was doing before. It's findBy(), without the need for findByIdx(), because in this case the index type used in the accounts table is its first leaf index, to which findBy() defaults. After that the fact of successful or unsuccessful lookup is used to pass the original row through or throw it away. If the found row were used to pick some fields from it and stick them into the result, that would be a more complete join, more like what you often expect to see.

And here is an example of the input processing:

acct,OP_INSERT,source1,999,1
acct,OP_INSERT,source1,2011,2
acct,OP_INSERT,source2,ABCD,1
trans,OP_INSERT,1,source1,999,100
lbFilterResult OP_INSERT id="1" acctSrc="source1" acctXtrId="999"
    amount="100"
trans,OP_INSERT,2,source2,ABCD,200
lbFilterResult OP_INSERT id="2" acctSrc="source2" acctXtrId="ABCD"
    amount="200"
trans,OP_INSERT,3,source2,QWERTY,200
acct,OP_INSERT,source2,QWERTY,2
trans,OP_DELETE,3,source2,QWERTY,200
lbFilterResult OP_DELETE id="3" acctSrc="source2" acctXtrId="QWERTY"
    amount="200"
acct,OP_DELETE,source1,999,1

It starts with populating the accounts table. Then the transactions that find the match pass, and those who don't find don't pass. If more of the account translations get added later, the transactions for them start passing but as you can see, the result might be slightly unexpected: you may get a DELETE that had no matching previous INSERT, as happened for the row with id=3. This happens because the lookup join keeps no history on its left side and can't react properly to the changes to the table on the right. Because of this, the lookup joins work best when the reference table gets pre-populated in advance and then stays stable.

12.4. The LookupJoin template

When a join has to produce the new rows, with the data from both the incoming row and the ones looked up in the reference table, this can also be done manually but may be more convenient to do with the LookupJoin template. The translation of account to the internal ids can be done like this:

our $join = Triceps::LookupJoin->new(
  unit => $uJoin,
  name => "join",
  leftRowType => $rtInTrans,
  rightTable => $tAccounts,
  rightIdxPath => ["lookupSrcExt"],
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
  isLeft => 1,
); # would confess by itself on an error

# label to print the changes to the detailed stats
makePrintLabel("lbPrint", $join->getOutputLabel());

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "acct") {
    $uJoin->makeArrayCall($tAccounts->getInputLabel(), @data);
  } elsif ($type eq "trans") {
    $uJoin->makeArrayCall($join->getInputLabel(), @data);
  }
  $uJoin->drainFrame(); # just in case, for completeness
}

The join gets defined in the option name-value format. The options unit and name are as usual.

The incoming rows are always on the left side, the table on the right. LookupJoin can do either the inner join or the left outer join (since it does not react to the changes of the right table and has no access to the past data from the left side, the full and right outer joins are not available). In this case the option isLeft => 1 selects the left outer join. The left outer join also happens to be the default if this option is not used.

The left side is described by the option leftRowType, and causes the join's input label of this row type to be created. The input label can be found with $join->getInputLabel().

The right side is a table, specified in the option rightTable. The lookups in the table are done using a combination of an index and the field pairing. The option by provides the field pairing. It contains the pairs of field names, one from the left, and one from the right, for the equal fields. They can be separated by , too, but => feels more idiomatic to me. These fields from the left are translated to the right and are used for lookup through the index. The index is specified with the path in the option rightIdxPath. This option is optional: if it's missing, the template will automatically find the index that matches the key fields. The index must exist though, if it doesn't exist, LookupJoin can't create it and can't work without it either. The index must be a Hashed index.

There is no particular reason for it not being a Sorted/Ordered index, other that the getKey() call does not work for these indexes yet, and that's what the LookupJoin uses to check that the right-side index key matches the join key in by. The order of the fields in the option by and in the index may vary but the set of the fields must be the same.

The index may be either a leaf (as in this example) or non-leaf. If it's a leaf, it could look up no more than one row per key, and LookupJoin uses this internally for a little optimization. Otherwise LookupJoin is capable of producing multiple result rows for one input row.

Finally, there is the result row. It is built out of the two original rows by picking the fields according to the options leftFields and rightFields. If either option is missing, that means take all the fields. The format of these options is from Triceps::Fields::filterToPairs() that has been described in Section 10.7: “Result projection in the templates” . So in this example [ "internal/acct" ] means: pass the field internal but rename it to acct.

Remember that the field names in the result must not duplicate. It would be an error. If the duplications happen, the most general solution is to use the substitution syntax to rename some of the fields.

A fairly common usage in joins is to just give the unique prefixes to the left-side and right-side fields. This can be achieved with:

  leftFields => [ '.*/left_$&' ],
  rightFields => [ '.*/right_$&' ],

The $& in the substitution gets replaced with the whole matched field name. There is also another way to solve a special case of duplication that will be shown in a moment.

The option fieldsLeftFirst determines, which side will go first in the result. By default it's set to 1 (as in this example), and the left side goes first. If set to 0, the right side would go first.

This setup for the result row types is somewhat clumsy but it's a reasonable first attempt.

Now, having gone through the description, an example of how it works:

acct,OP_INSERT,source1,999,1
acct,OP_INSERT,source1,2011,2
acct,OP_INSERT,source2,ABCD,1
trans,OP_INSERT,1,source1,999,100
join.out OP_INSERT id="1" acctSrc="source1" acctXtrId="999"
    amount="100" acct="1"
trans,OP_INSERT,2,source2,ABCD,200
join.out OP_INSERT id="2" acctSrc="source2" acctXtrId="ABCD"
    amount="200" acct="1"
trans,OP_INSERT,3,source2,QWERTY,200
join.out OP_INSERT id="3" acctSrc="source2" acctXtrId="QWERTY"
    amount="200"
acct,OP_INSERT,source2,QWERTY,2
trans,OP_DELETE,3,source2,QWERTY,200
join.out OP_DELETE id="3" acctSrc="source2" acctXtrId="QWERTY"
    amount="200" acct="2"
acct,OP_DELETE,source1,999,1

Same as before, first the accounts table gets populated, then the transactions are sent. If an account is not found, this left outer join still passes through the original fields from the left side. Adding an account later doesn't help the rowops that already went through but the new rowops will see it. The same goes for deleting an account, it doesn't affect the past rowops either.

Now let's take another look at the field duplication problem. The most typical case of duplication is in the key fields, when the key fields on both sides are named the same. If all the fields from both left and right sides were to be included (which is the default), the key fields would be included twice, with the same names, and cause a conflict. LookupJoin provides a solution for this special case, shown in the following example:

our $rtTrans = Triceps::RowType->new( # a transaction received
  id => "int32", # the transaction id
  source => "string", # external system that sent us a transaction
  external => "string", # its name of the account of the transaction
  amount => "int32", # the amount of transaction (int is easier to check)
);

our $tAccounts = $uJoin->makeTable($ttAccounts, "tAccounts");

our $join = Triceps::LookupJoin->new(
  unit => $uJoin,
  name => "join",
  leftRowType => $rtTrans,
  rightTable => $tAccounts,
  byLeft => [ "source", "external" ],
  fieldsDropRightKey => 1,
  isLeft => 1,
); # would confess by itself on an error

The example does the exact same thing as the last one, only here the fields in the incoming rows have been named the same as in the table. This made the option byLeft the more convenient way to specify the join condition. And this time there are no explicit options to select the result fields, all of them are included. But that would have included the fields source and external twice, which is illegal. The option fieldsDropRightKey set to 1 takes care of that: it automatically removes the key fields on the right side from the result. This example produces the output that is the same as the last one, only with the different field names:

acct,OP_INSERT,source1,999,1
acct,OP_INSERT,source1,2011,2
acct,OP_INSERT,source2,ABCD,1
trans,OP_INSERT,1,source1,999,100
join.out OP_INSERT id="1" source="source1" external="999" amount="100"
    internal="1"
trans,OP_INSERT,2,source2,ABCD,200
join.out OP_INSERT id="2" source="source2" external="ABCD"
    amount="200" internal="1"
trans,OP_INSERT,3,source2,QWERTY,200
join.out OP_INSERT id="3" source="source2" external="QWERTY"
    amount="200"
acct,OP_INSERT,source2,QWERTY,2
trans,OP_DELETE,3,source2,QWERTY,200
join.out OP_DELETE id="3" source="source2" external="QWERTY"
    amount="200" internal="2"
acct,OP_DELETE,source1,999,1

The left-side data can also be specified in another way: the option leftFromLabel provides a label which in turn provides both the input row type and the unit. You can still specify the unit option as well but it must match the one in the label. This is driven internally by Triceps::Opt::handleUnitTypeLabel(), described in Section 10.5: “Template options” , so it follows the same rules. The join still has its own input label but it gets automatically chained to the one in the option. For an example of such a join:

our $lbTrans = $uJoin->makeDummyLabel($rtInTrans, "lbTrans");

our $join = Triceps::LookupJoin->new(
  name => "join",
  leftFromLabel => $lbTrans,
  rightTable => $tAccounts,
  leftFields => [ "id", "amount" ],
  fieldsLeftFirst => 0,
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
  isLeft => 0,
); # would confess by itself on an error

# label to print the changes to the detailed stats
makePrintLabel("lbPrint", $join->getOutputLabel());

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "acct") {
    $uJoin->makeArrayCall($tAccounts->getInputLabel(), @data);
  } elsif ($type eq "trans") {
    $uJoin->makeArrayCall($lbTrans, @data);
  }
  $uJoin->drainFrame(); # just in case, for completeness
}

The other options demonstrate the possibilities described in the last post. This time it's an inner join, the result has the right-side fields going first, and the left-side fields are filtered in the result by an explicit list of fields to pass. The right-side index is found automatically.

Another way to achieve the same filtering of the left-side fields would be by throwing away everything starting with acct and passing through the rest:

  leftFields => [ "!acct.*", ".*" ],

And here is an example of a run:

acct,OP_INSERT,source1,999,1
acct,OP_INSERT,source1,2011,2
acct,OP_INSERT,source2,ABCD,1
trans,OP_INSERT,1,source1,999,100
join.out OP_INSERT acct="1" id="1" amount="100"
trans,OP_INSERT,2,source2,ABCD,200
join.out OP_INSERT acct="1" id="2" amount="200"
trans,OP_INSERT,3,source2,QWERTY,200
acct,OP_INSERT,source2,QWERTY,2
trans,OP_DELETE,3,source2,QWERTY,200
join.out OP_DELETE acct="2" id="3" amount="200"
acct,OP_DELETE,source1,999,1

The input data is the same as the last time, but the result is different. Since it's an inner join, the rows that don't find a match don't pass through. And of course the fields are ordered and subsetted differently in the result.

The next example loses all connection with reality, it just serves to demonstrate another ability of LookupJoin: matching multiple rows on the right side for an incoming row. The situation itself is obviously useful and normal, just it's not what normally happens with the account id translation, and I was too lazy to invent another realistically-looking example.

our $ttAccounts2 = Triceps::TableType->new($rtAccounts)
  ->addSubIndex("iterateSrc", # for iteration in order grouped by source
    Triceps::IndexType->newHashed(key => [ "source" ])
    ->addSubIndex("lookupSrcExt",
      Triceps::IndexType->newHashed(key => [ "external" ])
      ->addSubIndex("grouping", Triceps::IndexType->newFifo())
    )
  )
;
$ttAccounts2->initialize();

our $tAccounts = $uJoin->makeTable($ttAccounts2, "tAccounts");

our $join = Triceps::LookupJoin->new(
  unit => $uJoin,
  name => "join",
  leftRowType => $rtInTrans,
  rightTable => $tAccounts,
  rightIdxPath => [ "iterateSrc", "lookupSrcExt" ],
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
  #saveJoinerTo => \$code,
); # would confess by itself on an error

The main loop is unchanged from the first LookupJoin example, so I won't copy it here. Just for something different, the join index here is nested, and its path consists of two elements. It's not a leaf index either, with one FIFO level under it. It could also have been found automatically. And when the isLeft is not specified explicitly, it defaults to 1, making it a left join.

The example of a run uses a slightly different input, highlighting the ability to match multiple rows:

acct,OP_INSERT,source1,999,1
acct,OP_INSERT,source1,2011,2
acct,OP_INSERT,source2,ABCD,1
acct,OP_INSERT,source2,ABCD,10
acct,OP_INSERT,source2,ABCD,100
trans,OP_INSERT,1,source1,999,100
join.out OP_INSERT id="1" acctSrc="source1" acctXtrId="999"
    amount="100" acct="1"
trans,OP_INSERT,2,source2,ABCD,200
join.out OP_INSERT id="2" acctSrc="source2" acctXtrId="ABCD"
    amount="200" acct="1"
join.out OP_INSERT id="2" acctSrc="source2" acctXtrId="ABCD"
    amount="200" acct="10"
join.out OP_INSERT id="2" acctSrc="source2" acctXtrId="ABCD"
    amount="200" acct="100"
trans,OP_INSERT,3,source2,QWERTY,200
join.out OP_INSERT id="3" acctSrc="source2" acctXtrId="QWERTY"
    amount="200"
acct,OP_INSERT,source2,QWERTY,2
trans,OP_DELETE,3,source2,QWERTY,200
join.out OP_DELETE id="3" acctSrc="source2" acctXtrId="QWERTY"
    amount="200" acct="2"
acct,OP_DELETE,source1,999,1

When a row matches multiple rows in the table, it gets multiplied. The join function iterates through the whole matching row group, and for each found row creates a result row and calls the output label with it.

Now, what if you don't want to get multiple rows back even if they are found? Of course, the best way is to just use a leaf index. But once in a while you get into situations with the denormalized data in the lookup table. You might know in advance that for each row in an index group a certain field would be the same. Or you might not care, what exact value you get as long as it's from the right group. But you might really not want the input rows to multiply when they go through the join. LookupJoin has a solution:

our $join = Triceps::LookupJoin->new(
  unit => $uJoin,
  name => "join",
  leftRowType => $rtInTrans,
  rightTable => $tAccounts,
  rightIdxPath => [ "iterateSrc", "lookupSrcExt" ],
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
  limitOne => 1,
); # would confess by itself on an error

The option limitOne changes the processing logic to pick only the first matching row. It also optimizes the join function. If limitOne is not specified explicitly, the join constructor deduces it magically by looking at whether the join index is a leaf or not. Actually, for a leaf index it would always override limitOne to 1, even if you explicitly set it to 0.

With the limit, the same input produces a different output:

acct,OP_INSERT,source1,999,1
acct,OP_INSERT,source1,2011,2
acct,OP_INSERT,source2,ABCD,1
acct,OP_INSERT,source2,ABCD,10
acct,OP_INSERT,source2,ABCD,100
trans,OP_INSERT,1,source1,999,100
join.out OP_INSERT id="1" acctSrc="source1" acctXtrId="999"
    amount="100" acct="1"
trans,OP_INSERT,2,source2,ABCD,200
join.out OP_INSERT id="2" acctSrc="source2" acctXtrId="ABCD"
    amount="200" acct="1"
trans,OP_INSERT,3,source2,QWERTY,200
join.out OP_INSERT id="3" acctSrc="source2" acctXtrId="QWERTY"
    amount="200"
acct,OP_INSERT,source2,QWERTY,2
trans,OP_DELETE,3,source2,QWERTY,200
join.out OP_DELETE id="3" acctSrc="source2" acctXtrId="QWERTY"
    amount="200" acct="2"
acct,OP_DELETE,source1,999,1

Now it just picks the first matching row instead of multiplying the rows.

12.5. Manual iteration with LookupJoin

Sometimes you might want to just get the list of the resulting rows from LookupJoin and iterate over them by yourself, rather than have it call the labels. To be honest, this looked kind of important when I wrote LookupJoin first, but by now I don't see a whole lot of use in it. By now, if you want to do a manual iteration, calling findBy() and then iterating looks like a more useful option. But at the time there was no findBy(), and this feature came to exist. Here is an example:

our $join = Triceps::LookupJoin->new(
  unit => $uJoin,
  name => "join",
  leftRowType => $rtInTrans,
  rightTable => $tAccounts,
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
  automatic => 0,
); # would confess by itself on an error

# label to print the changes to the detailed stats
my $lbPrint = makePrintLabel("lbPrint", $join->getOutputLabel());

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "acct") {
    $uJoin->makeArrayCall($tAccounts->getInputLabel(), @data);
  } elsif ($type eq "trans") {
    my $op = shift @data; # drop the opcode field
    my $trans = $rtInTrans->makeRowArray(@data);
    my @rows = $join->lookup($trans);
    foreach my $r (@rows) {
      $uJoin->call($lbPrint->makeRowop($op, $r));
    }
  }
  $uJoin->drainFrame(); # just in case, for completeness
}

It copies the first LookupJoin example, only now with a manual iteration. Once the option automatic is set to 0 for the join, the method $join->lookup() becomes available to perform the lookup and return the result rows in an array (the data sent to the input label keeps working as usual, sending the result rows to the output label). This involves the extra overhead of keeping all the result rows (and there might be lots of them) in an array, so by default the join is compiled in an automatic-only mode.

Since lookup() returns rows, not rowops, and knows nothing about the opcodes, those had to be handled separately around the lookup. There is a way to achieve a similar result using the streaming functions that returns the rowops. It will be described in Section 15.8: “Streaming functions and template results” .

The result is the same as for the first example, only the name of the result label differs:

acct,OP_INSERT,source1,999,1
acct,OP_INSERT,source1,2011,2
acct,OP_INSERT,source2,ABCD,1
trans,OP_INSERT,1,source1,999,100
lbPrint OP_INSERT id="1" acctSrc="source1" acctXtrId="999"
    amount="100" acct="1"
trans,OP_INSERT,2,source2,ABCD,200
lbPrint OP_INSERT id="2" acctSrc="source2" acctXtrId="ABCD"
    amount="200" acct="1"
trans,OP_INSERT,3,source2,QWERTY,200
lbPrint OP_INSERT id="3" acctSrc="source2" acctXtrId="QWERTY"
    amount="200"
acct,OP_INSERT,source2,QWERTY,2
trans,OP_DELETE,3,source2,QWERTY,200
lbPrint OP_DELETE id="3" acctSrc="source2" acctXtrId="QWERTY"
    amount="200" acct="2"
acct,OP_DELETE,source1,999,1

The print label is still connected to the output label of the LookupJoin, but it's done purely for the convenience of its creation. Since no rowops get sent to the LookupJoin's input, none get to its output, and none get from there to the output label. Instead the main loop creates and sends the rowops directly to the output label when it iterates through the lookup results. Because of this the label name in the output is the name of the output label.

12.6. The key fields of LookupJoin

The key fields are the ones that participate in the join condition. I use these terms interchangeably because by the definition of LookupJoin, these fields must be the key fields in the join index in the right-side table. LookupJoin has a few more facilities for their handling that haven't been shown yet.

First, the join condition can be specified as the Triceps::Fields::filterToPairs() patterns in the option byLeft. The options by and byLeft are mutually exclusive and one of them must be present. The condition

by => [ "acctSrc" => "source", "acctXtrId" => "external" ],

can be also specified as:

byLeft => [ "acctSrc/source", "acctXtrId/external" ],

The option name byLeft says that the pattern specification is for the fields on the left side (there is no symmetric byRight). The substitutions produce the matching field names for the right side. Unlike the result pattern, here the fields that do not find a match do not get included in the key. It's as if an implicit "!.*" gets added at the end. In fact, "!.*" really does get added implicitly at the end.

Of course, for the example above either option doesn't make much difference. It starts making the difference when the key fields follow a pattern. For example, if the key fields on both sides have the names acctSrc and acctXtrId, the specification with the byLeft becomes a little simpler:

byLeft => [ "acctSrc", "acctXtrId" ],

Even more so if the key is long, common on both sides, and all the fields have a common prefix. Such as:

k_AccountSystem
k_AccountId
k_InstrumentSystem
k_InstrumentId
k_TransactionDate
k_SettlementDate

Then the join condition can be specified simply as:

byLeft => [ "k_.*" ],

If say the settlement date doesn't matter for a particular join, it can be excluded:

byLeft => [ "!k_SettlementDate", "k_.*" ],

If the right side represents a swap of securities, it might have two parts to it, each describing its half with its key:

BorrowAccountSystem
BorrowAccountId
BorrowInstrumentSystem
BorrowInstrumentId
BorrowTransactionDate
BorrowSettlementDate
LoanAccountSystem
LoanAccountId
LoanInstrumentSystem
LoanInstrumentId
LoanTransactionDate
LoanSettlementDate

Then the join of the one-sided rows with the borrow part condition can be done using:

byLeft => [ 'k_(.*)/Borrow$1' ],

The key patterns make the long keys easier to drag around.

Second, key fields of LookupJoin don't have to be of the same type on the left and on the right side. Since the key building for lookup is done through Perl, the key values get automatically converted as needed.

A caveat is that the conversion might be not exactly direct. If a string gets converted to a number, then any string values that do not look like numbers will be converted to 0. A conversion between a string and a floating-point number, in either direction, is likely to lose precision. A conversion between int64 and int32 may cause the upper bits to be truncated. So what gets looked up may be not what you expect.

I'm not sure yet if I should add the requirement for the types being exactly the same. The automatic conversions seem to be convenient, just use them with care. I suppose, when the joins will get eventually implemented in the C++ code, this freedom would go away because it's much easier and more efficient in C++ to copy the field values as-is than to convert them.

The only thing currently checked is whether a field is represented in Perl as a scalar or an array, and that must match on the left and on the right. Note that the array uint8[] gets represented in Perl as a scalar string, so an uint8[] field can be matched with other scalars but not with the other arrays.

Third, the key fields have the problem of duplication. The LookupJoin is by definition an equi-join, it joins together the rows that have the same values in a set of key fields. If all the fields from both sides are to be included in the result, they key values will be present in it twice, once from the left side, once from the right side. This is not what is usually wanted, and the good practice is to let these fields through from one side and filter out from the other side.

Letting these fields through on the left side is usually the better choice. For the inner joins it doesn't really matter but for the left outer joins it works much better than the with letting through the fields from the right side. The reason is that when the join doesn't find the match on the right side, all the right-side fields will be NULL. If you pass through the key fields only from the right side, they will contain NULL, and this is probably not what you want.

However if for some reason, be it the order of the fields or the better field types on the right side, you really want to pass the key fields only from the right side, you can. LookupJoin provides a special magic act enabled by the option

  fieldsMirrorKey => 1

Then if the row is not found on the right side, a special right-side row will be created with the key fields copied from the left side, and it will be used to produce the result row. With fieldsMirrorKey you are guaranteed to always have the key values present on the right side.

12.7. A peek inside LookupJoin

I won't be describing in the details the internals of LookupJoin. They seem a bit too big and complicated. Partially it's because the code is of an older origin, and not using all the newer calls. Partially it's because when I wrote it, I've tried to optimize by translating the rows to an array format instead of referring to the fields by names, and that made the code more tricky. Partially, the code has grown more complex due to all the added options. And partially the functionality just is a little tricky by itself.

But, for debugging purposes, the LookupJoin constructor can return the auto-generated code of the joiner function. It's done with the option saveJoinerTo:

  saveJoinerTo => \$code,

This will cause the auto-generated code to be placed into the variable $code. I've collected a few such examples in this section. They provide a glimpse into the internal workings of the joiner. It's definitely a quite advanced topic, but it's helpful if you want to know, what is really going on in there.

The joiner code from the example

our $join = Triceps::LookupJoin->new(
  unit => $uJoin,
  name => "join",
  leftRowType => $rtInTrans,
  rightTable => $tAccounts,
  rightIdxPath => ["lookupSrcExt"],
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
  isLeft => 1,
); # would confess by itself on an error

that was shown first in the Section 12.4: “The LookupJoin template” is this:

sub # ($inLabel, $rowop, $self)
{
  my ($inLabel, $rowop, $self) = @_;
  #print STDERR "DEBUGX LookupJoin " . $self->{name} . " in: ", $rowop->printP(), "\n";

  my $opcode = $rowop->getOpcode(); # pass the opcode
  my $row = $rowop->getRow();

  my @leftdata = $row->toArray();

  my $resRowType = $self->{resultRowType};
  my $resLabel = $self->{outputLabel};

  my $lookuprow = $self->{rightRowType}->makeRowHash(
    "source" => $leftdata[1],
    "external" => $leftdata[2],
    );

  #print STDERR "DEBUGX " . $self->{name} . " lookup: ", $lookuprow->printP(), "\n";
  my $rh = $self->{rightTable}->findIdx($self->{rightIdxType}, $lookuprow);

  my @rightdata; # fields from the right side, defaults to all-undef, if no data found
  my @result; # the result rows will be collected here

  if (!$rh->isNull()) {
    #print STDERR "DEBUGX " . $self->{name} . " found data: " . $rh->getRow()->printP() . "\n";
    @rightdata = $rh->getRow()->toArray();
  }

    my @resdata = ($leftdata[0],
    $leftdata[1],
    $leftdata[2],
    $leftdata[3],
    $rightdata[2],
    );
    my $resrowop = $resLabel->makeRowop($opcode, $resRowType->makeRowArray(@resdata));
    #print STDERR "DEBUGX " . $self->{name} . " +out: ", $resrowop->printP(), "\n";
    $resLabel->getUnit()->call($resrowop);

}

From the example with the manual iteration:

our $join = Triceps::LookupJoin->new(
  unit => $uJoin,
  name => "join",
  leftRowType => $rtInTrans,
  rightTable => $tAccounts,
  rightIdxPath => ["lookupSrcExt"],
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
  automatic => 0,
); # would confess by itself on an error

comes this code:

sub  # ($self, $row)
{
  my ($self, $row) = @_;

  #print STDERR "DEBUGX LookupJoin " . $self->{name} . " in: ", $row->printP(), "\n";

  my @leftdata = $row->toArray();

  my $lookuprow = $self->{rightRowType}->makeRowHash(
    "source" => $leftdata[1],
    "external" => $leftdata[2],
    );

  #print STDERR "DEBUGX " . $self->{name} . " lookup: ", $lookuprow->printP(), "\n";
  my $rh = $self->{rightTable}->findIdx($self->{rightIdxType}, $lookuprow);

  my @rightdata; # fields from the right side, defaults to all-undef, if no data found
  my @result; # the result rows will be collected here

  if (!$rh->isNull()) {
    #print STDERR "DEBUGX " . $self->{name} . " found data: " . $rh->getRow()->printP() . "\n";
    @rightdata = $rh->getRow()->toArray();
  }

    my @resdata = ($leftdata[0],
    $leftdata[1],
    $leftdata[2],
    $leftdata[3],
    $rightdata[2],
    );
    push @result, $self->{resultRowType}->makeRowArray(@resdata);
    #print STDERR "DEBUGX " . $self->{name} . " +out: ", $result[$#result]->printP(), "\n";
  return @result;
}

It takes different arguments because now it's not an input label handler but a common function that gets called from both the label handler and the lookup() method. And it collects the rows in an array to be returned instead of immediately passing them on.

From the example with multiple rows matching on the right side

our $join = Triceps::LookupJoin->new(
  unit => $uJoin,
  name => "join",
  leftRowType => $rtInTrans,
  rightTable => $tAccounts,
  rightIdxPath => [ "iterateSrc", "lookupSrcExt" ],
  rightFields => [ "internal/acct" ],
  by => [ "acctSrc" => "source", "acctXtrId" => "external" ],
); # would confess by itself on an error

comes this code:

sub # ($inLabel, $rowop, $self)
{
  my ($inLabel, $rowop, $self) = @_;
  #print STDERR "DEBUGX LookupJoin " . $self->{name} . " in: ", $rowop->printP(), "\n";

  my $opcode = $rowop->getOpcode(); # pass the opcode
  my $row = $rowop->getRow();

  my @leftdata = $row->toArray();

  my $resRowType = $self->{resultRowType};
  my $resLabel = $self->{outputLabel};

  my $lookuprow = $self->{rightRowType}->makeRowHash(
    "source" => $leftdata[1],
    "external" => $leftdata[2],
    );

  #print STDERR "DEBUGX " . $self->{name} . " lookup: ", $lookuprow->printP(), "\n";
  my $rh = $self->{rightTable}->findIdx($self->{rightIdxType}, $lookuprow);

  my @rightdata; # fields from the right side, defaults to all-undef, if no data found
  my @result; # the result rows will be collected here

  if ($rh->isNull()) {
    #print STDERR "DEBUGX " . $self->{name} . " found NULL\n";

    my @resdata = ($leftdata[0],
    $leftdata[1],
    $leftdata[2],
    $leftdata[3],
    $rightdata[2],
    );
    my $resrowop = $resLabel->makeRowop($opcode, $resRowType->makeRowArray(@resdata));
    #print STDERR "DEBUGX " . $self->{name} . " +out: ", $resrowop->printP(), "\n";
    $resLabel->getUnit()->call($resrowop);

  } else {
    #print STDERR "DEBUGX " . $self->{name} . " found data: " . $rh->getRow()->printP() . "\n";
    my $endrh = $self->{rightTable}->nextGroupIdx($self->{iterIdxType}, $rh);
    for (; !$rh->same($endrh); $rh = $self->{rightTable}->nextIdx($self->{rightIdxType}, $rh)) {
      @rightdata = $rh->getRow()->toArray();
    my @resdata = ($leftdata[0],
    $leftdata[1],
    $leftdata[2],
    $leftdata[3],
    $rightdata[2],
    );
    my $resrowop = $resLabel->makeRowop($opcode, $resRowType->makeRowArray(@resdata));
    #print STDERR "DEBUGX " . $self->{name} . " +out: ", $resrowop->printP(), "\n";
    $resLabel->getUnit()->call($resrowop);

    }
  }
}

It's more complicated in two ways: If a match is found, it has to iterate through the whole matching group. And if the match is not found, it still has to produce a result row for the left join with a separate code fragment.

12.8. JoinTwo joins two tables

Fundamentally, joining the two tables is kind of like the two symmetrical copies of LookupJoin, each of them reacting to the changes in one table and doing look-ups in another table. For all I can tell, the CEP systems with the insert-only stream model tend to start with the assumption that the LookupJoin (or whetever they call it) is good enough. Then it turns out that manually writing the join twice where it can be done once is a pain. So the table-to-table join gets added. Then the interesting nuances crop up, since a correct table-to-table join has more to it than just two stream-to-table joins. Then it turns out that it would be real convenient to propagate the deletes through the join, and that gets added as a special feature behind the scenes.

In Triceps, JoinTwo is the template for joining the tables. And actually it is translated under the hood to two LookupJoins, but it has more on top of them.

In a common database a join query causes a join plan to be created: on what table to iterate, and in which to look up next. A CEP system deals with the changing data, and a join has to react to the data changes on each of its input tables. It must have multiple plans, one for starting from each of the tables. And essentially a LookupJoin embodies such a plan, and JoinTwo makes two of them.

Why only two? Because it's the minimal usable number. The join logic is tricky, so it's better to work out the kinks on something simpler first. And it still can be scaled to many tables by joining them in stages. It's not quite as efficient as a direct join of multiple tables, because the result of each stage has to be put into a table, but it does the job.

I'll be doing the demonstrations of the table joins on an application example from the area of stock lending. Think of a large multinational broker that wants to keep track of its lending activities. It has many customers to whom the stock can be loaned or from whom it can be borrowed. This information comes as the records of positions, of how many shares are loaned or borrowed for each customer, and at what contractual price. And since the clients are from all around the world, the prices may be in different currencies. A simplified and much shortened version of the position information may look like this:

our $rtPosition = Triceps::RowType->new( # a customer account position
  date => "int32", # as of which date, in format YYYYMMDD
  customer => "string", # customer account id
  symbol => "string", # stock symbol
  quantity => "float64", # number of shares
  price => "float64", # share price in local currency
  currency => "string", # currency code of the price
);

Then we want to aggregate these data in different ways, getting the broker-wide summaries by the symbol, by customer etc. The aggregation is updated as the business day goes on. At the end of the business day the state of the day freezes, and the new day's initial data is loaded. That's why the business date is part of the schema. If you wonder, the next day's initial data is usually the same as at the end of the previous day, except where some contractual conditions change. The detailed position data is thrown away after a few days, or even right at the end of the day, but the aggregation results from the end of the day are kept for a longer history.

There is a problem with summing up the monetary values: they come in different currencies and can not be added up directly. If we want to get this kind of summaries, we have to translate all of them to a single reference currency. That's what the sample joins will be doing: finding the translation rates to the US dollars. The currency rates come in the translation schema:

our $rtToUsd = Triceps::RowType->new( # a currency conversion to USD
  date => "int32", # as of which date, in format YYYYMMDD
  currency => "string", # currency code
  toUsd => "float64", # multiplier to convert this currency to USD
);

Since the currency rates change all the time, to make sense of a previous day's position, the previous day's rates need to be kept around, and so the rates are also marked with a date.

Having the mood set, here is the first example of a model with an inner join:

# exchange rates, to convert all currencies to USD
our $ttToUsd = Triceps::TableType->new($rtToUsd)
  ->addSubIndex("primary",
    Triceps::IndexType->newHashed(key => [ "date", "currency" ])
  )
  ->addSubIndex("byDate", # for cleaning by date
    Triceps::SimpleOrderedIndex->new(date => "ASC")
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
;
$ttToUsd->initialize();

# the positions in the original currency
our $ttPosition = Triceps::TableType->new($rtPosition)
  ->addSubIndex("primary",
    Triceps::IndexType->newHashed(key => [ "date", "customer", "symbol" ])
  )
  ->addSubIndex("currencyLookup", # for joining with currency conversion
    Triceps::IndexType->newHashed(key => [ "date", "currency" ])
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
  ->addSubIndex("byDate", # for cleaning by date
    Triceps::SimpleOrderedIndex->new(date => "ASC")
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
;
$ttPosition->initialize();

our $uJoin = Triceps::Unit->new("uJoin");

our $tToUsd = $uJoin->makeTable($ttToUsd, "tToUsd");
our $tPosition = $uJoin->makeTable($ttPosition, "tPosition");

our $join = Triceps::JoinTwo->new(
  name => "join",
  leftTable => $tPosition,
  rightTable => $tToUsd,
  byLeft => [ "date", "currency" ],
  type => "inner",
); # would confess by itself on an error

# label to print the changes to the detailed stats
makePrintLabel("lbPrint", $join->getOutputLabel());

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "cur") {
    $uJoin->makeArrayCall($tToUsd->getInputLabel(), @data);
  } elsif ($type eq "pos") {
    $uJoin->makeArrayCall($tPosition->getInputLabel(), @data);
  }
  $uJoin->drainFrame(); # just in case, for completeness
}

The example just does the joining, leaving the aggregation to the imagination of the reader. The result of a JoinTwo is not stored in a table. It is a stream of ephemeral updates, same as for LookupJoin. If you want to keep them, you can put them into a table yourself (and maybe do the aggregation in the same table).

Both of the joined tables must provide a Hashed index for the efficient joining. In this case it will be currencyLookup on the left and primary on the right, found automatically by the key fields. The index may be leaf (selecting one row per key) or non-leaf (containing multiple rows per key) but it must be there. This makes sure that the joins are always efficient and you don't have to hunt for why your model is suddenly so slow.

There are two ways to provide the join condition: either specify it explicitly in the option by or byLeft, or specify the indexes in both tables and have the key fields in them paired together. Or you can specify both, as long as the information stays consistent. For example, the join in this example could also be written as:

our $join = Triceps::JoinTwo->new(
  name => "join",
  leftTable => $tPosition,
  leftIdxPath => [ "currencyLookup" ],
  rightTable => $tToUsd,
  rightIdxPath => [ "primary" ],
  type => "inner",
); # would confess by itself on an error

When the key fields in the indexes are paired up together, it's done in the order they go in the index specifications. Once again, the fields are paired not by name but by order. If the indexes are nested, the outer indexes precede in the order. For example, the $ttToUsd could have the same index done in a nested way and it would work just as well:

  ->addSubIndex("byDate",
    Triceps::IndexType->newHashed(key => [ "date" ])
    ->addSubIndex("primary",
      Triceps::IndexType->newHashed(key => [ "currency" ])
    )
  )

Same as with LookupJoin, currently only the Hashed indexes are supported, and must go through all the path. The outer index byDate here can not be a Sorted/Ordered index, that would be an error and the join will refuse to accept it.

If the order of key fields in the $ttToUsd index were changed to be different from $ttPosition, like this

  ->addSubIndex("primary",
    Triceps::IndexType->newHashed(key => [ "currency", "date" ])
  )

then it would be a mess for the automatic pairing by index. The wrong fields would be matched up in the join condition, which would become (tPosition.date == tToUsd.currency && tPosition.currency == tToUsd.date), and everything would go horribly wrong. It would be no problem at all for selecting the pairing explicitly with the by options and letting the join find the index, which is the recommended way.

JoinTwo is much less lenient than LookupJoin as the key field types go. It requires the types of the matching fields to be exactly the same. Partially, for the reasons of catching the wrong field pairing by order, partially for the sake of the result consistency. JoinTwo does the look-ups in both directions. And think about what happens if a string field and an int32 field get matched up, and then the non-numeric strings turn up in the string field, containing things like abc and qwerty. Those strings on the left side will match the rows with numeric 0 on the right side. But then if the row with 0 on the right side changes, it would look for the string 0 on the left, which would not find either abc or qwerty. The state of the join will become a mess. So no automatic key type conversions here.

By the way, even though JoinTwo doesn't refuse to have the float64 key fields, using them is a bad idea. The floating-point values are subject to non-obvious rounding. And if you have two floating-point values that print the same, this doesn't mean that they are internally the same down to the last bit (because the printing involves the conversion to decimal that involves rounding). The joining requires that the values are exactly equal. Because of this the joining on a floating-point field is rife with unpleasant surprises. Better don't do it. A possible solution is to round values by converting them to integers (scaled by multiplying by a fixed factor to get essentially a fixed-point value). You can even convert them back from fixed-point to floating-point and still join on these floating-point values, because the same values would always be produced from integers in exactly the same way, and will be exactly the same.

More of the JoinTwo options closely parallel those in LookupJoin. Obviously, name, rightTable and rightIdxPath are the same, with the added symmetrical leftTable and leftIdxPath. There is no unit option though, the unit is always taken from the tables (which must belong to the same unit). The option to save the source code of the generated joiner code has been split in two: leftSaveJoinerTo and rightSaveJoinerTo. Since JoinTwo has to react to the updates from both sides, is has to have two handlers. And since internally it uses two LookupJoin for this purpose, these happen to be the joiner functions of the left and right LookupJoin.

The option type selects the join mode. The inner join is the default, and would have been used even if this option was not specified.

The options controlling the result are also the same as in LookupJoin: leftFields, rightFields, fieldsLeftFirst. There is no option fieldsDropRightKey, JoinTwo always excludes the duplicate key fields automatically. The results in this example include all the fields from both sides by default.

The joins are currently not equipped to actually compute the translated prices directly. They can only look up the information for it, and the computation can be done later, before or during the aggregation.

That's enough explanations for now, let's look at the result. The input rows are shown as usual in bold, and to make keeping track easier, I broke up the output into short snippets with commentary after each one.

cur,OP_INSERT,20120310,USD,1
cur,OP_INSERT,20120310,GBP,2
cur,OP_INSERT,20120310,EUR,1.5

Inserting the reference currencies produces no result, since it's an inner join and they have no matching positions yet.

pos,OP_INSERT,20120310,one,AAA,100,15,USD
join.leftLookup.out OP_INSERT date="20120310" customer="one"
    symbol="AAA" quantity="100" price="15" currency="USD" toUsd="1"
pos,OP_INSERT,20120310,two,AAA,100,8,GBP
join.leftLookup.out OP_INSERT date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2"

Now the positions arrive and find the matching translations to USD. The label names on the output are an interesting artifact of all the chained labels receiving the original rowop that refers to the first label in the chain. Which happens to be the output label of a LookupJoin inside JoinTwo. It works conveniently for the demonstrational purposes, since the name of that LookupJoin shows whether the row that triggered the result came from the left or right side of the JoinTwo.

pos,OP_INSERT,20120310,three,AAA,100,300,RUR

This position is out of luck: no translation for its currency. The inner join is actually not a good choice here. If a row does not pass through because of the lack of translation, it gets excluded even from the aggregations that do not require the translation, such as those that total up the quantity of a particular symbol across all the customers. A left outer join would have been suited better.

pos,OP_INSERT,20120310,three,BBB,200,80,GBP
join.leftLookup.out OP_INSERT date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2"

Another position arrives, same as before.

cur,OP_INSERT,20120310,RUR,0.04
join.rightLookup.out OP_INSERT date="20120310" customer="three"
    symbol="AAA" quantity="100" price="300" currency="RUR"
    toUsd="0.04"

The translation for RUR finally comes in. The position in RUR can now find its match and propagate through.

cur,OP_DELETE,20120310,GBP,2
join.rightLookup.out OP_DELETE date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2"
join.rightLookup.out OP_DELETE date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2"
cur,OP_INSERT,20120310,GBP,2.2
join.rightLookup.out OP_INSERT date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2.2"
join.rightLookup.out OP_INSERT date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2.2"

An exchange rate update for GBP arrives. It amounts to delete the old translation and then insert a new one. Each of these operations updates the state of the join: the disappearing translation causes all the GBP positions to be deleted from the result, and the new translation inserts them back, with the new value of toUsd. Which is the correct behavior: to make an up date to the result positions, they have to be deleted and then inserted witn the new values.

pos,OP_DELETE,20120310,one,AAA,100,15,USD
join.leftLookup.out OP_DELETE date="20120310" customer="one"
    symbol="AAA" quantity="100" price="15" currency="USD" toUsd="1"
pos,OP_INSERT,20120310,one,AAA,200,16,USD
join.leftLookup.out OP_INSERT date="20120310" customer="one"
    symbol="AAA" quantity="200" price="16" currency="USD" toUsd="1"

A position update arrives. Again, it's a delete-and-insert, and propagates through the join as such.

That's the end of the first example. The commentary said that the left outer join would have been better for the logic, so let's make one for the left outer join. All we need to change is the join type option:

our $join = Triceps::JoinTwo->new(
  name => "join",
  leftTable => $tPosition,
  rightTable => $tToUsd,
  byLeft => [ "date", "currency" ],
  type => "left",
); # would confess by itself on an error

Now the positions would pass through even if the currency translation is not available. The same input now produces a different result:

cur,OP_INSERT,20120310,USD,1
cur,OP_INSERT,20120310,GBP,2
cur,OP_INSERT,20120310,EUR,1.5
pos,OP_INSERT,20120310,one,AAA,100,15,USD
join.leftLookup.out OP_INSERT date="20120310" customer="one"
    symbol="AAA" quantity="100" price="15" currency="USD" toUsd="1"
pos,OP_INSERT,20120310,two,AAA,100,8,GBP
join.leftLookup.out OP_INSERT date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2"

So far things are going the same as for the inner join.

pos,OP_INSERT,20120310,three,AAA,100,300,RUR
join.leftLookup.out OP_INSERT date="20120310" customer="three"
    symbol="AAA" quantity="100" price="300" currency="RUR"

The first difference: even though there is no translation for RUR, the row still passes through (with the field toUsd being NULL).

pos,OP_INSERT,20120310,three,BBB,200,80,GBP
join.leftLookup.out OP_INSERT date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2"

This is also unchanged.

cur,OP_INSERT,20120310,RUR,0.04
join.rightLookup.out OP_DELETE date="20120310" customer="three"
    symbol="AAA" quantity="100" price="300" currency="RUR"
join.rightLookup.out OP_INSERT date="20120310" customer="three"
    symbol="AAA" quantity="100" price="300" currency="RUR"
    toUsd="0.04"

The second difference: since this row from the left side has already passed through, just sending another INSERT for it would make the data inconsistent. The original result without the translation must be deleted first, and then a new one, with translation, inserted. JoinTwo is smart enough to figure it out all by itself.

cur,OP_DELETE,20120310,GBP,2
join.rightLookup.out OP_DELETE date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2"
join.rightLookup.out OP_INSERT date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP"
join.rightLookup.out OP_DELETE date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2"
join.rightLookup.out OP_INSERT date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP"

The same logic works for the deletes, only backwards: when the translation for GBP is deleted, the result rows that used it change to the lose the translation.

cur,OP_INSERT,20120310,GBP,2.2
join.rightLookup.out OP_DELETE date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP"
join.rightLookup.out OP_INSERT date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2.2"
join.rightLookup.out OP_DELETE date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP"
join.rightLookup.out OP_INSERT date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2.2"

And again, when the new translation for GBP comes in, the DELETE-INSERT sequence is done for each of the rows. As you can see, the update of the GBP translation in the last two snippets worked in not the most efficient way. Fundamentally, if we knew that a DELETE of GBP will be immediately followed by an INSERT, we could skip inserting and then deleting the rows with the NULL in toUsd. But we don't know, and in Triceps there is no way to know it.

If you really, really want to avoid the propagation of these intermediate changes, insert after the join a Collapse template described in Section 14.2: “Collapsed updates” , and flush it only after the whole update has been processed. There will be more overhead in the Collapse itself, but all the logic below it will skip the intermediate changes. If this logic below is heavy-weight, that might be an overall win. A caveat though: a Collapse requires that the data has a primary key, a JoinTwo doesn't require its result (nor its inputs) to have a primary key. Because of this, the collapse might not work right with every possible join, you'd have to limit yourself to the joins that produce the data with a primary key.

pos,OP_DELETE,20120310,one,AAA,100,15,USD
join.leftLookup.out OP_DELETE date="20120310" customer="one"
    symbol="AAA" quantity="100" price="15" currency="USD" toUsd="1"
pos,OP_INSERT,20120310,one,AAA,200,16,USD
join.leftLookup.out OP_INSERT date="20120310" customer="one"
    symbol="AAA" quantity="200" price="16" currency="USD" toUsd="1"

And the rest is again the same as with an inner join.

JoinTwo can do a right outer join too, just use the type right. It works in exactly the same way as the left outer join, just with a different table. So much the same that it's not even worth a separate example.

Now, the full outer join. The full outer joins usually get used with a variation of the fork-join topology described in the Section 14.1: “The dreaded diamond” . In it the processing of a row can be forked into multiple parallel paths, each path doing an optional part of the comuptation and either providing a result row or not, eventually with all the parts merged back together into one row. The full outer join is a convenient way to do this merge: the paths that didn't produce the result get quietly ignored, and the results that were produced get merged back into a single row. The row in such situations is usually identified by a primary key, so the partial results can find each other. This scheme makes the most sense when the paths are executed in the parallel threads, or when the processing on some paths may get delayed and then continued later. If the processing is single-threaded and fast, Triceps provides a more convenient procedural way of getting the same result: just call every path in order and merge the results from them procedurally, and you won't have to keep the intermediate results in their tables forever, nor delete them manually.

Even though that use is typical, it has only the 1:1 record matching and does not highlight all the abilities of the JoinTwo. So, let's come up with another example that does.

The positions-and-currencies do not lend itself easily to a full outer join but we'll make them do. Suppose that you want to get the total count of positions (per symbol, or altogether), or maybe the total value, for every currency. Including those for which we have the exchange rates but no positions, for them the count should simply be 0 (or maybe NULL). And those for which there are positions but no exchange rate translations. This is a job for a full outer join, followed by an aggregation. The join has the type outer and looks like this:

our $join = Triceps::JoinTwo->new(
  name => "join",
  leftTable => $tPosition,
  rightTable => $tToUsd,
  byLeft => [ "date", "currency" ],
  type => "outer",
); # would confess by itself on an error

As before, the aggregation part will be left to the imagination of the reader. This join has the many-to-one (M:1) row matching, since there might be multiple positions on the left matching one currency rate translation on the right. This will create interesting effects in the output, let's look at it:

cur,OP_INSERT,20120310,GBP,2
join.rightLookup.out OP_INSERT date="20120310" currency="GBP"
    toUsd="2"

The first translation gets through, even though there is no position for it yet.

pos,OP_INSERT,20120310,two,AAA,100,8,GBP
join.leftLookup.out OP_DELETE date="20120310" currency="GBP" toUsd="2"
join.leftLookup.out OP_INSERT date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2"

The first position for an existing translation comes in. Now the GBP row has a match, so the unmatched row gets deleted and a matched one gets inserted instead.

pos,OP_INSERT,20120310,three,BBB,200,80,GBP
join.leftLookup.out OP_INSERT date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2"

The second position for GBP works differently: since there is no unmatched row any more (it was taken care of by the first position), there is nothing to delete. Just the second matched row gets inserted.

pos,OP_INSERT,20120310,three,AAA,100,300,RUR
join.leftLookup.out OP_INSERT date="20120310" customer="three"
    symbol="AAA" quantity="100" price="300" currency="RUR"

The position without a matching currency get through as well.

cur,OP_INSERT,20120310,RUR,0.04
join.rightLookup.out OP_DELETE date="20120310" customer="three"
    symbol="AAA" quantity="100" price="300" currency="RUR"
join.rightLookup.out OP_INSERT date="20120310" customer="three"
    symbol="AAA" quantity="100" price="300" currency="RUR"
    toUsd="0.04"

Now the RUR translation becomes available and it has to do the same things as we've seen before, only on the other side: delete the unmatched record and replace it with the matched one.

cur,OP_DELETE,20120310,GBP,2
join.rightLookup.out OP_DELETE date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2"
join.rightLookup.out OP_INSERT date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP"
join.rightLookup.out OP_DELETE date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2"
join.rightLookup.out OP_INSERT date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP"
cur,OP_INSERT,20120310,GBP,2.2
join.rightLookup.out OP_DELETE date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP"
join.rightLookup.out OP_INSERT date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2.2"
join.rightLookup.out OP_DELETE date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP"
join.rightLookup.out OP_INSERT date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2.2"

Then the GBP translation gets updated. First the old translation gets deleted and then the new one inserted. When the translation gets deleted, all the positions in GBP lose their match. So the matched rows gets deleted and replaced with the unmatched ones. When the new GBP translation is inserted, the replacement goes in the other direction.

pos,OP_DELETE,20120310,three,BBB,200,80,GBP
join.leftLookup.out OP_DELETE date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2.2"

When this position goes away, the row gets deleted from the result as well. However it was not the only position in GBP, so there is no need to insert an unmatched record for GBP.

pos,OP_DELETE,20120310,three,AAA,100,300,RUR
join.leftLookup.out OP_DELETE date="20120310" customer="three"
    symbol="AAA" quantity="100" price="300" currency="RUR"
    toUsd="0.04"
join.leftLookup.out OP_INSERT date="20120310" currency="RUR"
    toUsd="0.04"

This position was the last one in RUR. So when it gets deleted, the RUR translation has no match any more. That means, after deleting the matched row from the results, the unmatched row has to be inserted to keep the balance right.

This business with keeping track of the unmatched rows is not unique to the full outer joins. Remember, it was showing in the left outer joins too, and the right outer joins are no exception either. When the first matching row gets inserted or the last matching row gets deleted on the side that is opposite to the "outer side", the unmatched rows have to be handled in the result. (That would be the right side for the left outer joins, the left side for the right outer joins, and either side for the full outer joins). The special thing about the M:1 (and 1:M and M:M) joins is that there may be more than one matching row. On insertion, the second and following matching rows produce a different effect than the first one. On deletion, the opposite: all the rows but the last work differently from the last one. It's not limited to the full outer joins. M:1 or M:M with a right outer join, and 1:M or M:M with a left outer join will do it too.

If you're like me, by now you'd be wondering, how does it work? If the opposite side is of one variety (:1 or 1:), which can be known from it using a leaf index for the join, then every insert is the first insert of a matching row for this key, and every delete is the delete of the last row for this key. Which means, do the empty-match business every time.

If the opposite side is of the many variety (:M or M:), with a non-leaf index, then things get more complicated. The join works by processing the rowops coming out of the argument tables. When it gets the rowop in such a situation, it goes to the table and checks, was it the first (or last) row for this key? And then uses this knowledge to act.

12.9. The key field duplication in JoinTwo

JoinTwo in its raw form has the same problem of the key field duplication as LookupJoin (described in Section 12.6: “The key fields of LookupJoin” ). It's a more high-level template, so it solves this problem automatically, removing the duplicate fields from the result by default.

But the problem in JoinTwo is even worse because the table-to-table outer joins must work with the updates from any side. If a row finds no match in the outer join, the other side and all the fields in that other side will be NULL. If only the fields from that other side pass through into the result, the result will contain NULLs in the key fields, which would be very wrong. Thus JoinTwo has even more magic built into it: it knows how to have the key fields copied into the result from whatever side happens to be present for a particular row, and does this by default. The other way to think about it is that it makes these fields always available on both sides.

The default behavior is good enough for most situations. But if you want more control, it's done with the option fieldsUniqKey. The default value of this option is first. It means: Enable the magic for copying the fields from the non-NULL side to the NULL side. Look at the option fieldsLeftFirst and figure out, which side goes first in the result. Let the key fields pass on that side unchanged (though the user can block them on that side manually too, or possibly rename them, it's his choice). On the other side, automatically generate the blocking specs for the key fields and prepend them to that side's result specification. It's smart enough to know that an undefined leftFields or rightFields means the same as .*, so an undefined result spec is replaced by the blocking specs followed by .*. If you later call the methods

$fspec = $join->getLeftFields();
$fspec = $join->getRightFields();

then you will actually get back the modified field specs.

If you want the key fields to be present in a different location in the result, you can set fieldsUniqKey to left or right. That will make them pass through on the selected side, and the blocking would be automatically added on the other side.

For more control yet, set this option to manual. The magic for making the key fields available on both sides will still be enabled, but no automatic blocking. You can pick and choose the result fields manually, exactly as you want. Remember though that there can't be multiple fields with the same name in the result, so if both sides have these fields named the same, you've got to block or rename one of the two copies.

The final choice is none: it simply disables the key field magic.

12.10. The override options in JoinTwo

Normally JoinTwo tries to work in a consistent manner, refusing to do the unsafe things that might corrupt the data. But if you really, really want, and are really sure of what you're doing, there are options to override these restrictions.

If you set

  overrideSimpleMinded => 1,

then the logic that produces the DELETE-INSERT sequences for the outer joins gets disabled. The only reason I can think of to use this option is if you want to simulate a CEP system that has no concept of opcodes. So if your data is INSERT-only and you want to produce the INSERT-only data too, and want the dumbed-down logic, this option is your solution.

The option

  overrideKeyTypes => 1,

disables the check for the exact match of the key field types. This might come helpful for example if you have an int32 field on one side and an int64 field on the other side, and you know that in reality they would always stay within the int32 range. Or if you have an integer on one side and a string that always contains an integer on the other side. Since you know that the type conversions can always be done with no loss, you can safely override the type check and still get the correct result.

12.11. JoinTwo input event filtering

Let's look at how the business day logic interacts with the joins. It's typical for the business applications to keep the full data for the current day, or a few recent days, then clear the data that became old and maybe keep it only in an aggregated form.

So, let's add the business day logic to the left join example. It uses the indexes by date to find the rows that have become old:

# exchange rates, to convert all currencies to USD
our $ttToUsd = Triceps::TableType->new($rtToUsd)
  ->addSubIndex("primary",
    Triceps::IndexType->newHashed(key => [ "date", "currency" ])
  )
  ->addSubIndex("byDate", # for cleaning by date
    Triceps::SimpleOrderedIndex->new(date => "ASC")
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
;
$ttToUsd->initialize();

# the positions in the original currency
our $ttPosition = Triceps::TableType->new($rtPosition)
  ->addSubIndex("primary",
    Triceps::IndexType->newHashed(key => [ "date", "customer", "symbol" ])
  )
  ->addSubIndex("currencyLookup", # for joining with currency conversion
    Triceps::IndexType->newHashed(key => [ "date", "currency" ])
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
  ->addSubIndex("byDate", # for cleaning by date
    Triceps::SimpleOrderedIndex->new(date => "ASC")
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
;
$ttPosition->initialize();

# remember the indexes for the future use
our $ixtToUsdByDate = $ttToUsd->findSubIndex("byDate");
our $ixtPositionByDate = $ttPosition->findSubIndex("byDate");

# Go through the table and clear all the rows where the field "date"
# is less than the date argument. The index type orders the table by date.
sub clearByDate($$$) # ($table, $ixt, $date)
{
  my ($table, $ixt, $date) = @_;

  my $next;
  for (my $rhit = $table->beginIdx($ixt); !$rhit->isNull(); $rhit = $next) {
    last if (($rhit->getRow()->get("date")) >= $date);
    $next = $rhit->nextIdx($ixt); # advance before removal
    $table->remove($rhit);
  }
}

The table types are the same as have been already shown before, they've been copied here for convenience. clearByDate() is an universal function that can clear the contents of any table by date, provided that the date is in the field date and the index type on this table that orders the rows by date is given as an argument. The index with ordering by date must be not just a leaf Ordered index, but have a FIFO index nested in it. Without that FIFO index, the Ordered index would allow only one row for each date.

The main loop gets extended with a few more commands:

our $businessDay = undef;

our $join = Triceps::JoinTwo->new(
  name => "join",
  leftTable => $tPosition,
  rightTable => $tToUsd,
  byLeft => [ "date", "currency" ],
  type => "left",
); # would confess by itself on an error

# label to print the changes to the detailed stats
makePrintLabel("lbPrint", $join->getOutputLabel());

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "cur") {
    $uJoin->makeArrayCall($tToUsd->getInputLabel(), @data);
  } elsif ($type eq "pos") {
    $uJoin->makeArrayCall($tPosition->getInputLabel(), @data);
  } elsif ($type eq "day") { # set the business day
    $businessDay = $data[0] + 0; # convert to an int
  } elsif ($type eq "clear") { # clear the previous day
    # flush the left side first, because it's an outer join
    &clearByDate($tPosition, $ixtPositionByDate, $businessDay);
    &clearByDate($tToUsd, $ixtToUsdByDate, $businessDay);
  }
  $uJoin->drainFrame(); # just in case, for completeness
}

The roll-over to the next business day (after the input data previously shown with the left join example) then looks like this:

day,20120311
clear
join.leftLookup.out OP_DELETE date="20120310" customer="two"
    symbol="AAA" quantity="100" price="8" currency="GBP" toUsd="2.2"
join.leftLookup.out OP_DELETE date="20120310" customer="three"
    symbol="AAA" quantity="100" price="300" currency="RUR"
    toUsd="0.04"
join.leftLookup.out OP_DELETE date="20120310" customer="three"
    symbol="BBB" quantity="200" price="80" currency="GBP" toUsd="2.2"
join.leftLookup.out OP_DELETE date="20120310" customer="one"
    symbol="AAA" quantity="200" price="16" currency="USD" toUsd="1"

Clearing the left-side table before the right-side one is more efficient than the other way around, since this is a left outer join, and since it's an M:1 join. If the right-side table were cleared first, it would first update all the result records to change all the right-side fields in them to NULL, and then the clearing of the left-side table would finally delete these rows. Clearing the left side first removes this churn: it deletes all the rows from the result right away, and then when the right side is cleared, it still tries to look up the matching rows but finds nothing and produces no result. For an inner join the order would not matter: either one would produce the same amount of churn. For a full outer join, the M:1 consideration would come into play, and removing the rows from the left side first would still be more efficient. This way when it removes multiple position rows that match the same currency, all of them but one generate the simple DELETEs, and only the last one would follow up with an INSERT that has only the right-side data in it. That row with the right-side data will get deleted when the currency row gets deleted from the right side. If the right side were deleted first, deleting each row on the right side would cause an output of a DELETE-INSERT result pair for each of its matching position rows from the left side, and would produce more churn. For the 1:1 or M:M full outer joins, the order would not matter.

If you don't want these deletions to propagate though the rest of your model, you can just put a filtering logic after the join, to throw away all the modifications for the previous days. Through don't forget that you would have then to delete the previous-day data from the rest of the model's tables manually.

If you want to keep only the aggregated data, you may want to pass the join output to the aggregator without filtering and then filter the aggregator's output, thus stopping the updates to the aggregation results. You may even have a special logic in the aggregator, that would ignore the groups of the previous days. Such optimization of the aggregation filtering will be shown in the Section 13.1: “Time-limited propagation” . And they aren't any less efficient than filtering on the output of the join, because if you filter after the join, you'd still have to remove the rows from the aggregation table, and would still have to filter after the aggregation too.

Now, suppose that you want to be extra optimal and don't want any join look-ups to happen at all when you delete the old data. JoinTwo has a feature that lets you do that. You can make it receive the events not directly from the tables but after filtering, using the options leftFromLabel and rightFromLabel:

our $lbPositionCurrent = $uJoin->makeDummyLabel(
  $tPosition->getRowType, "lbPositionCurrent");
our $lbPositionFilter = $uJoin->makeLabel($tPosition->getRowType,
  "lbPositionFilter", undef, sub {
    if ($_[1]->getRow()->get("date") >= $businessDay) {
      $uJoin->call($lbPositionCurrent->adopt($_[1]));
    }
  });
$tPosition->getOutputLabel()->chain($lbPositionFilter);

our $lbToUsdCurrent = $uJoin->makeDummyLabel(
  $tToUsd->getRowType, "lbToUsdCurrent");
our $lbToUsdFilter = $uJoin->makeLabel($tToUsd->getRowType,
  "lbToUsdFilter", undef, sub {
    if ($_[1]->getRow()->get("date") >= $businessDay) {
      $uJoin->call($lbToUsdCurrent->adopt($_[1]));
    }
  });
$tToUsd->getOutputLabel()->chain($lbToUsdFilter);

our $join = Triceps::JoinTwo->new(
  name => "join",
  leftTable => $tPosition,
  leftFromLabel => $lbPositionCurrent,
  rightTable => $tToUsd,
  rightFromLabel => $lbToUsdCurrent,
  byLeft => [ "date", "currency" ],
  type => "left",
); # would confess by itself on an error

The same clearing now looks like this:

day,20120311
clear

No output is coming from the join whatsoever. It all gets cut off before it reaches the join. It's not such a great gain though. Remember that if you want to keep the aggregated data, you would still have to delete the original rows manually from the aggregation table afterwards. And the filtering logic will add overhead, not only during the clearing but all the time.

If you're not careful with the filtering conditions, it's also easy to make the results of the join inconsistent. This example filters both input tables on the same key field, with the same condition, so the output will stay always consistent. But if any of these elements were missing, it becomes possible to produce inconsistent output that has the DELETEs of different rows than INSERTs, and deletions of the rows that haven't been inserted in the first place. The reason is that even though the input events are filtered, the table look-ups done by JoinTwo aren't. If some row comes from the right side and gets thrown away by the filter, and then another row comes on the left side, passes the filter, and then finds a match in that thrown-away right-side row, it will use that row in the result. And the join would think that the right-side row has already been seen, and would produce an incorrect update.

So these options don't make a whole lot of a win but make a major opportunity for a mess, and probably should never be used. And will probably be deleted in the future, unless someone finds a good use for them. They have been added because at the time they provided a roundabout way to do a self-join. But the later fixes to the Table logic make the self-joins possible without this kind of perversions.

12.12. Self-join done with JoinTwo

The self-joins happen when a table is joined to itself. For an example of a model with self-joins, let's look at the Forex trading. People exchange the currencies in every possible direction in multiple markets. The Forex exchange rates are quoted for every pair of currencies, in every direction.

Naturally, if you exchange one currency into another and then back into the first one, you normally end up with less money than you've started with. The rest becomes the transaction cost and lines the pockets of the brokers, market makers and exchanges.

However once in a while some interesting things happen. If the exchange rates between the different currencies become disbalanced, you may be able to exchange the currency A for currency B for currency C and back for currency A, and end up with more money than you've started with. (You don't have to do it in sequence, you would normally do all three transactions in parallel). However it's a short-lived opportunity: as you perform the transactions, you'll be changing the involved exchange rates towards the balance, and you won't be the only one exploiting this opportunity, so you better act fast. This activity of bringing the market into balance while simultaneously extracting profit is called arbitration.

So let's make a model that will detect such arbitration opportunities, for the following automated execution. Mind you, it's all grossly simplified, but it shows the gist of it. And most importantly, it uses the self-joins. Here we go:

our $rtRate = Triceps::RowType->new( # an exchange rate between two currencies
  ccy1 => "string", # currency code
  ccy2 => "string", # currency code
  rate => "float64", # multiplier when exchanging ccy1 to ccy2
);

# all exchange rates
our $ttRate = Triceps::TableType->new($rtRate)
  ->addSubIndex("byCcy1",
    Triceps::IndexType->newHashed(key => [ "ccy1" ])
    ->addSubIndex("byCcy12",
      Triceps::IndexType->newHashed(key => [ "ccy2" ])
    )
  )
  ->addSubIndex("byCcy2",
    Triceps::IndexType->newHashed(key => [ "ccy2" ])
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
;
$ttRate->initialize();

our $uArb = Triceps::Unit->new("uArb");

our $tRate = $uArb->makeTable($ttRate, "tRate");

our $join1 = Triceps::JoinTwo->new(
  name => "join1",
  leftTable => $tRate,
  leftIdxPath => [ "byCcy2" ],
  leftFields => [ "ccy1", "ccy2", "rate/rate1" ],
  rightTable => $tRate,
  rightIdxPath => [ "byCcy1" ],
  rightFields => [ "ccy2/ccy3", "rate/rate2" ],
); # would die by itself on an error
our $ttJoin1 = Triceps::TableType->new($join1->getResultRowType())
  ->addSubIndex("byCcy123",
    Triceps::IndexType->newHashed(key => [ "ccy1", "ccy2", "ccy3" ])
  )
  ->addSubIndex("byCcy31",
    Triceps::IndexType->newHashed(key => [ "ccy3", "ccy1" ])
    ->addSubIndex("grouping", Triceps::IndexType->newFifo())
  )
;
$ttJoin1->initialize();
our $tJoin1 = $uArb->makeTable($ttJoin1, "tJoin1");
$join1->getOutputLabel()->chain($tJoin1->getInputLabel());

our $join2 = Triceps::JoinTwo->new(
  name => "join2",
  leftTable => $tJoin1,
  leftIdxPath => [ "byCcy31" ],
  rightTable => $tRate,
  rightIdxPath => [ "byCcy1", "byCcy12" ],
  rightFields => [ "rate/rate3" ],
  # the field ordering in the indexes is already right, but
  # for clarity add an explicit join condition too
  byLeft => [ "ccy3/ccy1", "ccy1/ccy2" ],
); # would die by itself on an error

# now compute the resulting circular rate and filter the profitable loops
our $rtResult = Triceps::RowType->new(
  $join2->getResultRowType()->getdef(),
  looprate => "float64",
);
my $lbResult = $uArb->makeDummyLabel($rtResult, "lbResult");
my $lbCompute = $uArb->makeLabel($join2->getResultRowType(), "lbCompute", undef, sub {
  my ($label, $rowop) = @_;
  my $row = $rowop->getRow();
  my $looprate = $row->get("rate1") * $row->get("rate2") * $row->get("rate3");

  if ($looprate > 1) {
    $uArb->makeHashCall($lbResult, $rowop->getOpcode(),
      $row->toHash(),
      looprate => $looprate,
    );
  } else {
      print("__", $rowop->printP(), "looprate=$looprate \n"); # for debugging
  }
});
$join2->getOutputLabel()->chain($lbCompute);

# label to print the changes to the detailed stats
makePrintLabel("lbPrint", $lbResult);
#makePrintLabel("lbPrintJoin1", $join1->getOutputLabel());
#makePrintLabel("lbPrintJoin2", $join2->getOutputLabel());

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "rate") {
    $uArb->makeArrayCall($tRate->getInputLabel(), @data);
  }
  $uArb->drainFrame(); # just in case, for completeness
}

The rate quotes will be coming into $tRate. The indexes are provided to both work with the self-joins and to have a primary index as the first leaf.

There are no special options for the self-join in JoinTwo: just use the same table for both the left and right side. The first join represents two exchange transactions, so it's done by matching the second currency of the first quote to the first currency of the second quote. The result contains three currency names and two rate multipliers. The second join adds one more rate multiplier, returning back to the first currency. Now to learn the effect of the circular conversion we only need to multiply all the multipliers. If it comes out below 1, the cycling transaction would return a loss, if above 1, a profit.

The label $lbCompute with Perl handler performs the multiplication, and if the result is over 1, passes the result to the next label $lbResult, from which then the data gets printed. I've also added a debugging printout in case if the row doesn't get through. That one starts with __ and helps seeing what goes on inside when no result is coming out.

Finally, the main loop reads the data and puts it into the rates table, thus driving the logic.

Now let's take a look at an example of a run, with interspersed commentary.

rate,OP_INSERT,EUR,USD,1.48
rate,OP_INSERT,USD,EUR,0.65
rate,OP_INSERT,GBP,USD,1.98
rate,OP_INSERT,USD,GBP,0.49

The rate quotes start coming in. Note that the rates are separate for each direction of exchange. So far nothing happens because there aren't enough quotes to complete a loop of three steps.

rate,OP_INSERT,EUR,GBP,0.74
__join2.leftLookup.out OP_INSERT ccy1="EUR" ccy2="GBP" rate1="0.74"
    ccy3="USD" rate2="1.98" rate3="0.65" looprate=0.95238
__join2.leftLookup.out OP_INSERT ccy1="USD" ccy2="EUR" rate1="0.65"
    ccy3="GBP" rate2="0.74" rate3="1.98" looprate=0.95238
__join2.rightLookup.out OP_INSERT ccy1="GBP" ccy2="USD" rate1="1.98"
    ccy3="EUR" rate2="0.65" rate3="0.74" looprate=0.95238
rate,OP_INSERT,GBP,EUR,1.30
__join2.leftLookup.out OP_INSERT ccy1="GBP" ccy2="EUR" rate1="1.3"
    ccy3="USD" rate2="1.48" rate3="0.49" looprate=0.94276
__join2.leftLookup.out OP_INSERT ccy1="USD" ccy2="GBP" rate1="0.49"
    ccy3="EUR" rate2="1.3" rate3="1.48" looprate=0.94276
__join2.rightLookup.out OP_INSERT ccy1="EUR" ccy2="USD" rate1="1.48"
    ccy3="GBP" rate2="0.49" rate3="1.3" looprate=0.94276

Now there are enough currencies in play to complete the loop. None of them get the loop rate over 1 though, so the only printouts are from the debugging logic. There are only two loops, but each of them is printed three times. Why? It's a loop, so you can start from each of its elements and come back to the same element. One row for each starting point. And the joins find all of them.

To find and eliminate the duplicates, the order of currencies in the rows can be rotated to put the alphabetically lowest currency code first. Note that they can't be just sorted because the relative order matters. Trading in the order GBP-USD-EUR will give a different result than GBP-EUR-USD. The relative order has to be preserved. I didn't put any such elimination into the example to keep it smaller.

rate,OP_DELETE,EUR,USD,1.48
__join2.leftLookup.out OP_DELETE ccy1="EUR" ccy2="USD" rate1="1.48"
    ccy3="GBP" rate2="0.49" rate3="1.3" looprate=0.94276
__join2.leftLookup.out OP_DELETE ccy1="GBP" ccy2="EUR" rate1="1.3"
    ccy3="USD" rate2="1.48" rate3="0.49" looprate=0.94276
__join2.rightLookup.out OP_DELETE ccy1="USD" ccy2="GBP" rate1="0.49"
    ccy3="EUR" rate2="1.3" rate3="1.48" looprate=0.94276
rate,OP_INSERT,EUR,USD,1.28
__join2.leftLookup.out OP_INSERT ccy1="EUR" ccy2="USD" rate1="1.28"
    ccy3="GBP" rate2="0.49" rate3="1.3" looprate=0.81536
__join2.leftLookup.out OP_INSERT ccy1="GBP" ccy2="EUR" rate1="1.3"
    ccy3="USD" rate2="1.28" rate3="0.49" looprate=0.81536
__join2.rightLookup.out OP_INSERT ccy1="USD" ccy2="GBP" rate1="0.49"
    ccy3="EUR" rate2="1.3" rate3="1.28" looprate=0.81536

Someone starts changing lots of euros for dollars, and the rate moves. No good news for us yet though.

rate,OP_DELETE,USD,EUR,0.65
__join2.leftLookup.out OP_DELETE ccy1="USD" ccy2="EUR" rate1="0.65"
    ccy3="GBP" rate2="0.74" rate3="1.98" looprate=0.95238
__join2.leftLookup.out OP_DELETE ccy1="GBP" ccy2="USD" rate1="1.98"
    ccy3="EUR" rate2="0.65" rate3="0.74" looprate=0.95238
__join2.rightLookup.out OP_DELETE ccy1="EUR" ccy2="GBP" rate1="0.74"
    ccy3="USD" rate2="1.98" rate3="0.65" looprate=0.95238
rate,OP_INSERT,USD,EUR,0.78
lbResult OP_INSERT ccy1="USD" ccy2="EUR" rate1="0.78" ccy3="GBP"
    rate2="0.74" rate3="1.98" looprate="1.142856"
lbResult OP_INSERT ccy1="GBP" ccy2="USD" rate1="1.98" ccy3="EUR"
    rate2="0.78" rate3="0.74" looprate="1.142856"
lbResult OP_INSERT ccy1="EUR" ccy2="GBP" rate1="0.74" ccy3="USD"
    rate2="1.98" rate3="0.78" looprate="1.142856"

The rate for dollars-to-euros follows its opposite. This creates an arbitration opportunity! Step two: trade in the direction USD-EUR-GBP-USD, step three: PROFIT!!!

rate,OP_DELETE,EUR,GBP,0.74
lbResult OP_DELETE ccy1="EUR" ccy2="GBP" rate1="0.74" ccy3="USD"
    rate2="1.98" rate3="0.78" looprate="1.142856"
lbResult OP_DELETE ccy1="USD" ccy2="EUR" rate1="0.78" ccy3="GBP"
    rate2="0.74" rate3="1.98" looprate="1.142856"
lbResult OP_DELETE ccy1="GBP" ccy2="USD" rate1="1.98" ccy3="EUR"
    rate2="0.78" rate3="0.74" looprate="1.142856"
rate,OP_INSERT,EUR,GBP,0.64
__join2.leftLookup.out OP_INSERT ccy1="EUR" ccy2="GBP" rate1="0.64"
    ccy3="USD" rate2="1.98" rate3="0.78" looprate=0.988416
__join2.leftLookup.out OP_INSERT ccy1="USD" ccy2="EUR" rate1="0.78"
    ccy3="GBP" rate2="0.64" rate3="1.98" looprate=0.988416
__join2.rightLookup.out OP_INSERT ccy1="GBP" ccy2="USD" rate1="1.98"
    ccy3="EUR" rate2="0.78" rate3="0.64" looprate=0.988416

Our trading (and perhaps other people's trading too) moves the exchange rate of euros to pounds. And with that the balance of currencies is restored, and the arbitration opportunity disappears.

Now let's have a look inside JoinTwo. What is so special about the self-join? Normally the join works on two separate tables. They get updated one at a time. Even if some common reason causes both tables to be updated, the update arrives from one table first. The join sees this incoming update, looks in the unchanged second table, produces an updated result. Then the update from the second table comes to the join, which takes it, looks in the already modified first table, and produces another updated result.

If both inputs are from the same table, this logic breaks. Two copies of the updates will arrive, but by the time the first one arrives, the contents of the table has been already changed. When the join looks in the table, it gets the unexpected results and creates a mess.

But JoinTwo has a fix for this. It makes use of the Pre label of the table for its left-side update (the right side would have worked just as good, it's just a random choice):

  my $selfJoin = $self->{leftTable}->same($self->{rightTable});
  if ($selfJoin && !defined $self->{leftFromLabel}) {
    # one side must be fed from Pre label (but still let the user override)
    $self->{leftFromLabel} = $self->{leftTable}->getPreLabel();
  }

This way when the join sees the first update, the table hasn't changed yet. And then the second copy of that update comes though the normal output label, after the table has been modified. Everything just works out as normal and the self-joins produce the correct result.

Normally you don't need to concern yourself with this, except if you're trying to filter the data coming to the join. Then remember that for leftFromLabel you have to receive the data from the table's getPreLabel(), not getOutputLabel().

12.13. Self-join done manually

In many cases the self-joins are better suited to be done by the manual looping through the data. This is especially true if the table represents a tree, linked by the parent-child node id and the processing has to navigate through the tree. Indeed, if the tree may be of an arbitrary depth, there is no way to handle if with the common joins, you will need as many joins as the depth of the tree (through there are some SQL extensions for the recursive self-joins).

The arbitration example can also be conveniently rewritten through the manual loops. The input row type, table type, table, unit, and the main loop do not change, so I won't copy them the second time. The rest of the code is:

our $rtResult = Triceps::RowType->new(
  ccy1 => "string", # currency code
  ccy2 => "string", # currency code
  ccy3 => "string", # currency code
  rate1 => "float64",
  rate2 => "float64",
  rate3 => "float64",
  looprate => "float64",
);
my $ixtCcy1 = $ttRate->findSubIndex("byCcy1");
my $ixtCcy12 = $ixtCcy1->findSubIndex("byCcy12");

my $lbResult = $uArb->makeDummyLabel($rtResult, "lbResult");
my $lbCompute = $uArb->makeLabel($rtRate, "lbCompute", undef, sub {
  my ($label, $rowop) = @_;
  my $row = $rowop->getRow();
  my $ccy1 = $row->get("ccy1");
  my $ccy2 = $row->get("ccy2");
  my $rate1 = $row->get("rate");

  my $rhi = $tRate->findIdxBy($ixtCcy1, ccy1 => $ccy2);
  my $rhiEnd = $rhi->nextGroupIdx($ixtCcy12);
  for (; !$rhi->same($rhiEnd); $rhi = $rhi->nextIdx($ixtCcy12)) {
    my $row2 = $rhi->getRow();
    my $ccy3 = $row2->get("ccy2");
    my $rate2 = $row2->get("rate");

    my $rhj = $tRate->findIdxBy($ixtCcy12, ccy1 => $ccy3, ccy2 => $ccy1);
    # it's a leaf primary index, so there may be no more than one match
    next
      if ($rhj->isNull());
    my $row3 = $rhj->getRow();
    my $rate3 = $row3->get("rate");
    my $looprate = $rate1 * $rate2 * $rate3;

    # now build the row in normalized order of currencies
    print("____Order before: $ccy1, $ccy2, $ccy3\n");
    my $result;
    if ($ccy2 lt $ccy3) {
      if ($ccy2 lt $ccy1) { # rotate left
        $result = $lbResult->makeRowopHash($rowop->getOpcode(),
          ccy1 => $ccy2,
          ccy2 => $ccy3,
          ccy3 => $ccy1,
          rate1 => $rate2,
          rate2 => $rate3,
          rate3 => $rate1,
          looprate => $looprate,
        );
      }
    } else {
      if ($ccy3 lt $ccy1) { # rotate right
        $result = $lbResult->makeRowopHash($rowop->getOpcode(),
          ccy1 => $ccy3,
          ccy2 => $ccy1,
          ccy3 => $ccy2,
          rate1 => $rate3,
          rate2 => $rate1,
          rate3 => $rate2,
          looprate => $looprate,
        );
      }
    }
    if (!defined $result) { # use the straight order
      $result = $lbResult->makeRowopHash($rowop->getOpcode(),
        ccy1 => $ccy1,
        ccy2 => $ccy2,
        ccy3 => $ccy3,
        rate1 => $rate1,
        rate2 => $rate2,
        rate3 => $rate3,
        looprate => $looprate,
      );
    }
    if ($looprate > 1) {
      $uArb->call($result);
    } else {
      print("__", $result->printP(), "\n"); # for debugging
    }
  }
});
$tRate->getOutputLabel()->chain($lbCompute);
makePrintLabel("lbPrint", $lbResult);

Whenever a new rowop is processed in the table, it goes to the label $lbCompute. The row in this rowop is the first leg of the triangle. The loop then finds all the possible second legs that can be connected to the first leg. And then for each second leg it checks whether it can make the third leg back to the original currency. If it can, good, we've found a candidate for a result row.

The way the loops work, this time there is no triplication. But the same triangle still can be found starting from any of its three currencies. This means that to keep the data consistent, no matter what was the first currency in a particular run, it still must produce the exact same result row. To achieve that, the currencies get rotated as explained in the previous section, making sure that the first currency is has the lexically smallest name. These if-else statements do that by selecting the direction of rotation (if any) and build the result record in one of three ways.

Finally it compares the combined rate to 1, and if greater then sends the result. If not, a debugging printout starting with __ prints the row, so that is can be seen. Another debugging printout prints the original order of the currencies, letting us check that the rotation was performed correctly.

On feeding the same input data this code produces the result:

rate,OP_INSERT,EUR,USD,1.48
rate,OP_INSERT,USD,EUR,0.65
rate,OP_INSERT,GBP,USD,1.98
rate,OP_INSERT,USD,GBP,0.49
rate,OP_INSERT,EUR,GBP,0.74
____Order before: EUR, GBP, USD
__lbResult OP_INSERT ccy1="EUR" ccy2="GBP" ccy3="USD" rate1="0.74"
    rate2="1.98" rate3="0.65" looprate="0.95238"
rate,OP_INSERT,GBP,EUR,1.30
____Order before: GBP, EUR, USD
__lbResult OP_INSERT ccy1="EUR" ccy2="USD" ccy3="GBP" rate1="1.48"
    rate2="0.49" rate3="1.3" looprate="0.94276"
rate,OP_DELETE,EUR,USD,1.48
____Order before: EUR, USD, GBP
__lbResult OP_DELETE ccy1="EUR" ccy2="USD" ccy3="GBP" rate1="1.48"
    rate2="0.49" rate3="1.3" looprate="0.94276"
rate,OP_INSERT,EUR,USD,1.28
____Order before: EUR, USD, GBP
__lbResult OP_INSERT ccy1="EUR" ccy2="USD" ccy3="GBP" rate1="1.28"
    rate2="0.49" rate3="1.3" looprate="0.81536"
rate,OP_DELETE,USD,EUR,0.65
____Order before: USD, EUR, GBP
__lbResult OP_DELETE ccy1="EUR" ccy2="GBP" ccy3="USD" rate1="0.74"
    rate2="1.98" rate3="0.65" looprate="0.95238"
rate,OP_INSERT,USD,EUR,0.78
____Order before: USD, EUR, GBP
lbResult OP_INSERT ccy1="EUR" ccy2="GBP" ccy3="USD" rate1="0.74"
    rate2="1.98" rate3="0.78" looprate="1.142856"
rate,OP_DELETE,EUR,GBP,0.74
____Order before: EUR, GBP, USD
lbResult OP_DELETE ccy1="EUR" ccy2="GBP" ccy3="USD" rate1="0.74"
    rate2="1.98" rate3="0.78" looprate="1.142856"
rate,OP_INSERT,EUR,GBP,0.64
____Order before: EUR, GBP, USD
__lbResult OP_INSERT ccy1="EUR" ccy2="GBP" ccy3="USD" rate1="0.64"
    rate2="1.98" rate3="0.78" looprate="0.988416"

It's the same result as before, only without the triplicates. And you can see that the rotation logic works right. The manual self-joining has produced the result without triplicates, without an intermediate table, and for me writing and understanding its logic is much easier than with the proper joins. I'd say that the manual self-join is a winner in every respect.

An interesting thing is that this manual logic produces the same result independently of whether it's connected to the Output or Pre label of the table. Try changing it, it works the same. This is because the original row is taken directly from the input rowop, and never participates in the join again; it's never read from the table by any of the loops. If it were read again from the table by the loops, the table connection would matter. And the correct one would be fairly weird: the INSERT rowops would have to be processed coming from the Output label, the DELETE rowops coming from the Pre label.

This is because the row has to be in the table to be found. And for an INSERT the row gets there only after it goes through the table and comes out on the Output label. But for a DELETE the row would get already deleted from the table by that time. Instead it has to be handled before that, on the Pre label, when the table only prepares to delete it.

If you look at the version with JoinTwo, that's also how an inner self-join works. Since it's an inner join, both rows on both sides must be present to produce a result. An INSERT first arrives from the Pre label on the left side, doesn't find itself in the table, and produces no result (again, we're talking here about the situation when a row has to get joined to itself; it might well find the other pairs for itself and produce a result for them but not for itself joined with itself). Then it arrives the second time from the Output label on the right side. Now it looks in the table, and finds itself, and produces the result (an INSERT coming form the join). A DELETE also first arrives from the Pre label on the left side. It finds its copy in the table and produces the result (a DELETE coming from the join). When the second copy of the row arrives from the Output label on the right side, it doesn't find its copy in the table any more, and produces nothing. In the end it's the same thing, an INSERT comes out of the join triggered by the table Output label, a DELETE comes out of the join triggered by the table Pre label. It's not a whimsy, it's caused by the requirements of the correctness. The manual self-join would have to mimic this order to produce the correct result. In such a situation perhaps JoinTwo would be easier to use than doing things manually.

12.14. Self-join done with a LookupJoin

The experience with the manual join has made me think about using a similar approach to avoid triplication of the data in the version with join templates. And after some false-starts, I've realized that what that version needs is the LookupJoins. They replace the loops. So, one more version is:

our $join1 = Triceps::LookupJoin->new(
  name => "join1",
  leftFromLabel => $tRate->getOutputLabel(),
  leftFields => [ "ccy1", "ccy2", "rate/rate1" ],
  rightTable => $tRate,
  rightIdxPath => [ "byCcy1" ],
  rightFields => [ "ccy2/ccy3", "rate/rate2" ],
  byLeft => [ "ccy2/ccy1" ],
  isLeft => 0,
); # would die by itself on an error

our $join2 = Triceps::LookupJoin->new(
  name => "join2",
  leftFromLabel => $join1->getOutputLabel(),
  rightTable => $tRate,
  rightIdxPath => [ "byCcy1", "byCcy12" ],
  rightFields => [ "rate/rate3" ],
  byLeft => [ "ccy3/ccy1", "ccy1/ccy2" ],
  isLeft => 0,
); # would die by itself on an error

# now compute the resulting circular rate and filter the profitable loops
our $rtResult = Triceps::RowType->new(
  $join2->getResultRowType()->getdef(),
  looprate => "float64",
);
my $lbResult = $uArb->makeDummyLabel($rtResult, "lbResult");
my $lbCompute = $uArb->makeLabel($join2->getResultRowType(), "lbCompute", undef, sub {
  my ($label, $rowop) = @_;
  my $row = $rowop->getRow();

  my $ccy1 = $row->get("ccy1");
  my $ccy2 = $row->get("ccy2");
  my $ccy3 = $row->get("ccy3");
  my $rate1 = $row->get("rate1");
  my $rate2 = $row->get("rate2");
  my $rate3 = $row->get("rate3");
  my $looprate = $rate1 * $rate2 * $rate3;

  # now build the row in normalized order of currencies
  print("____Order before: $ccy1, $ccy2, $ccy3\n");
  my $result;
  if ($ccy2 lt $ccy3) {
    if ($ccy2 lt $ccy1) { # rotate left
      $result = $lbResult->makeRowopHash($rowop->getOpcode(),
        ccy1 => $ccy2,
        ccy2 => $ccy3,
        ccy3 => $ccy1,
        rate1 => $rate2,
        rate2 => $rate3,
        rate3 => $rate1,
        looprate => $looprate,
      );
    }
  } else {
    if ($ccy3 lt $ccy1) { # rotate right
      $result = $lbResult->makeRowopHash($rowop->getOpcode(),
        ccy1 => $ccy3,
        ccy2 => $ccy1,
        ccy3 => $ccy2,
        rate1 => $rate3,
        rate2 => $rate1,
        rate3 => $rate2,
        looprate => $looprate,
      );
    }
  }
  if (!defined $result) { # use the straight order
    $result = $lbResult->makeRowopHash($rowop->getOpcode(),
      ccy1 => $ccy1,
      ccy2 => $ccy2,
      ccy3 => $ccy3,
      rate1 => $rate1,
      rate2 => $rate2,
      rate3 => $rate3,
      looprate => $looprate,
    );
  }
  if ($looprate > 1) {
    $uArb->call($result);
  } else {
    print("__", $result->printP(), "\n"); # for debugging
  }
});
$join2->getOutputLabel()->chain($lbCompute);

It produces the exact same result as the version with the manual loops, with the only minor difference of the field order in the result rows.

And, in retrospect, I should have probably made a function for the row rotation, so that I would not have to copy that code here.

Well, it works the same as the version with the loops and maybe even looks a little bit neater, but in practice it's much harder to write, debug and understand. The caveat for the situation where the incoming row might participate in the join the second time applies to this version of the code as well. The same thing about the Pre and Output labels would have to be done, resulting in four LookupJoins instead of two. Each pair would become a manually-built analog of JoinTwo, and probably it's easier to use a JoinTwo to start with.

12.15. A glimpse inside JoinTwo and the hidden options of LookupJoin

The internals of JoinTwo provide an interesting example of a template that builds upon other template (LookupJoin). For a while JoinTwo was compact and straightforward, and easy to demonstrate. Then it has grown all these extra features, options and error checks, and became quite complicated. So I'll show only the selected portions of the JoinTwo constructor, with the gist of its functionality:

...
  my $selfJoin = $self->{leftTable}->same($self->{rightTable});
  if ($selfJoin && !defined $self->{leftFromLabel}) {
    # one side must be fed from Pre label (but still let the user override)
    $self->{leftFromLabel} = $self->{leftTable}->getPreLabel();
  }
...

  my ($leftLeft, $rightLeft);
  if ($self->{type} eq "inner") {
    $leftLeft = 0;
    $rightLeft = 0;
  } elsif ($self->{type} eq "left") {
    $leftLeft = 1;
    $rightLeft = 0;
  } elsif ($self->{type} eq "right") {
    $leftLeft = 0;
    $rightLeft = 1;
  } elsif ($self->{type} eq "outer") {
    $leftLeft = 1;
    $rightLeft = 1;
  } else {
    Carp::confess("Unknown value '" . $self->{type} . "' of option 'type', must be one of inner|left|right|outer");
  }

  $self->{leftRowType} = $self->{leftTable}->getRowType();
  $self->{rightRowType} = $self->{rightTable}->getRowType();
...

  for my $side ( ("left", "right") ) {
    if (defined $self->{"${side}FromLabel"}) {
...
    } else {
      $self->{"${side}FromLabel"} = $self->{"${side}Table"}->getOutputLabel();
    }

    my @keys;
    ($self->{"${side}IdxType"}, @keys) = $self->{"${side}Table"}->getType()->findIndexKeyPath(@{$self->{"${side}IdxPath"}});
    # would already confess if the index is not found

    if (!$self->{overrideSimpleMinded}) {
      if (!$self->{"${side}IdxType"}->isLeaf()

      && ($self->{type} ne "inner" && $self->{type} ne $side) ) {
        my $table = $self->{"${side}Table"};
        my $ixt = $self->{"${side}IdxType"};
        if ($selfJoin && $side eq "left") {
          # the special case, reading from the table's Pre label;
          # must adjust the count for what will happen after the row gets processed
          $self->{"${side}GroupSizeCode"} = sub { # (opcode, row)
            if (&Triceps::isInsert($_[0])) {
              $table->groupSizeIdx($ixt, $_[1])+1;
            } else {
              $table->groupSizeIdx($ixt, $_[1])-1;
            }
          };
        } else {
          $self->{"${side}GroupSizeCode"} = sub { # (opcode, row)
            $table->groupSizeIdx($ixt, $_[1]);
          };
        }
      }
    }

...
  my $fieldsMirrorKey = 1;
  my $uniq = $self->{fieldsUniqKey};
  if ($uniq eq "first") {
    $uniq = $self->{fieldsLeftFirst} ? "left" : "right";
  }
  if ($uniq eq "none") {
    $fieldsMirrorKey = 0;
  } elsif ($uniq eq "manual") {
    # nothing to do
  } elsif ($uniq =~ /^(left|right)$/) {
    my($side, @keys);
    if ($uniq eq "left") {
      $side = "right";
      @keys = @rightkeys;
    } else {
      $side = "left";
      @keys = @leftkeys;
    }
    if (!defined $self->{"${side}Fields"}) {
      $self->{"${side}Fields"} = [ ".*" ]; # the implicit pass-all
    }
    unshift(@{$self->{"${side}Fields"}}, map("!$_", @keys) );
  } else {
    Carp::confess("Unknown value '" . $self->{fieldsUniqKey} . "' of option 'fieldsUniqKey', must be one of none|manual|left|right|first");
  }

  # now create the LookupJoins
  $self->{leftLookup} = Triceps::LookupJoin->new(
    unit => $self->{unit},
    name => $self->{name} . ".leftLookup",
    leftRowType => $self->{leftRowType},
    rightTable => $self->{rightTable},
    rightIdxPath => $self->{rightIdxPath},
    leftFields => $self->{leftFields},
    rightFields => $self->{rightFields},
    fieldsLeftFirst => $self->{fieldsLeftFirst},
    fieldsMirrorKey => $fieldsMirrorKey,
    by => \@leftby,
    isLeft => $leftLeft,
    automatic => 1,
    oppositeOuter => ($rightLeft && !$self->{overrideSimpleMinded}),
    groupSizeCode => $self->{leftGroupSizeCode},
    saveJoinerTo => $self->{leftSaveJoinerTo},
  );
  $self->{rightLookup} = Triceps::LookupJoin->new(
    unit => $self->{unit},
    name => $self->{name} . ".rightLookup",
    leftRowType => $self->{rightRowType},
    rightTable => $self->{leftTable},
    rightIdxPath => $self->{leftIdxPath},
    leftFields => $self->{rightFields},
    rightFields => $self->{leftFields},
    fieldsLeftFirst => !$self->{fieldsLeftFirst},
    fieldsMirrorKey => $fieldsMirrorKey,
    by => \@rightby,
    isLeft => $rightLeft,
    automatic => 1,
    oppositeOuter => ($leftLeft && !$self->{overrideSimpleMinded}),
    groupSizeCode => $self->{rightGroupSizeCode},
    saveJoinerTo => $self->{rightSaveJoinerTo},
  );

  # create the output label
  $self->{outputLabel} = $self->{unit}->makeDummyLabel($self->{leftLookup}->getResultRowType(), $self->{name} . ".out");

  # and connect them together
  $self->{leftFromLabel}->chain($self->{leftLookup}->getInputLabel());
  $self->{rightFromLabel}->chain($self->{rightLookup}->getInputLabel());
  $self->{leftLookup}->getOutputLabel()->chain($self->{outputLabel});
  $self->{rightLookup}->getOutputLabel()->chain($self->{outputLabel});

In the end it boils down to two LookupJoins, with the options computed from the JoinTwo's options. But you might notice that there are a few LookupJoin options that haven't been described before.

Despite the title of the section, these options aren't really hidden, just they aren't particularly useful unless you want to use a LookupJoin as a part of a multi-sided join, like JoinTwo does. It's even hard to explain what do they do without explaining the JoinTwo first. If you're not interested in such details, you can as well skip them.

So, setting

  oppositeOuter => 1,

tells that this LookupJoin is a part of an outer join, with the opposite side (right side, for this LookupJoin) being an outer one (well, this side might be outer too if isLeft => 1, but that's a whole separate question). This enables the logic that checks whether the row inserted here is the first one that matches a row in the right-side table, and whether the row deleted here was the last one that matches. If the condition is satisfied, not a simple INSERT or DELETE rowop is produced but a correct DELETE-INSERT pair that replaces the old state with the new one. It has been described in detail in Section 12.8: “JoinTwo joins two tables” .

But how does it know whether the current row if the first one or last one or neither? After all, LookupJoin doesn't have any access to the left-side table.

It has two ways to know. First, by default it simply assumes that it's an one-to-something (1:1 or 1:M) join. Then there may be no more than one matching row on this side, and every row inserted is the first one, and every row deleted is the last one. Then it does the DELETE-INSERT trick every time.

Second, the option

  groupSizeCode => \&groupSizeComputation,

can be used to compute the current group size for the current row. It provides a function that does the computation and gets called as

$gsz = &{$self->{groupSizeCode}}($opcode, $row);

Note that it doesn't get the table reference nor the index type reference as arguments, so it has to be a closure with the references compiled into it. JoinTwo does it with the definition

sub { # (opcode, row)
  $table->groupSizeIdx($ixt, $_[1]);
}

Why not just pass the table and index type references to JoinTwo and let it do the same computation without the mess of the closure references? Because the group size computation may need to be different. When the JoinTwo does a self-join, it feeds the left side from the table's Pre label, and the normal group size computation would be incorrect because the rowop didn't get applied to the table yet. Instead it has to predict what will happen when the rowop will get applied:

sub { # (opcode, row)
  if (&Triceps::isInsert($_[0])) {
    $table->groupSizeIdx($ixt, $_[1])+1;
  } else {
    $table->groupSizeIdx($ixt, $_[1])-1;
  }
}

If you set the option groupSizeCode to undef, that's the default value that triggers the one-to-something behavior.

The option

  fieldsMirrorKey => 1,

has been already described. It enables another magic behavior: mirroring the values of key fields to both sides before they are used to produce the result row. This is the heavy machinery that underlies the JoinTwo's high-level option fieldsUniqKey. But it hasn't been described yet that the mirroring goes both ways: If this is a left join and no matching row is found on the right, the values of the key fields will be copied from the left to the right. If the option oppositeOuter is set and causes a row with the empty left side to be produced as a part of DELETE-INSERT pair, the key fields will be copied from the right to the left.

Chapter 13. Time processing

13.1. Time-limited propagation

When aggregating data, often the results of the aggregation stay relevant longer than the original data.

For example, in the financials the data gets collected and aggregated for the current business day. After the day is closed, the day's detailed data are not interesting any more, and can be deleted in preparation for the next day. However the daily results stay interesting for a long time, and may even be archived for years.

This is not limited to the financials. A long time ago, in the times of slow and expensive Internet connections, I've done a traffic accounting system. It did the same: as the time went by, less and less detail was kept about the traffic usage. The modern accounting of the click-through advertisement also works in a similar way.

An easy way to achieve this result is to put a filter on the way of the aggregation results. It would compare the current idea of time and the time in the rows going by, and throw away the rows that are too old. This can be done as a label that gets the data from the aggregator and then forwards or doesn't forward the data to the real destination, and has been already shown. This solves the propagation problem but as the obsolete original data gets deleted, the aggregator will still be churning and producing the updates, only to have them thrown away at the filter. A more efficient way is to stop the churn by placing the filter right into the aggregator.

The next example demonstrates such an aggregator, in a simplified version of that traffic accounting system that I've once done. The example is actually about more than just stopping the data propagation. That stopping accounts for about three lines in it. But I also want to show a simple example of traffic accounting as such. And to show that the lack of the direct time support in Triceps does not stop you from doing any time-based processing. Because of this I'll show the whole example and not just snippets from it. But since the example is biggish, I'll paste it into the text in pieces with commentaries for each piece.

our $uTraffic = Triceps::Unit->new("uTraffic");

# one packet's header
our $rtPacket = Triceps::RowType->new(
  time => "int64", # packet's timestamp, microseconds
  local_ip => "string", # string to make easier to read
  remote_ip => "string", # string to make easier to read
  local_port => "int32",
  remote_port => "int32",
  bytes => "int32", # size of the packet
);

# an hourly summary
our $rtHourly = Triceps::RowType->new(
  time => "int64", # hour's timestamp, microseconds
  local_ip => "string", # string to make easier to read
  remote_ip => "string", # string to make easier to read
  bytes => "int64", # bytes sent in an hour
);

The router to the ISP forwards us the packet header information from all the packets that go though the outside link. The local_ip is always the address of a machine on our network, remote_ip outside our network, no matter in which direction the packet went. With a slow and expensive connection, we want to know two things: First, that the provider's billing at the end of the month is correct. Second, to be able to find out the high traffic users, when was the traffic used, and then maybe look at the remote addresses and decide whether that traffic was used for the business purposes or not. This example goes up to aggregation of the hourly summaries and then stops, since the further aggregation by days and months is straightforward to do.

If there is no traffic for a while, the router is expected to periodically communicate its changing idea of time as the same kind of records but with the non-timestamp fields as NULLs. That by the way is the right way to communicate the time-based information between two machines: do not rely on any local synchronization and timeouts but have the master send the periodic time updates to the slave even if it has no data to send. The logic is then driven by the time reported by the master. A nice side effect is that the logic can also easily be replayed later, using these timestamps and without any concern of the real time. If there are multiple masters, the slave would have to order the data coming from them according to the timestamps, thus synchronizing them together.

The hourly data drops the port information, and sums up the traffic between two addresses in the hour. It still has the timestamp but now this timestamp is rounded to the start of the hour:

# compute an hour-rounded timestamp
sub hourStamp # (time)
{
  return $_[0]  - ($_[0] % (1000*1000*3600));
}

Next, to the aggregation. The SimpleAggregator has no provision for filtering in it, the aggregation has to be done raw.

# the current hour stamp that keeps being updated
our $currentHour;

# aggregation handler: recalculate the summary for the last hour
sub computeHourly # (table, context, aggop, opcode, rh, state, args...)
{
  my ($table, $context, $aggop, $opcode, $rh, $state, @args) = @_;
  our $currentHour;

  # don't send the NULL record after the group becomes empty
  return if ($context->groupSize()==0
    || $opcode == &Triceps::OP_NOP);

  my $rhFirst = $context->begin();
  my $rFirst = $rhFirst->getRow();
  my $hourstamp = &hourStamp($rFirst->get("time"));

  return if ($hourstamp < $currentHour);

  if ($opcode == &Triceps::OP_DELETE) {
    $context->send($opcode, $$state);
    return;
  }

  my $bytes = 0;
  for (my $rhi = $rhFirst; !$rhi->isNull();
      $rhi = $context->next($rhi)) {
    $bytes += $rhi->getRow()->get("bytes");
  }

  my $res = $context->resultType()->makeRowHash(
    time => $hourstamp,
    local_ip => $rFirst->get("local_ip"),
    remote_ip => $rFirst->get("remote_ip"),
    bytes => $bytes,
  );
  ${$state} = $res;
  $context->send($opcode, $res);
}

sub initHourly #  (@args)
{
  my $refvar;
  return \$refvar;
}

The aggregation doesn't try to optimize by being additive, to keep the example simpler. The model keeps the notion of the current hour. As soon as the hour stops being current, the aggregation for it stops. The result of that aggregation will then be kept unchanged in the hourly result table, no matter what happens to the original data.

The tables are defined and connected thusly:

# the full stats for the recent time
our $ttPackets = Triceps::TableType->new($rtPacket)
  ->addSubIndex("byHour",
    Triceps::IndexType->newPerlSorted("byHour", undef, sub {
      return &hourStamp($_[0]->get("time")) <=> &hourStamp($_[1]->get("time"));
    })
    ->addSubIndex("byIP",
      Triceps::IndexType->newHashed(key => [ "local_ip", "remote_ip" ])
      ->addSubIndex("group",
        Triceps::IndexType->newFifo()
        ->setAggregator(Triceps::AggregatorType->new(
          $rtHourly, "aggrHourly", \&initHourly, \&computeHourly)
        )
      )
    )
  )
;

$ttPackets->initialize();
our $tPackets = $uTraffic->makeTable($ttPackets, "tPackets");

# the aggregated hourly stats, kept longer
our $ttHourly = Triceps::TableType->new($rtHourly)
  ->addSubIndex("byAggr",
    Triceps::SimpleOrderedIndex->new(
      time => "ASC", local_ip => "ASC", remote_ip => "ASC")
  )
;

$ttHourly->initialize();
our $tHourly = $uTraffic->makeTable($ttHourly, "tHourly");

# connect the tables
$tPackets->getAggregatorLabel("aggrHourly")->chain($tHourly->getInputLabel());

The table of incoming packets has a 3-level index: it starts with being sorted by the hour part of the timestamp, then goes by the ip addresses to complete the aggregation key, and then a FIFO for each aggregation group. Arguably, maybe it would have been better to include the ip addresses straight into the top-level sorting index, I don't know, and it doesn't seem worth measuring. The top-level ordering by the hour is important, it will be used to delete the rows that have become old.

The table of hourly aggregated stats uses the same kind of index, only now there is no need for a FIFO because there is only one row per this key. And the timestamp is already rounded to the hour right in the rows, so a SimpleOrderedIndex can be used without writing a manual comparison function, and the ip fields have been merged into it too.

The output of the aggregator on the packets table is connected to the input of the hourly table.

# label to print the changes to the detailed stats
makePrintLabel("lbPrintPackets", $tPackets->getOutputLabel());
# label to print the changes to the hourly stats
makePrintLabel("lbPrintHourly", $tHourly->getOutputLabel());

# dump a table's contents
sub dumpTable # ($table)
{
  my $table = shift;
  for (my $rhit = $table->begin(); !$rhit->isNull(); $rhit = $rhit->next()) {
    print($rhit->getRow()->printP(), "\n");
  }
}

# how long to keep the detailed data, hours
our $keepHours = 2;

# flush the data older than $keepHours from $tPackets
sub flushOldPackets
{
  my $earliest = $currentHour - $keepHours * (1000*1000*3600);
  my $next;
  # the default iteration of $tPackets goes in the hour stamp order
  for (my $rhit = $tPackets->begin(); !$rhit->isNull(); $rhit = $next) {
    last if (&hourStamp($rhit->getRow()->get("time")) >= $earliest);
    $next = $rhit->next(); # advance before removal
    $tPackets->remove($rhit);
  }
}

The print labels generate the debugging output that shows what is going on with both tables. Next go a couple of helper functions.

The dumpTable() is a straightforward iteration through a table and print. It can be used on any table, printP() takes care of any differences.

The flushing goes through the packets table and deletes the rows that belong to an older hour than the current one or $keepHours before it. For this to work right, the rows must go in the order of the hour stamps, which the outer index byHour takes care of.

All the time-related logic expects that the time never goes backwards. This is a simplification to make the example shorter, a production code can not assume this.

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "new") {
    my $rowop = $tPackets->getInputLabel()->makeRowopArray(@data);
    # update the current notion of time (simplistic)
    $currentHour = &hourStamp($rowop->getRow()->get("time"));
    if (defined($rowop->getRow()->get("local_ip"))) {
      $uTraffic->call($rowop);
    }
    &flushOldPackets(); # flush the packets
    $uTraffic->drainFrame(); # just in case, for completeness
  } elsif ($type eq "dumpPackets") {
    &dumpTable($tPackets);
  } elsif ($type eq "dumpHourly") {
    &dumpTable($tHourly);
  }
}

The final part is the main loop. The input comes in the CSV form as a command followed by more data. If the command is new then the data is the opcode and data fields, as it would be sent by the router. The commands dumpPackets and dumpHourly are used to print the contents of the tables, to see, what is going on in them.

In an honest implementation there would be a separate label that would differentiate between a reported packet and just a time update from the router. Here for simplicity this logic is placed right into the main loop. On each input record it updates the model's idea of the current timestamp, then if there is a packet data, it gets processed, and finally the rows that have become too old for the new timestamp get flushed.

Now a run of the model. Its printout is also broken up into the separately commented pieces. Of course, it's not like a real run, it just contains one or two packets per hour to show how things work.

new,OP_INSERT,1330886011000000,1.2.3.4,5.6.7.8,2000,80,100
tPackets.out OP_INSERT time="1330886011000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="100"
tHourly.out OP_INSERT time="1330884000000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="100"
new,OP_INSERT,1330886012000000,1.2.3.4,5.6.7.8,2000,80,50
tHourly.out OP_DELETE time="1330884000000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="100"
tPackets.out OP_INSERT time="1330886012000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="50"
tHourly.out OP_INSERT time="1330884000000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="150"

The two input rows in the first hour refer to the same connection, so they go into the same group and get aggregated together in the hourly table. The rows for the current hour in the hourly table get updated immediately as more data comes in. The tHourly.out OP_DELETE comes out even before tPackets.out OP_INSERT because it's driven by the output of the aggregator on $tPackets, and the operation AO_BEFORE_MOD on the aggregator that drives the deletion is executed before $tPackets gets modified.

new,OP_INSERT,1330889811000000,1.2.3.4,5.6.7.8,2000,80,300
tPackets.out OP_INSERT time="1330889811000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="300"
tHourly.out OP_INSERT time="1330887600000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="300"

Only one packet arrives in the next hour.

new,OP_INSERT,1330894211000000,1.2.3.5,5.6.7.9,3000,80,200
tPackets.out OP_INSERT time="1330894211000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" local_port="3000" remote_port="80" bytes="200"
tHourly.out OP_INSERT time="1330891200000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" bytes="200"
new,OP_INSERT,1330894211000000,1.2.3.4,5.6.7.8,2000,80,500
tPackets.out OP_INSERT time="1330894211000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="500"
tHourly.out OP_INSERT time="1330891200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="500"

And two more packets in the next hour. They are for the different connections, so they do not get summed together in the aggregation. When the hour changes again, the old data will start being deleted (because of $keepHours = 2, which ends up keeping the current hour and two before it), so let's take a snapshot of the tables' contents.

dumpPackets
time="1330886011000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    local_port="2000" remote_port="80" bytes="100"
time="1330886012000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    local_port="2000" remote_port="80" bytes="50"
time="1330889811000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    local_port="2000" remote_port="80" bytes="300"
time="1330894211000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    local_port="2000" remote_port="80" bytes="500"
time="1330894211000000" local_ip="1.2.3.5" remote_ip="5.6.7.9"
    local_port="3000" remote_port="80" bytes="200"
dumpHourly
time="1330884000000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="150"
time="1330887600000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="300"
time="1330891200000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="500"
time="1330891200000000" local_ip="1.2.3.5" remote_ip="5.6.7.9"
    bytes="200"

The packets table shows all the 5 packets received so far, and the hourly aggregation results for all 3 hours (with two separate aggregation groups in the same last hour, for different ip pairs).

new,OP_INSERT,1330896811000000,1.2.3.5,5.6.7.9,3000,80,10
tPackets.out OP_INSERT time="1330896811000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" local_port="3000" remote_port="80" bytes="10"
tHourly.out OP_INSERT time="1330894800000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" bytes="10"
tPackets.out OP_DELETE time="1330886011000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="100"
tPackets.out OP_DELETE time="1330886012000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="50"

When the next hour's packet arrives, it gets processed as usual, but then the removal logic finds the packet rows that have become too old to keep. It kicks in and deletes them. But notice that the deletions affect only the packets table, the aggregator ignores this activity as too old and does not propagate it to the hourly table.

new,OP_INSERT,1330900411000000,1.2.3.4,5.6.7.8,2000,80,40
tPackets.out OP_INSERT time="1330900411000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="40"
tHourly.out OP_INSERT time="1330898400000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="40"
tPackets.out OP_DELETE time="1330889811000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="300"

One more hour's packet, flushes out the data for another hour.

new,OP_INSERT,1330904011000000
tPackets.out OP_DELETE time="1330894211000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="500"
tPackets.out OP_DELETE time="1330894211000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" local_port="3000" remote_port="80" bytes="200"

And just a time update for another hour, when no packets have been received. The removal logic still kicks in and works the same way, deleting raw data for one more hour. After all this activity let's dump the tables again:

dumpPackets
time="1330896811000000" local_ip="1.2.3.5" remote_ip="5.6.7.9"
    local_port="3000" remote_port="80" bytes="10"
time="1330900411000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    local_port="2000" remote_port="80" bytes="40"
dumpHourly
time="1330884000000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="150"
time="1330887600000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="300"
time="1330891200000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="500"
time="1330891200000000" local_ip="1.2.3.5" remote_ip="5.6.7.9"
    bytes="200"
time="1330894800000000" local_ip="1.2.3.5" remote_ip="5.6.7.9"
    bytes="10"
time="1330898400000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="40"

The packets table only has the data for the last 3 hours (there are no rows for the last hour because none have arrived). But the hourly table contains all the history. The rows weren't getting deleted here.

13.2. Periodic updates

In the previous example if we keep aggregating the data from hours to days and the days to months, then the arrival of each new packet will update the whole chain. Sometimes that's what we want, sometimes it isn't. The daily stats might be fed into some complicated computation, with nobody looking at the results until the next day. In this situation each packet will trigger these complicated computations, for no good reason, since nobody cares for them until the day is closed.

These unnecessary computations can be prevented by disconnecting the daily data from the hourly data, and performing the manual aggregation only when the day changes. Then these complicated computations would happen only once a day, not many times per second.

Here is how the last example gets amended to produce the once-a-day daily summaries of all the traffic (as before, in multiple snippets, this time showing only the added or changed code):

# an hourly summary, now with the day extracted
our $rtHourly = Triceps::RowType->new(
  time => "int64", # hour's timestamp, microseconds
  day => "string", # in YYYYMMDD
  local_ip => "string", # string to make easier to read
  remote_ip => "string", # string to make easier to read
  bytes => "int64", # bytes sent in an hour
);

# a daily summary: just all traffic for that day
our $rtDaily = Triceps::RowType->new(
  day => "string", # in YYYYMMDD
  bytes => "int64", # bytes sent in an hour
);

The hourly rows get an extra field, for convenient aggregation by day. And the daily rows are introduced. The notion of the day is calculated as:

# compute the date of a timestamp, a string YYYYMMDD
sub dateStamp # (time)
{
  my @ts = gmtime($_[0]/1000000); # microseconds to seconds
  return sprintf("%04d%02d%02d", $ts[5]+1900, $ts[4]+1, $ts[3]);
}

# the current hour stamp that keeps being updated
our $currentHour = undef;
# the current day stamp that keeps being updated
our $currentDay = undef;

The calculation is done in GMT, so that the code produces the same result all around the world. If you're doing this kind of project for real, you may want to use the local time zone instead (but be careful with the changing daylight saving time).

And the model keeps a global notion of the current day in addition to the current hour.

# aggregation handler: recalculate the summary for the last hour
sub computeHourlywDay # (table, context, aggop, opcode, rh, state, args...)
{
...
  my $res = $context->resultType()->makeRowHash(
    time => $hourstamp,
    day => &dateStamp($hourstamp),
    local_ip => $rFirst->get("local_ip"),
    remote_ip => $rFirst->get("remote_ip"),
    bytes => $bytes,
  );
  ${$state} = $res;
  $context->send($opcode, $res);
}

The packets-to-hour aggregation function now populates this extra field, the rest of it stays the same.

# the aggregated hourly stats, kept longer
our $ttHourly = Triceps::TableType->new($rtHourly)
  ->addSubIndex("byAggr",
    Triceps::SimpleOrderedIndex->new(
      time => "ASC", local_ip => "ASC", remote_ip => "ASC")
  )
  ->addSubIndex("byDay",
    Triceps::IndexType->newHashed(key => [ "day" ])
    ->addSubIndex("group",
      Triceps::IndexType->newFifo()
    )
  )
;

$ttHourly->initialize();
our $tHourly = $uTraffic->makeTable($ttHourly, "tHourly");

# remember the daily secondary index type
our $idxHourlyByDay = $ttHourly->findSubIndex("byDay");
our $idxHourlyByDayGroup = $idxHourlyByDay->findSubIndex("group");

The hourly table type grows an extra secondary index for the manuall aggregation into the daily data.

# the aggregated daily stats, kept even longer
our $ttDaily = Triceps::TableType->new($rtDaily)
  ->addSubIndex("byDay",
    Triceps::IndexType->newHashed(key => [ "day" ])
  )
;

$ttDaily->initialize();
our $tDaily = $uTraffic->makeTable($ttDaily, "tDaily");

# label to print the changes to the daily stats
makePrintLabel("lbPrintDaily", $tDaily->getOutputLabel());

And a table for the daily data is created but not connected to any other tables.

Instead it gets updated manually with the function that performs the manual aggregation of the hourly data:

# the manual aggregation of a day's data
sub computeDay # ($dateStamp)
{
  our $uTraffic;
  my $bytes = 0;

  my $rhFirst = $tHourly->findIdxBy($idxHourlyByDay, day => $_[0]);
  my $rhEnd = $rhFirst->nextGroupIdx($idxHourlyByDayGroup);
  for (my $rhi = $rhFirst;
      !$rhi->same($rhEnd); $rhi = $rhi->nextIdx($idxHourlyByDay)) {
    $bytes += $rhi->getRow()->get("bytes");
  }
  $uTraffic->makeHashCall($tDaily->getInputLabel(), "OP_INSERT",
    day => $_[0],
    bytes => $bytes,
  );
}

This logic doesn't check whether any data for that day existed. If none did, it would just produce a row with traffic of 0 bytes anyway. This is different from the normal aggregation but here may actually be desirable: it shows for sure that yes, the aggregation for that day really did happen.

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "new") {
    my $rowop = $tPackets->getInputLabel()->makeRowopArray(@data);
    # update the current notion of time (simplistic)
    $currentHour = &hourStamp($rowop->getRow()->get("time"));
    my $lastDay = $currentDay;
    $currentDay = &dateStamp($currentHour);
    if (defined($rowop->getRow()->get("local_ip"))) {
      $uTraffic->call($rowop);
    }
    &flushOldPackets(); # flush the packets
    if (defined $lastDay && $lastDay ne $currentDay) {
      &computeDay($lastDay); # manual aggregation
    }
    $uTraffic->drainFrame(); # just in case, for completeness
  } elsif ($type eq "dumpPackets") {
    &dumpTable($tPackets);
  } elsif ($type eq "dumpHourly") {
    &dumpTable($tHourly);
  } elsif ($type eq "dumpDaily") {
    &dumpTable($tDaily);
  }
}

The main loop gets extended with the day-keeping logic and with the extra command to dump the daily data. It now maintains the current day, and after the packet computation is done, looks, whether the day has changed. If it did, it calls the manual aggregation of the last day.

And here is an example of its work:

new,OP_INSERT,1330886011000000,1.2.3.4,5.6.7.8,2000,80,100
tPackets.out OP_INSERT time="1330886011000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="100"
tHourly.out OP_INSERT time="1330884000000000" day="20120304"
    local_ip="1.2.3.4" remote_ip="5.6.7.8" bytes="100"
new,OP_INSERT,1330886012000000,1.2.3.4,5.6.7.8,2000,80,50
tHourly.out OP_DELETE time="1330884000000000" day="20120304"
    local_ip="1.2.3.4" remote_ip="5.6.7.8" bytes="100"
tPackets.out OP_INSERT time="1330886012000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="50"
tHourly.out OP_INSERT time="1330884000000000" day="20120304"
    local_ip="1.2.3.4" remote_ip="5.6.7.8" bytes="150"
new,OP_INSERT,1330889811000000,1.2.3.4,5.6.7.8,2000,80,300
tPackets.out OP_INSERT time="1330889811000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="300"
tHourly.out OP_INSERT time="1330887600000000" day="20120304"
    local_ip="1.2.3.4" remote_ip="5.6.7.8" bytes="300"

So far all the 3 packets are for the same day, and nothing new has happened.

new,OP_INSERT,1330972411000000,1.2.3.5,5.6.7.9,3000,80,200
tPackets.out OP_INSERT time="1330972411000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" local_port="3000" remote_port="80" bytes="200"
tHourly.out OP_INSERT time="1330970400000000" day="20120305"
    local_ip="1.2.3.5" remote_ip="5.6.7.9" bytes="200"
tPackets.out OP_DELETE time="1330886011000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="100"
tPackets.out OP_DELETE time="1330886012000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="50"
tPackets.out OP_DELETE time="1330889811000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="300"
tDaily.out OP_INSERT day="20120304" bytes="450"

When a packet for the next day arrives, it has three effects:

  1. inserts the packet data as usual,
  2. finds that the previous packet data is obsolete and flushes it (without upsetting the hourly summaries), and
  3. finds that the day has changed and performs the manual aggregation of last day's hourly data into daily.
new,OP_INSERT,1331058811000000
tPackets.out OP_DELETE time="1330972411000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" local_port="3000" remote_port="80" bytes="200"
tDaily.out OP_INSERT day="20120305" bytes="200"

A time update for the yet next day flushes out the previous day's detailed packets and again builds the daily summary of that day.

new,OP_INSERT,1331145211000000
tDaily.out OP_INSERT day="20120306" bytes="0"

Yet another day's time roll now has no old data to delete (since none arrived in the previous day) but still produces the daily summary of 0 bytes.

dumpDaily
day="20120305" bytes="200"
day="20120304" bytes="450"
day="20120306" bytes="0"

This shows the eventual contents of the daily summaries. The order of the rows is fairly random, because of the hashed index. Note that the hourly summaries weren't flushed either, they are all still there too. If you want them eventually deleted after some time, you would need to provide more of the manual logic for that.

13.3. The general issues of time processing

After a couple of examples, it's time to do some generalizations. What these examples did manually, with the data expiration by time, the more mature CEP systems do internally, using the statements for the time-based work.

Which isn't always better though. The typical issues are with:

  • fast replay of data,
  • order of execution,
  • synchronization between modules.

The problem with the fast replay is that those time based-statements use the real time and not the timestamps from the incoming rows. Sure, in Coral8 you can use the incoming row timestamps but they still are expected to have the time generally synchronized with the local clock (they are an attempt to solve the inter-module synchronization problem, not fast replay). You can't run them fast. And considering the Coral8 fashion of dropping the data when the input buffer overflows, you don't want to feed the data into it too fast to start with. In the Aleri system you can accelerate the time but it's by a fixed factor. You can run the logical time there say 10 times faster and feed the data 10 times faster but there are no timestamps in the input rows, and you simply can't feed the data precisely enough to reproduce the exact timing. And 10 times faster is not the same thing as just as fast as possible. I don't know for sure what the StreamBase does, it seems to have the time acceleration by a fixed rate too. Esper apparently allows the full control over timing, but I don't know much about it.

Your typical problem with fast replay in Coral8/CCL is this: you create a time limited window

create window ... keep 3 hours;

and then feed the data for a couple of days in say 20 minutes. Provided that you don't feed it too fast and none of it gets dropped, all of the data ends up in the window and none of it expires, since the window goes by the physical time, and the physical time was only 20 minutes. The first issue is that you may not have enough memory to store the data for two days, and everything would run out of memory and crash. The second issue is that if you want to do some time-based aggregation relying on the window expiration, you're out of luck.

Why would you want to feed the data so fast in the first place? Two reasons:

  1. Testing. When you test your time-based logic, you don't want your unit test to take 3 hours, let alone multiple days. You also want your unit tests to be fully repeatable, without any fuzz.
  2. State restoration after a planned shutdown or crash. No matter what everyone says, the built-in persistence features work right only for a small subset of the simple models. Getting the persistence work for the more complex models is difficult, and for all I know nobody has bothered to get it working right. The best approach in reality is to preserve a subset of the state, and get the rest of it by replaying the recent input data after restart. The faster you re-feed the data, the faster your model comes back online. (Incidentally, that's what Aleri does with the persistent source streams, only losing all the timing information of the rows and having the same above-mentioned issue as CCL).

Next issue, the execution order. The last example was relying on $currentHour being updated before flushOldPackets() runs. Otherwise the deletions would propagate through the aggregator where they should not. In a system like Aleri with each element running in its own thread there is no way to ensure any particular timing between the threads. In a system with single-threaded logic, like Coral8/Sybase or StreamBase, there is a way. But getting the order right is tricky. It depends on what the compiler and scheduler decide, and may require a few attempts to get the order right. Well, technically, Aleri can control the time too: you can run in artificial time, setting and stopping it. So you can stop the time, set to record timestamp, feed the record, wait for processing to complete, advance time, wait for any time-based processing to complete, and so on. I'm not sure if it made to Sybase R5, but it definitely worked on Aleri. However there was no tool that did it for you easily, and also all these synchronous calls present a pretty high overhead.

The procedural execution makes things much more straightforward.

Now, the synchronization between modules. When the data is passed between multiple threads or processes, there is always a jigger in the way the data goes through the inter-process communications and even more so through the network. Relying on the timing of the data after it arrives is usually a bad idea if you want to get any repeatability and precision. Instead the data has to be timestamped by the sender and then these timestamps used by the receiver instead of the real time.

And Coral8 allows you to do so. But what if there is no data coming? What do you do with the time-based processing? The Coral8 approach is to allow some delay and then proceed at the rate of the local clock. Note that the logical time is not exactly the same as the local clock, it generally gets behind the local clock by no more than the delay amount, or might go faster if the sender's clock goes faster. The question is, what delay amount do you choose? If you make it too short, the small hiccups in the data flow throw the timing off, the local clock runs ahead, and then the incoming data gets thrown away because it's too old. If you make it too long, you potentially add a large amount of latency. As it turns out, no reasonable amount of delay works well with Coral8. To get things working at least sort of reliably, you need horrendous delays, on the order of 10 seconds or more. Even then the sender may get hit by a long-running request and the connection would go haywire anyway.

The only reliable solution is to drive the time completely by the sender. Even if there is no data to send, it must still send the periodic time updates, and the receiver must use the incoming timestamps for its time-based processing. Sending one or even ten time-update packets per second is not a whole lot of overhead, and sure works much better than the 10-second delays. And along the way it gives the perfect repeatability and fast replay for the unit testing. So unless your CEP system can be controlled in this way, getting any decent distributed timing control requires doing it manually. The reality is that Aleri can't, Coral8 can't, the Sybase R4/R5 descended from them can't, and I could not find anything related to the time control in the StreamBase documentation, so my guess is that it can't either.

And if you have to control the time-based processing manually, doing it in the procedural way is at least easier.

An interesting side subject is the relation of the logical time to the real time. If the input data arrives faster than the CEP model can process it, the logical time will be getting behind the real time. Or if the data is fed at the artificially accelerated rate, the logical time will be getting ahead of the real time. There could even be a combination thereof: making the "real" time also artificial (driven by the sender) and artificially make the data get behind it for the testing purposes. The getting-behind can be detected and used to change the algorithm. For example, if we aggregate the traffic data in multiple stages, to the hour, to the day and to the month, the whole chain does not have to be updated on every packet Just update the first level on every packet, and then propagate further when the traffic burst subsides and gives the model a breather.

So far the major CEP systems don't seem to have a whole lot of direct support for it. There are ways to reduce the load by reducing the update frequency to a fixed period (like the OUTPUT EVERY statement in CCL, or periodic subscription in Aleri), but not much of the load-based kind. If the system provides ways to get both the real time and logical time of the row, the logic can be implemented manually. But the optimizations of the time-reading, like in Coral8, might make it unstable.

The way to do it in Triceps is by handling it in the Perl (or C++) code of the main event loop. When it has no data to read, it can create an idle row that would push through the results as a more efficient batch.

Chapter 14. The other templates and solutions

14.1. The dreaded diamond

The diamond is a particular topology of the data flow, when the computation separates based on some condition and then merges again. Like in Figure 14.1. It is also known as fork-join (the join here has nothing to do with the SQL join, it just means that the arrows merge to the same block).

The diamond topology.

Figure 14.1. The diamond topology.


This topology is a known source of two problems. The first problem is about the execution order. To make things easier to see, let's consider a simple example. Suppose the rows come into the block A with the schema:

key => string,
value => int32,

And come out of the blocks B and C into D with schema

key => string,
value => int32,
negative => int32,

With the logic in the blocks being:

A:
  if value < 0 then B else C
B:
  negative = 1
C:
  negative = 0

Yes, this is a very dumb example that can usually be handled by a conditional expression in a single block. But that's to keep it small and simple. A real example would often include some SQL joins, with different joins done on condition.

Suppose A then gets the input, in CSV form:

INSERT,key1,10
DELETE,key1,10
INSERT,key1,20
DELETE,key1,20
INSERT,key1,-1

What arrives at D should be

INSERT,key1,10,0
DELETE,key1,10,0
INSERT,key1,20,0
DELETE,key1,20,0
INSERT,key1,-1,1

And with the first four rows this is not a problem: they follow the same path and are queued sequentially, so the order is preserved. But the last row follows a different path. And the last two rows logically represent a single update and would likely arrive closely together. The last row might happen to overtake the one before it, and D would see the incorrect result:

INSERT,key1,10,0
DELETE,key1,10,0
INSERT,key1,20,0
INSERT,key1,-1,1
DELETE,key1,20,0

If all these input rows arrive closely one after another, the last row might overtake even more of them and produce an even more disturbing result like

INSERT,key1,-1,1
INSERT,key1,10,0
DELETE,key1,10,0
INSERT,key1,20,0
DELETE,key1,20,0

Such misorderings may also happen between the rows with different keys. Those are usually less of a problem, because usually if D keeps a table, the rows with different keys may be updated in any order without losing the meaning. But in case if D keeps a FIFO index (say, for a window based on a row count), and the two keys fall into the same FIFO bucket, their misordering would also affect the logic.

The reasons for this can be subdivided further into two classes:

  • asynchronous execution,
  • incorrect scheduling in the synchronous execution.

If each block executes asynchronously in its own thread, there is no way to predict, in which order they will actually execute. If some data is sent to B and C at about the same time, it becomes a race between them. One of the paths might also be longer than the other, making one alternative always win the race. This kind of problems is fairly common for the Aleri system that is highly multithreaded. But this is the problem of absolutely any CEP engine if you split the execution by multiple threads or processes.

But the single-threaded execution is not necessarily a cure either. Then the order of execution is up to the scheduler. And if the scheduler gets all these rows close together, and then decides to process all the input of A, then all the input of B, of C and of D, then D will receive the rows in the order:

INSERT,key1,-1,1
INSERT,key1,10,0
DELETE,key1,10,0
INSERT,key1,20,0
DELETE,key1,20,0

Which is typical for, say, Coral8 if all the input rows arrive in a single bundle (see also the Section 7.6: “No bundling” ).

The multithreaded case in Triceps will be discussed separately in Section 16.10: “The threaded dreaded diamond and data reordering” .

When the single-threaded scheduling is concerned, Triceps provides two answers.

First, the conditional logic can often be expressed procedurally:

if ($a->get("value") < 0) {
  D($rtD->makeRowHash($a->toHash(), negative => 1));
} else {
  D($rtD->makeRowHash($a->toHash(), negative => 0));
}

The procedural if-else logic can easily handle not only the simple expressions but things like look-ups and modifications in the tables.

Second, if the logic is broken into the separate labels, the label call semantics provides the same ordering as well:

$lbA = $unit->makeLabel($rtA, "A", undef, sub {
  my $rop = $_[1];
  my $op = $rop->getOpcode(); my $a = $rop->getRow();
  if ($a->get("value") < 0) {
    $unit->call($lbB->makeRowop($op, $a));
  } else {
    $unit->call($lbC->makeRowop($op, $a));
  }
});

$lbB = $unit->makeLabel($rtA, "B", undef, sub {
  my $rop = $_[1];
  my $op = $rop->getOpcode(); my $a = $rop->getRow();
  $unit->makeHashCall($lbD, $op, $a->toHash(), negative => 1);
});

$lbC = $unit->makeLabel($rtA, "C", undef, sub {
  my $rop = $_[1];
  my $op = $rop->getOpcode(); my $a = $rop->getRow();
  $unit->makeHashCall($lbD, $op, $a->toHash(), negative => 0);
});

When the label A calls the label B or C, which calls the label D, A does not get to see its next input row until the whole chain of calls to D and beyond completes. B and C may be replaced with the label chains of arbitrary complexity, including loops, without disturbing the logic.

The second problem with the diamond topology happens when the blocks B and C keep the state, and the input data gets updated by simply re-sending a record with the same key. This kind of updates is typical for the systems that do not have the concept of opcodes.

Consider a CCL example (approximate, since I can't test it) that gets the reports about borrowing and loaning securities, using the sign of the quantity to differentiate between borrows (-) and loans (+). It then sums up the borrows and loans separately:

create schema s_A (
  id integer,
  symbol string,
  quantity long
);
create input stream i_A schema s_A;

create schema s_D (
  symbol string,
  borrowed boolean, // flag: loaned or borrowed
  quantity long
);
// aggregated data
create public window w_D schema s_D
keep last per symbol, borrowed;

// collection of borrows
create public window w_B schema s_A keep last per id;
// collection of loans
create public window w_C schema s_A keep last per id;

insert when quantity < 0
  then w_B
  else w_C
select * from i_A;

// borrows aggregation
insert into w_D
select
  symbol,
  true,
  sum(quantity)
group by symbol
from w_B;

// loans aggregation
insert into w_D
select
  symbol,
  false,
  sum(quantity)
group by symbol
from w_C;

It works OK until a row with the same id gets updated to a different sign of quantity:

1,AAA,100
....
1,AAA,-100

If the quantity kept the same sign, the new row would simply replace the old one in w_B or w_C, and the aggregation result would be right again. But when the sign changes, the new row goes into a different direction than the previous one. Now it ends up with both w_B and w_C having rows with the same id: one old and one new!

In this case really the problem is at the fork part of the diamond, the merging part of it is just along for the ride, carrying the incorrect results.

This problem does not happen in the systems that have both inserts and deletes. Then the data sequence becomes

INSERT,1,AAA,100
....
DELETE,1,AAA,100
INSERT,1,AAA,-100

The DELETE goes along the same branch as the first insert and undoes its effect, then the second INSERT goes into the other branch.

Since Triceps has both INSERT and DELETE opcodes, it's immune to this problem, as long as the input data has the correct DELETEs in it.

If you wonder, the CCL example can be fixed too but in a more round-about way, by adding a couple of statements before the insert-when statement:

on w_A
delete from w_B
  where w_A.id = w_B.id;

on w_A
delete from w_C
  where w_A.id = w_C.id;

This generates the matching DELETEs. Of course, if you want, you can use this way with Triceps too.

14.2. Collapsed updates

First, a note: the collapse described here has nothing to do with the collapsing of the aggregation groups. It's just the same word reused for a different purpose.

Sometimes the exact sequence of how a row at a particular key was updated does not matter, the only interesting part is the end result. Like the OUTPUT EVERY statement in CCL or the pulsed subscription in Aleri. It doesn't have to be time-driven either: if the data comes in as batches, it makes sense to collapse the modifications from the whole batch into one, and send it at the end of the batch.

To do this in Triceps, I've made a template. Here is an example of its use with interspersed commentary:

our $rtData = Triceps::RowType->new(
  # mostly copied from the traffic aggregation example
  local_ip => "string",
  remote_ip => "string",
  bytes => "int64",
);

The meaning of the rows is not particularly important for this example. It just uses a pair of the IP addresses as the collapse key. The collapse absolutely needs a primary key, since it has to track and collapse multiple updates to the same row.

my $unit = Triceps::Unit->new("unit");

my $collapse = Triceps::Collapse->new(
  unit => $unit,
  name => "collapse",
  data => [
    name => "idata",
    rowType => $rtData,
    key => [ "local_ip", "remote_ip" ],
  ],
);

Most of the options are self-explanatory. The dataset is defined with nested options to make the API extensible, to allow multiple datasets to be defined in the future. But at the moment only one is allowed. A dataset collapses the data at one label: an input label and an output label get defined for it, just as for the table. The data arrives at the input label, gets collapsed by the primary key, and then stays in the Collapse until the flush. When the Collapse gets flushed, the data is sent out of its output label. After the flush, the Collapse has no data in it, and starts collecting the updates again from scratch. The labels gets named by connecting the names of the Collapse element, of the dataset, and in or out. For this Collapse, the label names will be collapse.idata.in and collapse.idata.out.

Note that the dataset options are specified in a referenced array, not a hash! If you try to use a hash, it will fail. When specifying the dataset options, put the name first. It's used in the error messages about any issues in the dataset, and the code really expects the name to go first.

Like with the other shown templates, if something goes wrong, Collapse will confess.

my $lbPrint = makePrintLabel("print", $collapse->getOutputLabel("idata"));

The print label gets connected to the Collapse's output label. The method to get the collapse's output label is very much like table's. Only it gets the dataset name as an argument.

sub mainloop($$$) # ($unit, $datalabel, $collapse)
{
  my $unit = shift;
  my $datalabel = shift;
  my $collapse = shift;
  while(<STDIN>) {
    chomp;
    my @data = split(/,/); # starts with a command, then string opcode
    my $type = shift @data;
    if ($type eq "data") {
      my $rowop = $datalabel->makeRowopArray(@data);
      $unit->call($rowop);
      $unit->drainFrame(); # just in case, for completeness
    } elsif ($type eq "flush") {
      $collapse->flush();
    }
  }
}

&mainloop($unit, $collapse->getInputLabel($collapse->getDatasets()), $collapse);

There will be a second example, so I've placed the main loop into a function. It works in the same way as in the examples before: extracts the data from the CSV format and sends it to a label. The first column contains the command: data sends the data, and flush performs the flush from the Collapse. The flush marks the end of the batch. Here is an example of a run, with the input lines shown as usual in bold:

data,OP_INSERT,1.2.3.4,5.6.7.8,100
data,OP_INSERT,1.2.3.4,6.7.8.9,1000
data,OP_DELETE,1.2.3.4,6.7.8.9,1000
flush
collapse.idata.out OP_INSERT local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="100"

The row for (1.2.3.4, 5.6.7.8) gets plainly inserted, and goes through on the flush. The row for (1.2.3.4, 6.7.8.9) gets first inserted and then deleted, so by the flush time it becomes a no-operation.

data,OP_DELETE,1.2.3.4,5.6.7.8,100
data,OP_INSERT,1.2.3.4,5.6.7.8,200
data,OP_INSERT,1.2.3.4,6.7.8.9,2000
flush
collapse.idata.out OP_DELETE local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="200"
collapse.idata.out OP_INSERT local_ip="1.2.3.4" remote_ip="6.7.8.9"
    bytes="2000"

The original row for (1.2.3.4, 5.6.7.8) gets modified, and the modification goes through. The new row for (1.2.3.4, 6.7.8.9) gets inserted now, and also goes through.

data,OP_DELETE,1.2.3.4,6.7.8.9,2000
data,OP_INSERT,1.2.3.4,6.7.8.9,3000
data,OP_DELETE,1.2.3.4,6.7.8.9,3000
data,OP_INSERT,1.2.3.4,6.7.8.9,4000
data,OP_DELETE,1.2.3.4,6.7.8.9,4000
flush
collapse.idata.out OP_DELETE local_ip="1.2.3.4" remote_ip="6.7.8.9"
    bytes="2000"

The row for (1.2.3.4, 6.7.8.9) now gets modified twice, and after that deleted. After collapse it becomes the deletion of the original row, the one that was inserted before the previous flush.

The Collapse also allows to specify the row type and the input connection for a dataset in a different way:

my $lbInput = $unit->makeDummyLabel($rtData, "lbInput");

my $collapse = Triceps::Collapse->new(
  name => "collapse",
  data => [
    name => "idata",
    fromLabel => $lbInput,
    key => [ "local_ip", "remote_ip" ],
  ],
);

&mainloop($unit, $lbInput, $collapse);

Normally $lbInput would be not a dummy label but the output label of some element. The dataset option fromLabel tells that the dataset input will be coming from that label. So the Collapse can automatically both copy its row type for the dataset, and also chain the dataset's input label to that label. And also allowing to skip the option unit at the main level. It's a pure convenience, allowing to skip the manual steps. In the future a Collapse dataset should probably take a whole list of source labels and chain itself to all of them, but for now only one.

This example produces exactly the same output as the previous one, so there is no use in copying it again.

Another item that hasn't been shown yet, you can get the list of dataset names (well, currently only one name):

@names = $collapse->getDatasets();

The Collapse implementation is reasonably small, and is another worthy example to show. It's a common template, with no code generation whatsoever, just a combination of ready components. As with SimpleAggregator, the current Collapse is quite simple and will grow more features over time, so I've copied the original simple version into t/xCollapse.t to stay there unchanged.

The most notable thing about Collapse is that it took just about an hour to write the first version of it and another three or so hours to test it. Which is a lot less than the similar code in the Aleri or Coral8 code base took. The reason for this is that Triceps provides the fairly flexible base data structures that can be combined easily directly in a scripting language. There is no need to re-do a lot from scratch every time, just take something and add a little bit on top.

So here it is, with the interspersed commentary.

sub new # ($class, $optName => $optValue, ...)
{
  my $class = shift;
  my $self = {};

  &Triceps::Opt::parse($class, $self, {
    unit => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::Unit") } ],
    name => [ undef, \&Triceps::Opt::ck_mandatory ],
    data => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "ARRAY") } ],
  }, @_);

  # Keeps the names of the datasets in the order they have been defined
  # (since the hash loses the order).
  $self->{dsetnames} = [];

  # parse the data element
  my %data_unparsed = @{$self->{data}};
  my $dataset = {};
  &Triceps::Opt::parse("$class data set (" . ($data_unparsed{name} or 'UNKNOWN') . ")", $dataset, {
    name => [ undef, \&Triceps::Opt::ck_mandatory ],
    key => [ undef, sub { &Triceps::Opt::ck_mandatory(@_); &Triceps::Opt::ck_ref(@_, "ARRAY", "") } ],
    rowType => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::RowType"); } ],
    fromLabel => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::Label"); } ],
  }, @{$self->{data}});

The options parsing goes as usual. The option data is parsed again for the options inside it, and those are placed into the hash %$dataset.

  # save the dataset for the future
  push @{$self->{dsetnames}}, $dataset->{name};
  $self->{datasets}{$dataset->{name}} = $dataset;
  # check the options
  &Triceps::Opt::handleUnitTypeLabel("Triceps::Collapse data set (". $dataset->{name} . ")",
    "unit at the main level", \$self->{unit},
    "rowType", \$dataset->{rowType},
    "fromLabel", \$dataset->{fromLabel});
  my $lbFrom = $dataset->{fromLabel};

If fromLabel is used, the row type and possibly unit are found from it by Triceps::Opt::handleUnitTypeLabel(). Or if the unit was specified explicitly, it gets checked for consistency with the label's unit. See Section 10.5: “Template options” for more detail. The early version of Collapse in t/xCollapse.t actually pre-dates Triceps::Opt::handleUnitTypeLabel(), and there the similar functionality is done manually.

  # create the tables
  $dataset->{tt} = Triceps::TableType->new($dataset->{rowType})
    ->addSubIndex("primary",
      Triceps::IndexType->newHashed(key => $dataset->{key})
    );
  $dataset->{tt}->initialize();

  $dataset->{tbInsert} = $self->{unit}->makeTable($dataset->{tt}, $self->{name} . "." . $dataset->{name} . ".tbInsert");
  $dataset->{tbDelete} = $self->{unit}->makeTable($dataset->{tt}, $self->{name} . "." . $dataset->{name} . ".tbDelete");

The state is kept in two tables. The reason for their existence is that after collapsing, the Collapse may send for each key one of:

  • a single INSERT rowop, if the row was not there before and became inserted,
  • a DELETE rowop if the row was there before and then became deleted,
  • a DELETE followed by an INSERT if the row was there but then changed its value,
  • or nothing if the row was not there before, and then was inserted and deleted, or if there was no change to the row.

Accordingly, this state is kept in two tables: one contains the DELETE part, another the INSERT part for each key, and either part may be empty (or both, if the row at that key has not been changed). After each flush both tables become empty, and then start collecting the modifications again.

  # create the labels
  $dataset->{lbIn} = $self->{unit}->makeLabel($dataset->{rowType}, $self->{name} . "." . $dataset->{name} . ".in",
    undef, \&_handleInput, $self, $dataset);
  $dataset->{lbOut} = $self->{unit}->makeDummyLabel($dataset->{rowType}, $self->{name} . "." . $dataset->{name} . ".out");

The input and output labels get created. The input label has the function with the processing logic set as its handler. The output label is just a dummy. Note that the tables don't get connected anywhere, they are just used as storage, without any immediate reactions to their modifications.

  # chain the input label, if any
  if (defined $lbFrom) {
    $lbFrom->chain($dataset->{lbIn});
    delete $dataset->{fromLabel}; # no need to keep the reference any more, avoid a reference cycle
  }

And if the fromLabel was used, the Collapse gets connected to it. After that there is no good reason to keep a separate reference to that label, especially considering that it creates a reference loop that would not be cleaned until the input label get cleaned by the unit. So it gets deleted early instead.

  bless $self, $class;
  return $self;
}

The final blessing is boilerplate. The constructor creates the data structures but doesn't implement any logic. The logic goes next:

# (protected)
# handle one incoming row on a dataset's input label
sub _handleInput # ($label, $rop, $self, $dataset)
{
  my $label = shift;
  my $rop = shift;
  my $self = shift;
  my $dataset = shift;

  if ($rop->isInsert()) {
    # Simply add to the insert table: the effect is the same, independently of
    # whether the row was previously deleted or not. This also handles correctly
    # multiple inserts without a delete between them, even though this kind of
    # input is not really expected.
    $dataset->{tbInsert}->insert($rop->getRow());

The Collapse object knows nothing about the data that went through it before. After each flush it starts again from scratch. It expects that the stream of rows is self-consistent, and makes the conclusions about the previous data based on the new data it sees. An INSERT rowop may mean one of two things: either there was no previous record with this key, or there was a previous record with this key and then it got deleted. The Delete table can be used to differentiate between these situations: if there was a row that was then deleted, the Delete table would contain that row. But for the INSERT it doesn't matter: in either case it just inserts the new row into the Insert table. If there was no such row before, it would be the new INSERT. If there was such a row before, it would be an INSERT following a DELETE.

  } elsif($rop->isDelete()) {
    # If there was a row in the insert table, delete that row (undoing the previous insert).
    # Otherwise it means that there was no previous insert seen in this round, so this must be a
    # deletion of a row inserted in the previous round, so insert it into the delete table.
    if (! $dataset->{tbInsert}->deleteRow($rop->getRow())) {
      $dataset->{tbDelete}->insert($rop->getRow());
    }
  }
}

The DELETE case is more interesting. If we see a DELETE rowop, this means that either there was an INSERT sent before the last flush and now that INSERT becomes undone, or that there was an INSERT after the flush, which also becomes undone. The actions for these cases are different: if the INSERT was before the flush, this row should go into the Delete table, and eventually propagate as a DELETE during the next flush. If the last INSERT was after the flush, then its row would be stored in the Insert table, and now we just need to delete that row and pretend that it has never been.

That's what the logic does: first it tries to remove from the Insert table. If succeeded, then it was an INSERT after the flush, that became undone now, and there is nothing more to do. If there was no row to delete, this means that the INSERT must have happened before the last flush, and we need to remember this row in the Delete table and pass it on in the next flush.

This logic is not resistant to the incorrect data sequences. If there ever are two DELETEs for the same key in a row (which should never happen in a correct sequence), the second DELETE will end up in the Delete table.

# Unlatch and flush the collected data, then latch again.
sub flush # ($self)
{
  my $self = shift;
  my $unit = $self->{unit};
  my $OP_INSERT = &Triceps::OP_INSERT;
  my $OP_DELETE = &Triceps::OP_DELETE;
  foreach my $dataset (values %{$self->{datasets}}) {
    my $tbIns = $dataset->{tbInsert};
    my $tbDel = $dataset->{tbDelete};
    my $lbOut = $dataset->{lbOut};
    my $next;
    # send the deletes always before the inserts
    for (my $rh = $tbDel->begin(); !$rh->isNull(); $rh = $next) {
      $next = $rh->next(); # advance the irerator before removing
      $tbDel->remove($rh);
      $unit->call($lbOut->makeRowop($OP_DELETE, $rh->getRow()));
    }
    for (my $rh = $tbIns->begin(); !$rh->isNull(); $rh = $next) {
      $next = $rh->next(); # advance the irerator before removing
      $tbIns->remove($rh);
      $unit->call($lbOut->makeRowop($OP_INSERT, $rh->getRow()));
    }
  }
}

The flushing is fairly straightforward: first it sends on all the DELETEs, then all the INSERTs, clearing the tables along the way. At first I've though of matching the DELETEs and INSERTs together, sending them next to each other in case if both are available for some key. It's not that difficult to do. But then I've realized that it doesn't matter and just did it the simple way.

# Get the input label of a dataset.
# Confesses on error.
sub getInputLabel($$) # ($self, $dsetname)
{
  my ($self, $dsetname) = @_;
  confess "Unknown dataset '$dsetname'"
    unless exists $self->{datasets}{$dsetname};
  return $self->{datasets}{$dsetname}{lbIn};
}

# Get the output label of a dataset.
# Confesses on error.
sub getOutputLabel($$) # ($self, $dsetname)
{
  my ($self, $dsetname) = @_;
  confess "Unknown dataset '$dsetname'"
    unless exists $self->{datasets}{$dsetname};
  return $self->{datasets}{$dsetname}{lbOut};
}

# Get the lists of datasets (currently only one).
sub getDatasets($) # ($self)
{
  my $self = shift;
  return @{$self->{dsetnames}};
}

The getter functions are fairly simple. The only catch is that the code has to check for exists before it reads the value of $self->{datasets}{$dsetname}{lbOut}. Otherwise, if an incorrect $dsetname is used, the reading would return an undef but along the way would create an unpopulated $self->{datasets}{$dsetname}. Which would then cause a crash when flush() tries to iterate through it and finds the dataset options missing.

That's it, Collapse in a nutshell! Another way to do the collapse will be shown in Section 15.2: “Streaming functions by example, another version of Collapse” . And one more piece to it is shown in Section 15.8: “Streaming functions and template results” .

14.3. Large deletes in small chunks

If you have worked with Coral8 and similar CEP systems, you should be familiar with the situation when you ask it to delete a million rows from the table and the model goes into self-contemplation for half an hour, not reacting to any requests. It starts responding again only when the deletes are finished. That's because the execution is single-threaded, and deleting a million rows takes time.

Triceps is succeptible to the same issue. So, how to avoid it? Even better, how to make the deletes work in the background, at a low priority, kicking in only when there is no other pending requests?

The solution is do do it in smaller chunks. Delete a few rows (say, a thousand or so) then check if there are any other requests. Keep processing these other request until the model becomes idle. Then continue with deleting the next chunk of rows.

Let's make a small example of it. First, let's make a table.

our $uChunks = Triceps::Unit->new("uChunks");

# data is just some dumb easily-generated filler
our $rtData = Triceps::RowType->new(
  s => "string",
  i => "int32",
);

# the data is auto-generated by a sequence
our $seq = 0;

our $ttData = Triceps::TableType->new($rtData)
  ->addSubIndex("fifo", Triceps::IndexType->newFifo())
;
$ttData->initialize();
our $tData = $uChunks->makeTable($ttData, "tJoin1");
makePrintLabel("lbPrintData", $tData->getOutputLabel());

The data in the table is completely silly, just something to put in there. Even the index is a simple FIFO, just something to keep the table together.

Next, the clearing logic.

# notifications about the clearing
our $rtNote = Triceps::RowType->new(
  text => "string",
);

# rowops to run when the model is otherwise idle
our $trayIdle = $uChunks->makeTray();

our $lbReportNote = $uChunks->makeDummyLabel($rtNote, "lbReportNote"
);
makePrintLabel("lbPrintNote", $lbReportNote);

# code that clears the table in small chunks
our $lbClear = $uChunks->makeLabel($rtNote, "lbClear", undef, sub {
  $tData->clear(2); # no more than 2 rows deleted per run
  if ($tData->size() > 0) {
    $trayIdle->push($_[0]->adopt($_[1]));
  } else {
    $uChunks->makeHashCall($lbReportNote, "OP_INSERT",
      text => "done clearing",
    );
  }
});

We want to get a notification when the clearing is done. This notification will be sent as a rowop with row type $rtNote to the label $lbReportNote. Which then just gets printed, so that we can see it. In a production system it would be sent back to the requestor.

The clearing is initiated by sending a row (of the same type $rtNote) to the label $lbClear. Which does the job and then sends the notification of completion. In the real world not the whole table would probably be erased but only the old data, from before a certain date, like was shown in the Section 12.11: “JoinTwo input event filtering” . Here for simplicity all the data get wiped out.

But the method clear() stops after the number of deleted rows reaches the limit. Since it's real inconvenient to play with a million rows, we'll play with just a few rows. And so the chunk size limit is also set smaller, to just two rows instead of a thousand. When the limit is reached and there still are rows left in the table, the code pushes the command row into the idle tray for later rescheduling and returns. The adoption part is not strictly necessary, and this small example would work fine without it. But it's a safeguard for the more complicated programs that may have the labels chained, with our clearing label being just one link in a chain. If the incoming rowop gets rescheduled as is, the whole chain will get executed again. which might not be desirable. Re-adopting it to our label will cause only our label (okay, and everything chained from it) to be executed.

How would the rowops in the idle tray get executed? In the real world, the main loop logic would be like this pseudocode:

while(1) {
  if (idle tray is empty)
    timeout = infinity;
  else
    timeout = 0;
  poll(file descriptors, timeout);
  if (poll timed out)
    run the idle tray;
  else
    process the incoming data;
}

The example from Section 7.9: “Main loop with a socket” can be extended to work like this. But it's hugely inconvenient for a toy demonstration, getting the timing right would be a major pain. So instead let's just add the command idle to the main loop, to trigger the idle logic at will. The main loop of the example is:

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  if ($type eq "data") {
    my $count = shift @data;
    for (; $count > 0; $count--) {
      ++$seq;
      $uChunks->makeHashCall($tData->getInputLabel(), "OP_INSERT",
        s => ("data_" . $seq),
        i => $seq,
      );
    }
  } elsif ($type eq "dump") {
    for (my $rhit = $tData->begin(); !$rhit->isNull(); $rhit = $rhit->next()) {
      print("dump: ", $rhit->getRow()->printP(), "\n");
    }
    for my $r ($trayIdle->toArray()) {
      print("when idle: ", $r->printP(), "\n");
    }
  } elsif ($type eq "clear") {
    $uChunks->makeHashCall($lbClear, "OP_INSERT",
      text => "clear",
    );
  } elsif ($type eq "idle") {
    $uChunks->schedule($trayIdle);
    $trayIdle->clear();
  }
  $uChunks->drainFrame(); # just in case, for completeness
}

The data is put into the table by the main loop in a silly way: When we send the command like data,3, the mail loop will insert 3 new rows into the table. The contents is generated with sequential numbers, so the rows can be told apart. As the table gets changed, the updates get printed by the label lbPrintData.

The command dump dumps the contents of both the table and of the idle tray.

The command clear issues a clearing request by calling the label $lbClear. The first chunk gets cleared right away but then the control returns back to the main loop. If not all the data were cleared, an idle rowop will be placed into the idle tray.

The command idle that simulates the input idleness will then pick up that rowop from the idle tray and reschedule it.

All the pieces have been put together, let's run the code. The commentary are interspersed, and as usual, the input lines are shown in bold:

data,1
tJoin1.out OP_INSERT s="data_1" i="1"
clear
tJoin1.out OP_DELETE s="data_1" i="1"
lbReportNote OP_INSERT text="done clearing"

This is pretty much a dry run: put in one row (less than the chunk size), see it deleted on clearing. And see the completion reported afterwards.

data,5
tJoin1.out OP_INSERT s="data_2" i="2"
tJoin1.out OP_INSERT s="data_3" i="3"
tJoin1.out OP_INSERT s="data_4" i="4"
tJoin1.out OP_INSERT s="data_5" i="5"
tJoin1.out OP_INSERT s="data_6" i="6"

Add more data, which will be enough for three chunks.

clear
tJoin1.out OP_DELETE s="data_2" i="2"
tJoin1.out OP_DELETE s="data_3" i="3"

Now the clearing does one chunk and stops, waiting for the idle condition.

dump
dump: s="data_4" i="4"
dump: s="data_5" i="5"
dump: s="data_6" i="6"
when idle: lbClear OP_INSERT text="clear"

See what's inside: the remaining 3 rows, and a row in the idle tray saying that the clearing is in progress.

idle
tJoin1.out OP_DELETE s="data_4" i="4"
tJoin1.out OP_DELETE s="data_5" i="5"

The model goes idle once more, one more chunk of two rows gets deleted.

data,1
tJoin1.out OP_INSERT s="data_7" i="7"
dump
dump: s="data_6" i="6"
dump: s="data_7" i="7"
when idle: lbClear OP_INSERT text="clear"

What will happen if we add more data in between the chunks of clearing? Let's see, let's add one more row. It shows up in the table as usual.

idle
tJoin1.out OP_DELETE s="data_6" i="6"
tJoin1.out OP_DELETE s="data_7" i="7"
lbReportNote OP_INSERT text="done clearing"
dump
idle

On the next idle condition the clearing picks up whatever was in the table for the next chunk. Since there were only two rows left, it's the last chunk, and the clearing reports a successful completion. And a dump shows that there is nothing left in the table nor in the idle tray. The next idle condition does nothing, because the idle tray is empty.

The deletion could also be interrupted and cancelled, by removing the row from the idle tray. That would involve converting the tray to an array, finding and deleting the right rowop, and converting the array back into the tray. Overall it's fairly straightforward. The search in the array is linear but there should not be that many idle requests, so it should be quick enough.

The delete-by-chunks logic can be made into a template, just I'm not sure yet what is the best way to do it. It would have to have a lot of configurable parts.

On another subject, scheduling the things to be done on idle adds an element of unpredictability to the model. It's impossible to predict the exact timing of the incoming requests, and the idle work may get inserted between any of them. Presumably it's OK because the data being deleted should not be participating in any logic at this time any more. For repeatability in the unit tests, make the chunk size adjustable and adjust it to a size larger than the biggest amount of data used in the unit tests.

A similar logic can also be used in querying the data. But it's more difficult. For deletion the continuation is easy: just take the first row in the index, and it will be the place to continue (because the index is ordered correctly, and because the previous rows are getting deleted). For querying you would have to remember the next row handle and continue from it. Which is OK if it can not get deleted in the meantime. But if it can get deleted, you'll have to keep track of that too, and advance to the next row handle when this happens. And if you want to receive a full snapshot with the following subscription to all updates, you'd have to check whether the modified rows are before or after the marked handle, and pass them through if they are before it, letting the user see the updates to the data already received. And since the data is being sent to the user, filling up the output buffer and stopping would stop the whole model too, and not restart until the user reads the buffered data. So there has to be a flow control logic that would stop the query when output buffer fills up, return to the normal operation, and then reschedule the idle job for the query only when the output buffer drains down. I've kind of started on doing an example of the chunked query too, but then because of all these complications decided to leave it for later.

Chapter 15. Streaming functions

15.1. Introduction to streaming functions

The streaming functions are a cool and advanced concept. I've never seen it anywhere before, and for all I know I have invented it.

First let's look at the differences between the common functions and macros (or templates and such), shown in Figure 15.1.

The difference between the function and macro calls.

Figure 15.1. The difference between the function and macro calls.


What happens during a function call? Some code (marked with the light bluish color) is happily zooming along when it decides to call a function. It prepares some arguments and jumps to the function code (reddish). The function executes, computes its result and jumps back to the point right after it has been called from. Then the original code continues from there (the slightly darker bluish color).

What happens during a macro (or template) invocation? It starts with some code zooming along in the same way, however when the macro call time comes, it prepares the arguments and then does nothing. It gets away with it because the compiler has done the work: it has placed the macro code right where it's called, so there is no need for jumps. After the macro is done, again it does nothing: the compiler has placed the next code to execute right after it, so it just continues on its way.

So far it's pretty equivalent. An interesting difference happens when the function or macro is called from more than one place. With a macro, another copy of the macro is created, inserted between its call and return points. That's why in the figure the macro is shown twice. But with the function, the same function code is executed every time, and then returns back to the caller. That's why in the figure there are two function callers with their paths through the same function. But how does the function know, where should it jump on return? The caller tells it by pushing the return address onto the stack. When the function is done, it pops this address from the stack and jumps there.

Still, it looks all the same. A macro call is a bit more efficient, except when a large complex macro is called from many places, then it becomes more efficient as a function. However there is another difference if the function or macro holds some context (say, a static variable): each invocation of the macro will get its own context but all the function calls will share the same context. The only way to share the context with a macro is to pass some global context as its argument (or you can use a separately defined global variable if you're willing to dispense with some strict modularity).

Now let's switch to the CEP world. The Sybase or StreamBase modules are essentially macros, and so are the Triceps templates. When such a macro gets instantiated, a whole new copy of it gets created with its tables/windows and streams/labels. Its input and output streams/labels get all connected in a fixed way. The limitation is that if the macro contains any tables, each instantiation gets another copy of them. Well, in Triceps you can use a table as an argument to a template. In the other systems I think you still can't, so if you want to work with a common table in a module, you have to make up the query-response patterns, like the one described in Section 10.1: “Comparative modularity” .

In a query-response pattern there is some common sub-model, with a stream (in Triceps terms, a label, but here we're talking the other systems) for the queries to come in and a stream for the results to come out (both sides might have not only one but multiple streams). There are multiple inputs connected, from all the request sources, and the outputs are connected back to all the request sources. All the request sources (i.e. callers) get back the whole output of the pattern, so they need to identify, what output came from their input, and ignore the rest. They do this by adding the unique ids to their queries, and filter the results. In the end, it looks almost like a function but with much pain involved.

To make it look quite like a function, one thing is needed: the selective connection of the result streams (or, returning to the Triceps terminology, labels) to the caller. Connect the output labels, send some input, have it processed and send the result through the connection, disconnect the output labels. And what you get is a streaming function. It's very much like a common function but working on the streaming data arguments and results.

The Figure 15.2 highlights the similarity and differences between the query patterns and the streaming functions.

The query patterns and streaming functions.

Figure 15.2. The query patterns and streaming functions.


The thick lines show where the data goes during one concrete call. The thin lines show the connections that do exist but without the data going through them at the moment (they will be used during the other calls, from these other callers). The dashed thin line shows the connection that doesn't exist at the moment. It will be created when needed (and at that time the thick arrow from the streaming function to what is now the current return would disappear).

The particular beauty of the streaming functions for Triceps is that the other callers don't even need to exist yet. They can be created and connected dynamically, do their job, call the function, use its result, and then be disposed of. The calling side in Triceps doesn't have to be streaming either: it could as well be procedural.

15.2. Streaming functions by example, another version of Collapse

The streaming functions have proved quite useful in Triceps, in particular the inter-thread communications use an interface derived from them. But the ironic part is that coming up with the good examples of the streaming function usage in Triceps is surprisingly difficult. The flexibility of Triceps is the problem. If all you have is SQL, the streaming functions become pretty much a must. But if you can write the procedural code, most things are easier that way, with the normal procedural functions. For a streaming function to become beneficial, it has to be written in SQLy primitives (such as tables, joins) and not be easily reducible to the procedural code. The streaming function examples that aren't big enough for their own file are collected in t/xFn.t.

The most distilled example I've come up with is for the implementation of Collapse. The original implementation of Collapse is described in Section 14.2: “Collapsed updates” . The flush() there goes in a loop deleting the all rows from the state tables and sending them as rowops to the output.

The deletion of all the rows can nowadays be done easier with the Table method clear(). However by itself it doesn't solve the problem of sending the output. It sends the deleted rows to the table's output label but we can't just connect the output of the state tables to the Collapse output: then it would also pick up all the intermediate changes! The data needs to be picked up from the tables output selectively, only in flush().

This makes it a good streaming function: the body of the function consists of running clear() on the state tables, and its result is whatever comes on the output labels of the tables.

Since most of the logic remains unchanged, I've implemented this new version of Collapse in t/xFn.t as a subclass that extends and replaces some of the code with its own:

package FnCollapse;

sub CLONE_SKIP { 1; }

our @ISA=qw(Triceps::Collapse);

sub new # ($class, $optName => $optValue, ...)
{
  my $class = shift;
  my $self = $class->SUPER::new(@_);
  # Now add an FnReturn to the output of the dataset's tables.
  # One return is enough for both.
  # Also create the bindings for sending the data.
  foreach my $dataset (values %{$self->{datasets}}) {
    my $fret = Triceps::FnReturn->new(
      name => $self->{name} . "." . $dataset->{name} . ".retTbl",
      labels => [
        del => $dataset->{tbDelete}->getOutputLabel(),
        ins => $dataset->{tbInsert}->getOutputLabel(),
      ],
    );
    $dataset->{fret} = $fret;

    # these variables will be compiled into the binding snippets
    my $lbOut = $dataset->{lbOut};
    my $unit = $self->{unit};
    my $OP_INSERT = &Triceps::OP_INSERT;
    my $OP_DELETE = &Triceps::OP_DELETE;

    my $fbind = Triceps::FnBinding->new(
      name => $self->{name} . "." . $dataset->{name} . ".bndTbl",
      on => $fret,
      unit => $unit,
      labels => [
        del => sub {
          if ($_[1]->isDelete()) {
            $unit->call($lbOut->adopt($_[1]));
          }
        },
        ins => sub {
          if ($_[1]->isDelete()) {
            $unit->call($lbOut->makeRowop($OP_INSERT, $_[1]->getRow()));
          }
        },
      ],
    );
    $dataset->{fbind} = $fbind;
  }
  bless $self, $class;
  return $self;
}

# Override the base-class flush with a different implementation.
sub flush # ($self)
{
  my $self = shift;
  foreach my $dataset (values %{$self->{datasets}}) {
    # The binding takes care of producing and directing
    # the output. AutoFnBind will unbind when the block ends.
    my $ab = Triceps::AutoFnBind->new(
      $dataset->{fret} => $dataset->{fbind}
    );
    $dataset->{tbDelete}->clear();
    $dataset->{tbInsert}->clear();
  }
}

new() adds the streaming function elements in each data set. They consist of two parts: FnReturn defines the return value of a streaming function (there is no formal definition of the body or the entry point since they are quite flexible), and FnBinding defines a call of the streaming function. In this case the function is called in only one place, so one FnBinding is defined. If called from multiple places, there would be multiple FnBindings.

When a normal procedural function is called, the return address provides the connection to get the result back from it to the caller. In a streaming function, the FnBinding connects the result labels to the caller's further processing of the returned data. Unlike the procedural functions, the data is not returned in one step (run the function, compute the value, return it). Instead the return value of a streaming function is a stream of rowops. As each of them is sent to a return label, it goes through the binding and to the caller's further processing. Then the streaming function continues, producing the next rowop, and so on.

If this sounds complicated, please realize that here we're dealing with the assembly language equivalent for streaming functions. I expect that over time the more high-level primitives will be developed and it will become easier.

The second source of complexity is that the arguments of a streaming function are not computed in one step either. You don't normally have a full set of rows to send to a streaming function in one go. Instead you set up the streaming call to bind the result, then you pump the argument rowops to the function's input, creating them in whatever way you wish.

Getting back to the definition of a streaming function, FnReturn defines a set of labels, each with a logical name. In this case the names are del and ins. The labels inside FnReturn are a special variety of dummy labels, but they are chained to some real labels that send the result of the function. The snippet

del => $dataset->{tbDelete}->getOutputLabel(),

says create a return label named del and chain it from the tbDelete's output label. The FnReturn normally does its chaining with chainFront(), unless the option chainFront => 0 tells it otherwise. But in this particular case the chaining order wouldn't matter. There are more details to the naming and label creation but let's not get bogged in them now.

The FnBinding defines a matching set of labels, with the same logical names. It's like a receptacle and a plug: you put the plug into the receptacle and get the data flowing, you unplug it and the data flow stops. The Perl version of FnBinding provides a convenience: when it gets a code reference instead of a label, it automatically creates a label with that code for its handler.

In this case both binding labels forward the data to the Collapse's output label. Only the one for the Insert table has to change the opcodes to OP_INSERT. The check

if ($_[1]->isDelete()) ...

is really redundant, to be on the safe side, since we know that when the data will be flowing, all of it will be coming from the table clearing and have the opcodes of OP_DELETE.

The actual call happens in flush(): Triceps::AutoFnBind is a constructor of the scope object that does the plug into receptable thing, with automatic unplugging when the object returned by it gets destroyed on leaving the block scope. If you want to do things manually, FnReturn has the methods push() and pop() but the scoped binding is safer and easier. Once the binding is done, the data is sent through the function by calling clear() on both tables. And then the block ends, $ab get destroyed, AutoFnBind destructor undoes the binding, and thus the streaming function call completes.

The result produced by this version of Collapse is exactly the same as by the original version. And even when we get down to grits, it's produced with the exact same logical sequence: the rows are sent out as they are deleted from the state tables. But it's structured differently: instead of the procedural deletion and sending of the rows, the internal machinery of the tables gets invoked, and the results of that machinery are then converted to the form suitable for the collapse results and propagated to the output.

Philosophically, it could be argued, what is the body of this function? Is it just the internal logic of the table delection, that gets triggered by clear() in the caller? Or are the clear() calls also a part of the function body? But it practice it just doesn't matter, whatever.

15.3. Collapse with grouping by key with streaming functions

The Collapse as shown before sends all the collected deletes before all the collected inserts. For example, if it has collected the updates for four rows, the output will be (assuming that the Collapse element is named collapse and the data set in it is named idata):

collapse.idata.out OP_DELETE local_ip="3.3.3.3" remote_ip="7.7.7.7"
    bytes="100"
collapse.idata.out OP_DELETE local_ip="2.2.2.2" remote_ip="6.6.6.6"
    bytes="100"
collapse.idata.out OP_DELETE local_ip="4.4.4.4" remote_ip="8.8.8.8"
    bytes="100"
collapse.idata.out OP_DELETE local_ip="1.1.1.1" remote_ip="5.5.5.5"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="3.3.3.3" remote_ip="7.7.7.7"
    bytes="300"
collapse.idata.out OP_INSERT local_ip="2.2.2.2" remote_ip="6.6.6.6"
    bytes="300"
collapse.idata.out OP_INSERT local_ip="4.4.4.4" remote_ip="8.8.8.8"
    bytes="300"
collapse.idata.out OP_INSERT local_ip="1.1.1.1" remote_ip="5.5.5.5"
    bytes="300"

What if you want the updates produced as deletes immediately followed by the matching inserts with the same key? Like this:

collapse.idata.out OP_DELETE local_ip="3.3.3.3" remote_ip="7.7.7.7"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="3.3.3.3" remote_ip="7.7.7.7"
    bytes="300"
collapse.idata.out OP_DELETE local_ip="2.2.2.2" remote_ip="6.6.6.6"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="2.2.2.2" remote_ip="6.6.6.6"
    bytes="300"
collapse.idata.out OP_DELETE local_ip="4.4.4.4" remote_ip="8.8.8.8"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="4.4.4.4" remote_ip="8.8.8.8"
    bytes="300"
collapse.idata.out OP_DELETE local_ip="1.1.1.1" remote_ip="5.5.5.5"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="1.1.1.1" remote_ip="5.5.5.5"
    bytes="300"

With the procedural version it would have required doing a look-up in the Insert table after processing each row in the Delete table and handling it if found. So I've left it out to avoid complicating that example. But in the streaming function form it becomes easy, just change the binding of the del label a little bit:

    my $lbInsInput = $dataset->{tbInsert}->getInputLabel();

    my $fbind = Triceps::FnBinding->new(
      name => $self->{name} . "." . $dataset->{name} . ".bndTbl",
      on => $fret,
      unit => $unit,
      labels => [
        del => sub {
          if ($_[1]->isDelete()) {
            $unit->call($lbOut->adopt($_[1]));
            # If the INSERT is available after this DELETE, this
            # will produce it.
            $unit->call($lbInsInput->adopt($_[1]));
          }
        },
        ins => sub {
          if ($_[1]->isDelete()) {
            $unit->call($lbOut->makeRowop($OP_INSERT, $_[1]->getRow()));
          }
        },
      ],
    );

The del binding first sends the result out as usual and then forwards the DELETE rowop to the Insert table's input. Which then causes the INSERT rowop to be sent if a match is found. Mind you, the look-up and conditional processing still happens. But now it all happens inside the table machinery, all you need to do is add one more line to invoke it.

Let's talk through in a little more detail, what happens when the clearing of the Delete table deletes the row with (local_ip="3.3.3.3" remote_ip="7.7.7.7").

  1. The Delete table sends a rowop with this row and OP_DELETE to its output label collapse.idata.tbDelete.out.
  2. Which then gets forwarded to a chained label in the FnReturn, collapse.idata.retTbl.del.
  3. FnReturn has an FnBinding pushed into it, so the rowop passes to the matching label in the binding, collapse.idata.bndTbl.del.
  4. The Perl handler of that label gets called, first forwards the rowop to the Collapse output label collapse.idata.out, and then to the Insert table's input label collapse.idata.tbInsert.in.
  5. The Insert table looks up the row by the key, finds it, removes it from the table, and sends an OP_DELETE rowop to its output label collapse.idata.tbInsert.out.
  6. Which then gets forwarded to a chained label in the FnReturn, collapse.idata.retTbl.ins.
  7. FnReturn has an FnBinding pushed into it, so the rowop passes to the matching label in the binding, collapse.idata.bndTbl.ins.
  8. The Perl handler of that label gets called and sends the rowop with the opcode changed to OP_INSERT to the Collapse output label collapse.idata.out.

It's a fairly complicated sequence but all you needed to do was to add one line of code. The downside of course is that if something goes not the way you expected, you'd have to trace and understand the whole long sequence (that's the typical trouble with the SQL-based systems).

When the INSERTs are sent after DELETEs, their rows are removed from the Insert table too, so the following clear() of the Insert table won't find them any more and won't send any duplicates; it will send only the inserts for which there were no matching deletes.

And of course if there is only a DELETE collected for a certain key, not an update, there will be no matching row in the Insert table, so the forwarded DELETE request will have no effect and produce no output from the Insert table.

You may notice that the code in the del handler only forwards the rows around, and that can be replaced by a chaining:

    my $lbDel = $unit->makeDummyLabel(
      $dataset->{tbDelete}->getOutputLabel()->getRowType(),
      $self->{name} . "." . $dataset->{name} . ".lbDel");
    $lbDel->chain($lbOut);
    $lbDel->chain($lbInsInput);

    my $fbind = Triceps::FnBinding->new(
      name => $self->{name} . "." . $dataset->{name} . ".bndTbl",
      on => $fret,
      unit => $unit,
      labels => [
        del => $lbDel,
        ins => sub {
          $unit->call($lbOut->makeRowop($OP_INSERT, $_[1]->getRow()));
        },
      ],
    );

This shows another way of label definition in FnBinding: an actual label is created first and then given to the FnBinding, instead of letting it automatically create a label from the code. The condition if ($_[1]->isDelete()) has been removed from the ins part, since it's really redundant and the del part with its chaining doesn't do this check anyway.

This code works just as well and even more efficiently than the previous version, since no Perl code needs to be invoked for del, it all propagates internally through the chaining. However the price is that the DELETE rowops coming out of the output label will have the head-of-the-chain label in them:

collapse.idata.lbDel OP_DELETE local_ip="3.3.3.3" remote_ip="7.7.7.7"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="3.3.3.3" remote_ip="7.7.7.7"
    bytes="300"
collapse.idata.lbDel OP_DELETE local_ip="2.2.2.2" remote_ip="6.6.6.6"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="2.2.2.2" remote_ip="6.6.6.6"
    bytes="300"
collapse.idata.lbDel OP_DELETE local_ip="4.4.4.4" remote_ip="8.8.8.8"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="4.4.4.4" remote_ip="8.8.8.8"
    bytes="300"
collapse.idata.lbDel OP_DELETE local_ip="1.1.1.1" remote_ip="5.5.5.5"
    bytes="100"
collapse.idata.out OP_INSERT local_ip="1.1.1.1" remote_ip="5.5.5.5"
    bytes="300"

The ins side can't be handled just by chaining because it has to replace the opcode in the rowops. Another potential way to handle this would be to define various preprogrammed label types in C++ for many primitive operations, like replacing the opcode, and then build the models by combining them.

The final item is that the code shown in this section involved a recursive call of the streaming function. Its output from the del label got fed back to the function, producing more output on the ins label. This worked because it invoked a different code path in the streaming function than the one that produced the del data. If it were to form a topological loop back to the same path with the same labels, that would have been an error. The more advanced use of recursion is possible and will be discussed in more detail later.

15.4. Table-based translation with streaming functions

Next I want to show an example that is in its essence kind of dumb. The same thing is easier to do in Triceps with templates. And the whole premise is not exactly great either. But it provides an opportunity to show more of the streaming functions, in a set-up that is closer to the SQL-based systems.

The background is as follows: There happen to be multiple ways to identify the securities (stock shares and such). RIC is the identifier used by Reuters (and quite often by the other data suppliers too), consisting of the ticker symbol on an exchange, a dot, and the coded name of the exchange (such as L for the London stock exchange or N for the New York stock exchange). ISIN is the international standard alphanumeric identifier. A security (and some of its creative equivalents) might happen to be listed on multiple exchanges, each listing having its own RIC. And if you wonder, the ticker names are allocated separately by each exchange and may differ. But all of these RICs refer to the same security, thus translating to the same ISIN (there might be multiple ISINs too but that's another story). A large financial company would want to track a security all around the world. To aggregate the data on the security worldwide, it has to identify it by ISIN, but the data feed might be coming in as RIC only. The translation of RIC to ISIN is then done by the table during processing. The RIC is not thrown away either, it shows the detail of what and where had happened. But ISIN is added for the aggregation on it.

The data might be coming from multiple feeds, and there are multiple kinds of data: trades, quotes, lending quotes and so on, each with its own schema and its own aggregations. However the step of RIC-to-ISIN translation is the same for all of them, is done by the same table, and can be done in one place. Of course, multithreading can add more twists here but for now we're talking about a simple single-threaded example.

An extra complexity is that in the real world the translation table might be incomplete. However some feeds might provide both RICs and ISINs in their records, so the pairs that aren't in the reference table yet, can be inserted there and used for the following translations. This is actually not such a great idea, because it means that there might be previous records that went through before the translation became available. A much better way would be to do the translation as a join, where the update to a reference table would update any previous records as well. But then there would not be much use for a streaming function in it. As I've said before, it's a rather dumb example.

The streaming function will work like this: It will get an argument pair of (RIC, ISIN) from an incoming record. Either component of this pair might be empty. Since the rest of the record is wildly different for different feeds, the rest of the record is left off at this point, and the uniform argument of (RIC, ISIN) is given to the function. The function will consult its table, see if it can add more information from there, or add more information from the argument into the table, and return the hopefully enriched pair (RIC, ISIN) with an empty ISIN field replaced by the right value, to the caller.

The function is defined like this:

my $rtIsin = Triceps::RowType->new(
  ric => "string",
  isin => "string",
);

my $ttIsin = Triceps::TableType->new($rtIsin)
  ->addSubIndex("byRic", Triceps::IndexType->newHashed(key => [ "ric" ])
);
$ttIsin->initialize();

my $tIsin = $unit->makeTable($ttIsin, "tIsin");

# the results will come from here
my $fretLookupIsin = Triceps::FnReturn->new(
  name => "fretLookupIsin",
  unit => $unit,
  labels => [
    result => $rtIsin,
  ],
);

# The function argument: the input data will be sent here.
my $lbLookupIsin = $unit->makeLabel($rtIsin, "lbLookupIsin", undef, sub {
  my $row = $_[1]->getRow();
  if ($row->get("ric")) {
    my $argrh = $tIsin->makeRowHandle($row);
    my $rh = $tIsin->find($argrh);
    if ($rh->isNull()) {
      if ($row->get("isin")) {
        $tIsin->insert($argrh);
      }
    } else {
      $row = $rh->getRow();
    }
  }
  $unit->call($fretLookupIsin->getLabel("result")->makeRowop("OP_INSERT", $row));
});

The $fretLookupIsin is the function result, $lbLookupIsin is the function input. In this example the result label in FnReturn is defined differently than in the previous ones: not by a source label but by a row type. This label doesn't get chained to anything, instead the procedural code in the function finds it as $fretLookupIsin->getLabel("result") and calls it directly.

Then the ISIN translation code for some trades feed would look as follows (remember, supposedly there would be many feeds, each one with its own schema, but for the example I show only one):

my $rtTrade = Triceps::RowType->new(
  ric => "string",
  isin => "string",
  size => "float64",
  price => "float64",
);

my $lbTradeEnriched = $unit->makeDummyLabel($rtTrade, "lbTradeEnriched");
my $lbTrade = $unit->makeLabel($rtTrade, "lbTrade", undef, sub {
  my $rowop = $_[1];
  my $row = $rowop->getRow();
  Triceps::FnBinding::call(
    name => "callTradeLookupIsin",
    on => $fretLookupIsin,
    unit => $unit,
    rowop => $lbLookupIsin->makeRowopHash("OP_INSERT",
      ric => $row->get("ric"),
      isin => $row->get("isin"),
    ),
    labels => [
      result => sub { # a label will be created from this sub
        $unit->call($lbTradeEnriched->makeRowop($rowop->getOpcode(),
          $row->copymod(
            isin => $_[1]->getRow()->get("isin")
          )
        ));
      },
    ],
  );
});

The label $lbTrade receives the incoming trades, calls the streaming function to enrich them with the ISIN data, and forwards the enriched data to the label $lbTradeEnriched. The function call is done differently in this example. Rather than create a FnBinding object and then use it with a scoped AutoFnBind, it uses the convenience function FnBinding::call() that wraps all that logic. It's simpler to use, without all these extra objects, but the price is the efficiency: it ends up creating a new FnBinding object for every call. That's where a compiler would be very useful, it could take a call like this, translate it to the internal objects once, and then keep reusing them.

The FnBinding::call() option name gives a name that is used for the error messages and also to produce the names of the temporary objects it creates. The option on tells, which streaming function is being called (by specifying its FnReturn). The option rowop gives the arguments of the streaming functions. There are multiple ways to do that: option rowop for a single rowop, rowops for an array of rowops, tray for a tray, and code for a procedural code snippet that would send the inputs to the streaming function. And labels as usual connects the results of the function, either to the existing labels, or by creating labels automatically from the snippets of code.

The result handling in this example demonstrates the technique that I call the implicit join: The function gets a portion of data from an original row, does some transformation and returns the data back. This data is then joined with the original row. The code knows, what this original row was, it gets remembered in the variable $row. The semantics of the call guarantees that nothing else has happened during the function call, and that $row is still the current row. Then the function result gets joined with $row, and the produced data is sent further on its way. The variable $row could be either a global one, or as shown here a scoped variable that gets embedded into a closure function.

The rest of the example, the dispatcher part, is:

# print what is going on
my $lbPrintIsin = makePrintLabel("printIsin", $tIsin->getOutputLabel());
my $lbPrintTrade = makePrintLabel("printTrade", $lbTradeEnriched);

# the main loop
my %dispatch = (
  isin => $tIsin->getInputLabel(),
  trade => $lbTrade,
);

while(<STDIN>) {
  chomp;
  my @data = split(/,/); # starts with a command, then string opcode
  my $type = shift @data;
  my $lb = $dispatch{$type};
  my $rowop = $lb->makeRowopArray(@data);
  $unit->call($rowop);
  $unit->drainFrame(); # just in case, for completeness
}

And an example of running, with the input lines shown in bold:

isin,OP_INSERT,ABC.L,US0000012345
tIsin.out OP_INSERT ric="ABC.L" isin="US0000012345"
isin,OP_INSERT,ABC.N,US0000012345
tIsin.out OP_INSERT ric="ABC.N" isin="US0000012345"
isin,OP_INSERT,DEF.N,US0000054321
tIsin.out OP_INSERT ric="DEF.N" isin="US0000054321"
trade,OP_INSERT,ABC.L,,100,10.5
lbTradeEnriched OP_INSERT ric="ABC.L" isin="US0000012345" size="100"
    price="10.5"
trade,OP_DELETE,ABC.N,,200,10.5
lbTradeEnriched OP_DELETE ric="ABC.N" isin="US0000012345" size="200"
    price="10.5"
trade,OP_INSERT,GHI.N,,300,10.5
lbTradeEnriched OP_INSERT ric="GHI.N" isin="" size="300" price="10.5"
trade,OP_INSERT,,XX0000012345,400,10.5
lbTradeEnriched OP_INSERT ric="" isin="XX0000012345" size="400"
    price="10.5"
trade,OP_INSERT,GHI.N,XX0000012345,500,10.5
tIsin.out OP_INSERT ric="GHI.N" isin="XX0000012345"
lbTradeEnriched OP_INSERT ric="GHI.N" isin="XX0000012345" size="500"
    price="10.5"
trade,OP_INSERT,GHI.N,,600,10.5
lbTradeEnriched OP_INSERT ric="GHI.N" isin="XX0000012345" size="600"
    price="10.5"

The table gets pre-populated with a few translations, and the first few trades use them. Then goes the example of a non-existing translation, which gets eventually added from the incoming data (see that the trade with (GHI.N, XX0000012345) both updates the ISIN table and sends through the trade record), and the following trades can then use this newly added translation but obviously the older ones do not get updated.

15.5. Streaming functions and loops

The streaming functions can be used to replace the topological loops (where the connection between the labels go in circles) with the procedural ones. Just make the body of the loop into a streaming function and connect its output with its own input (and of course also to the loop results). Then call this function in a procedural while-loop until the data stop circulating.

The way the streaming functions have been described so far, there is a catch, even two of them: First, with such a connection, the output of the streaming function would immediately circulate to its input, and would try to keep circulating until the loop is done, with no need for a while-loop. Second, as soon as it attempts to circulate, the scheduler will detect a recursive call and die (unless you change the recursion settings, however this is not a good reason to change them).

But there is also a solution that has not been described yet: an FnBinding can collect the incoming rowops in a tray instead of immediately forwarding them. This tray can be called later, after the original function call completes. This way the iteration has its data collected, the function completes, and then the next iteration of the while-loop starts, sending the data from the previous iteration. When there is nothing to send any more, the loop completes.

Using this logic, let's rewrite the Fibonacci example with the streaming function loops. Its original version and description of the logic can be found in Section 7.7: “Topological loops” .

The new version is:

my $uFib = Triceps::Unit->new("uFib");

###
# A streaming function that computes one step of a
# Fibonacci number, will be called repeatedly.

# Type of its input and output.
my $rtFib = Triceps::RowType->new(
  iter => "int32", # number of iterations left to do
  cur => "int64", # current number
  prev => "int64", # previous number
);

# Input:
#   $lbFibCompute: request to do a step. iter will be decremented,
#     cur moved to prev, new value of cur computed.
# Output (by FnReturn labels):
#   "next": data to send to the next step, if the iteration
#     is not finished yet (iter in the produced row is >0).
#   "result": the result data if the iretaion is finished
#     (iter in the produced row is 0).
# The opcode is preserved through the computation.

my $frFib = Triceps::FnReturn->new(
  name => "Fib",
  unit => $uFib,
  labels => [
    next => $rtFib,
    result => $rtFib,
  ],
);

my $lbFibCompute = $uFib->makeLabel($rtFib, "FibCompute", undef, sub {
  my $row = $_[1]->getRow();
  my $prev = $row->get("cur");
  my $cur = $prev + $row->get("prev");
  my $iter = $row->get("iter") - 1;
  $uFib->makeHashCall($frFib->getLabel($iter > 0? "next" : "result"), $_[1]->getOpcode(),
    iter => $iter,
    cur => $cur,
    prev => $prev,
  );
});

# End of streaming function
###

my $lbPrint = $uFib->makeLabel($rtFib, "Print", undef, sub {
  print($_[1]->getRow()->get("cur"));
});

# binding to run the Triceps steps in a loop
my $fbFibLoop = Triceps::FnBinding->new(
  name => "FibLoop",
  on => $frFib,
  withTray => 1,
  labels => [
    next => $lbFibCompute,
    result => $lbPrint,
  ],
);

my $lbMain = $uFib->makeLabel($rtFib, "Main", undef, sub {
  my $row = $_[1]->getRow();
  {
    my $ab = Triceps::AutoFnBind->new($frFib, $fbFibLoop);

    # send the request into the loop
    $uFib->makeHashCall($lbFibCompute, $_[1]->getOpcode(),
      iter => $row->get("iter"),
      cur => 0, # the "0-th" number
      prev => 1,
    );

    # now keep cycling the loop until it's all done
    while (!$fbFibLoop->trayEmpty()) {
      $fbFibLoop->callTray();
    }
  }
  print(" is Fibonacci number ", $row->get("iter"), "\n");
});

while(<STDIN>) {
  chomp;
  my @data = split(/,/);
  $uFib->makeArrayCall($lbMain, @data);
  $uFib->drainFrame(); # just in case, for completeness
}

It produces the same output as before (as usual, the lines in bold are the input lines):

OP_INSERT,1
1 is Fibonacci number 1
OP_DELETE,2
1 is Fibonacci number 2
OP_INSERT,5
5 is Fibonacci number 5
OP_INSERT,6
8 is Fibonacci number 6

The option withTray of FnBind is what makes it collect the rowops in a tray. The rowops are not the original incoming ones but already translated to call the FnBinding's output labels. The method callTray() swaps the tray with a fresh one and then calls the original tray with the collected rowops. There are more methods for the tray control: swapTray() swaps the tray with a fresh one and returns the original one, which can then be read or called; traySize() returns not just the emptiness condition but the whole size of the tray.

The whole loop runs in one binding scope, because it doesn't change with the iterations. The first row primes the loop, and then it continues while there is anything to circulate.

This example sent both the next iteration rows and the result rows through the binding. But for the result rows it doesn't have to. They can be sent directly out of the loop:

my $lbFibCompute = $uFib->makeLabel($rtFib, "FibCompute", undef, sub {
  my $row = $_[1]->getRow();
  my $prev = $row->get("cur");
  my $cur = $prev + $row->get("prev");
  my $iter = $row->get("iter") - 1;
  $uFib->makeHashCall($iter > 0? $frFib->getLabel("next") : $lbPrint, $_[1]->getOpcode(),
    iter => $iter,
    cur => $cur,
    prev => $prev,
  );
});

The printed result is exactly the same as in the previous example.

15.6. Streaming functions and pipelines

The streaming functions can be arranged into a pipeline by binding the result of one function to the input of another one. Fundamentally, the pipelines in the world of streaming functions are analogs of the nested calls with the common functions. For example, a pipeline (written for shortness in the Unix way)

a | b | c

is an analog of the common function calls

c(b(a()))

Of course, if the pipeline is fixed, it can as well be connected directly with the label chaining and then stay like this. A more interesting case is when the pipeline needs to be reconfigured dynamically based on the user requests. An interesting example of pipeline usage comes from the data security. A client may connect to a CEP model element in a clear-text or encrypted way. In the encrypted way the data received from the client needs to be decrypted, then processed, and then the results encrypted before sending them back:

receive | decrypt | process | encrypt | send

In the clear-text mode the pipeline becomes shorter:

receive | process | send

Let's make an example around this idea: To highlight the flexibility, the configuration will be selectable for each input line. If the input starts with a +, it will be considered encrypted, otherwise clear-text. Since the actual security is not important for the example, it will be simulated by encoding the text in hex (each byte of data becomes two hexadecimal digits). The real encryption, such as SSL, would of course require the key negotiation, but this little example just skips over this part, since it has no key. First, define the input and output (receive and send) endpoints:

# All the input and output gets converted through an intermediate
# format of a row with one string field.
my $rtString = Triceps::RowType->new(
  s => "string"
);

# All the input gets sent here.
my $lbReceive = $unit->makeDummyLabel($rtString, "lbReceive");
my $retReceive = Triceps::FnReturn->new(
  name => "retReceive",
  labels => [
    data => $lbReceive,
  ],
);

# The binding that actually prints the output.
my $bindSend = Triceps::FnBinding->new(
  name => "bindSend",
  on => $retReceive, # any matching return will do
  unit => $unit,
  labels => [
    data => sub {
      print($_[1]->getRow()->get("s"), "\n");
    },
  ],
);

The same row type $rtString will be used for the whole pipeline, sending through the arbitrary strings of text. The binding $bindSend is defined on $retReceive, so they can actually be short-circuited together. But they don't have to. $bindSend can be bound to any matching return. The matching return is defined as having the same number of labels in it, with matching row types. The names of the labels don't matter but their order does. It's a bit tricky: when a binding is created, the labels in it get connected to the return on which it's defined by name. But at this point each of them gets assigned a number, in order the labels went in that original return. After that only this number matters: if this binding gets connected to another matching return, it will get the data from the return's label with the same number, not the same name.

Next step, define the endpoints for the processing: the dispatcher and the output label. All of them use the same row type and matching returns. The actual processing will eventually be hard-connected between these endpoints.

my %dispatch; # the dispatch table will be set here

# The binding that dispatches the input data
my $bindDispatch = Triceps::FnBinding->new(
  name => "bindDispatch",
  on => $retReceive,
  unit => $unit,
  labels => [
    data => sub {
      my @data = split(/,/, $_[1]->getRow()->get("s")); # starts with a command, then string opcode
      my $type = shift @data;
      my $lb = $dispatch{$type};
      my $rowop = $lb->makeRowopArray(@data);
      $unit->call($rowop);
    },
  ],
);

# All the output gets converted to rtString and sent here.
my $lbOutput = $unit->makeDummyLabel($rtString, "lbOutput");
my $retOutput = Triceps::FnReturn->new(
  name => "retOutput",
  labels => [
    data => $lbOutput,
  ],
);

And now the filters for encryption and decryption. Each of them has a binding for its input and a return for its output. The actual pseudo-encryption transformation is done with Perl functions unpack() and pack().

# The encryption pipeline element.
my $retEncrypt = Triceps::FnReturn->new(
  name => "retEncrypt",
  unit => $unit,
  labels => [
    data => $rtString,
  ],
);
my $lbEncrypt = $retEncrypt->getLabel("data");
my $bindEncrypt = Triceps::FnBinding->new(
  name => "bindEncrypt",
  on => $retReceive,
  unit => $unit,
  labels => [
    data => sub {
      my $s = $_[1]->getRow()->get("s");
      $unit->makeArrayCall($lbEncrypt, "OP_INSERT", unpack("H*", $s));
    },
  ],
);

# The decryption pipeline element.
my $retDecrypt = Triceps::FnReturn->new(
  name => "retDecrypt",
  unit => $unit,
  labels => [
    data => $rtString,
  ],
);
my $lbDecrypt = $retDecrypt->getLabel("data");
my $bindDecrypt = Triceps::FnBinding->new(
  name => "bindDecrypt",
  on => $retReceive,
  unit => $unit,
  labels => [
    data => sub {
      my $s = $_[1]->getRow()->get("s");
      $unit->makeArrayCall($lbDecrypt, "OP_INSERT", pack("H*", $s));
    },
  ],
);

Then goes the body of the model. It defines the actual row types for the data that gets parsed from strings and the business logic (which is pretty simple, increasing an integer field). The dispatch table connects the dispatcher with the business logic, and the conversion from the data rows to the plain text rows is done with template makePipePrintLabel(). This template is very similar to the template makePrintLabel() that was shown in Section 10.3: “Simple wrapper templates” .

sub makePipePrintLabel($$$) # ($print_label_name, $parent_label, $out_label)
{
  my $name = shift;
  my $lbParent = shift;
  my $lbOutput = shift;
  my $unit = $lbOutput->getUnit();
  my $lb = $lbParent->getUnit()->makeLabel($lbParent->getType(), $name,
    undef, sub { # (label, rowop)
      $unit->makeArrayCall(
        $lbOutput, "OP_INSERT", $_[1]->printP());
    });
  $lbParent->chain($lb);
  return $lb;
}

# The body of the model: pass through the name, increase the count.
my $rtData = Triceps::RowType->new(
  name => "string",
  count => "int32",
);

my $lbIncResult = $unit->makeDummyLabel($rtData, "result");
my $lbInc = $unit->makeLabel($rtData, "inc", undef, sub {
  my $row = $_[1]->getRow();
  $unit->makeHashCall($lbIncResult, $_[1]->getOpcode(),
    name  => $row->get("name"),
    count => $row->get("count") + 1,
  );
});
makePipePrintLabel("printResult", $lbIncResult, $lbOutput);

%dispatch = (
  inc => $lbInc,
);

Finally, the main loop. It will check the input lines for the leading + and construct one or the other pipeline for processing. Of course, the pipelines don't have to be constructed in the main loop. They could have been constructed in the handler of $lbReceive just as well (then it would need a separate label to send its result to, and to include into $retReceive).

while(<STDIN>) {
  my $ab;
  chomp;
  if (/^\+/) {
    $ab = Triceps::AutoFnBind->new(
      $retReceive => $bindDecrypt,
      $retDecrypt => $bindDispatch,
      $retOutput => $bindEncrypt,
      $retEncrypt => $bindSend,
    );
    $_ = substr($_, 1);
  } else {
    $ab = Triceps::AutoFnBind->new(
      $retReceive => $bindDispatch,
      $retOutput => $bindSend,
    );
  };
  $unit->makeArrayCall($lbReceive, "OP_INSERT", $_);
  $unit->drainFrame();
}

The constructor of AutoFnBind can accept multiple return-binding pairs. It will bind them all, and unbind them back on its object destruction. It's the same thing as creating multiple AutoFnBind objects, one for each pair, only more efficient. And here is an example of a run (as usual the input lines are in bold, and the long lines get wrapped):

inc,OP_INSERT,abc,1
result OP_INSERT name="abc" count="2"
inc,OP_DELETE,def,100
result OP_DELETE name="def" count="101"
+696e632c4f505f494e534552542c6162632c32
726573756c74204f505f494e53455254206e616d653d226162632220636f756e743d2
  2332220
+696e632c4f505f44454c4554452c6465662c313031
726573756c74204f505f44454c455445206e616d653d226465662220636f756e743d2
  23130322220

What is in the encrypted data? The input lines have been produced by running a Perl expression manually:

$ perl -e 'print((unpack "H*", "inc,OP_INSERT,abc,2"), "\n");'
696e632c4f505f494e534552542c6162632c32
$ perl -e 'print((unpack "H*", "inc,OP_DELETE,def,101"), "\n");'
696e632c4f505f44454c4554452c6465662c313031

They and their results can be decoded by running another Perl expression:

$ perl -e 'print((pack "H*", "726573756c74204f505f494e53455254206e616
  d653d226162632220636f756e743d22332220"), "\n");'
result OP_INSERT name="abc" count="3"
$ perl -e 'print((pack "H*", "726573756c74204f505f44454c455445206e616
  d653d226465662220636f756e743d223130322220"), "\n");'
result OP_DELETE name="def" count="102"

15.7. Streaming functions and tables

Sometimes you might want to collect a table's reaction to an operation on it and process it manually afterwards. Triceps 1.0 had a special feature called copy tray to support that but starting with the version 2.0 the streaming functions solve this problem much better, replacing the copy trays.

If you connect the table's output to a FnReturn and then push a binding with a tray onto it, the table's output will be collected on that tray. There is even a Table method that creates this FnReturn:

$fret = $table->fnReturn();

The return contains the labels pre, out, dump (more on that one below) and the named labels for all aggregators. The FnReturn object is created on the first call of this method and is kept in the table. All the following calls return the same object. This has some interesting consequences for the pre label: the rowop for the pre label doesn't get created at all if there is nothing chained from that label. But when the FnReturn gets created, one of its labels gets chained from the pre label. Which means that once you call $table->fnReturn() for the first time, you will see that table's pre label called in all the traces. It's not a huge extra overhead, but still something to keep in mind and not be surprised when calling fnReturn() changes all your traces.

The following code demonstrates the use of an FnReturn to collect the changes done to a table on insert:

my $fret1 = $t1->fnReturn();
my $fbind1 = Triceps::FnBinding->new(
    unit => $unit,
    name => "fbind1",
    on => $fret1,
    withTray => 1,
    labels => [
        out => sub { }, # another way to make a dummy
    ],
);

$fret1->push($fbind1);
$t1->insert($row1);
$fret1->pop($fbind1);

# $tray contains the rowops produced by the update
my $tray = $fbind1->swapTray(); # get the updates on an insert
my @rowops = $tray->toArray();

And then you could for example check if any rowop has the DELETE opcode, this meaning that an old row was displaced by this insert. Of course, this is not the most efficient way. Placing the check into the label handler would be a better approach. And you don't even have to collect the rowops in a tray, you can as well compute the result on the fly:

my $seenDelete;

my $fret1 = $t1->fnReturn();
my $fbind1 = Triceps::FnBinding->new(
    unit => $unit,
    name => "fbind1",
    on => $fret1,
    labels => [
        out => sub {
      $seenDelete = 1 if ($_[1]->isDelete());
    }
    ],
);

$fret1->push($fbind1);
$seenDelete = 0;
$t1->insert($row1);
$fret1->pop($fbind1);

if ($seenDelete) {
  # there was a displacement
}

The variable $seenDelete is remembered in the closure function that handles the out label and sets it accordingly.

In both examples the binding doesn't have to be created from scratch each time. Creating it once and then reusing as needed would be more efficient.

And of course the use of an FnReturn doesn't preclude you from connecting the table outputs as usual.

Another feature where the tables and streaming functions intersect is the table dumping. It allows to iterate on a table in a functional manner.

The label dump is present in the table and its FnReturn. Whenever the method Table::dumpAll() is called, it sends the whole contents of the table to that label. Then you can set a binding on the table's FnReturn, call dumpAll(), and the binding will iterate through the whole table's contents.

If you want to get the dump label explicitly, you can do it with

my $dlab = $table->getDumpLabel();

Normally the only reason to do that would be to add it to another FnReturn (besides the table's FnReturn). Chaining anything else directly to this label would not make much sense, because the dump of the table can be called from many places, and the directly chained label will receive data every time the dump is called.

The grand plan is also to add the dumping by a a condition that selects a sub-index, but it's not implemented yet. You can select an index for an alternative ordering but all the rows get dumped in any case.

The method dumpAllIdx() is the one that sends the rows in the order of a chosen index, rather than the default first leaf index:

$table->dumpAll();
$table->dumpAllIdx($indexType);

As usual, the index type must belong to the exact type of this table. For example:

$table->dumpAllIdx($table->getType()->findIndexPath("cb"), "OP_NOP");

The typical usage looks like this:

Triceps::FnBinding::call(
  name => "iterate",
  on => $table->fnReturn(),
  unit => $unit,
  labels => [
    dump => sub { ... },
  ],
  code => sub {
    $table->dumpAll();
  },
);

It's less efficient than the normal iteration but sometimes comes handy.

Normally the rowops are sent with the opcode OP_INSERT. But the opcode can also be specified explicitly:

$table->dumpAll($opcode);
$table->dumpAllIdx($indexType, $opcode);

And some more interesting examples will be forthcoming in Section 15.11: “Streaming functions and unit boundaries” and Section 17.6: “Internals of a TQL join” .

15.8. Streaming functions and template results

The same way as the FnReturns can be used to get back the direct results of the operations on the tables, they can be also used on the templates in general. Indeed, it's a good idea to have a method that would create an FnReturn in all the templates. So I went ahead and added it to the LookupJoin, JoinTwo and Collapse.

For the joins, the resulting FnReturn has one label out. It's created similarly to the table's:

my $fret = $join->fnReturn();

And then it can be used as usual. The implementation of this method is fairly simple:

sub fnReturn # (self)
{
  my $self = shift;
  if (!defined $self->{fret}) {
    $self->{fret} = Triceps::FnReturn->new(
      name => $self->{name} . ".fret",
      labels => [
        out => $self->{outputLabel},
      ],
    );
  }
  return $self->{fret};
}

All this makes the method lookup() of LookupJoin essentially redundant, since now pretty much all the same can be done with the streaming function API, and even better, because it provides the opcodes on rowops, can handle the full processing, and calls the rowops one by one without necessarily creating an array. But it could happen yet that the lookup() has some more convenient uses too, so I didn't remove it yet.

For Collapse the interface is a little more complicated: the FnReturn contains a label for each data set, named the same as the data set. The order of labels follows the order of the data set definitions (though right now it's kind of moot, because only one data set is supported). The implementation is:

sub fnReturn # (self)
{
  my $self = shift;
  if (!defined $self->{fret}) {
    my @labels;
    for my $n (@{$self->{dsetnames}}) {
      push @labels, $n, $self->{datasets}{$n}{lbOut};
    }
    $self->{fret} = Triceps::FnReturn->new(
      name => $self->{name} . ".fret",
      labels => \@labels,
    );
  }
  return $self->{fret};
}

Use these examples to write the fnReturn() in your templates.

15.9. Streaming functions and recursion

Let's look again at the pipeline example. Suppose we want to do the encryption twice (you know, maybe we have a secure channel to a semi-trusted intermediary who can can read the envelopes and forward the encrypted messages he can't read to the final destination). The pipeline becomes

decrypt | decrypt | process | encrypt | encrypt

Or if you want to think about it in a more function-like notation, rather than a pipeline, the logic can also be expressed as:

encrypt(encrypt(process(decrypt(decrypt(data)))))

However it would not work directly: a decrypt function has only one output and it can not have two bindings at the same time, it would not know which one to use at any particular time.

Instead you can make decrypt into a template, instantiate it twice, and connect into a pipeline. It's very much like what the Unix shell does: it instantiates a new process for each part of its pipeline.

But there is also another possibility: instead of assembling the whole pipeline in advance, do it in steps.

Start by adding this option in every binding:

withTray => 1,

This will make all the bindings collect the result on a tray instead of sending it on immediately. Then modify the main loop:

while(<STDIN>) {
  chomp;

  # receive
  my $abReceive = Triceps::AutoFnBind->new(
    $retReceive => $bindDecrypt,
  );
  $unit->makeArrayCall($lbReceive, "OP_INSERT", $_);

  # 1st decrypt
  my $abDecrypt1 = Triceps::AutoFnBind->new(
    $retDecrypt => $bindDecrypt,
  );
  $bindDecrypt->callTray();

  # 2nd decrypt
  my $abDecrypt2 = Triceps::AutoFnBind->new(
    $retDecrypt => $bindDispatch,
  );
  $bindDecrypt->callTray();

  # processing
  my $abProcess = Triceps::AutoFnBind->new(
    $retOutput => $bindEncrypt,
  );
  $bindDispatch->callTray();

  # 1st encrypt
  my $abEncrypt1 = Triceps::AutoFnBind->new(
    $retEncrypt => $bindEncrypt,
  );
  $bindEncrypt->callTray();

  # 2nd encrypt
  my $abEncrypt2 = Triceps::AutoFnBind->new(
    $retEncrypt => $bindSend,
  );
  $bindEncrypt->callTray();

  # send
  $bindSend->callTray();
}

Here I've dropped the encrypted-or-unencrypted choice to save the space, the data is always encrypted twice. The drainFrame() call has been dropped because with the way the function calls work here there is no chance that it could be useful. The rest of the code stays the same.

The bindings have been split in stages. The next binding is set in each stage, and the data from the previous binding gets sent into it. The binding method callTray() replaces the tray in the binding with an empty one, and then calls all the rowops collected on the old tray (and if you wonder what then happens to the old tray, it gets discarded). Because of this the first decryption stage with binding

my $abDecrypt1 = Triceps::AutoFnBind->new(
  $retDecrypt => $bindDecrypt,
);

doesn't send the data circling forever. It just does one pass through the decryption and prepares for the second pass.

Every time AutoFnBind->new() runs, it doesn't replace the binding of the FnReturn but pushes a new binding onto the FnReturn's stack. Each FnReturn has its own stack of bindings (this way it's easier to manage than a single stack). When an AutoFnBind gets destroyed, it pops the binding from the return's stack. And yes, if you specify multiple bindings in one AutoFnBind, all of them get pushed on construction and popped on destruction. In this case all the auto-binds are in the same block, so they will all be destroyed at the end of block in the opposite order. Which means that in effect the code is equivalent to the nested blocks. And the version with explicit nexted blocks might be easier for you to think of:

while(<STDIN>) {
  chomp;

  # receive
  my $abReceive = Triceps::AutoFnBind->new(
    $retReceive => $bindDecrypt,
  );
  $unit->makeArrayCall($lbReceive, "OP_INSERT", $_);

  {
    # 1st decrypt
    my $abDecrypt1 = Triceps::AutoFnBind->new(
      $retDecrypt => $bindDecrypt,
    );
    $bindDecrypt->callTray();

    {
      # 2nd decrypt
      my $abDecrypt1 = Triceps::AutoFnBind->new(
        $retDecrypt => $bindDispatch,
      );
      $bindDecrypt->callTray();

      {
        # processing
        my $abProcess = Triceps::AutoFnBind->new(
          $retOutput => $bindEncrypt,
        );
        $bindDispatch->callTray();

        {
          # 1st encrypt
          my $abEncrypt1 = Triceps::AutoFnBind->new(
            $retEncrypt => $bindEncrypt,
          );
          $bindEncrypt->callTray();

          {
            # 2nd encrypt
            my $abEncrypt1 = Triceps::AutoFnBind->new(
              $retEncrypt => $bindSend,
            );
            $bindEncrypt->callTray();

            # send
            $bindSend->callTray();
          }
        }
      }
    }
  }
}

An interesting consequence of all this nesting, pushing and popping is that you can put the inner calls into the procedural loops if you wish. For example, if you want to process every input line thrice:

while(<STDIN>) {
  chomp;

  # receive
  my $abReceive = Triceps::AutoFnBind->new(
    $retReceive => $bindDecrypt,
  );

  for (my $i = 0; $i < 3; $i++) {
    $unit->makeArrayCall($lbReceive, "OP_INSERT", $_);

    {
      # 1st decrypt
      my $abDecrypt1 = Triceps::AutoFnBind->new(
        $retDecrypt => $bindDecrypt,
      );
      $bindDecrypt->callTray();

      {
        # 2nd decrypt
        my $abDecrypt1 = Triceps::AutoFnBind->new(
          $retDecrypt => $bindDispatch,
        );
        $bindDecrypt->callTray();

        {
          # processing
          my $abProcess = Triceps::AutoFnBind->new(
            $retOutput => $bindEncrypt,
          );
          $bindDispatch->callTray();

          {
            # 1st encrypt
            my $abEncrypt1 = Triceps::AutoFnBind->new(
              $retEncrypt => $bindEncrypt,
            );
            $bindEncrypt->callTray();

            {
              # 2nd encrypt
              my $abEncrypt1 = Triceps::AutoFnBind->new(
                $retEncrypt => $bindSend,
              );
              $bindEncrypt->callTray();

              # send
              $bindSend->callTray();
            }
          }
        }
      }
    }
  }
}

This code will run the whole pipeline three times for each input line, and print out three output lines. The following example of the output has both the input and the output lines wrapped, since they are hugely long:

363936653633326334663530356634393465353334353532353432633631363236
  3332633332
373236353733373536633734323034663530356634393465353334353532353432
  3036653631366436353364323236313632363332323230363336663735366537
  3433643232333332323230
373236353733373536633734323034663530356634393465353334353532353432
  3036653631366436353364323236313632363332323230363336663735366537
  3433643232333332323230
373236353733373536633734323034663530356634393465353334353532353432
  3036653631366436353364323236313632363332323230363336663735366537
  3433643232333332323230

If you wonder, what is the meaning of these lines, they are the same as before. The input is :

inc,OP_INSERT,abc,2

And each line of output is:

result OP_INSERT name="abc" count="3"

I suppose, it would be more entertaining if the processing weren't just incrementing a value in the input data but incrementing some static counter, then the three output lines would be different.

However this is not the only way to do the block nesting. The contents of the FnBinding's tray is not affected in any way by the binding being pushed or popped. It stays there throughout, until it's explicitly flushed by callTray(). So it could use the blocks formed in a more pipeline-like fashion (as opposed to the more function-call-like fashion shown before):

while(<STDIN>) {
  chomp;

  # receive
  {
    my $abReceive = Triceps::AutoFnBind->new(
      $retReceive => $bindDecrypt,
    );
    $unit->makeArrayCall($lbReceive, "OP_INSERT", $_);
  }

  # 1st decrypt
  {
    my $abDecrypt1 = Triceps::AutoFnBind->new(
      $retDecrypt => $bindDecrypt,
    );
    $bindDecrypt->callTray();
  }

  # 2nd decrypt
  {
    my $abDecrypt1 = Triceps::AutoFnBind->new(
      $retDecrypt => $bindDispatch,
    );
    $bindDecrypt->callTray();
  }

  # processing
  {
    my $abProcess = Triceps::AutoFnBind->new(
      $retOutput => $bindEncrypt,
    );
    $bindDispatch->callTray();
  }

  # 1st encrypt
  {
    my $abEncrypt1 = Triceps::AutoFnBind->new(
      $retEncrypt => $bindEncrypt,
    );
    $bindEncrypt->callTray();
  }

  # 2nd encrypt
  {
    my $abEncrypt1 = Triceps::AutoFnBind->new(
      $retEncrypt => $bindSend,
    );
    $bindEncrypt->callTray();
  }

  # send
  $bindSend->callTray();
}

After each stage, its binding is popped but the tray is carried through to the next stage.

Which way of blocking is better? I'd say they're pretty equivalent in functionality, and your preference would depend on what style you prefer to express.

15.10. Streaming functions and more recursion

There are great many slightly different ways to use recursion with the streaming functions. This section goes through them with examples of the Fibonacci numbers computed in all these ways. You can as well skip over this section if you're not particularly interested in the details of recursive execution.

All the examples from this section (and most of others from this chapter) are locates in t/xFn.t. The first example uses the dumb recursive calls. It's a real dumb recursive way, with two recursive calls and thus the exponential execution time, just to show how they can be done. This simplest and most straightforward way goes as follows:

my $uFib = Triceps::Unit->new("uFib");
$uFib->setMaxRecursionDepth(100);

# Type the data going into the function
my $rtFibArg = Triceps::RowType->new(
  idx => "int32", # the index of Fibonacci number to generate
);

# Type of the function result
my $rtFibRes = Triceps::RowType->new(
  idx => "int32", # the index of Fibonacci number
  fib => "int64", # the generated Fibonacci number
);

###
# A streaming function that computes a Fibonacci number.

# Input:
#   $lbFibCompute: request to compute the number.
# Output (by FnReturn labels):
#   "result": the computed value.
# The opcode is preserved through the computation.

my $frFib = Triceps::FnReturn->new(
  name => "Fib",
  unit => $uFib,
  labels => [
    result => $rtFibRes,
  ],
);

my $lbFibResult = $frFib->getLabel("result");

my $lbFibCompute; # must be defined before assignment, for recursion
$lbFibCompute = $uFib->makeLabel($rtFibArg, "FibCompute", undef, sub {
  my $row = $_[1]->getRow();
  my $op = $_[1]->getOpcode();
  my $idx = $row->get("idx");
  my $res;

  if ($idx < 1) {
    $res = 0;
  } elsif($idx == 1) {
    $res = 1;
  } else {
    my ($prev1, $prev2);
    Triceps::FnBinding::call(
      name => "FibCompute.call1",
      on => $frFib,
      unit => $uFib,
      labels => [
        result => sub {
          $prev1 = $_[1]->getRow()->get("fib");
        }
      ],
      rowop => $lbFibCompute->makeRowopHash($op,
        idx => $idx - 1,
      ),
    );
    Triceps::FnBinding::call(
      name => "FibCompute.call2",
      on => $frFib,
      unit => $uFib,
      labels => [
        result => sub {
          $prev2 = $_[1]->getRow()->get("fib");
        }
      ],
      rowop => $lbFibCompute->makeRowopHash($op,
        idx => $idx - 2,
      ),
    );
    $res = $prev1 + $prev2;
  }
  $uFib->makeHashCall($frFib->getLabel("result"), $op,
    idx => $idx,
    fib => $res,
  );
});

# End of streaming function
###

# binding to call the Fibonacci function and print the result
my $fbFibCall = Triceps::FnBinding->new(
  name => "FibCall",
  on => $frFib,
  unit => $uFib,
  labels => [
    result => sub {
      my $row = $_[1]->getRow();
      print($row->get("fib"), " is Fibonacci number ", $row->get("idx"), "\n");
    }
  ],
);

while(<STDIN>) {
  chomp;
  my @data = split(/,/);
  $uFib->callBound(
    $lbFibCompute->makeRowopArray(@data),
    $frFib => $fbFibCall,
  );
  $uFib->drainFrame(); # just in case, for completeness
}

The calling sequence had become different than in the looping version but the produced result is exactly the same. The streaming function now receives an argument row and produces a result row. The unit's recursion depth limit had to be adjusted to permit the recursion.

The recursive calls are done through the FnBinding::call(), with a closure for the result handling label. That closure can access the scope of its creator and place the result into its local variable. After both intermediate results are computed, the final result computation takes place and sends out the result row.

The FnBinding::call() creates a brand new binding for each call. So no matter how deep is the recursion, each function call will get a separate binding that knows how to put the results into the correct place.

If the streaming function were to return more than one rowop, the closure would have to collect them all into a variable. The further processing can not be done until the function completes. The bindings with trays cannot be used because FnBinding::call() disposes of the binding before it returns, so there is no chance to extract the tray from the binding. Perhaps this can be improved in the future. But there is another way to use trays that will be shown below.

And just to show yet another technique, the main loop is also different: instead of creating an AutoFnBind manually, it uses the Unit's method callBound() that is more compact to write and slightly more efficient. It's a great method if you have all the rowops for the call available upfront. It's first argument is a rowop or a tray or a reference to an array of rowops. The rest are the pairs of FnReturns and FnBindings. The bindings are pushed onto the FnReturns, then the rowops are called, then the bindings are popped. It replaces a whole block that would contain an AutoFnBind and the calls.

FnBinding:call() with closures is easy to use but it creates a closure and an FnBinding object on each run. Can things be rearranged to reuse the same objects? With some effort, they can:

###
# A streaming function that computes a Fibonacci number.

# Input:
#   $lbFibCompute: request to compute the number.
# Output (by FnReturn labels):
#   "result": the computed value.
# The opcode is preserved through the computation.

my @stackFib; # stack of the function states
my $stateFib; # The current state

my $frFib = Triceps::FnReturn->new(
  name => "Fib",
  unit => $uFib,
  labels => [
    result => $rtFibRes,
  ],
  onPush => sub { push @stackFib, $stateFib; $stateFib = { }; },
  onPop => sub { $stateFib = pop @stackFib; },
);

my $lbFibResult = $frFib->getLabel("result");

# Declare the label & binding variables in advance, to define them sequentially.
my ($lbFibCompute, $fbFibPrev1, $fbFibPrev2);
$lbFibCompute = $uFib->makeLabel($rtFibArg, "FibCompute", undef, sub {
  my $row = $_[1]->getRow();
  my $op = $_[1]->getOpcode();
  my $idx = $row->get("idx");

  if ($idx <= 1) {
    $uFib->makeHashCall($frFib->getLabel("result"), $op,
      idx => $idx,
      fib => $idx < 1 ? 0 : 1,
    );
  } else {
    $stateFib->{op} = $op;
    $stateFib->{idx} = $idx;

    $frFib->push($fbFibPrev1);
    $uFib->makeHashCall($lbFibCompute, $op,
      idx => $idx - 1,
    );
  }
});
$fbFibPrev1 = Triceps::FnBinding->new(
  unit => $uFib,
  name => "FibPrev1",
  on => $frFib,
  labels => [
    result => sub {
      $frFib->pop($fbFibPrev1);

      $stateFib->{prev1} = $_[1]->getRow()->get("fib");

      # must prepare before pushing new state and with it new $stateFib
      my $rop = $lbFibCompute->makeRowopHash($stateFib->{op},
        idx => $stateFib->{idx} - 2,
      );

      $frFib->push($fbFibPrev2);
      $uFib->call($rop);
    },
  ],
);
$fbFibPrev2 = Triceps::FnBinding->new(
  unit => $uFib,
  on => $frFib,
  name => "FibPrev2",
  labels => [
    result => sub {
      $frFib->pop($fbFibPrev2);

      $stateFib->{prev2} = $_[1]->getRow()->get("fib");
      $uFib->makeHashCall($frFib->getLabel("result"), $stateFib->{op},
        idx => $stateFib->{idx},
        fib => $stateFib->{prev1} + $stateFib->{prev2},
      );
    },
  ],
);

# End of streaming function
###

The rest of the code stays the same, so I won't copy it here.

The computation still needs to keep the intermediate results of two recursive calls. With no closures, these results have to be kept in a global object $stateFib (which refers to a hash that keeps multiple values).

But it can't just be a single object! The recursive calls would overwrite it. So it has to be built into a stack of objects, a new one pushed for each call and popped after it. This pushing and popping can be tied to the pushing and popping of the bindings on an FnReturn. When the FnReturn is defined, the options onPush and onPop define the custom Perl code to execute, which is used here for the management of the state stack.

The whole logic is then split into the sections around the calls:

  • before the first call;
  • between the first and second call;
  • after the second call.

The first section goes as a normal label and the rest are done as bindings.

A tricky moment is that a simple scoped AutoFnBind can't be used here. The pushing of the binding happens in the calling label (such as FibCompute) but then the result is processed in another label (such as FibPrev1.result). The procedural control won't return to FibCompute until after FibPrev1.result has been completed. But FibPrev1.result needs the state popped before it can do its work! So the pushing and popping of the binding is done explicitly in two split steps: push() called in FibCompute and pop() called in FibPrev1.result. And of course then after FibPrev1.result saves the result, it pushes the next binding, which then gets popped in FibPrev2.result.

The popping can also be done without arguments, as simply pop(), but if it's given an argument, it will check that the binding popped is the same as its argument. This is helpful for detecting the call stack corruptions.

Now, can you guess, what depth of the unit call stack is required to compute and print the 2nd Fibonacci number? It's 7. If the tracing is enabled, it will produce this trace:

unit 'uFib' before label 'FibCompute' op OP_DELETE {
unit 'uFib' before label 'FibCompute' op OP_DELETE {
unit 'uFib' before label 'Fib.result' op OP_DELETE {
unit 'uFib' before label 'FibPrev1.result' (chain 'Fib.result') op
    OP_DELETE {
unit 'uFib' before label 'FibCompute' op OP_DELETE {
unit 'uFib' before label 'Fib.result' op OP_DELETE {
unit 'uFib' before label 'FibPrev2.result' (chain 'Fib.result') op
    OP_DELETE {
unit 'uFib' before label 'Fib.result' op OP_DELETE {
unit 'uFib' before label 'FibCall.result' (chain 'Fib.result') op
    OP_DELETE {
unit 'uFib' after label 'FibCall.result' (chain 'Fib.result') op
    OP_DELETE }
unit 'uFib' after label 'Fib.result' op OP_DELETE }
unit 'uFib' after label 'FibPrev2.result' (chain 'Fib.result') op
    OP_DELETE }
unit 'uFib' after label 'Fib.result' op OP_DELETE }
unit 'uFib' after label 'FibCompute' op OP_DELETE }
unit 'uFib' after label 'FibPrev1.result' (chain 'Fib.result') op
    OP_DELETE }
unit 'uFib' after label 'Fib.result' op OP_DELETE }
unit 'uFib' after label 'FibCompute' op OP_DELETE }
unit 'uFib' after label 'FibCompute' op OP_DELETE }

9 labels get called in a sequence, all the way from the initial call to the result printing. And only then the whole sequence unrolls back. 3 of them are chained through the bindings, so they don't push the stack frames onto the stack, and there is always the outermost stack frame, with the resulting stack depth of 9-3+1 = 7. This number grows fast. For the 6th number the number of labels becomes 75 and the frame count 51.

It happens because all the calls get unrolled into a single sequence, like what I've warned against in Section 7.7: “Topological loops” . The function return does unroll its FnReturn stack but doesn't unroll the unit call stack, it just goes even deeper by calling the label that processes it.

There are ways to improve it. The simplest one is to use the FnBinding with a tray, and call this tray after the function completely returns. This works out quite conveniently in two other ways too: First, AutoFnBind with its scoped approach can be used again. And second, it allows to handle the situations where a function returns not just one row but multiple of them. That will be the next example:

###
# A streaming function that computes a Fibonacci number.

# Input:
#   $lbFibCompute: request to compute the number.
# Output (by FnReturn labels):
#   "result": the computed value.
# The opcode is preserved through the computation.

my @stackFib; # stack of the function states
my $stateFib; # The current state

my $frFib = Triceps::FnReturn->new(
  name => "Fib",
  unit => $uFib,
  labels => [
    result => $rtFibRes,
  ],
  onPush => sub { push @stackFib, $stateFib; $stateFib = { }; },
  onPop => sub { $stateFib = pop @stackFib; },
);

my $lbFibResult = $frFib->getLabel("result");

# Declare the label & binding variables in advance, to define them sequentially.
my ($lbFibCompute, $fbFibPrev1, $fbFibPrev2);
$lbFibCompute = $uFib->makeLabel($rtFibArg, "FibCompute", undef, sub {
  my $row = $_[1]->getRow();
  my $op = $_[1]->getOpcode();
  my $idx = $row->get("idx");

  if ($idx <= 1) {
    $uFib->makeHashCall($frFib->getLabel("result"), $op,
      idx => $idx,
      fib => $idx < 1 ? 0 : 1,
    );
  } else {
    $stateFib->{op} = $op;
    $stateFib->{idx} = $idx;

    {
      my $ab = Triceps::AutoFnBind->new(
        $frFib => $fbFibPrev1
      );
      $uFib->makeHashCall($lbFibCompute, $op,
        idx => $idx - 1,
      );
    }
    $fbFibPrev1->callTray();
  }
});
$fbFibPrev1 = Triceps::FnBinding->new(
  unit => $uFib,
  name => "FibPrev1",
  on => $frFib,
  withTray => 1,
  labels => [
    result => sub {
      $stateFib->{prev1} = $_[1]->getRow()->get("fib");

      # must prepare before pushing new state and with it new $stateFib
      my $rop = $lbFibCompute->makeRowopHash($stateFib->{op},
        idx => $stateFib->{idx} - 2,
      );

      {
        my $ab = Triceps::AutoFnBind->new(
          $frFib => $fbFibPrev2
        );
        $uFib->call($rop);
      }
      $fbFibPrev2->callTray();
    },
  ],
);
$fbFibPrev2 = Triceps::FnBinding->new(
  unit => $uFib,
  on => $frFib,
  name => "FibPrev2",
  withTray => 1,
  labels => [
    result => sub {
      $stateFib->{prev2} = $_[1]->getRow()->get("fib");
      $uFib->makeHashCall($frFib->getLabel("result"), $stateFib->{op},
        idx => $stateFib->{idx},
        fib => $stateFib->{prev1} + $stateFib->{prev2},
      );
    },
  ],
);

# End of streaming function
###

The stack depth is now greatly reduced because the unit stack pops the frames before pushing more of them. For the 2nd Fibonacci number the trace is:

unit 'uFib' before label 'FibCompute' op OP_DELETE {
unit 'uFib' before label 'FibCompute' op OP_DELETE {
unit 'uFib' before label 'Fib.result' op OP_DELETE {
unit 'uFib' after label 'Fib.result' op OP_DELETE }
unit 'uFib' after label 'FibCompute' op OP_DELETE }
unit 'uFib' before label 'FibPrev1.result' op OP_DELETE {
unit 'uFib' before label 'FibCompute' op OP_DELETE {
unit 'uFib' before label 'Fib.result' op OP_DELETE {
unit 'uFib' after label 'Fib.result' op OP_DELETE }
unit 'uFib' after label 'FibCompute' op OP_DELETE }
unit 'uFib' before label 'FibPrev2.result' op OP_DELETE {
unit 'uFib' before label 'Fib.result' op OP_DELETE {
unit 'uFib' before label 'FibCall.result' (chain 'Fib.result') op
    OP_DELETE {
unit 'uFib' after label 'FibCall.result' (chain 'Fib.result') op
    OP_DELETE }
unit 'uFib' after label 'Fib.result' op OP_DELETE }
unit 'uFib' after label 'FibPrev2.result' op OP_DELETE }
unit 'uFib' after label 'FibPrev1.result' op OP_DELETE }
unit 'uFib' after label 'FibCompute' op OP_DELETE }

The maximal call stack depth is reduced to 5. For the 6th number the maximal required stack depth now gets reduced to only 9 instead of 51.

And there is also a way to run the recursive calls without even the need to increase the recursion depth limit. It can be left at the default 1, without setMaxRecursionDepth(). The secret is to fork the argument rowops to the functions instead of calling them.

###
# A streaming function that computes a Fibonacci number.

# Input:
#   $lbFibCompute: request to compute the number.
# Output (by FnReturn labels):
#   "result": the computed value.
# The opcode is preserved through the computation.

my @stackFib; # stack of the function states
my $stateFib; # The current state

my $frFib = Triceps::FnReturn->new(
  name => "Fib",
  unit => $uFib,
  labels => [
    result => $rtFibRes,
  ],
  onPush => sub { push @stackFib, $stateFib; $stateFib = { }; },
  onPop => sub { $stateFib = pop @stackFib; },
);

my $lbFibResult = $frFib->getLabel("result");

# Declare the label & binding variables in advance, to define them sequentially.
my ($lbFibCompute, $fbFibPrev1, $fbFibPrev2);
$lbFibCompute = $uFib->makeLabel($rtFibArg, "FibCompute", undef, sub {
  my $row = $_[1]->getRow();
  my $op = $_[1]->getOpcode();
  my $idx = $row->get("idx");

  if ($idx <= 1) {
    $uFib->fork($frFib->getLabel("result")->makeRowopHash($op,
      idx => $idx,
      fib => $idx < 1 ? 0 : 1,
    ));
  } else {
    $stateFib->{op} = $op;
    $stateFib->{idx} = $idx;

    $frFib->push($fbFibPrev1);
    $uFib->fork($lbFibCompute->makeRowopHash($op,
      idx => $idx - 1,
    ));
  }
});
$fbFibPrev1 = Triceps::FnBinding->new(
  unit => $uFib,
  name => "FibPrev1",
  on => $frFib,
  labels => [
    result => sub {
      $frFib->pop($fbFibPrev1);

      $stateFib->{prev1} = $_[1]->getRow()->get("fib");

      # must prepare before pushing new state and with it new $stateFib
      my $rop = $lbFibCompute->makeRowopHash($stateFib->{op},
        idx => $stateFib->{idx} - 2,
      );

      $frFib->push($fbFibPrev2);
      $uFib->fork($rop);
    },
  ],
);
$fbFibPrev2 = Triceps::FnBinding->new(
  unit => $uFib,
  on => $frFib,
  name => "FibPrev2",
  labels => [
    result => sub {
      $frFib->pop($fbFibPrev2);

      $stateFib->{prev2} = $_[1]->getRow()->get("fib");
      $uFib->fork($frFib->getLabel("result")->makeRowopHash($stateFib->{op},
        idx => $stateFib->{idx},
        fib => $stateFib->{prev1} + $stateFib->{prev2},
      ));
    },
  ],
);

# End of streaming function
###

This is a variation of the pre-previous example, with the split push and pop. The split is required for the fork to work: when the forked rowop executes, the calling label has already returned, so obviously the scoped approach won't work.

In this version the unit stack depth required to compute the 6th (and any) Fibonacci number reduces to 2: it's really only one level on top of the outermost frame.

If you were to attempt taking the advantage of the techniques from both of the last two examples (the one with the trays and the one with the forks) at the same time, that combination won't work. They could be combined but the combination just doesn't work right.

The problem is that the example with trays relies on the recursive function being completed before the tray gets called. But if the recursive functions are forked, things break. Looking at why they break provides another insight into the works of recursion. The example would look approximately like this in pseudo-code:

Compute:
  if (idx <= 1) {
    call FrFib Result;
  } else {
    push FibPrev1 to FrFib;
    fork Compute for n-1;
    fork Followup1;
  }

Followup1:
  fork tray of FibPrev1;

FibPrev1.result:
  pop FibPrev1 from FrFib;
  push FibPrev2 to FrFib;
  fork Compute for n-2;
  fork Followup2;

Followup2:
  fork tray of FibPrev2;

FibPrev2.result:
  pop FibPrev2 from FrFib;
  call FrFib Result;

The Followup labels are required because the trays with the intermediate results won't call themselves. They need to be called (or in this case, forked) by something else. The FnBinding has no method forkTray() but it can be done manually by first swapping the tray and then forking the result. The result FnReturn has to be called, not forked, so that it would immediately deposit the result rowop into the bound tray.

If there were only one recursive call, it would still work because the execution frame after the label Compute(n) returns would then look like this:

Compute(n-1)
Followup1(n)

The rowop Compute(n-1) would be the argument for the recursive function call, and Followup1(n) would be the follow-up rowop. When the execution time comes, the rowop Compute(n-1) executes, places the result into the tray. Then the rowop Followup1(n) executes and forks the tray, with the next rowop FibPrev1.result(n) then executing in order. So far so good.

Now let's trace the recursion to the depth of two. The first level starts the same:

Compute(n-1)
Followup1(n)

Then Compute(n-1) executes and forks the second level of recursion, the frame becoming:

Followup1(n)
Compute(n-1-1)
Followup1(n-1)

Do you see what went wrong? The unit execution frames are FIFO. So the second level of recursion got queued after the follow-up of the first level. That rowop Followup1(n) executes next, doesn't get any return values, and everything goes downhill from there.

15.11. Streaming functions and unit boundaries

One of the examples-as-future-standard-modules I've come up with, is TQL: the Triceps Trivial Query Language (or should that be TTQL?) along with a server to execute it. A TQL server is kind of like the Sybase or StreamBase CEP server in the way that it encapsulates the CEP logic, handles the client network connections with inputs and outputs, and also lets the clients define the ad-hoc queries against the tables of both the one-time and streaming varieties. The ad-hoc capabilities of TQL are probably better than those of Sybase and StreamBase, at least comparing to the last time I've looked at them up close. TQL will be described in detail in Chapter 17: “TQL, Triceps Trivial Query Language but right now I want to look at only one aspect of its implementation.

When you're building the execution model of an ad-hoc query, you'd obviously need to take it apart after its work is done. The easy way to do so is by building it in its own unit. Then this unit can be disposed of as, well, a unit, and guarantee that nothing will leak. By the way, that is the answer to the question of why would someone want to use multiple units in the same thread: for modular disposal. So far so good, but it means that the data sources in the main unit need to be connected with the processing labels in the units of the ad-hoc queries.

But the labels in the main unit and the query unit can't be directly connected. A direct connection would create the stable references, and the disposal won't work. That's where the streaming function interface comes to the rescue: it provides a temporary connection. Build the query unit, build a binding for it, push the binding onto the FnReturn of the main unit, run the query, pop the binding, dispose of the query unit.

And the special capacity (or if you will, superpower) of the streaming functions that allows all that is that the FnReturn and FnBinding don't have to be of the same unit. They may be of the different units and will still work together fine.

TQL was really developed as a showcase of this feature but it has gained a life of its own and I don't want to go into all details here. Instead lets have a high-level overview and then dive straight into the part that uses the streaming functions.

To start a TQL server, you build a Triceps model as usual, and then create an object of class Triceps::X::Tql using the endpoints of that model (inputs, outputs, queryable tables) as arguments. After that the object runs the server and handles the clients until it's asked to stop.

The TQL queries are pipelines. You read the data from a table, then select, project, join (in any order, and possibly repeatedly) and eventually print the result (that is, send it back to the client over the socket).

The reading from a table is done through its dump label. When the Tql object is created, it builds an FnReturn with the dump labels of all the tables given to it. When an ad-hoc query is created, its head of the pipeline gets a matching FnBinding that is then pushed onto the FnReturn, the table gets dumped and flows through the binding into the query.

Now let's take a look at the code. I'll be skipping over the code that is less interesting, you can find the full version in the source code in lib/Triceps/X/Tql.pm as always. The constructor is one of these things to be skipped. The initialization part is more interesting. I've cut out the part that supports the multi-threaded logic, and the remaining single-threaded version goes as follows:

sub initialize # ($self)
{
  my $myname = "Triceps::X::Tql::initialize";
  my $self = shift;

  return if ($self->{initialized});

  my $owner = $self->{trieadOwner};
  if (defined $owner) {
    # ... multithreaded version ...
  } else {
    my %dispatch;
    my @labels;
    for (my $i = 0; $i <= $#{$self->{tables}}; $i++) {
      my $name = $self->{tableNames}[$i];
      my $table = $self->{tables}[$i];

      confess "$myname: found a duplicate table name '$name', all names are: "
          . join(", ", @{$self->{tableNames}})
        if (exists $dispatch{$name});

      $dispatch{$name} = $table;
      push @labels, $name, $table->getDumpLabel();
    }

    $self->{dispatch} = \%dispatch;
    $self->{fret} = Triceps::FnReturn->new(
      name => $self->{name} . ".fret",
      labels => \@labels,
    );
  }

  $self->{initialized} = 1;
}

It creates a dispatch hash of name-to-table and also an FnReturn that contains the dump labels of all the tables.

The method compileQuery() then handles the creation of the separate unit with its contents (facet is a term from the multithreading support, just ignore it for now):

# The common query compilation for the single-threaded and multi-threaded versions.
#
# The options are:
#
# qid => $id
# (optional) The query id that will be used to report any service information
# such as errors, end of dump portion and such.
# Default: ''.
#
# qname => $name
# The query name that will be used as a label name for all the
# produced data, and for the service information too.
#
# nxprefix => $name
# (optional) Prefix for the created unit name.
# Default: ''.
#
# text => $query_text
# Text of the query, in the braced format.
#
# subError => \&error($id, $qname, $msg, $error_code, $error_val)
# The function that will handle the error reporting. The args are:
#   $id and $qname as received in the options
#   $msg - the full human-readable message
#   $error_code - the string identifying the error
#   $error_val - the particular value that caused the error
#
# tables => { $name => $table, ... }
# The tables list for the single-threaded version.
# Not used with the multithreaded version.
#
# fretDumps => $fnReturn
# The FnReturn object for dumps in the single-threaded version.
# Not used with the multithreaded version.
#
# faOut => $facet
# The facet used to send the data to the Tql thread.
# Not used with the single-threaded version.
#
# faRqDump => $facet
# The facet used to send the table dump requests back to the app core.
# Not used with the single-threaded version.
#
# subPrint => \&print($text)
# The function that prints the text back to the socket.
# Not used with the single-threaded version.
#
# @return - undef on error, the compiled context object on success
#           (see the definition of its contents inside the function)
sub compileQuery # (@opts)
{
  my $myname = "Triceps::X::Tql::compileQuery";
  my $opts = {};
  &Triceps::Opt::parse("chatSockWriteT", $opts, {
    qid => [ '', undef ],
    qname => [ undef, \&Triceps::Opt::ck_mandatory ],
    nxprefix => [ '', undef ],
    text => [ undef, \&Triceps::Opt::ck_mandatory ],
    subError => [ undef, sub { &Triceps::Opt::ck_mandatory; &Triceps::Opt::ck_ref(@_, "CODE"); } ],
    tables => [ undef, sub { &Triceps::Opt::ck_ref(@_, "HASH", "Triceps::Table"); } ],
    fretDumps => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::FnReturn"); } ],
    faOut => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::Facet"); } ],
    faRqDump => [ undef, sub { &Triceps::Opt::ck_ref(@_, "Triceps::Facet"); } ],
    subPrint => [ undef, sub { &Triceps::Opt::ck_ref(@_, "CODE"); } ],
  }, @_);

  my $q = $opts->{qname}; # the name of the query itself

  my @cmds = split_braced($opts->{text});
  if ($opts->{text} ne '') {
    &{$opts->{subError}}($opts->{qid}, $q, "mismatched braces in the trailing " . $opts->{text},
      'query_syntax', $opts->{text});
    return undef;
  }

  # The context for the commands to build up an execution of a query.
  # Unlike $self, the context is created afresh for every query.
  my $ctx = {};
  $ctx->{qid} = $opts->{qid};
  $ctx->{qname} = $opts->{qname};

  $ctx->{tables} = $opts->{tables};
  $ctx->{fretDumps} = $opts->{fretDumps};
  $ctx->{actions} = []; # code that will run the pipeline

  $ctx->{faOut} = $opts->{faOut};
  $ctx->{faRqDump} = $opts->{faRqDump};
  $ctx->{subPrint} = $opts->{subPrint};
  $ctx->{requests} = []; # dump and subscribe requests that will run the pipeline
  $ctx->{copyTables} = []; # the tables created in this query
    # (have to keep references to the tables or they will disappear)

  # The query will be built in a separate unit
  $ctx->{u} = Triceps::Unit->new($opts->{nxprefix} . "${q}.unit");
  $ctx->{prev} = undef; # will contain the output of the previous command in the pipeline
  $ctx->{id} = 0; # a unique id for auto-generated objects
  # deletion of the context will cause the unit in it to clean
  $ctx->{cleaner} = $ctx->{u}->makeClearingTrigger();

  if (! eval {
    foreach my $cmd (@cmds) {
      my @args = split_braced($cmd);
      my $argv0 = bunescape(shift @args);
      # The rest of @args do not get unquoted here!
      die "No such TQL command '$argv0'\n" unless exists $tqlDispatch{$argv0};
      # do something better with the errors, show the failing command...
      $ctx->{id}++;
      &{$tqlDispatch{$argv0}}($ctx, @args);
      # Each command must set its result label (even if an undef) into
      # $ctx->{next}.
      die "Internal error in the command $argv0: missing result definition\n"
        unless (exists $ctx->{next});
      $ctx->{prev} = $ctx->{next};
      delete $ctx->{next};
    }
    if (defined $ctx->{prev}) {
      # implicitly print the result of the pipeline, no options
      &{$tqlDispatch{"print"}}($ctx);
    }

    1; # means that everything went OK
  }) {
    &{$opts->{subError}}($opts->{qid}, $q, "query error: $@", 'bad_query', '');
    return undef;
  }

  return $ctx;
}

Each TQL command is defined as its own method, all of them collected in the %tqlDispatch. compileQuery() splits the pipeline and then lets each command build its part of the query, connecting them through $ctx. A command may also register an action to be run later. After everything is built, the actions run and produce the result.

The TQL syntax uses braces for the grouping in the pipeline. The functions split_braced() and bunescape() are imported from the package Triceps::Braced that handles the parsing of the braced nested lists. They are described in detail in Section 19.14: “Braced reference” .

The option subError defines a function that reports the errors back to the user. Since everything is returned back as a stream, the errors are reported as rowops on the special label +ERROR.

And the final part of the puzzle, here is the read command handler that creates the head of the query pipeline:

# "read" command. Defines a table to read from and starts the command pipeline.
# Options:
# table - name of the table to read from.
sub _tqlRead # ($ctx, @args)
{
  my $ctx = shift;
  die "The read command may not be used in the middle of a pipeline.\n"
    if (defined($ctx->{prev}));
  my $opts = {};
  &Triceps::Opt::parse("read", $opts, {
    table => [ undef, \&Triceps::Opt::ck_mandatory ],
  }, @_);

  my $tabname = bunescape($opts->{table});
  my $unit = $ctx->{u};

  if ($ctx->{faOut}) {
    # ... multithreaded version ...
  } else {
    my $fret = $ctx->{fretDumps};

    die ("Read found no such table '$tabname'\n")
      unless (exists $ctx->{tables}{$tabname});
    my $table = $ctx->{tables}{$tabname};
    my $lab = $unit->makeDummyLabel($table->getRowType(), "lb" . $ctx->{id} . "read");
    $ctx->{next} = $lab;

    my $code = sub {
      Triceps::FnBinding::call(
        name => "bind" . $ctx->{id} . "read",
        unit => $unit,
        on => $fret,
        labels => [
          $tabname => $lab,
        ],
        code => sub {
          $table->dumpAll();
        },
      );
    };
    push @{$ctx->{actions}}, $code;
  }
}

It's the only command that registers an action, which sends data into the query unit. The rest of commands just add more handlers to the pipeline in the unit, and get the data that flows from read. The action sets up a binding and calls the table dump, to send the data into that binding.

The reading of the tables could have also been done without the bindings, and without the need to bind the units at all: just iterate through the table procedurally in the action. But this whole example has been built largely to showcase that the bindings can be used in this way, so naturally it uses bindings.

The bindings come more useful when the query logic has to react to the normal logic of the main unit, such as in the subscriptions: set up the query, read its initial state, and then keep reading as the state gets updated. But guess what, the subscriptions can't be done with the FnReturns as shown because the FnReturn only sends its data to the last binding pushed onto it. This means, if multiple subscriptions get set up, only the last one will be getting the data. This problem gets solved only in the multithreaded implementation of Tql that will be discussed in Section 17.6: “Internals of a TQL join” . There each client runs in its own thread, and each of its queries runs in its own unit; the inter-thread communications are used to subscribe to the updates.

15.12. The ways to call a streaming function

The examples in this chapter have shown many ways to call a streaming function. Here is a recap of them all:

  • Manually push the FnBinding onto the FnReturn, send the argument rowops to the streaming function, pop the FnBinding.
  • Use an AutoFnBind to handle the pushing and popping in a scoped fashion. A single AutoFnBind can control multiple pairs of FnBinding and FnReturn, so it can build in one go not only a single streaming function call but even a whole pipeline. In the C++ API a more low-level object ScopedFnBind can also be used in a similar way.
  • Use Unit::callBound() that takes care of creating both the AutoFnBind object and a scope around of it in a more efficient way.
  • Use FnBinding::call() to create an FnBinding object dynamically, do a call with it, and dispose of it.

The most convenient way depends on the situation.

15.13. The gritty details of streaming functions scheduling

If you've read carefully about all the gritty details of scheduling, you might wonder, what exactly happens when a label in an FnBinding gets called through an FnReturn? The answer is, they are executed like the chained labels, reusing the frame of the parent label (that is, of the matching label on the FnReturn side). They even show in the traces as the chained labels. This lets the bound labels to easily fork a rowop to the frame of its parent.

The only exception is when the FnReturn and FnBinding are in the different units. Then the bound label is properly called with its own frame in the unit where it belongs.

And of course the rowops collected in a tray are another exception, since they are not called in the binding, they are only collected. When the tray gets called, they get properly called with their own frames, just as when calling any other tray.

Chapter 16. Multithreading

16.1. Triceps multithreading concepts

When running the CEP models, naturally the threads have to be connected by the queues for the data exchange. The use of queues is extremely popular but also notoriously bug-prone.

The idea of the multithreading support in Triceps is to make writing the multithreaded model easier. To make writing the good code easy and writing the bad code hard. But of course you don't have to use it, if it feels too constraining, you can always make your own.

The diagram in Figure 16.1 shows all the main elements of a multithread Triceps application.

Triceps multithreaded application.

Figure 16.1. Triceps multithreaded application.


The Triceps application is embodied in the class App. It's possible to have multiple Apps in one program.

Each thread has multiple parts to it. First, of course, there is the OS-level (or, technically, library-level, or Perl-level) thread where the code executes. And then there is a class that represents this thread and its place in the App. To reduce the naming conflict, this class is creatively named Triead (pronounced still thread). In the discussion I use the word thread for both concepts, the OS-level thread and the Triead, and it's usually clear from the context which one I mean. But sometimes it's particularly important to make the distinction, and then I name one or the other explicitly.

The class Triead itself is largely opaque, allowing only a few methods for introspection. But there is a control interface to it, called TrieadOwner. The Triead is visible from the outside, the TrieadOwner object is visible only in the OS thread that owns the Triead. The TrieadOwner manages the thread state and acts as the intermediary in the thread's communications with the App.

The data is passed between the threads through the Nexuses. A Nexus is unidirectional, with data going only one way, however it may have multiple writers and multiple readers. All the readers see the exact same data, with rowops going in the exact same order (well, there will be other policies in the future as well, but for now there is only one policy).

A Nexus passes through the data for multiple labels, very much like an FnReturn does (and indeed there is a special connection between them). A Nexus also allows to export the row types and table types from one thread to another.

A Nexus is created by one thread, and then the other threads connect to it. The thread that creates the Nexus determines what labels will it contain, and what row types and table types to export.

A Nexus gets connected to the Trieads through the Facets (in the diagram, the Facets are shown as flat spots on the round Nexuses). A Facet is a connection point between the Nexus and the Triead. Each Facet is for either reading or writing. And there may be only one Facet between a given Nexus and a given Triead, you can't make multiple connections between them. As a consequence, a thread can't both write and read to the same Nexus, it can do only one thing. This might actually be an overly restrictive limitation and might change in the future but that's how things work now.

Each Nexus also has a direction: either direct (downwards) or reverse (upwards). How does it know, which direction is down and whih is up? It doesn't. You tell it by designating a Nexus one way or the other. And yes, the reverse Nexuses allow to build the models with loops. However the loops consisting of only the direct Nexuses are not allowed, nor of only reverse Nexuses. They would mess up the flow control. The proper loops must contain a mix of direct and reverse Nexuses.

The direct Nexuses have a limited queue size and stop the writers when the queue fills up, until the data gets consumed, thus providing the flow control. The reverse Nexuses have an unlimited queue size, which allows to avoid the circular deadlocks. The reverse Nexuses also have a higher priority: if a thread is reading from a direct Nexus and a reverse one, with both having data available, it will read the data from the reverse Nexus first. This is to prevent the unlimited queues in the reverse Nexuses from the truly unlimited growth.

Normally an App is built once and keeps running in this configuration until it stops. But there is a strong need to have the threads dynamically added and deleted too. For example, if the App running as a server, and clients connect to it, each client needs to have its thread(s) added when the client connects and then deleted when the client disconnects. This is handled through the concept of fragments. There is no Fragment class but when you create a Triead, you can specify a fragment name for it. Then it becomes possible to shut down and dispose the threads in a fragment after the fragment's work is done.

16.2. The Triead lifecycle

Each Triead goes through a few stages in its life:

  • declared
  • defined
  • constructed
  • ready
  • waited ready
  • requested dead
  • dead

Note by the way that it's the stages of the Triead object. The OS-level thread as such doesn't know much about them, even though these stages do have some connections to its state.

These stages always go in order and can not be skipped. However for convenience you can request a move directly to a further stage. This will just automatically pass through all the intermediate stages. Although, well, there is one exception: the waited ready and requested dead stages can get skipped on the way to dead. Other than that, there is always the sequence, so if you find out that a Triead is dead, you can be sure that it's also declared, defined, constructed and ready. The attempts to go to a previous stage are silently ignored.

Now, what do these stages mean?

Declared:

The App knows the name of the thread and that this thread will eventually exist. When an App is asked to find the resources from this thread (such as Nexuses, and by the way, the Nexuses are associated with the threads that created them) it will know to wait until this thread becomes constructed, and then look for the resources. It closes an important race condition: the code that defines the Triead normally runs in a new OS thread but there is no way to tell when exactly will it run and do its work. If you had spawned a new thread and then attempted to get a nexus from it before it actually runs, the App would tell you that there is no such thread and fail. To get around it, you declare the thread first and then start it. Most of the time there is no need to declare explicitly, the library code that wraps the thread creation does it for you.

Defined:

The Triead object has been created and connected to the App. Since this is normally done from the new OS thread, it also implies that the thread is running and is busy about constructing the nexuses and whatever its own internal resources.

Constructed:

The Triead had constructed and exported all the nexuses that it planned to. This means that now these nexuses can be imported by the other threads (i.e. connected to the other threads). After this point the thread can not construct any more nexuses. However it can keep importing the nexuses from the other threads. It's actually a good idea to do all your exports, mark the thread constructed, and only then start importing. This order guarantees the absence of initialization deadlocks (which would be detected and will cause the App to be aborted). There are some special cases when you need to import a nexus from a thread that is not fully constructed yet, and it's possible, but requires more attention and a special override of the immediate import. This is described in more detail in Section 19.20: “TrieadOwner reference” , with the method importNexus().

Ready:

The thread had imported all the nexuses it wanted and fully initialized all its internals (for example, if it needs to load data from a file, it might do that before telling that it's ready). After this point no more nexuses can be imported. A fine point is that the other threads may still be created, and they may do their exporting and importing, but once a thread is marked as ready, it's cast in bronze. And in the simple cases you don't need to worry about separating the constructed and ready stages, just initialize everything and mark the thread as ready.

Waited ready:

Before proceeding further, the thread has to wait for all the threads in App to be ready, or it would lose data when it tries to communicate with them. It's essentially a barrier. Normally both the stages ready and waited ready are advanced to with a single call readyReady(). With it the thread says I'm ready, and let me continue when everyone is ready. After that the actual work can begin. It's still possible to create more threads after that (normally, parts of the transient fragments), and until they all become ready, the App may temporarily become unready again, but that's a whole separate advanced topic that will be discussed in Section 16.6: “Dynamic threads and fragments in a socket server” .

Requested dead:

This is the way to request a thread to exit. Normally some control thread will decide that the App needs to exit and will request all its threads to die. The threads will get these requests, perform their last rites and exit. The threads don't have to get this request to exit, they can also always decide to exit on their own. When a thread is requested to die, all the data communication with it stops. No more data will get to it through the nexuses and any data it sends will be discarded. It might churn a little bit through the data in its input buffers but any results produced will be discarded. The good practice is to make sure that all the data is drained before requesting a thread to die. Note that the nexuses created by this thread aren't affected at all, they keep working as usual. It's the data connections between this thread and any nexuses that get broken.

Dead:

The thread had completed its execution and exited. Normally you don't need to mark this explicitly. When the thread's main function returns, the library will do it for you. Marking the thread dead also drives the harvesting of the OS threads: the harvesting logic will perform a join() (not to be confused with SQL join) of the thread and thus free the OS resources. The dead Trieads are still visible in the App (except for some special cases with the fragments), and their nexuses continue working as usual (even including the special cases with the fragments), the other threads can keep communicating through them for as long as they want.

16.3. Multithreaded pipeline

The multithreaded models are well suited for running the pipelines, so that is going to be the first example of the threads. The full text of the example can be found in t/xTrafficAggMt.t in the class Traffic1. It's a variation of an already shown example, the traffic data aggregation from Section 13.2: “Periodic updates” . The short recap is that it gets the data for each network packet going through and keeps it for some time, aggregates the data by the hour and keeps it for a longer time, and aggregates it by the day and keeps for a longer time yet. This multi-stage computation naturally matches the pipeline approach.

Since this new example highlights different features than the original one, I've changed it logic a little: it updates both the hourly and daily summaries on every packet received. And I didn't bother to implement the part with the automatic cleaning of the old data, it doesn't add anything interesting to the pipeline works.

The pipeline topologies are quite convenient for working with the threads. The parallel computations create a possibility of things happening in an unpredictable order and producing unpredictable results. The pipeline topology allows the parallelism and at the same time also keeps the data in the same predictable order, with no possibility of rows overtaking each other.

The computation in this example is split into the following threads:

  • Read the input, parse and send the data into the model.
  • Store the recent data and aggregate it by the hour.
  • Store the hourly data and aggregate it by the day.
  • Store the daily data.
  • Get the data at the end of the pipeline and print it.

The result of each aggregation gets stored in a table in the next thread, which then uses the same table for the next stage of aggregation.

Technically, each stage only needs the data from the previous stage, but to get the updates to the printing stage (since we want to print the original updates, daily and hourly), they all go all the way through.

Dumping the contents of the tables also requires some special support. Each table is local to its thread and can't be accessed from the other threads. To dump its contents, the dump request needs to be sent to its thread, which would extract the data and send it through. There are multiple ways to deal with the dump results. One is to have a special label for each table's dump and propagate it to the last stage to print. If all that is needed is text, another way is to have one label that allows to send strings is good enough, all the dumps can send the data converted to text into it, and it would go all the way through the pipeline. For this example I've picked the last approach.

And now is time to show some code. The main part goes like this:

Triceps::Triead::startHere(
  app => "traffic",
  thread => "print",
  main => \&printT,
);

The startHere() creates an App and starts a Triead in the current OS thread. Here in the method name stands for in the current OS thread. traffic is the app name, print the thread name. This thread will be the end of the pipeline, and it will create the rest of the threads. This is a convenient pattern when the results of the model need to be fed back to the current thread, and it works out very conveniently for the unit tests. printT() is the body function of this printing thread:

sub printT # (@opts)
{
  my $opts = {};
  Triceps::Opt::parse("traffic main", $opts, {@Triceps::Triead::opts}, @_);
  my $owner = $opts->{owner};
  my $unit = $owner->unit();

  Triceps::Triead::start(
    app => $opts->{app},
    thread => "read",
    main => \&readerT,
  );
  Triceps::Triead::start(
    app => $opts->{app},
    thread => "raw_hour",
    main => \&rawToHourlyT,
    from => "read/data",
  );
  Triceps::Triead::start(
    app => $opts->{app},
    thread => "hour_day",
    main => \&hourlyToDailyT,
    from => "raw_hour/data",
  );
  Triceps::Triead::start(
    app => $opts->{app},
    thread => "day",
    main => \&storeDailyT,
    from => "hour_day/data",
  );

  my $faIn = $owner->importNexus(
    from => "day/data",
    as => "input",
    import => "reader",
  );

  $faIn->getLabel("print")->makeChained("print", undef, sub {
    print($_[1]->getRow()->get("text"));
  });
  for my $tag ("packet", "hourly", "daily") {
    makePrintLabel($tag, $faIn->getLabel($tag));
  }

  $owner->readyReady();
  $owner->mainLoop(); # all driven by the reader
}

startHere() accepts a number of fixed options plus arbitrary options that it doesn't care about by itself but passes through to the thread's main function, which are then the responsibility of the main function to parse. To reiterate, the main function gets all the options from the call of startHere(), both these that startHere() parses and these that it simply passes through. startHere() also adds one more option on its own: owner containing the TrieadOwner object that the thread uses to communicate with the rest of the App.

In this case printT() doesn't have any extra options on its own, it's just happy to get startHere()'s standard set that it takes all together from @Triceps::Triead::opts.

It gets the TrieadOwner object $owner from the option appended by startHere(). Each TrieadOwner is created with its own Unit, so the unit is obtained from it to create the thread's model in it. Incidentally, the TrieadOwner also acts as a clearing trigger object for the Unit, so when the TrieadOwner is destroyed, it properly clears the Unit.

Then it goes and creates all the threads of the pipeline. The start() works very much like startHere(), only it actually creates a new thread and starts the main function in it. The main function can be the same whether it runs through start() or startHere(). The special catch is that the options to start() must contain only the plain Perl values, not Triceps objects. It has to do with how Perl works with threads: it makes a copy of every value for the new thread, and it cant's copy the XS objects, so they simply become undefined in the new thread.

All but the first thread in the pipeline have the extra option from: it specifies the input nexus for this thread, and each thread creates an output nexus data. A nexus it named relatively to the thread that created it, so when the option from says day/data, it's the nexus data created by the thread day.

So, the pipeline gets all connected sequentially until eventually printT() imports the nexus at its tail. importNexus() returns a Facet, which is the thread's API to the nexus. A facet looks very much like an FnReturn for most purposes, with a few additions. It even has a real FnReturn in it, and you work with the labels of that FnReturn to get the data out of the nexus (or to send data into the nexus). You could potentially use an FnBinding with that FnReturn but the typical pattern for reading from a facet is different: just get its labels and chain the handling labels directly to them.

The option as of importNexus() gives the name to the facet and to its same-named FnReturn (without it the facet would be named the same as the short name of the nexus, in this case data). The option import tells whether this thread will be reading or writing the nexus, and in this case it's reading.

By the time the pipeline gets to the last stage, it has a few labels in its facet:

  • print - carries the direct text lines to print in its field text, and its contents gets printed.
  • dumprq - carries the dump requests to the tables, and the printing thread doesn't care about it.
  • packet - carries the raw data about the packets.
  • hourly - carries the hourly summaries.
  • daily - carries the daily summaries.

The last three get also printed but this time as whole rows.

And after everything is connected, the thread both tells that it's ready and waits for all the other threads to become ready by calling readyReady(). Then its the run time, and mainLoop() takes care of it: it keeps reading data from the nexus and processes it until it's told to shutdown. The shutdown will be controlled by the file reading thread at the start of the pipeline. The processing is done by getting the rowops from the nexus and calling them on the appropriate label in the facet, which then calls the the labels chained from it, and that gets all the rest of the thread's model running.

The reader thread drives the pipeline:

sub readerT # (@opts)
{
  my $opts = {};
  Triceps::Opt::parse("traffic main", $opts, {@Triceps::Triead::opts}, @_);
  my $owner = $opts->{owner};
  my $unit = $owner->unit();

  my $rtPacket = Triceps::RowType->new(
    time => "int64", # packet's timestamp, microseconds
    local_ip => "string", # string to make easier to read
    remote_ip => "string", # string to make easier to read
    local_port => "int32",
    remote_port => "int32",
    bytes => "int32", # size of the packet
  );

  my $rtPrint = Triceps::RowType->new(
    text => "string", # the text to print (including \n)
  );

  my $rtDumprq = Triceps::RowType->new(
    what => "string", # identifies, what to dump
  );

  my $faOut = $owner->makeNexus(
    name => "data",
    labels => [
      packet => $rtPacket,
      print => $rtPrint,
      dumprq => $rtDumprq,
    ],
    import => "writer",
  );

  my $lbPacket = $faOut->getLabel("packet");
  my $lbPrint = $faOut->getLabel("print");
  my $lbDumprq = $faOut->getLabel("dumprq");

  $owner->readyReady();

  while(<STDIN>) {
    chomp;
    # print the input line, as a debugging exercise
    $unit->makeArrayCall($lbPrint, "OP_INSERT", "> $_\n");

    my @data = split(/,/); # starts with a command, then string opcode
    my $type = shift @data;
    if ($type eq "new") {
      $unit->makeArrayCall($lbPacket, @data);
    } elsif ($type eq "dump") {
      $unit->makeArrayCall($lbDumprq, "OP_INSERT", $data[0]);
    } else {
      $unit->makeArrayCall($lbPrint, "OP_INSERT", "Unknown command '$type'\n");
    }
    $owner->flushWriters();
  }

  {
    # drain the pipeline before shutting down
    my $ad = Triceps::AutoDrain::makeShared($owner);
    $owner->app()->shutdown();
  }
}

It starts by creating the nexus with the initial set of the labels: for the data about the network packets, for the lines to be printed at the end of the pipeline and for the dump requests to the tables in the other threads. It gets exported for the other threads to import, and also imported right back into this thread, for writing. And then the setup is done, readyReady() is called, and the processing starts.

It reads the CSV lines, splits them, makes a decision if it's a data line or dump request, and one way or the other sends it into the nexus. The data sent to a facet doesn't get immediately forwarded to the nexus. It's collected internally in a tray, and then flushWriters() sends it on. The mainLoop() shown in printT calls flushWriters() automatically after every tray it processes from the input. But when reading from a file you've got to do it yourself. Of course, it's more efficient to send through multiple rows at once, so a smarter implementation would check if multiple lines are available from the file and send them in larger bundles.

The last part is the shutdown. After the end of file is reached, it's time to shut down the application. You can't just shut down it right away because there still might be data in the pipeline, and if you shut it down, that data will be lost. The right way is to drain the pipeline first, and then do the shutdown when the app is drained. AutoDrain::makeShared() creates a scoped drain: the drain request for all the threads is started when this object is created, and the object construction completes when the drain succeeds. When the object is destroyed, that releases the drain. So in this case the drain succeeds and then the app gets shut down.

The shutdown causes the mainLoop() calls in all the other threads to return, and the threads to exit. Then startHere() in the first thread has the special logic in it that joins all the started threads after its own main function returns and before it completes. After that the script continues on its way and is free to exit.

The rest of this example might be easier to understand by looking at an example of a run first. The lines in bold are the copies of the input lines that readerT() reads from the input and sends into the pipeline, and printT() faithfully prints.

input.packet are the rows that reach the printT on the packet label (remember, input is the name with which it imports its input nexus). input.hourly is the data aggregated by the hour intervals (and also by the IP addresses, dropping the port information), and input.daily further aggregates it per day (and again per the IP addresses). The timestamps in the hourly and daily rows are truncated to the start of the hour or day.

And the lines without any prefixes are the dumps of the table contents that again reach the printT() through the print label:

new,OP_INSERT,1330886011000000,1.2.3.4,5.6.7.8,2000,80,100
input.packet OP_INSERT time="1330886011000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="100"
input.hourly OP_INSERT time="1330884000000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="100"
input.daily OP_INSERT time="1330819200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="100"
new,OP_INSERT,1330886012000000,1.2.3.4,5.6.7.8,2000,80,50
input.packet OP_INSERT time="1330886012000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="50"
input.hourly OP_DELETE time="1330884000000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="100"
input.daily OP_DELETE time="1330819200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="100"
input.hourly OP_INSERT time="1330884000000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="150"
input.daily OP_INSERT time="1330819200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="150"
new,OP_INSERT,1330889612000000,1.2.3.4,5.6.7.8,2000,80,150
input.packet OP_INSERT time="1330889612000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="150"
input.hourly OP_INSERT time="1330887600000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="150"
input.daily OP_DELETE time="1330819200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="150"
input.daily OP_INSERT time="1330819200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="300"
new,OP_INSERT,1330889811000000,1.2.3.4,5.6.7.8,2000,80,300
input.packet OP_INSERT time="1330889811000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" local_port="2000" remote_port="80" bytes="300"
input.hourly OP_DELETE time="1330887600000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="150"
input.daily OP_DELETE time="1330819200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="300"
input.daily OP_INSERT time="1330819200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="150"
input.hourly OP_INSERT time="1330887600000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="450"
input.daily OP_DELETE time="1330819200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="150"
input.daily OP_INSERT time="1330819200000000" local_ip="1.2.3.4"
    remote_ip="5.6.7.8" bytes="600"
new,OP_INSERT,1330972411000000,1.2.3.5,5.6.7.9,3000,80,200
input.packet OP_INSERT time="1330972411000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" local_port="3000" remote_port="80" bytes="200"
input.hourly OP_INSERT time="1330970400000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" bytes="200"
input.daily OP_INSERT time="1330905600000000" local_ip="1.2.3.5"
    remote_ip="5.6.7.9" bytes="200"
new,OP_INSERT,1331058811000000
input.packet OP_INSERT time="1331058811000000"
new,OP_INSERT,1331145211000000
input.packet OP_INSERT time="1331145211000000"
dump,packets
time="1330886011000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    local_port="2000" remote_port="80" bytes="100"
time="1330886012000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    local_port="2000" remote_port="80" bytes="50"
time="1330889612000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    local_port="2000" remote_port="80" bytes="150"
time="1330889811000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    local_port="2000" remote_port="80" bytes="300"
time="1330972411000000" local_ip="1.2.3.5" remote_ip="5.6.7.9"
    local_port="3000" remote_port="80" bytes="200"
dump,hourly
time="1330884000000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="150"
time="1330887600000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="450"
time="1330970400000000" local_ip="1.2.3.5" remote_ip="5.6.7.9"
    bytes="200"
dump,daily
time="1330819200000000" local_ip="1.2.3.4" remote_ip="5.6.7.8"
    bytes="600"
time="1330905600000000" local_ip="1.2.3.5" remote_ip="5.6.7.9"
    bytes="200"

Note that the order of the lines is completely nice and predictable, nothing goes out of order. Each nexus preserves the order of the rows put into it, and the fact that there is only one writer per nexus and that every thread is fed from only one nexus, avoids the races.

Let's look at the thread that performs the aggregation by the hour:

# compute an hour-rounded timestamp (in microseconds)
sub hourStamp # (time)
{
  return $_[0]  - ($_[0] % (1000*1000*3600));
}

sub rawToHourlyT # (@opts)
{
  my $opts = {};
  Triceps::Opt::parse("traffic main", $opts, {
    @Triceps::Triead::opts,
    from => [ undef, \&Triceps::Opt::ck_mandatory ],
  }, @_);
  my $owner = $opts->{owner};
  my $unit = $owner->unit();

  # The current hour stamp that keeps being updated;
  # any aggregated data will be propagated when it is in the
  # current hour (to avoid the propagation of the aggregator clearing).
  my $currentHour;

  my $faIn = $owner->importNexus(
    from => $opts->{from},
    as => "input",
    import => "reader",
  );

  # the full stats for the recent time
  my $ttPackets = Triceps::TableType->new($faIn->getLabel("packet")->getRowType())
    ->addSubIndex("byHour",
      Tric