Thursday, November 15, 2012

Event Sourcing Yow Night with Greg Young




·         Current state:
·         Is awful
·         Requires large amounts of versioning
·         1st level derivative of facts that have happened
·         Look at systems from perspective of no current state
·         Banking, insurance, gambling, etc
·         We don’t have current state, we have a series of facts
·         Driving point is from business perspective
·         E.g.
·         Purchase order
·         Line items(n)
·         Shipping information
·         Models represent our current state
·         Document stores are awesome - until you need to change your schema
·         Problem is we want to go and change our previous representations of data
·         E.g. Cart created -> 3 items added -> shipping information added
·         At any time can replay 3 events to get data model
·         Events: append only model
·         How do you scale immutable data?  Copy it
·         Immutable data is awesome
·         Once “Cart created” is created it will never change
·         Append-only model, with everything immutable, what about updates/deletes?
·         Update/delete = lost valuable data
·         Code with a magic 8-ball to predict what business is going to want in 2 years?
·         Strategic design with DDD
·         Don’t apply ES globally
·         ES/CQRS is not an architecture
·         Small things you apply within a service/component
·         Not losing information is valuable
·         2 sets of use cases in different orders that end up with same ending state?
·         Lost info
·         Hash collision – non-perfect – lost info coming into system
·         One rule: we don’t lose any data – generating 100Gb per day
·         How do you predict value of data?
·         Humans have history of making bad predictions about future
·         Bigger the expert = worse predictive analysis
·         Can only say: “I cannot price this option”
·         Therefore I should keep it
·         When business ask for unexpected data, can say yes
·         Could be something that makes or breaks company – competitive advantage
·         Accounting is not done with a pencil
·         If make a mistake, do a reversal
·         Partial reversal $10,000 instead of $1,000 = -$9,000
·         Accountants don’t like doing – too complicated across 8 accounts,
·         Do a full reversal instead and then redo
·         E.g. Cart created -> 3 items added -> 1 item removed -> shipping information added
·         Same as 2 items added?
·         As a series of facts, very different from each other
·         Want to know about how many items removed?
·         Most businesses are not just create, read, update, delete…. Many verbs
·         ES gives semantics associated back down to verbs
·         Business value comes from fact that we’re not losing information
·         E.g. Large POS, Amazon
·         Removed items from cart are more likely to purchase in the future – still want them can’t afford them
·         Old model
·         Add RemovedLineItems object or flag & date on line items
·         Query, subquery – time correlation – 3 nested subqueries
·         (Try using a Stream database instead)
·         ES model
·         Write projection with state inside
·         If item found in carts
·         Business person can go back into past and see things at that point in time with a deterministic perception we have today
·         Huge win for business
·         Useful for predicting future  - “Back testing” in finance
·         BI reverse engineer CRUD databases into events (imperfectly)
·         Temporal data model
·         Smoke testing
·         Rerun commands since day 1 every Friday and compare results from last time
·         Won’t protect you from black swans
·         Append-only good for hard drives (even SSDs that burn out rewriting)
·         E.g. Secure system
·         Gambling
·         Chris Harn – edited his bets on hard drive
·         How to prevent a super user attack
·         E.g. Pick 6 tickets
·         CSU/DSU
·         Prevent by putting log on “write-once” media – physically can’t modify data
·         Easier to physically secure a machine than to secure software
·         200 partitions within logs
·         Every aggregate has its own stream
·         Partition
·         Rolling snapshot
·         20,000 requests per sec if all in memory
·         Rents represents functions
·         Current state = left fold
·         Snapshots = memoisation
·         ES = functional way of storing data
·         Pattern match functions to events
·         ES = FP
·         Balance of bank account not a column in db but a function of account history
·         Provable

·         Natural fits for ES
·         Accounting
·         Pubsub
·         Don't have to build your own Event Store
·         Cassandra - stream per colum
·         Scales well
·         Medical system

Questions
·         How to justify cost of storing everything because you don’t know what you will need
·         Cost of data is low - 5gb for can of coke
·         Hard to justify not storing data
·         What is it not used for?
·         Lots of things
·         Things outside of core domain
·         Events represent use cases
·         Some use cases might not be high value
·         E.g claims more valuable than sales
·         Only used for competitive advantage – requires analysis
·         Pitfalls?
·         ES architecture
·         Monolithic - systems of systems instead
·         Expensive to do analysis
·         Does every projection read every event?
·         Projection pattern match, function
·         Only look at events interested in
·         Map reduce
·         I asked which databases other than Cassandra were a good fit for ES?
·         Consistency is important
·         Need CA for writes, AP for reads
·         Hard to find system that can be tuned like that
·         Riak but slow, quorum writes
·         Event Store has BSD license
·         SQL server for small systems