bi-temporal database – first lesson

Milestone in PWM — similar. I was using only the validFrom/validTill columns.

I think this is all about accountably modifying historical record.

snapshot — For a given observation date like 31/12/1999, “where TxStart < observDate and observDate < TxEnd” would return a snapshot as of 31/12/1999.

A “Tx” or “Transaction” — means a data entry. TxStart/TxEnd marks the observation window or period of belief. If we observe the price on a day within this window, i.e. take a snapshot on that day, then all the data rows outside this window is filtered out.

Indexing for performance — an index on (TxStart, TxEnd) would speed up such “snapshot” queries.

Data error 1 — If a wrong price was entered but quickly corrected, and never affected any customer, then I doubt bi-temporal is designed for this. Therefore the “never-delete” is not so strict.

Data error 2 — However, if a customer was charged the wrong price, then “never-delete” principle holds. The wrong price was in force at that time, so it is valid history (though not valid by company policy). Valid history is never deleted and must be milestoned.

Example —

c#Reuters database IV

Database Question

Table E
int id
string empType
string email
string firstName
string lastName

CLUSTERED (id, empType)
UNIQUE (empType, id, firstName)
DUPLICATE (empType, firstName)

Table E has: 10,000,000,000 rows

Table E has: one row with firstName “bettlejuice”

Table E has: 1,000,000,000 rows with empType = ‘fulltime’

SELECT * FROM E WHERE email = ‘‘;

SELECT * FROM E WHERE id = 100 AND firstName = ‘beetlejuice’;

SELECT * FROM E WHERE empType = ‘fulltime’ AND firstName = ‘beetlejuice’;

SELECT * FROM E WHERE empType = ‘fulltime’ AND firstName = ‘beetlejuice’;

How many pages for leaf nodes does the above query read?

INSERT: Writes how many pages?

oracle tablespace striping

If an audit table or a trade capture table gets a lot of concurrent writes, you may want to stripe it across disks to increase parallelism and write performance. One way is to configure the table with multiple tablespaces, according to a friend. Each tablespace maps to a physical disk.

The same stripe technique also speeds up reading.


basic oracle instance tuning tips #Mithun

You can turn on tracing for a given session. All queries will be traced. You can also trace across all sessions.

Oracle provides dynamic performance stats in the form of so-called V$ views, but I think the trace files are more detailed.

Another common technique is to record timestamps

– client log – issuing query
– server log – receiving query
– server log – returning data
– client log – receiving data
– client log – sending out data to requester, which could be in the same or another process.

Latency is additive. Find the largest component.

simple snoop table to monitor DB access

 charp1 VARCHAR(16384) DEFAULT  ”  NULL,
 charv1 VARCHAR(16384) DEFAULT  ”  NULL,
 charp2 VARCHAR(16384) DEFAULT  ”  NULL,
 charv2 VARCHAR(16384) DEFAULT  ”  NULL,
 nump1 VARCHAR(99)     DEFAULT  ”  NULL,
 numv1 FLOAT           DEFAULT  0   NULL,
 datep1 VARCHAR(99)    DEFAULT  ”  NULL,

/* one way to use this table is to save multiple params when calling a proc
insert snoop(charp1,charv1,nump1,numv1,datep1,datev1,sproc)
      values(‘param1’,?, ‘param2’,?,  ‘param3’,?, ‘myProc’)
 sproc VARCHAR(99) DEFAULT  ”  NULL,
 remark VARCHAR(99) DEFAULT  ”  NULL

SQL code generation – ETL^ORM

Hibernate does SQL code generation. Often sub-optimal. Now I think ETL tools probably avoid SQL code generation. Reason is efficiency.

Suppose your legacy app has business logic in query or sproc. ETL tools often emulate the same business logic but outside the database, and often at a much faster throughput.

[practically]proc to return 0-row, null or default value

We often write lookup procedures to return a single joined record.
Better distinguish between these scenarios below. The same stored proc
– returns 0-row
– return a special value to indicate 0-row
– return a null value for a field
– return a default value
If possible, I generally avoid returning null value, because they
require extra parsing in java. Besides, null values can be a consequence
of many scenarios — ambiguous.
If the one and only select from the proc simply selects a bunch of
variables, then 0-row won’t happen. How do you indicate 0-row? A very
common scenario. I often use a @rowct variable, that’s updated by the
earlier table selects. In this context, we can also put special values
into other fields to indicate 0-row.
If you want the caller to know it’s 0-row, null or default value, when
all scenarios are possible.
– then choosing a default value can be tricky
– null can be tricky because in the @rowct case, a lot of fields of the
last select might be null.

Oracle article on exclusive/shared locks, row/table level locks, non-repeatable-read

Here are a small sample of the knowledge pearls —

– If a transaction obtains a row lock for a row, the transaction also acquires a table lock for the corresponding table. The table lock prevents conflicting DDL operations (like alter table).

– A detailed comparison of read-committed vs serializable isolation levels in Oracle

To my surprise,

* “Readers of data do not wait for writers of the same data rows”. Maybe Reader thread simply goes ahead and read-committed. Reads either the BEFORE or the AFTER image of the row. Now, If isolation is at Serializable, then still no wait, but this thread will throw exception if it re-reads the row and detects update. This is a classic non-repeatable-read error.