(A personal blog) We discussed enterprise reporting on a database with millions of new records added each day. Some reflections…
One of my tables had about 10G data and more than 50 million rows. (100 million is kind of minimum to qualify as a large table.) This is the base table for most of our important online reports. Every user hits this table (or its derivative summary tables) one way or another. We used more than 10 special summary tables and Business Objects and the performance was good enough.
With the aid of a summary table, users can modify specific rows in main table. You can easily join. You can update main table using complex SELECT. The most complex reporting logic can often be implemented by pure SQL (joins, case, grouping…) without java. None of these is available to gigaspace or hibernate users. These tools simply get in the way of my queries, esp. when I do something fancy.
In all the production support (RTB) teams I have seen on wall street, investigating and updating DB is the most useful technique at firefighting time. If the reporting system is based on tables without cache, prod support will feel more comfortable. Better control, better visibility. The fastest cars never use automatic gear.
Really need to limit disk I/O? Then throw enough memory in the DB.