Credibility

2011-01-01

533

JL Computer Consultancy

Credibility without Credentials ?

Mar 2005

In a recent thread on AskTom, http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:35336203098853 Tom was asked what he thought about an article by Don Burleson that offered the readers the opportunity to enter a competition to write a program that could reliably predict tables and indexes that could measurably benefit from reorganization.

The responses to this article raised a number of interesting points. One such point came from Don Burleson himself, who wrote an article for his website titled “Are all Oracle Scientists created equal?” In this article, Don states the following:

Knowing the qualifications of those who proclaim to be Oracle “scientists” is very important. In this world of fakers and posers, all Oracle professionals need to have strong “BS” radar and a quick Google search can tell you how much weight to give to the assertions of any Oracle “scientist”. Personally, I check for these qualifications:

Experience at Oracle Corporation - Was this person intimately involved with the internal machinations of Oracle at Redwood? Nobody knows Oracle like the folks who built and maintain it (especially if they have source code!).

Computer Science Background - What is their academic CS background? Were they good enough to get into a respected university science program? Were they able to compete effectively for entrance into a competitive graduate school? Does the Oracle “scientist” have a Masters or Doctorate degree in science?

Computer Science Research – I quickly check the ACM and IEEE archives to locate all research in any scientific journals, and use Google to find all presentations and whitepapers at computer science conventions.

And follows it up with:

Now, I’m sure that they are just being modest, but it makes me wonder? Is a publication by an Oracle Scientist the pontification of a Harvard PhD or the ruminations of a high-school dropout? Frankly, I’d like to know.

I would like to suggest that such checks are unnecessary because information that is presented properly tends to carry its own credentials. Either it is good information, in which case the credentials of the supplier do not matter; or it is bad information, in which case top-class credentials won’t rescue it.

However, it has to be recognized that many people go to the Internet to acquire information because they are currently “uninformed”. So how can they determine whether the “information” they find is good or bad when they don’t know what it is they don’t know? (See the article “The Rumsfeld Box” – 24K pdf).

The first step is to eliminate the bad as quickly as possible – and so I would like to demonstrate that you don’t need to be an expert in a topic to recognize when an article on that topic should be discard (unless, perhaps, you have plenty of time to spare winnowing out the grains of truth from the pile of chaff). The sample I have chosen to demonstrate this point is an article published in January 2004 by DBAZine with the title "Inside Oracle Indexing", written by Don Burleson.

In paragraph 3 (Bitmap Indexes) we are told that "bitmap indexes are used where an index column has a relatively small number of distinct values". It isn't clear whether this is intended as a piece of advice, or is simply an observation of the way things are often done. If it's advice, it is advice that is far from complete and there are articles on the Internet explaining why (e.g. Understanding Bitmap Indexes – also published by DBAZine). If it's simply an observation, then Don Burleson should have included some comment on the advisability of the practice.

In paragraph 8 (Index fast full scans), we see the statement that "this method does not read the index nodes." There are two problems with this statement: first it introduces the term "index nodes" with no explanation – what is the difference, if any, between index blocks and index nodes. Secondly, if we assume that Don Burleson is using the term "index nodes" in place of the term "branch blocks", then it's wrong. Oracle can't "not read" a branch block until it knows it's a branch block – so it's got to read it to discover that it shouldn't have read it. The meaning of the term "index nodes" is not made any clearer when we see the term "deleted leaf nodes" later on in the article.

In paragraph 9 (Star Joins), we are told of an index type called the "Star Index". But there's no such thing. There are star joins which make use of multi-column B-tree indexes, and there are star transformation joins, which make use of multiple bitmap indexes. The difference between Star Joins and Star Transformation Joins was probably the idea Don Burleson was trying to express when he says a “star index” used to be a ‘single-concatenated index’ and then changed to a ‘bitmap index’ implementation.

Reproducible Code

So far, we’ve seen a couple of things that you might know to be misleading or incorrect. This is the sort of thing could suggest a "confidence rating" for the article only if you happened to know some of the material already. You need a more concrete demonstration of quality (or lack thereof) to help. Sample code can be very revealing.

The first major chunk of code in this article creates a table with some 70 columns, and then populates about 50 of those columns from an insert/select statement.

You may argue about the benefit of switching to "create table as select", "alter table add {columns}" for this piece of code. It might help to reduce the risk of error, future-proof the code, and take up 120 fewer lines. But that's a highly subjective opinion about style, and Don Burleson may have written the code for reasons of simplicity and clarity rather than anything else. Be very careful about distinguishing between style and content when assessing other people's work.

The first surprising anomaly in the code is the presence of the clause:

storage (initial 5K next 5K maxextents unlimited).

Whoever wrote the script seems to be unaware of locally managed tablespaces and their relevance to storage clauses. Moreover, the choice of extent size is very unusual.

But even more surprising, consider the update loop that follows the initial insert. Every row in the index_details table is going to have 20 or so columns updated from null. Don Burleson has defined a table with some very specific (and strange) values for the extent sizes, but not considered setting the table's pctfree to something other than the default of 10 in order to avoid row migration.

You might also note that there is no index on the index_details table, so the pl/sql loop that updates the table is going to have to do a full tablescan for each row updated. Production systems often have a very large number of indexes, so the index_details table could be quite large. Given Don Burleson’s inclusion of all those lines of column names in the create table statement, I doubt if the omission of the appropriate create index statement was simply to save a bit of space in the article – it seems more likely that that this piece of code has not been run on a large-scale production system, so Don Burleson has not noticed the potential performance problem. (Of course, I am stating an opinion here, not making an objective statement of fact – were it the only little omission or error in this article, I might be far more generous in my assessment).

However, the code does attempt to analyze index…validate structure*** on most of the indexes in the system (and Don Burleson could have mentioned that this is likely to be very resource-intensive). So perhaps the extra cost of migrated rows in and tablescans of the index_details table may be an insignificant fraction of your performance overhead of running the code. In passing, a call to the analyze index…validate structure*** command tries to acquire a share lock (mode 4) on the underlying table. If anyone happens to be modifying data in a table as the loop gets to it, the loop will crash with error ORA-00054: resource busy and acquire with NOWAIT specified. Conversely, no-one will be able to modify the data if you are busy validating the structure of any indexes on it. Both these points deserve a mention.

There is another anomaly in the table creation statement that might catch your eye. The statement creates a table that is supposed to hold one row for every index in the system but one of the column names in that table is partition_name. What's going to end up in this table if we have any (composite) partitioned indexes? The answer is that the index_details table will capture the index_stats figures for the last (sub-) partition examined in each (composite) partitioned index, even though all the (sub-) partitions will be examined in turn by the analyze index…validate structure*** command. The code does not produce the intended result set.

At this point, you might like to try testing the code on a database that uses partitioned tables, IOTs, LOBs, nested tables and domain indexes – perhaps leaving a few uncommitted updates in place on some of the tables whilst you run the code. It would be instructive to see how many different features of real systems could result in unmentioned side effects.

Moving on to the next piece of SQL – we have a statement that updates a column (num_keys) that doesn't exist in the table, and the subsequent view-creation statement selects two columns (num_keys, sum_key_len) that don't exist in the table. Clearly, there is an alter table add {columns} statement missing. But if you add the columns that meet the needs of the view, one of them will be left blank anyway because the update statement does not reference it.

There is a further error in the update statement. It matches an index owner (a.owner_name) in the index_details table against a table owner (b.table_owner) in the dba_ind_columns view. This probably won't cause a problem for most people, as it is fairly common practice for the indexes on a table to be owned by the owner of the table. But it does happen.

Is it reasonable to complain that the subsequent drop view and create view statements use different view names. If this were the only error, I think it would be most unreasonable to make any comment about such a trivial typographical error. But how many trivial errors do you allow before you start thinking that perhaps Don Burleson hasn't really created and tested the original code.

Explanations

So far I've pointed out items that fall into two areas – errors which you may be able to identify because you already know the answers, and objective errors in the code. If you happen to spot a couple of the former or several of the latter, you would be right to treat an article a little cautiously. But errors of detail and code can be forgiven in an article still managed to make some useful comments on strategy. So let's continue the review looking for the cogent arguments or good explanations.

The paragraph headed Arguments for Index Rebuilding says: "Many Oracle shops schedule periodic index rebuilding, and report measurable improvements …". It would have been nice if Don Burleson had persuaded one such shop to make a few comments about the type of work their system did, what fraction of indexes they rebuilt, how frequently they built them, and how they measured the improvement.

We are also told that 10g will "include a tool to detect and rebuild indexes that need rebuilding", but in this case we have a reference to an article (a marketing presentation by Oracle) so we can examine the background for this comment by reading the original text. Moreover, the extract quoted shows you exactly the boundaries of the claim made by Oracle, viz “AWR provides the Oracle Database 10g a very good "knowledge" of how it is being used.” [My emphasis]

The paragraph headed Arguments against Index Rebuilding says: "Some Oracle in-house experts maintain that Oracle indexes are super-efficient at space re-use and … a b-tree index rarely needs rebuilding". It would have been nice if Don Burleson could have quoted some of the background, or referenced a couple of URLs that explained the assumptions underlying the claim that these experts were making.

So, Don Burleson’s quoted arguments about rebuilding indexes are:

FOR – some people find that they can get some benefit from rebuilding (some?) indexes and Oracle Corp. is attempting to produce a tool to identify by their pattern and cost of usage those indexes (does that mean only those indexes) that need rebuilding.

AGAINST – sometimes you can get performance benefits by rebuilding b-tree indexes. (I am assuming here that the phrase 'indexes rarely needs rebuilding' is equivalent to 'some indexes may need rebuilding occasionally').

It’s not easy to tell the difference between the “for” and the “against” here. So what is Don Burleson really trying to say - what is the strategic content of this document?

Don Burleson then gives us some explanations of a few of the statistics Oracle holds on an index:

Clustering_factor – the general tenor of this definition is correct (for b-trees) and is close enough to give the right idea.

Height – This contains a statement that: ’Once the index nodes have split to a pre-determined maximum level the index will “spawn” into a new level.’ There is no explanation of the word "spawn" and no indication about the way in which the "maximum" is "pre-determined" or what it might be, and the word “level” seems to have two different meanings in one sentence. Moreover, your intuitive response to the word "spawn" may be that Don Burleson is suggesting that leaf blocks have “children” – which is wrong. (If anything, it would be more accurate to say that leaf blocks put themselves up for adoption and change their parents in critical cases.)

Blocks –The article suggests this is the number of blocks allocated to the segment – it’s not, it’s the number of blocks in use. (Don Burleson made the same mistake when defining clustering_factor). There is a comment here about creating indexes in very large block sizes, with a reference to back it up. Unfortunately the link is to an advertisement for a book published by Don Burleson’s publishing company, rather than the original paper, which I think can be found at: www.embarcadero.com/resources/tech_papers/ResolvingOracleSpace_6_26.pdf . In the original paper, we see only the statistics relating to a particular query, and not the construction of the data or even the query text. The query is described as a range scan on a 1 million row table although the statistics look like an index fast full scan that never visits the table and does a “no-sort” aggregate. (In fact, Don Burleson recreates the experiment for SearchOracle.com and on his own website http://www.dba-oracle.com/art_so_blocksize.htm describing the operation as a range scan even though the execution plan shows it to be an index (fast full scan).

pct_used – The comparative description is incorrect. Dba_indexes doesn't have a pct_used column, it's dba_tables that has this column; and it is the "link" threshold, not the "unlink" threshold – table blocks go ON to a transaction freelist when they fall below this percentage use.

Applied Science

The section entitled "Is there a Criterion for index rebuilding ?" sounds promising. Perhaps here we will get some sort of strategic advice.

Alas, no. The section includes a sample report to “analyze” your collected index statistics, some criteria (unrelated to the report) that another DBA (but not necessarily Don Burleson who offers no suggestions of his own in this article) thinks are really good as guidelines for rebuilding indexes, and a couple of questions to address.

But look at question 2, which says: "assume I have an index on the last_name column of a 1,000,000 row table and the clustering_factor is 1,000,000 indicating that the rows are in the same sequence as the index". This contradicts the description of clustering_factor that Don Burleson gave in an earlier section. (Note – the earlier definition was broadly correct, the error is in this section).

The paragraph then goes on to suggest that the clustering_factor has an effect on the reusability of space in the index if you do "a bulk delete of all people whose last_name begins with the letter “K”". But if you delete all the people whose last name begins with the letter K, you empty out all the 'K- blocks' in the index irrespective of whether the 'K-people' are densely packed or widely scattered in the table. The reusability of the index leaf blocks is not dictated by the clustering_factor, it’s a consequence of the decision to clear all the leaf blocks in the ‘K’ section of the index.

So let’s look at the report, and see if it matches up with any of the comments that precede it? I don't really think so. But there is one significant thought you can take from these real-life figures. Look at the 'average height = 4' line from the report, and assume that this actually does get close to representing one of the indexes in the system.

It uses 538,000,000 bytes in its leaf blocks, and 113,628 leaf blocks at an 8K block size (the 8K is more or less implied by the average leaf block length). Based on these figures, the average used space in a leaf block is 4,740 bytes, or about 60% efficiency.

Rebuild this index at 100% packing, and you will need

113,638 * 4,740 / 8,000 leaf blocks = 67,330 leaf blocks

Based on the leaf row count, each entry is about 43 bytes, so we can get about 190 entries per block. Let's track that number up through the branch blocks:

67,330 leaf blocks need 67,330 / 190 = 355 branch blocks in the level above

355 branch blocks (Level 1) need 355 / 190 = 2 branch blocks in the level above

2 branch blocks (Level 2) need 2 / 190 = 1 branch blocks in the level above.

After we've packed the index to 100% it will still have height > 3 which means it will have to be rebuilt all over again immediately (according to the criteria that Don Burleson quoted, but doesn’t necessarily think are good. His own data shows the flaw in the argument, but he doesn’t mention it.)

Even though the report tells us nothing particularly useful about the set of indexes it was run against, it is interesting to note that one set of hard facts, with a description of how they were obtained, has led to a useful observation. This system probably does have some indexes where the advice to "rebuild any index with a height greater than 3" is definitely suspect. Not only that, the index selected by the bad guideline would be the most expensive index in the system to rebuild.

The only solid strategic advice you can get from this section of the article is something you can work out only if you understand how indexes work already; and that piece of advice contradicts a guideline which is in the same section of the article.

Conclusion

Your purpose in reading any article about Oracle is to acquire useful, trustworthy information. You need a process to help you assess the probable quality of the content when you aren't familiar with the topic. This article demonstrates such a process:

Use whatever prior knowledge you do have as a benchmark.

Assess the degree of care and thought implied by any code fragments supplied.

Assess whether the quality of explanation tells you anything about the authors coverage of the issue.

Watch out for arguments that substitute opinions or guesses for facts – without marking them as such.

Every article you read about Oracle should be put through exactly the same critical process (including this one). A dozen good articles do not make an author infallible. Conversely, one bad article shouldn't make you condemn everything an author produces.

Don’t trust information because of the credentials of the source - good information carries its own credentials, bad information usually advertises its own lack of credibility.

Finally, if something doesn’t fall very clearly into either camp, then you have to ask yourself whether the inherent value of the ‘possibly sound’ information is sufficient for you to expend some time checking it.

Footnote

The points made in this article were first written for an article I offered to DBAZine in February 2004. The article was rejected. Since Don Burleson was apparently the Senior Technical Editor for DBAZine at the time, and the article listed several errors in his work I had assumed that he would have seen the article. However, I made the same point in a previous review of Don Burleson’s work, and in his reply (see A Review) he stated that he had not been shown my article. He may say the same about the above – which would explain why the article has not been corrected in the last 12 months.

Footnote 2

I have notified Don Burleson by email that this article is on my website, and offered him the opportunity to comment, either on this website or by giving me a URL to a response on his own website.

*** 15^th March 2005: analyze index … validate structure;

A note of thanks to Matthew Hayward for pointing out the need to clarify my text. When I first mentioned the locking issue associated with the ‘analyze index … validate structure’, I referred only to the fact that the supplied code sample used the ‘analyze’ command, rather than mentioning the validate option explicitly. It is the validate option of analyze that locks the table. If you are using analyze to collect statistics, this does not lock the table.

Back to Index of Topics

oracle