Friday, August 26, 2005

Hit by Lucene Hits

java.io.IOException: Bad file descriptor
 	at java.io.RandomAccessFile.seek(Native Method)
 	at org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:415)
 	at org.apache.lucene.store.InputStream.refill(InputStream.java:158)
 	at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
 	at org.apache.lucene.store.InputStream.readBytes(InputStream.java:57)
 	at org.apache.lucene.index.CompoundFileReader$CSInputStream.readInternal(CompoundFileReader.java:220)
 	at org.apache.lucene.store.InputStream.refill(InputStream.java:158)
 	at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
 	at org.apache.lucene.store.InputStream.readInt(InputStream.java:73)
 	at org.apache.lucene.store.InputStream.readLong(InputStream.java:96)
 	at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:59)
 	at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:237)
 	at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:74)
 	at org.apache.lucene.search.Hits.doc(Hits.java:101)

The Hits should only be accessed when its corresponding IndexSearcher is open. Attempting to access the Hits after the searcher is closed may result in above exception.

The most disturbing thing is that this is not mentioned in the Lucene Javadoc, nor in the "Lucene In Action" book. Thus guarantee any newcomer will be hit by the hits issue! (pun intended)

Technorati Tags: ,

Wednesday, August 24, 2005

Why using Boolean to represent ternary states is a bad idea

The intention of my previous entry is not to advocate the usage of Boolean to represent ternary state but to document a usage I found while reading some open source code.

Use NULL to represent state is technically valid and common practice. Nullable column is often used in database design and many APIs return NULL.

However, people seem to really hate this idea. Why?

I believe the reason is that the usage violates the "least surprise" principle. As programmers, we are hardwired to think of boolean as binary state. The usage in this case, although valid, is against human psychology.

Use Boolean to return ternary state

By using Boolean as return value, it is possible to represent ternary state by - Boolean.TRUE, Boolean.FALSE and null.

Technorati Tags:

Monday, August 22, 2005

Be wary of "IN" clause

In my current project, some queries share a pattern of "where category.id in (:categoryids)". To be "DRY", I implemented a routine to extract a set of category ids; then use Hibernate's setParameterList(). Nice and clean... and WRONG.

The problem is that database has limited buffer for parsing sql query so that there is a limitation on the length of the "IN" clause. The above approach works with small set of data but bombs out when data gets larger.

The solution is to place the query for extracting category ids directly in the "IN" clause as subquery. The result is messy with lots of string concatenation, special case handling. But CORRECT.

An example of leaky abstraction and a reminder to test with large data set early on.

Technorati Tags:

Wednesday, August 17, 2005

com.octo.captcha.service.CaptchaServiceException: no captcha for specified id is found

com.octo.captcha.service.CaptchaServiceException: no captcha for specified id is found
  at com.octo.captcha.service.EhcacheManageableCaptchaService$EhcacheStore.getCaptcha(
    EhcacheManageableCaptchaService.java:909)
  at com.octo.captcha.service.AbstractCaptchaService.getChallengeForID(
    AbstractCaptchaService.java:534)
  at com.octo.captcha.service.image.EhcacheManageableImageCaptchaService.getImageChallengeForID(
    EhcacheManageableImageCaptchaService.java:505)

This is due to a bug in jcaptcha implementation, as detailed in here.

To spend 1 hour to get Maven running so that I can build from source to fix a bug known since June is not exactly fun for me at midnight running in "death march" mode.

Technorati Tags: ,

Sunday, August 07, 2005

No Fluff Just Staff (PHX) 2005

29684033 Aba4915151 M

I went to NFJS at Phoenix last weekend. Just as expected, the conference is well organized and speakers are great.

What I did not expect is the kind of comradeship I felt strongly towards fellow conference attendees whom I have never met before. When Larry pulled out the "Core Mac Osx And Unix Programming" from his bag and raved about the bignerdranch.com, I felt connected as I was planning to start learning Cocoa programming and the "Cocoa(R) Programming for Mac(R) OS X" by the same author is on my Amazon wish list. When I noticed Tim uses fisher pen and Moleskine Notebook , I knew instantly that he is a fellow Hipster PDA user for I have the same setup myself.

Although we are strangers, we are alike. We are all "geeks".

Saturday, August 06, 2005

[Book] JBoss : A Developer's Notebook

JBoss : A Developer's Notebook is a little book that packs a surprisingly good amount of information.

I find the last chapter on how to harden the JBoss instance alone worth the book's price. Do you know that an out-of-box JBoss installation exposes

  • the jmx console and web console without protection so that anybody can remotely shutdown the JBoss server?
  • the remote class downloading service so that anybody can remotely download any file?

If the answer is "No", you probably own it to yourself and your client to check out the book. Flipping it through for 10 minutes at your local bookstore might save you from having to deal with a security break in.

Technorati Tags:

Wednesday, August 03, 2005

org.jboss.mq.SpyJMSException

org.jboss.mq.SpyJMSException: Could not store message: 2 msg=1 hard 
NOT_STORED PERSISTENT queue=QUEUE.teema_email_queue priority=4 
lateClone=false hashCode=28390332; - nested throwable: 
(java.sql.SQLException: Io exception: Connection reset)

Above exception is thrown when trying to enqueue object into a JBoss JMS queue that is backed by a Oracle database.

The cryptic error message is not much help. Neither does Google. Not able to find stock anwser, I started to lay down facts in order to piece together the puzzle. The first clue is that the error appears only in a particular use case, but not in others. What is the difference, I wonder. One noticeable difference is that the object to be enqueued in the failed use case is quite large compare to those in other use cases. May be size of the object is the deciding factor here. A quick testcase confirmed my suspicion.

As I know JBoss saves JMS message as blob, this looks like a problem in handling blob data. Googling "oracle blob size" turns out both explanation and solution. Apparently the older Oracle thin JDBC drivers require non-JDBC compliant way to properly handle blob data. In other words, calling setBlob() doesn't work as expected, which is exactly what JBoss JMS implementation appears to be doing. When data is less than 4000 bytes, Oracle stores it inline, directly in the column, therefore setBlob() works for small data. For larger data, Oracle stores it as LOB and that is when setBlob() breaks and our problem begins. Upgrade to Oracle driver for 10g, which is compatible with Oracle 9 database, fixes the problem. According to its release note, 10g driver adds direct support for lob/clob/blob at JDBC level.

Technorati Tags: , ,