Wednesday, November 5, 2014

Is it a bad practice to catch ‘Exception’?

I recently encountered an organization that religiously enforces the Checkstyle IllegalCatch rule (reference here) and mandates following it for all production code.  This rule seeks to ensure that developers don’t catch Throwable, Exception, or RuntimeException directly. The assertion is that it is always a better practice to catch more specific exceptions.  This rule seems a bit draconian to me, so I did a little surfing to see what others thought.  It seems the question has been debated at length, sometimes heatedly.  In fact, like discussion involving religion or politics, discussion of this issue is sometimes more heated than the strength of the supporting arguments provided.

Common reasons that people answer ‘yes’ to this question and assert that catching Throwable, Exception or RuntimeException is universally a bad practice seem to distill to the following reasons:

  • Developers can’t expect to handle all kinds of exceptions, which is what catching Exception implies (i.e. you’re code is not worthy).
  • It’s possible that classes up the stack will have a better ability to handle the unexpected exception thrown.
  • Detailed exceptions can often provide additional detail about the problem.
  • Developers routinely handle exceptions poorly loosing information needed to diagnose reported bugs or environmental issues.

It is interesting that a common “exception” (pun intended) to this rule given by the skeptical is that application entry points need to have a global catch.  Presumably the reasoning is that logging the exception to standard error (which is what the JVM does with uncaught exceptions by default) isn’t what’s wanted in most cases.

I am one of the ‘skeptics’ and disagree with the global catch prohibition, at least for Exception and RuntimeException, for several reasons.  Let's treat Throwable as a separate case that I'll address later in the entry.

There is no reliable way to identify the list of specific exceptions a developer can ‘expect’ and could be coded in catch clauses.  Most of us leverage third-party products in our code very frequently.  Unless the products you use throw checked exceptions or takes the trouble to document what exceptions it throws in the throws specification for all methods, there’s no way to get a specific list of the exceptions you can reasonably expect from product code.  If you don’t have a list of specific exceptions you can ‘expect’, there’s no reasonable way to code specific catches for those exceptions.

Specific catches violate DRY most of the time.  Most of the time, our response to all types of exceptions is identical.   Either we log the exception or convert it to some type of RuntimeException and re-throw it.  If our response to the most specific exceptions will be identical, coding multiple catches with identical logic just to adhere to the illegal catch rule just creates code bloat.  Yes, as of JDK 1.7, there’s a syntax to help eliminate the code bloat (as illustrated below).  Many of us aren’t on 1.7 in production yet.
JDK 1.7 catch example:
catch (IOException|SQLException ex) {
    logger.log(ex);
    throw new RuntimeException(ex);
}
Pre-JDK 1.7 catch example with DRY violation:
catch (IOException ex) {
    handleException(ex);
}
catch (SQLException ex) {
    handleException(ex);
}

The JDK itself is replete with instances of catching Exception.  In looking for best practice guidance for general coding questions such as this, I consult the JDK source itself.  Presumably the people who write the JDK are an authority on how the language was intended to be used.  If general catches were a bad practice, why are JDK authors doing it routinely? Don’t take my word for it.  Look for references to Exception in Eclipse and start auditing the search results.  Instances of global catches are too numerous to list here.  To save you time, MessageFormat (and many other java.text classes) and CompletedFuture (and many other java.util.concurrent classes) have global catch code.  Instances of catching RuntimeException, Error, and Throwable exist but appear to occur less often.

The JDK itself provides a way to supply handling logic for uncaught exceptions (as of JDK 1.5).  Classes implementing Thread.UncaughtExceptionHandler can be associated with any Thread or ThreadGroup.  Should a thread experience an uncaught exception, the JVM will automatically invoke the handler for the thread (or it’s thread group) if it was defined.  This feature can be used to supply logic to log exceptions (to somewhere other than the default standard error) or notify administrators in some other way.  Apart from the syntax difference, this really is a type of global catch.  Why provide this feature if it's a 'bad practice' to use it?  Unfortunately, exceptions logged at this level will have little knowledge of the context surrounding why the exception was generated.

Incidentally, it's not uncommon to farm global catches out to products that execute that global catch for us.  I'm thinking of Spring interceptors and handler adapters that are often configured to effectively issue global catches on our behalf and escort users to a standard error page.  Isn't this type of practice effectively a global catch, just with a different syntax? Isn't it hypocritical to configure products to issue global catches, but deliberately prohibit global catches directly?

Global catches combined with adding runtime context information has saved me boat-loads of time. In many cases, handling an exception either means logging that exception or recasting the exception in some way and including a meaningful context that might be useful to developers fixing the issue.  I routinely use ContextedRuntimeException from the Apache Commons Lang product for this purpose.  I’ve provided an illustration of this concept below (from the Apache Doc).

   try {
     ...
   } catch (Exception e) {
     throw new ContextedRuntimeException("Error posting account transaction", e)
          .addContextValue("Account Number", accountNumber)
          .addContextValue("Amount Posted", amountPosted)
          .addContextValue("Previous Balance", previousBalance)
   }

The stack trace output for the ContextedRuntimeException will report context information as well as the root cause.  For most reported exceptions using this technique, I’ve enough information to diagnose the bug or at least replicate the bug most of the time.  This technique *only* works when the exception and context are caught close to where the exception was generated. Also, it's important to record the original exception as a cause as it may have additional information useful to solving the issue.  Letting the JVM log the exception to standard error by default will not capture the information about the problem being captured here. Yes, it is possible that the additional information captured might not be valuable.  It is better to 'have' and not 'need' then the reverse.

The assertion that developers sometimes handle exception logic (code within a catch block) poorly and swallow information needed to diagnose problems is a fair point.  However, developers can insert poor exception handling logic for specific exceptions as well general exceptions.  Prohibiting global catches in all cases does not solve poor exception handling coding issues.  Best practices for exception handling logic is a topic that deserves more in depth treatment, perhaps in another blog entry.

There are reasons to catch specific exceptions.  Sometimes specific exceptions have additional information that should be captured.  Some exceptions represent business process errors or user errors and not unintended application exceptions. I'm not trying to advise that developers should catch Exception exclusively.  I do believe, it should be an option and not outright prohibited.

I would like to see two JDK improvements with exception handling.  Java needs a syntax to list “exceptions” for a global catch.  For instance, if I could easily code a catch that specified that I would like to catch all exceptions except instances of VirtualMachineError (out of memory conditions, JVM internal errors, etc.), the illegal catch rule would be easier to adhere to.  For example, I'd like to be able to specify a catch clause like:
catch all except(VirtualMachineError|LinkageError ex) {
    handleException(ex);
}

It would also help if it were possible to specify a default Thread.UncaughtExceptionHandler at JVM startup and supply notification logic for memory conditions and low-level virtual machine errors.  The default behavior of logging to standard error isn't wanted in many cases.  Note that it is possible to set the default exception handler for the entire JVM via Thread.setDefaultUncaughtExceptionHandler().

Wednesday, May 28, 2014

Handling External Processes in Java made Easy

Handling external processes in Java has never been easy.  The JDK provides a way to start and manage external processes and they do work; but they are awkward.  If the process hangs for some reason, getting at the input and output streams to figure out what’s going wrong is a pain.

I ran into this issue recently with my work on the Transform4J project, am open source Java ETL Transformation API,  where my test cases need to start and stop databases such as Cassandra and MongoDB.  Getting these test cases to work as far as starting up and shutting down these databases was very irritating and took more time than I would like to admit.  This prompted me to look for an open source product that makes process management easier.  I found one.

The Apache Commons Exec product makes external process management from Java very, very easy.  To illustrate with a simple example, let’s start up a Cassandra database for a series of unit tests.  

Executor executor = new DefaultExecutor();
executor.execute(new CommandLine(cassandraStartupCommand));

In my case, I was having trouble getting Cassandra to start up properly at one point, so I added a StreamHandler to capture any output so I could more easily debug the issue.  By default, this output goes to standard out and standard error (this is configurable).  I just added one line before execution:

executor.setStreamHandler(new PumpStreamHandler());

As it happens, I need to shut the database down after unit tests have completed.  It turns out that this is easy to do.  Cassandra initiates a graceful shutdown when the process is terminated.  To accomplish this with Commons Exec, you need an ExecuteWatchdog.  Adding one to the execution  is relatively simple:

ExecuteWatchdog  watchdog = new ExecuteWatchdog(5000);
executor.setWatchdog(watchdog);

When the unit tests are complete, terminating the process is simple:
watchdog.destroyProcess();

Note that there are additional features in this API that look useful (but that I didn’t happen to need for my unit tests).  With an ExecuteResultHandler, you can optionally throw an exception should the sub-process fail; I may incorporate this into my testing process.  You can very easily associate a ProcessDestroyer with the execution to terminate the executing process via shutdown hook when your JVM terminates.

I wish I had found this product sooner; it’s been around for five years or so.  Those looking for a more in depth introduction should check out the tutorial here.

Thursday, March 20, 2014

A JSON Library Evaluation for Java EE applications

I have a need to enhance a set of Java EE applications to support mobile development projects.   As mobile developers seem to prefer JSON formats to XML for passing data to/from mobile devices, I had a need evaluate current Java JSON libraries for use by the supporting Java EE applications.  The client prefers open source products to vended products.  The most prevalent open source product choices seem to be the following:

This evaluation was conducted in March, 2014.  The evaluation was conducted using the methodology described in chapter 13 of the Java EE Architect's Handbook.

Evaluation Criteria

As with all product evaluations, it’s important to establish the criteria by which product choices will be graded and on which a product decision will be made.  The criteria used in such a product choice would obviously vary per project and organization.  I’ve settled on the following criteria:

  • Level of community activity - measured by the average number of releases per year.  Active projects are more likely to be enhanced in future.  Should the product not be maintained and become obsolete, it may need to be replaced in future and may incur additional costs of ownership.
  • Market share – measured by the number of downloads as a proxy.  I would prefer direct market share information instead of a proxy, but that isn’t usually available for open source products. Solutions for problems and issues with the product are more likely to be posted on the web for more popular products; such increased posting speeds development and indirectly lowers cost of ownership.  
  • License – The license for the product must conform to a specified list of legal requirements; most common open source licenses are acceptable.
  • Ease of Use – subjective measurement based on code samples and the quality of documentation.   Ease of use features greatly speed development and indirectly lowers  the cost of ownership.  
  • Performance – measured by speed and required memory footprint.  Lower footprints lower hardware requirements and lowers cost of ownership.   Same test case used for all products so that performance can be more easily compared.
  • Versioning Support – I expect that as features are added, existing JSON formats will be changed and enhanced.  With a mobile development with the application distributed to the public, forcing upgrades to accommodate JSON format changes isn’t feasible.  A JSON library that provides support for legacy formats is preferred.  it’s possible that a versioning solution will involve additional products and not solely reside within the JSON library itself.
Not all criteria are necessarily weighted equally; weights given to each item will vary per project and enterprise.  It’s also possible that individual projects might have additional criteria not listed here.

Evaluation Results and Product Ranking

The evaluation criteria were prioritized by the needs of my upcoming project.  It's possible that your priorities might be different.  All criteria were assigned a numerical score from one to ten with ten being the best.

Scores for the criteria were weighted by priority.  High priority scores were multiplied by 2.  Medium priority scores were multiplied by 1.5.  Low priority scores were left alone.  Weighted scores for all criteria were then added together for the overall ranking.  Reasons for the ratings given are discussed in more detail in the product observations section below.


Product Observations


Ease of Use

I view this as the most important of the criteria as it most directly lower costs of ownership.  With both Gson and Jackson creating and reading JSON formatted data were two liners.  I’ve provided the source for my prototypes in the reference section below.  With my prototype, there was no need to write custom serialization/deserialization logic; both products do have robust options to do this should it be needed for your projects.  Both products by default handled escaping special characters like quotes and carriage returns.  The prototype source code and all dependencies can be downloaded from here.

One reason Gson slightly edges out Jackson is because it has better documentation.  The Gson documentation is better organized and more concise.  As a result of that documentation, constructing the Gson prototype took a little less time than writing the Jackson prototype.
It’s also notable that Jackson has a clumsy distribution: They need a one-zip download that contains binary jars, source jars, Javadoc jars, license and documentation.  Instead, all of these are downloaded separately and the product is structured so that it’s not obvious which jars you need for your particular project.

Note that coding using the classes provided with the JSR 353 spec takes much longer and it’s much easier to create bugs.  My prototype (I re-coded the same product with each of the products) required over 100 lines of code using the JSR 353 construct as opposed to the handful required by either Gson or Jackson.

I did not code a prototype for the Json.org product as it’s not a real product.  There’s no formal distribution bundle.  For distribution, you download class source individually and build it.  There are no unit tests, so you can’t validate your build very easily.  Essentially, product source becomes additional source in your project that you must maintain.  It became clear that this product isn’t a valid option, despite the fact it appeared near the top of internet searches for JSon libraries.

Level of Community Activity

Community activity is important for open source projects as products will become obsolete without care and feeding.  Being part of the Java EE specification means that the JSR353 specification have or will have support from all the Java EE implementers, both commercial and open source.  It’s hard to compete with that.

It should be noted that both Gson and Jackson typically have multiple releases per year.  Both products are used in several open source products.  Both Gson and Jackson have ample community support and will no doubt be enhanced for some time to come.

Market Share

The number of downloads could only available for one of the four products:  Gson had 143,904 downloads as of March 14, 2014.  Somebody built the binaries for the java classes on Json.org; that unofficial distribution had 10,821 downloads on March 14, 2014.  Download statistics for Jackson isn’t published as far as I can find.  For JSR353, it’s not possible to identify download statistics; it’s not clear that everybody who downloads a Java EE application server with JSR 353 support in it will use that section of the container for JSON processing; they could use either GSon, Jackson, or some other library.

Performance

The performance test was identical for all products.  The test consisted of starting with a complex value object with data and using that object to produce JSON-formatted data.  That JSON formatted data was also read and marshaled into the value objects.  This is a very common usage scenario.  Source for the performance evaluation can be downloaded from here.

Memory was measured by noting consumed memory (total memory less free memory) both before and after the test.  The test was run for 100,000 iterations to make time and memory usage more noticeable.  Results are in the table below.

The JSR353 reference implementation is at the extreme; it was a lot slower than the other two libraries, but appears to have a much smaller footprint.  It’s possible that the JSR test took so long that there was a garbage collection during the test, which means that the memory measurement is understated.

Comparing Gson and Jackson, Gson was faster and had a smaller memory footprint for reading Json data and marshalling that data into value objects.  Jackson was faster and had a smaller memory footprint producing Json data from value objects.

Versioning Support

Gson versioning support uses annotations.   When coding value objects that will correspond to the JSON data read or produced, you use the @Since annotation t record the version in which the field appeared.  When you instantiate Gson to read or produce JSON data, you have the option to specify the version of JSON that will be used.  Any fields from later versions will be ignored.

None of the other products appear to have versioning support.

License.

All three viable product options have licenses that are acceptable to most organizations.  Gson uses Apache 2.0.  Jackson’s license appears to have changed over time; Jackson was LGPL up to version 2.1 and an Apache license thereafter.  Jackson does not specify clearly which version of the Apache license it uses.  Furthermore, its distribution method doesn’t place a copy of the license in what you download.

Concluding Thoughts

Either Gson or Jackson are reasonable choices.  Both products are very easy to use for common use cases and have options for customized serialization/deserialization logic if needed; for most uses custom logic won’t be needed.

My choice for my upcoming project is Gson.  I’m lured by the superior documentation and versioning support.

I hope you’ve found this entry helpful.  Thanks for taking time to read it.

References


Saturday, October 6, 2012

Are Commercial J2EE Application Servers worth their cost?


If the market is any indication, the answer is a resounding no! Check out market share research conducted earlier this year and published on Silicon Angle. The two largest commercial application servers Websphere and Weblogic have a whopping 2.17% market share between them. The lion share of the market is going to open source application servers such as Tomcat and JBoss.
As an architect and developer, I've always thought open source application servers easier to support. In most organizations, access to software vendor support staff is tightly controlled and requires a bureaucratic effort to utilize. Because of this, developers are often left attempting to resolve issues on their own anyway. 
Self serve resources for commercial application servers such as documentation, web postings for similar problems, and searchable bug lists are often not much better than what you find with the open source alternatives. Furthermore, getting to knowledgeable support staff often takes time I don't usually have. Often, my support calls need to get routed from first-level support to second or third level support.
Most problems that start out being blamed as "application server" issues are usually application code defects. Rarely are outage or production defects resolved at the application server level. This makes sense as application server code is usually better and more thoroughly tested than application code.
As a manager, I've never found the "security" of having vendor resources available particularly comforting. It certainly doesn't assuage clients who are experiencing some type of outage or defect they need help with for very long.
Centralize and standardize use of sophisticated software (whether its commercial or open source) throughout the enterprise. Standardize the use of application server software to the point where all the typical issues are solved once and do not need to be continually revisited. For example for application servers, I standardize all build and deployment scripts and container configurations. With the exception of memory allocation and port assignments, I standardize other feature usage (e.g. management console usage) so that they are the same for all deployed applications.
These choices are not usually revisited by each application developer or team for each application. When the need for changes arise, these changes are centrally evaluated and those configuration standards changed. The change is then deployed on a planned basis throughout the enterprise. This may seem a bit excessive, but I'd rather developers spend time adding needed software features to applications and better supporting the business rather than low level application server configuration concerns. As a result of this standardization, mysterious problems occurring in some environments and not others rarely happen. I've written more about the benefits of this type of standardization here.
Some organizations see liability benefits to commercial software. For instance, it's another firm to possibly to shift blame to should problems and issues arise. Maybe it's where I've worked in the past, but I've never seen blame shifting strategies of this type work over the long term.
Another inference from this market share study is that use of Enterprise Java Beans (EJBs) has largely disappeared. However, that should be a separate discussion.

Thursday, March 15, 2012

Design Tips for Integrating Your Java/J2EE Applications with 3rd Party Software Products


Those of us writing Java/J2EE applications are commonly asked to interface with other applications we don't control. Sometimes, these are other custom applications written and managed by other teams. Sometimes, these are vended applications. Often, these applications are on a different platform (e.g. .Net) and sometimes not even designed to be integrated easily with custom applications. I refer to these types of interfaces as external [application] interfaces. External interfaces like these are usually an unpleasant source of support issues. There are ways Java architects can design external interfaces so that they minimize these support headaches and the resources needed to support them.
The key to a minimizing support for external interfaces is insulating your Java applications from them. That is, limit and contain the number of direct dependencies between your Java applications and 3rd party applications. The insulation strategy I usually use is depicted in the graphic below. As an example 3rd party application, let's use a document management system (DMS). This type of product is frequently purchased (or provided open source) and not custom built. Furthermore, there are several DMS vendors and the possibility that an organization may want to upgrade the DMS product or change DMS vendors is always a possibility.
Figure 1

Establish a generic operational data store for needed external application data. This data store will be the source of external application data for all your custom Java applications. This means that your Java applications do not need to understand internals of the external application. Your Java applications will not be affected if the external application is upgraded or enhanced. Consider a DMS as an example. DMS product upgrades happen no matter which vendor you choose. Using this strategy, your Java applications will not be affected by product upgrades; only the extracts populating the data store might be.
The operational data store must be vendor neutral. That is, your operational data store should not contain vendor-specific tables or fields. You should be able to populate the operational data store from a different product without changing it. You should be able to upgrade the external product without changing this data store. In the case of a DMS, you might have Document and Document_Type tables. However, no fields or tables should be specific to the DMS you are using.
Only populate data needed by your custom applications. The only purpose of the operational data store is to insulate your custom applications from external product changes and upgrades. To copy data not needed by your applications is just making work for no benefit. You can always enhance the data store if new requirements arrive.
Establish a generic Java API to process actions and information updates. The classes and methods in this API must be vendor neutral so that your Java applications are not affected by product upgrades and changes. Of course, the code supporting the API will need to adapt to changes in the underlying external application. Using a DMS as an example, you might have a generic Document interface with methods like “addDocument()” and the like. This API should be product-neutral.
Record all actions and information updates for the interface API. For example, if the DMS product exposes functionality via web services, I'll record the SOAP request and response texts for each API call. Should a defect be reported, support developers will have information they need to contact the vendor right away.
I hope you find this strategy useful. As always, your input is welcome.

Sunday, February 19, 2012

Four Tips for Reducing J2EE Application Costs

Much has been made of J2EE application complexity and what managers perceive as high development and support costs. I don't want to spark a religious war over choosing J2EE vs. .Net or LAMP. But, there are ways that managers can minimize J2EE application development and support costs (or decrease them over time if you have a large J2EE investment currently).
Adopt one J2EE web framework and standardize its use for all J2EE applications throughout the enterprise. Web framework product choices are confusing and many. Product choices include Java Server Faces, Spring MVC, and Struts to name a few. There are many articles and blogs that compare and contrast the different framework products; I don't intend to get into this debate. I merely assert that you can decrease costs by standardizing one one web framework choice across the enterprise no matter which framework you choose.
Web frameworks are complex and typically have a high learning curve. Letting web framework choice vary by application causes the following problems:
  • Development staff must become proficient in multiple web frameworks.
  • Managers incur larger burn-in time when re-assigning developers between applications.
  • Managers have a more difficult time finding developers in the labor pool that already know all web frameworks in use.
In addition, when starting new applications, developers typically rehash the arguments as to which web development framework is best. As a manger, you can save money on new developments by taking the choice of web framework products off the table. Similar points can be made for other aspects of J2EE development.
Adopt one persistence framework and standardize its use for all J2EE applications throughout the enterprise. Like web framework products, Object-Relational Mapping (ORM) products (e.g. Hibernate, IBATIS) are every bit as complex as web frameworks and present the same issues for the same reasons; I won't repeat the points already made. Even if you don't adopt an ORM product and use native JDBC instead, there are companion products (e.g. Apache Commons DBUtils) that when consistently used throughout the enterprise can greatly speed up development.
Adopt a common technical stack and standardize its use for all J2EE applications throughout the enterprise. I go further than standardizing the web framework and ORM product choices; standardize the entire technical stack and manage it via source code control. This allows economies of scale for supporting processes such as build management and deployment management. One build script and deployment script can be used for all applications. Improvements in the technical stack can be more easily leveraged across the enterprise.
If all applications have a common technical stack, common code can be developed that speeds development and support for all J2EE applications. For instance, it's not uncommon to have a base set of classes to manage database transactions (e.g. commits and rollbacks). Common utilities can also be developed or adopted to provide Ajax capabilities or perform common UI tasks such as error handling.
While I recommend implementing a common technical stack, it does need to evolve over time. I version it (e.g. 1.0, 1.1, 1.2, etc.). If you decide to upgrade to the next version of Hibernate or your web framework product, create a new version for that work. Upgrades to the version of the common technical stack used can be decided and scheduled individually for each application.
Adopt a common instrumentation and error reporting protocol and standardize its use for all J2EE applications throughout the enterprise. J2EE application support developers have common concerns, such as obtaining alerts for exceptions and memory issues, obtaining runtime performance metrics or managing log levels at runtime to investigate reported defects. I typically leverage the open source product Admin4J for this purpose. This provides economies of scale for application support staff as the alerts and capabilities available to support developers are identical for all production J2EE applications.
The underlying principle for all these recommendations is that consistency provides more value than minor incremental improvements one product may provide when compared to another.
Manager's be forewarned! Some developers will resist these ideas. All developers have personal preferences with regard to technical product choices. The odds that some developers will not agree with the specific product choices made are high. Developers also may perceive that this standardization limits their freedom. It does; no doubt about it. But it also makes support activities easier and transition to other applications within the same enterprise easier. Developers still have creativity, but it's applied when new business needs arise that aren't provided for in the existing technical stack. 
As always, I'm always interested in your thoughts on the topic.

Tuesday, January 31, 2012

How to Reduce External Dependencies for your Java Libraries

For people who write or contribute to java open source products, external dependencies are a blessing and a curse. They are a blessing in that these external dependencies provide needed functionality that shortens development. I couldn't imagine writing code without the benefit of Apache Commons Lang, Commons Collections, Commons IO, Commons BeanUtils, and many more. They shorten development a tremendous amount, but for open source libraries, they also present problems.

The first problem is that external dependencies can cause class conflicts.  for example, it's possible that the library you release works perfectly fine under Commons Lang 2.6, but doesn't run properly with Commons Lang 2.1. Yes, you can run your unit tests using previous versions of your dependent products; but there's no guarantee that this will catch everything. Furthermore, it takes time and effort which is often better spent adding new features to enhancements to your library. 

The second problem is that new releases of your external dependencies can cause runtime problems in the future.  There's no way you can test against un-released versions of these products. Just because you work fine with Commons Lang 3.1 doesn't mean that you will run properly with upcoming releases. This is also a problem for the users of your library. Typically, web applications have a vast assortment of libraries they depend on, each with their own dependency list. It's possible for these dependency lists to conflict. Yes, there are tools to help you identify these conflicts. Yes, we try to choose dependencies wisely and choose products with a good history of maintaining backward compatibility. But, these aren't going to completely keep users out of trouble.

With an open source product I'm involved with, Admin4J, we took a different approach. Yes, we leverage other products, but we do so differently. We repackage the most of the products we use. That is, we slightly refactor their underlying source to have a unique package structure. For example, Apache Commons Lang's main package is org.apache.commons.lang3. We refactor that so that the package Admin4J relies upon is net.admin4j.deps.commons.lang3. We make no other changes.
The advantages of this approach are the following:
  • We benefit from the functionality provided by these other products.
  • We have a more consistent runtime environment; we don't need to worry about dependency version differences with the versions we develop and test with.
  • Our users don't have to be concerned that our dependency list conflicts with the list of one of their other dependent products.
The disadvantages are the following:
  • We consume additional memory (PermGen space) for additional copies of classes that might already be in the users classpath.
  • Some products don't work well with this strategy; this strategy didn't work well and isn't used by us with Freemarker and Slf4J. We still list these two products as external dependencies.
To give credit where credit is due, we borrowed this technique from Tomcat, which uses it quite successfully. Tomcat uses this technique for to utilize Apache Commons Logging and Commons DBCP. The secret sauce to accomplish this refactoring is the replace Ant task. We use Ant to perform the package refactoring, compile the resulting code, and package it either for development or as part of our deployed runtime jar. An excerpt from our build script to illustrate follows:

<!-- Perform package refactoring -->
<replace dir="${temp.src.dir}/net/admin4j/deps/commons" >
     <replacefilter token="org.apache.commons.lang3"
           value="net.admin4j.deps.commons.lang3" />
     <replacefilter token="org.apache.commons.mail"
           value="net.admin4j.deps.commons.mail" />
     <replacefilter token="org.apache.commons.fileupload"
           value="net.admin4j.deps.commons.fileupload" />
     <replacefilter token="org.apache.commons.io"
           value="net.admin4j.deps.commons.io" />
     <replacefilter token="org.apache.commons.dbutils"
           value="net.admin4j.deps.commons.dbutils" />
     <replacefilter token="org.apache.commons.beanutils"
           value="net.admin4j.deps.commons.beanutils" />
     <replacefilter token="org.apache.commons.collections"
           value="net.admin4j.deps.commons.collections" />
     <replacefilter token="org.apache.commons.logging"
           value="net.admin4j.deps.commons.logging" />
</replace>

For those of you who write or contribute to open source libraries, I'm interested in any other strategies you might have encountered and how they worked out.