The Cloud Architect's Blog: 2012

Saturday, October 6, 2012

Are Commercial J2EE Application Servers worth their cost?

If the market is any indication, the answer is a resounding no! Check out market share research conducted earlier this year and published on Silicon Angle. The two largest commercial application servers Websphere and Weblogic have a whopping 2.17% market share between them. The lion share of the market is going to open source application servers such as Tomcat and JBoss.

As an architect and developer, I've always thought open source application servers easier to support. In most organizations, access to software vendor support staff is tightly controlled and requires a bureaucratic effort to utilize. Because of this, developers are often left attempting to resolve issues on their own anyway.

Self serve resources for commercial application servers such as documentation, web postings for similar problems, and searchable bug lists are often not much better than what you find with the open source alternatives. Furthermore, getting to knowledgeable support staff often takes time I don't usually have. Often, my support calls need to get routed from first-level support to second or third level support.

Most problems that start out being blamed as "application server" issues are usually application code defects. Rarely are outage or production defects resolved at the application server level. This makes sense as application server code is usually better and more thoroughly tested than application code.

As a manager, I've never found the "security" of having vendor resources available particularly comforting. It certainly doesn't assuage clients who are experiencing some type of outage or defect they need help with for very long.

Centralize and standardize use of sophisticated software (whether its commercial or open source) throughout the enterprise. Standardize the use of application server software to the point where all the typical issues are solved once and do not need to be continually revisited. For example for application servers, I standardize all build and deployment scripts and container configurations. With the exception of memory allocation and port assignments, I standardize other feature usage (e.g. management console usage) so that they are the same for all deployed applications.

These choices are not usually revisited by each application developer or team for each application. When the need for changes arise, these changes are centrally evaluated and those configuration standards changed. The change is then deployed on a planned basis throughout the enterprise. This may seem a bit excessive, but I'd rather developers spend time adding needed software features to applications and better supporting the business rather than low level application server configuration concerns. As a result of this standardization, mysterious problems occurring in some environments and not others rarely happen. I've written more about the benefits of this type of standardization here.

Some organizations see liability benefits to commercial software. For instance, it's another firm to possibly to shift blame to should problems and issues arise. Maybe it's where I've worked in the past, but I've never seen blame shifting strategies of this type work over the long term.

Another inference from this market share study is that use of Enterprise Java Beans (EJBs) has largely disappeared. However, that should be a separate discussion.

Thursday, March 15, 2012

Design Tips for Integrating Your Java/J2EE Applications with 3rd Party Software Products

Those of us writing Java/J2EE applications are commonly asked to interface with other applications we don't control. Sometimes, these are other custom applications written and managed by other teams. Sometimes, these are vended applications. Often, these applications are on a different platform (e.g. .Net) and sometimes not even designed to be integrated easily with custom applications. I refer to these types of interfaces as external [application] interfaces. External interfaces like these are usually an unpleasant source of support issues. There are ways Java architects can design external interfaces so that they minimize these support headaches and the resources needed to support them.

The key to a minimizing support for external interfaces is insulating your Java applications from them. That is, limit and contain the number of direct dependencies between your Java applications and 3^rd party applications. The insulation strategy I usually use is depicted in the graphic below. As an example 3^rd party application, let's use a document management system (DMS). This type of product is frequently purchased (or provided open source) and not custom built. Furthermore, there are several DMS vendors and the possibility that an organization may want to upgrade the DMS product or change DMS vendors is always a possibility.

Figure 1

Establish a generic operational data store for needed external application data. This data store will be the source of external application data for all your custom Java applications. This means that your Java applications do not need to understand internals of the external application. Your Java applications will not be affected if the external application is upgraded or enhanced. Consider a DMS as an example. DMS product upgrades happen no matter which vendor you choose. Using this strategy, your Java applications will not be affected by product upgrades; only the extracts populating the data store might be.

The operational data store must be vendor neutral. That is, your operational data store should not contain vendor-specific tables or fields. You should be able to populate the operational data store from a different product without changing it. You should be able to upgrade the external product without changing this data store. In the case of a DMS, you might have Document and Document_Type tables. However, no fields or tables should be specific to the DMS you are using.

Only populate data needed by your custom applications. The only purpose of the operational data store is to insulate your custom applications from external product changes and upgrades. To copy data not needed by your applications is just making work for no benefit. You can always enhance the data store if new requirements arrive.

Establish a generic Java API to process actions and information updates. The classes and methods in this API must be vendor neutral so that your Java applications are not affected by product upgrades and changes. Of course, the code supporting the API will need to adapt to changes in the underlying external application. Using a DMS as an example, you might have a generic Document interface with methods like “addDocument()” and the like. This API should be product-neutral.

Record all actions and information updates for the interface API. For example, if the DMS product exposes functionality via web services, I'll record the SOAP request and response texts for each API call. Should a defect be reported, support developers will have information they need to contact the vendor right away.

I hope you find this strategy useful. As always, your input is welcome.

Sunday, February 19, 2012

Four Tips for Reducing J2EE Application Costs

Much has been made of J2EE application complexity and what managers perceive as high development and support costs. I don't want to spark a religious war over choosing J2EE vs. .Net or LAMP. But, there are ways that managers can minimize J2EE application development and support costs (or decrease them over time if you have a large J2EE investment currently).

Adopt one J2EE web framework and standardize its use for all J2EE applications throughout the enterprise. Web framework product choices are confusing and many. Product choices include Java Server Faces, Spring MVC, and Struts to name a few. There are many articles and blogs that compare and contrast the different framework products; I don't intend to get into this debate. I merely assert that you can decrease costs by standardizing one one web framework choice across the enterprise no matter which framework you choose.

Web frameworks are complex and typically have a high learning curve. Letting web framework choice vary by application causes the following problems:

Development staff must become proficient in multiple web frameworks.
Managers incur larger burn-in time when re-assigning developers between applications.
Managers have a more difficult time finding developers in the labor pool that already know all web frameworks in use.

In addition, when starting new applications, developers typically rehash the arguments as to which web development framework is best. As a manger, you can save money on new developments by taking the choice of web framework products off the table. Similar points can be made for other aspects of J2EE development.

Adopt one persistence framework and standardize its use for all J2EE applications throughout the enterprise. Like web framework products, Object-Relational Mapping (ORM) products (e.g. Hibernate, IBATIS) are every bit as complex as web frameworks and present the same issues for the same reasons; I won't repeat the points already made. Even if you don't adopt an ORM product and use native JDBC instead, there are companion products (e.g. Apache Commons DBUtils) that when consistently used throughout the enterprise can greatly speed up development.

Adopt a common technical stack and standardize its use for all J2EE applications throughout the enterprise. I go further than standardizing the web framework and ORM product choices; standardize the entire technical stack and manage it via source code control. This allows economies of scale for supporting processes such as build management and deployment management. One build script and deployment script can be used for all applications. Improvements in the technical stack can be more easily leveraged across the enterprise.

If all applications have a common technical stack, common code can be developed that speeds development and support for all J2EE applications. For instance, it's not uncommon to have a base set of classes to manage database transactions (e.g. commits and rollbacks). Common utilities can also be developed or adopted to provide Ajax capabilities or perform common UI tasks such as error handling.

While I recommend implementing a common technical stack, it does need to evolve over time. I version it (e.g. 1.0, 1.1, 1.2, etc.). If you decide to upgrade to the next version of Hibernate or your web framework product, create a new version for that work. Upgrades to the version of the common technical stack used can be decided and scheduled individually for each application.

Adopt a common instrumentation and error reporting protocol and standardize its use for all J2EE applications throughout the enterprise. J2EE application support developers have common concerns, such as obtaining alerts for exceptions and memory issues, obtaining runtime performance metrics or managing log levels at runtime to investigate reported defects. I typically leverage the open source product Admin4J for this purpose. This provides economies of scale for application support staff as the alerts and capabilities available to support developers are identical for all production J2EE applications.

The underlying principle for all these recommendations is that consistency provides more value than minor incremental improvements one product may provide when compared to another.

Manager's be forewarned! Some developers will resist these ideas. All developers have personal preferences with regard to technical product choices. The odds that some developers will not agree with the specific product choices made are high. Developers also may perceive that this standardization limits their freedom. It does; no doubt about it. But it also makes support activities easier and transition to other applications within the same enterprise easier. Developers still have creativity, but it's applied when new business needs arise that aren't provided for in the existing technical stack.

As always, I'm always interested in your thoughts on the topic.

Tuesday, January 31, 2012

How to Reduce External Dependencies for your Java Libraries

For people who write or contribute to java open source products, external dependencies are a blessing and a curse. They are a blessing in that these external dependencies provide needed functionality that shortens development. I couldn't imagine writing code without the benefit of Apache Commons Lang, Commons Collections, Commons IO, Commons BeanUtils, and many more. They shorten development a tremendous amount, but for open source libraries, they also present problems.

The first problem is that external dependencies can cause class conflicts. for example, it's possible that the library you release works perfectly fine under Commons Lang 2.6, but doesn't run properly with Commons Lang 2.1. Yes, you can run your unit tests using previous versions of your dependent products; but there's no guarantee that this will catch everything. Furthermore, it takes time and effort which is often better spent adding new features to enhancements to your library.

The second problem is that new releases of your external dependencies can cause runtime problems in the future. There's no way you can test against un-released versions of these products. Just because you work fine with Commons Lang 3.1 doesn't mean that you will run properly with upcoming releases. This is also a problem for the users of your library. Typically, web applications have a vast assortment of libraries they depend on, each with their own dependency list. It's possible for these dependency lists to conflict. Yes, there are tools to help you identify these conflicts. Yes, we try to choose dependencies wisely and choose products with a good history of maintaining backward compatibility. But, these aren't going to completely keep users out of trouble.

With an open source product I'm involved with, Admin4J, we took a different approach. Yes, we leverage other products, but we do so differently. We repackage the most of the products we use. That is, we slightly refactor their underlying source to have a unique package structure. For example, Apache Commons Lang's main package is org.apache.commons.lang3. We refactor that so that the package Admin4J relies upon is net.admin4j.deps.commons.lang3. We make no other changes.

The advantages of this approach are the following:

We benefit from the functionality provided by these other products.
We have a more consistent runtime environment; we don't need to worry about dependency version differences with the versions we develop and test with.
Our users don't have to be concerned that our dependency list conflicts with the list of one of their other dependent products.

The disadvantages are the following:

We consume additional memory (PermGen space) for additional copies of classes that might already be in the users classpath.
Some products don't work well with this strategy; this strategy didn't work well and isn't used by us with Freemarker and Slf4J. We still list these two products as external dependencies.

To give credit where credit is due, we borrowed this technique from Tomcat, which uses it quite successfully. Tomcat uses this technique for to utilize Apache Commons Logging and Commons DBCP. The secret sauce to accomplish this refactoring is the replace Ant task. We use Ant to perform the package refactoring, compile the resulting code, and package it either for development or as part of our deployed runtime jar. An excerpt from our build script to illustrate follows:

<replacefilter token="org.apache.commons.lang3"

value="net.admin4j.deps.commons.lang3" />

<replacefilter token="org.apache.commons.mail"

value="net.admin4j.deps.commons.mail" />

<replacefilter token="org.apache.commons.fileupload"

value="net.admin4j.deps.commons.fileupload" />

<replacefilter token="org.apache.commons.io"

value="net.admin4j.deps.commons.io" />

<replacefilter token="org.apache.commons.dbutils"

value="net.admin4j.deps.commons.dbutils" />

<replacefilter token="org.apache.commons.beanutils"

value="net.admin4j.deps.commons.beanutils" />

<replacefilter token="org.apache.commons.collections"

value="net.admin4j.deps.commons.collections" />

<replacefilter token="org.apache.commons.logging"

value="net.admin4j.deps.commons.logging" />

</replace>

For those of you who write or contribute to open source libraries, I'm interested in any other strategies you might have encountered and how they worked out.

Monday, January 16, 2012

The Benefits of a Standardized Application Architecture

There is much literature on software architecture and design. Most of that literature focuses on coding patterns and best practices. That is, the literature focuses on an applications internal structure and improving quality at a code level, usually with a single application as the intended scope. In fact, most application architectures are created and deployed for a small number of applications. It's time we looked at the larger picture and considered the benefits of deploying an standardized software architecture across multiple custom applications across the enterprise. Over the past several years, I've had an opportunity to do just that, and observed several benefits that are worth documenting.

Let's start by defining terms so we're all on the same page. I define application architecture as the internal structure of an application. That is, an application architecture details software products it uses internally and the programming patterns employed. For example, application architecture details whether the application uses a layered architecture of one of the other architecture patterns, the exception, logging, and transaction management strategies used, etc. This is different from “applications architecture” (with an “s”) that is used in some EA circles to describe what business data is produced and consumed by each application in business terms.

Most organizations will standardize the hardware platform and operating system to be used. Most organizations will also standardize significant vendors involved, such as Microsoft (in the case of .Net) or J2EE container vendor and relational database will be used. Some organizations will provide base services and methods for securing, deploying and monitoring applications. Many organizations won't standardize much more than that leaving the application architecture up to individual teams.

Over the past several years, I've had the opportunity to guide the development of a standard application architecture in the Java/J2EE space that is used for over a dozen applications. The application architecture defines the entire technical stack including the web framework and ORM used. The architecture specifies a package structure and a specific software layering paradigm along with coding conventions. This architecture provides a standard method for instrumenting applications for performance, memory, logging, and exception management purposes. This application provides base build and deployment procedures.

This standardization of application architecture provided a consistency between the applications that allows for several benefits:

All applications can easily consume architectural improvements. For example, we developed a strategies (with several open source products) to measure performance of each application, provide run-time log-level management, provide memory shortage alerts and low-watermark history, and much more. All custom applications in the enterprise were easily configured to consume these features. This lowers the price tag should it be necessary to replace one of the products used in the tech stack; the solution and migration procedures are developed once and merely reused for all applications.

It is much easier to switch developers between applications. In most organizations, J2EE applications are different enough that there is a significant burn-time for new developers or to move developers from one application to another. As these applications are written in very similar ways, having experience with one application equates to having experience with all of them.

Developer time is optimized in several ways. First, a standard technical stack puts limits on the list of products a developer needs to be current in. With most organizations, the combined technical stack for all applications is normally much larger. Second, development speed for new applications is improved as all basic technical decisions have already been made.

The common architecture leads to a significant base of code that is commonly shared between applications. This implements DRY and keeps developers from having to re-invent the wheel.

The benefits are numerous and have definitely resulted in lower support costs. While I can't divulge exact numbers publicly, the number of support personnel needed for this collection of applications is far less than other organization developing J2EE applications that I'm aware of. It's clear to me that the benefits received stem from the consistency between the applications; not from any specific technical advantages from the individual products used.

There are some challenges to this approach. We've become very careful about what code makes it into the 'common' code base that's shared across the applications. Once it's published and used, the impact of change can be wide-spread. As a result, nothing is published to the 'common' code base until more than one application needs it.

We're a bit slower to consume newly released versions of the open source products used. Any new product release caries the potential for breaking something. We're mitigated this risk by effectively “versioning” the technical stack and common code base so that applications can consume new versions of the architecture at different times.

Some developers consider themselves artists. This type of developer won't like the idea of standardizing the application architecture as it limits their creativity. In this world, creativity (or artistry) comes into play when a new type of technical problem or need surfaces that hasn't already been addressed by the standard application architecture. After several years, the rate that new technical problems and needs have surfaced has decreased substantially;

While my primary experience with deploying a standard architecture of this type is in the J2EE space, there's every reason to believe that other development platforms would see similar benefits to standardizing the application architecture and deploying it across multiple applications.

I'm curious about your thoughts and experiences. I'm particularly interested if you've experienced benefits or costs to an approach like this that isn't already listed.

Monday, January 2, 2012

With Java: 'private' does not mean private.

It turns out, you can get access to fields and methods in JDK classes declared as 'private' through reflection. I don't recommend this and prefer not to use what I'm about to describe. In fact, I consider what I'm about to describe as an option of last resort.

Recently, I ran into bug 6526376 which describes a memory leak in the JDK that impacts products that use java.io.File.deleteOnExit(). This method tracks the file that will be deleted in a statically defined Set in class java.io.DeleteOnExitHook. This set grows, but never shrinks during the life of the JVM, creating a memory leak.

While this bug is marked as fixed in the latest release of the JDK, upgrading to the latest release wasn't an option. Furthermore, this Set is defined as 'private' with no way to get access to its contents and clear the Set (after doing its work), or so I thought. Upon reading a post from Cedriks weblog, I realized that there was a way to access the files listed in this private Set and perform its work through a batch job.

The secret ingredient for this solution is the following four lines:

Class<?> exitHookClass = Class.forName("java.io.DeleteOnExitHook");
Field privateField = exitHookClass.getDeclaredField("files");
privateField.setAccessible(true);
LinkedHashSet<String> fileSet = (LinkedHashSet<String>) privateField.get(null);
// You then have full access to the privately defined Set and can modify its content.

With this particular solution, you do want to take care to synchronize access to the Set and not to delete files that are currently in use. I haven't tried this trick to execute private methods yet, but I see no reason why it wouldn't work. Furthermore, these four lines could easily be abstracted and consolidated into a utility class somewhere and re-used in other situations.

I've mixed feelings about this solution. I'm glad that I was able to work around the memory leak with a short amount of code. We would have been stuck recycling our containers periodically, otherwise. I've run into bugs with open source products where this solution could conceivably be used as well.

On the other hand, this breaks encapsulation and leaves developers no way to really guard against unauthorized access to private fields and methods. An architecture principle I try to follow is 'Swim with the stream'. That is, use products as they are intended to be used. This hack definitely breaks this principle. Should we really be able to do this?

There is no question that this is a dangerous tactic. Private methods and fields are effectively unpublished from the perspective of the developers who wrote the code in question. Those developers, can and should feel completely free to refactor (e.g. change or even remove) those private methods and fields with the understanding that they aren't directly affecting calling code. These private fields/methods may disappear in future releases of the code being accessed this way or their intended use may be drastically changed. My solution is fragile and will likely break with a future release of the JDK.

It's possible that developers using this tactic to break into sections of code they weren't given access to might not understand the full context of the fields and methods being used; there may be unintended runtime consequences to accessing and changing private data that hasn't been considered. In my case, how can I really tell if a File listed in the shutdown hook isn't still being used? I elected to address this issue by not deleting any file until an hour after last update or later, but this isn't fool-proof. It's possible that by deleting these files early and removing the set entry, I've created a derivative bug somewhere that will likely be much harder now to find and fix.

On further consideration, the creators of the JDK were wise to allow this. It gives us options when dealing with bugs like this. If I've a developer on staff that routinely uses fragile solutions like this, I've got larger issues. Changing the JDK to prevent this type of unauthorized access to private fields/methods won't save me from the other damage a developer like that can cause.

Twitter