Saturday, August 27, 2016

A Pleasurable Journey into Text Translation using ANTLR4

For one of my clients, I needed to spec out a REST service API for developers. Although they had standardized on the API Blueprint product, I feel compelled to look for alternatives for specking out future REST APIs.  The input needed for API Blueprint is just too verbose and requires far too much time. If I expected to spend less time specking out REST APIs, this might not be an issue. Or at least not as much of an issue. I also checked out Swagger and RAML and found the same verbosity issue; I just don't have time for that on an ongoing basis.  There's a good review of these products that I found particularly useful here.

My thoughts turned to a syntax for specking REST APIs that would be far more streamlined and concise. Within an hour, I had a draft of a syntax (example below) that would be far less verbose and that I could increase my productivity significantly per API.  The problem would be writing something that could interpret specifications in this format and could then generate the verbose XML , Markdown, or YAML syntax for one of the other API designer products mentioned above. It turns out that there are products that specialize in text translation.  ANTLR is the most popular of these products right now.

The way ANTLR works is that you specify a grammar that describes the text you want to translate. ANTLR uses that grammar to generate Java code that can take text in that format and interpret it for you in the terms that you specified in the grammar. For instance, I define a REST Resource block in my grammar and told it about the syntax for different operations and the json arguments they accept as input or emit as output. ANTLR generated code will analyze the specifications I write and convert that to a Java object format that I can more easily read, interpret through code, and generate useful translated output with the help of a templating technology, such as Freemarker. An example grammar for the Java language as an example can be found here.

It turns out that specifying a grammar wasn't as easy as I thought it would be going in. What, in my mind, is a simple context is actually quite complex when you break it down into constructs that a product like ANTLR can understand. Essentially, you need to specify all whitespace (characters to ignore) and comments if the syntax is to support them. All special keywords and rules that govern when those keywords are expected to be used also need to be specified.  At this point, I should have just backed down from this idea and suffered through one of the more verbose solutions. However, by this point, I'm far too interested in how structured text gets specified and what's possible by interpreting it through code to stop.

I'm part way through this project and will open source it once complete.  For those taking similar text translating journeys with ANTLR, I have ferreted out some techniques that helped me immensely.

Write and test the Lexer portion of the grammar first.

ANTLR breaks up grammars into two pieces: a "Lexer" and a "Parser". A "lexer"  understands what characters and keywords are important for what you're doing and skipping any unneeded whitespace. It also formats those characters/keywords internally as "Tokens" so that it can be used for more sophisticated translation later on. A "Parser" applies rules to important characters and keywords to interpret context. For example, a REST resource definition doesn't make sense in the data type structure section of my proposed REST API specification syntax.

As the parser uses lexer output; it's important to make sure the lexer portion of your grammar tests out first. Any testing of the parser at this point is premature. Assertions in your lexer test should be:
  • Make sure all characters and keywords are recognized.
  • Make sure that the lexer identifies characters and keywords correctly. For instance, I had a bug early on where the keyword 'Resource' was recognized as a string literal. In my syntax, 'Resource' has a special context and meaning.

You can test the lexer generated from your grammer by iterating through the Tokens generated. Any unrecognized tokens shouild cause a test failure. If the lexer doesn't recognize your special characters and keywords (e.g. doesn't identify the correct number of keyword 'Resource' from your test sample), then it should also cause a test failure. 

Write Parser rules iteratively from general rules to more specific rules.

Parser rules apply context to the tokens identified by the Lexer. I found it much easier to start with very general parser rules and get those working. For example, my syntax has two main sections: a bounded context section that describes resources and operations and a Types section that describes all data types used by the API. My first iteration of the parser rules just identified the two sections.  That isn't enough to do what I need, but I didn't leave it there. Over time, I specified the portions of both sections and progressively describe them in more detail.

In other words, parser rules describe a section of your input text. The first test for parser rules can be simple; just test the start line/column position and end line/column position for each parser rule. If those are correct, then you can describe more specific rules that carves up the larger sections in the first iteration. Each parser rule you write has a value object specifically generated for it. That value object has the starting and ending token for the section it covers (you can get the starting and ending positions from those tokens).

There are a few points that aren't obvious about the ANTLR product to remember.
  • Lexer rules have UPPER_CASE names. Parser rules are lower case
  • At least one parser rule should apply to the entire document (minus skipped whitespace).
I'll post additional reports and publish the resulting work via Github when complete.  I'm still midway through this effort.

An Example REST API Specification Syntax

#
#student.spec - Student REST API
#
Bounded Context: Student Information {
Resource: Student // Everything about current and past students
Operation: /student - POST, json, student //Creates a student
httpStatus: 201,400
return: json, studentId
Operation: /student/{studentId} - PATCH, json, student //Update student (only those attributes provided)
httpStatus: 200,400,404,405
Operation: /student/{studentId} - DELETE //Delete student
httpStatus: 200,400,404,405
Operation: /student - GET //Finds a list of students by status
parms:
status - string[] // Status values to search by
httpStatus: 200,400
return: json, student
Operation: /student/{studentId} - GET //Finds a student by their id
httpStatus: 200,400,404
return: json, student
}
Types: {
student {
studentId - required, string // student Identifer that uniquely identifes a student
firstName - required, string $$Bill
middleName - string
lastName - required, string $$Williamson
title - enum{Mr, Ms, Mrs}
birthDate - required, date
primaryAddress - required, address
schoolAddress - address
primaryPhone - required, phone
cellPhone - phone
emailAddress
status - enum{Applied, Accepted, Active, NonActive}
}
address {
streetAddress1 - string $$123 Testing Lane
streetAddress2 - string
City - string $$Somewhere
StateCode - string(2) $$IL
zipCode - int(5)
zipCodeExt - int(4)
}
phone {
countryCode - int(2)
areaCode - int
prefix - int
line - int
extension - int
}
classSection {
title - string
discipline - string
courseNbr - int
building - string
room - string
time - string
}
}

Wednesday, July 20, 2016

A Cute Trick with Generics to Avoid Unwanted Casts

I ran into a cute trick while coding a unit test on a Spring application the other day that I'd like to share.

Consider these lines of Java code.
The first line requires a cast.  the reason is that the method returns an Object.  While the cast isn't the most labor intensive construct, but it makes code less clear and is inconvenient to developers.

Note that the second line does *not* require the cast making it much more convenient for developers. the secret is that Spring's ReflectionTestUtils uses generics. take a peak at how invokeMethod is defined.


In effect, Java infers the type of the value returned by the definition of the variable it is assigned to. In this case, it's a String.

This is convenient for developers in that there's less to type, but also more easily read.

You should not that the developer is required to know what type of value is returned. The following statement will generate a ClassCastException.


This might seem like a disadvantage to using generics in this way. I don't think so. In either case, the developer needs to understand the data type that will actually be returned.


Just thought I'd pass this tid bit along. I'll certainly think more about using generics in this way in APIs I write.

Monday, June 13, 2016

Active Conference Presentation Abstracts

Speaking at conferences has become a regular activity for me. To save time for both me and conference organizers, I maintain this list of presentations I'm actively submitting to conferences and user groups. My presentations are updated frequently to reflect current industry changes and events. If you're a conference organizer, I'm always willing to discuss enhancements that will allow one of my presentations to more closely fit your upcoming event.  As always, I'm interested in suggestions for new presentation topics.  

If you're interested in more detailed information about any of these presentations, I always post presentations on my Slideshare. Keep in mind that slide decks are frequently updated, so a deck you see on a future presentation may be different than what's currently posted.


Upcoming Presentations

Note: Presentations currently in development; no slide decks yet available.

Refactoring into Microservices

(60 minutes)
Microservices architecture has become a widely popular topic. Most of us are aware of what microservices are and the problems they are meant to solve. Most microservice implementations were originally monolithic applications that grew too large and complex to support. However, refactoring into microservices is much easier said than done.   
This presentation will provide you guidance for refactoring a monolithic application into microservices. I'll provide an overview of the entire process along with best practices and common mistakes along the way. This presentation is meant to be platform-generic; you can use these concepts on applications written in any programming language. This presentation is targeted for senior developers and tech leads. 

AWS Lambda Deployments:  Best Practices and Common Mistakes

(60 to 75 minutes)
"Serverless" architectures, such as AWS Lambda, Google Cloud Functions, or Azures Serverless Compute service, that relieves you of hardware and scaling set-up concerns. Large companies such as Netflix, Cmp.LY, VidRoll, and other organizations are introducing serverless technologies into their technical stacks. This presentation concentrates on the AWS Lambda product as it was the first cloud-based serverless architecture and is leading the trend. However, comparison and contrast with the Google and Azure offerings will also be included. 
This presentation will provide you with an overview for AWS Lambda and the products advantages and disadvantages. I'll include an overview on how to create and deploy Lambdas providing examples along the way. I'll also include a discussion of best practices and when use of Lambda is appropriate.  This presentation is targeted for senior developers and architects.

Microservices Presentations

Microservices for Architects

(60 to 75 minutes)

Given published success stories from Netflix and Amazon, many companies are adopting microservice architecture. This session will provide you with an understanding what microservices are and what benefits they provide. You should also be made aware of pitfalls and best practices for adopting this approach. I will provide guidance on effective microservice contract design and survey the most important design patterns. I will survey cross-cutting support concerns that all microservices share and point out common mistakes along the way. 

This session is targeted at architects and team leads. This session is intended to be platform-generic.

Microservices for Java Architects 

(60 to 90 minutes)

Given published success stories from Netflix and Amazon, many companies are adopting micro service architectures. This session will provide you with an understanding what micro service architectures are and what benefits they provide. You should also be made aware of pitfalls and best practices for adopting this approach. This session is targeted at Java developers and architects; coding examples will be Java. This session will also provide an overview of tooling that supports micro service architectures such as Docker, Spring Boot, Dropwizard, and a few more.

Writing Microservices in Java: Best Practices and Common Mistakes 

(60 to 90 minutes)

Given published success stories from Netflix and Amazon, many companies are adopting microservice architectures. For organizations that are heavily invested in Java technologies, writing microservices using Java is a natural progression. This session concentrates on best practices for coding microservices using the JVM. An overview of useful coding and deployment patterns will be included that make microservices more resilient and supportable. Tooling useful with implementing these patterns will be highlighted. Along the way, I'll also note common mistakes.
This session is targeted at Java developers and architects; all examples will be Java. I will provide a short definition section about what microservices are to level set.

Cloud Technology Presentations

AWS Lambda for Architects

(60 to 75 minutes)

Lambda is a "serverless" architecture that relieves you of hardware and scaling set-up concerns. Amazon AWS introduced Lambda at AWS Re-Invent 2014. Adoption of Lambda has grown exponentially ever since. Netflix, Cmp.LY, VidRoll, and other organizations are introducing Lambda into their technical stacks. In addition, Amazon competitors are working aggressively to introduce competitors to AWS Lambda. Lambda has many uses within applications as well as in a Dev Ops world. 
This presentation will provide you with an overview for AWS Lambda and the products advantages and disadvantages. I'll include an overview on how to create and deploy Lambdas providing  examples along the way.  I'll also include a discussion of best practices and when use of Lambda is appropriate.  This presentation is targeted for senior developers and architects.

AWS Lambda for Java Architects

(60 to 75 minutes)

Lambda is a "serverless" architecture that relieves you of hardware and scaling set-up concerns. Amazon AWS introduced Lambda at AWS Re-Invent 2014. Adoption of Lambda has grown exponentially ever since. Netflix, Cmp.LY, VidRoll, and other organizations are introducing Lambda into their technical stacks. In addition, Amazon competitors are working aggressively to introduce competitors to AWS Lambda. Lambda has many uses within applications as well as in a Dev Ops world. 
This presentation will provide you with an overview for AWS Lambda and the products advantages and disadvantages. I'll include an overview on how to create and deploy Lambdas providing Java examples along the way. I'll also include a discussion of best practices and when use of Lambda is appropriate. This presentation is targeted for senior Java developers and architects. Please note that AWS Lambda is not to be confused with Java 8 Lambda expressions; they are different subjects entirely.


Sunday, June 5, 2016

Making Unit Testing Private Methods Easier

Earlier this year I blogged about testing private methods (here). I noted that FieldUtils utility from Apache Commons Lang product has one-line utilities to access private fields. While explaining to readers that Lang's MethodUtils utility should have a one-line utility to access private methods so that we can more easily test them directly. I explained that this is preferable to over-exposing those methods (e.g. making them protected) merely so they can be tested. 

As a result I filed LANG-1195 and contributed an enhancement for MethodUtils so that it can easily invoke private methods for testing. I'm happy to announce that my enhancement has been accepted and committed (Pull 141). Expect to see it in Commons Lang 3.5 when it's released.

Method Additions
Like FieldUtils, boolean forceAccess is true, private methods can be seen and executed through reflection. There is no need to set forceAccess to true for methods already visible (e.g. protected or public).  

These methods allow you to directly test private methods in unit tests. There's no longer a need for the common practice of making methods protected for the sole purpose of accessing them in unit test code. 

These methods really should only be used in unit test code. From an architectural perspective, production code should not be used to invoke methods that were never designed to be executed outside the context of that class. Some argue that private methods should not be tested directly as they are not exposed. As those of us with clients that have mandated 100% line, branch, and mutation coverage; we know that that's nearly impossible being restricted to testing private methods only through public or protected methods that use them.

Another argument against testing private methods is that it created a coupling between classes (the test class and the tested class) that was never meant to be. My response is that yes, it does create coupling. However, unit tests are very tightly coupled to their targets anyway. Secondly, it means that some code in private methods goes untested as testing them indirectly is too tedious and difficult.

These methods do not benefit from compile time checks. One of the reasons I don't advocate using these methods in production code is that they can't be checked by the compiler. Direct method execution can be checked as to syntax, accessibility, and correct parameter type usage. These methods bypass those two checks.

Specifically, methods added follow. If you use these, please let me know if you see issues or problems.

Invoke Private Method without Arguments
This is also patterned after the existing method invocation without the forceAccess option.

Method signature
invokeMethod(Object object, boolean forceAccess, String methodName)

This is used to invoke a private method in a class. Yes, the accessibility attribute on the method is temporarily set to true so that the invocation works, but it's set back to it's original after execution. A usage example is below.

Example 1
String result = (String)MethodUtils.invokeMethod(testBean, true, "myPrivateMethod");

Invoke Private Method with Arguments
This is also patterned after the existing method invocation without the forceAccess option.

Method signature
invokeMethod(Object object, boolean forceAccess, String methodName,Object... args);

Example 2

String result = (String)MethodUtils.invokeMethod(
    testBean, true, "privateStringStuff", "Hi There", 5, new Date());



Sunday, April 24, 2016

AWS Lambda Reading List

I'm hearing a lot about AWS Lambda these days and "serverless" architectures. By "serverless", I mean the concept, not the product (here). Basically, AWS Lambda is computing power without management. You provide your code, AWS Lambda runs it. You have minimal setup and no responsibility for maintaining servers on which to run that code.

As with all new technology fads, there's a lot of buzz and a flurry of unorganized content. My intent with this entry is to keep a current list of relevant Lambda articles and categorize them. Hopefully, this will streamline your research if you're looking at using AWS Lambda at your organization.

I intend to update this list as new material comes to my attention. If you see articles that are worthy of mention, please add a comment providing a link. I'll take a look.

Getting Started
These articles provide overview material describing what AWS Lambda is.

  • "It’s Amazon’s way of delivering a microservices framework far ahead of its competitors."
A Walk in the Cloud with AWS Lambda (Slideshare)
  • Provides detailed overview and possible use cases.


What are the Business Benefits?
These articles highlight business benefits provided by AWS Lambda

  • Somewhat biased as it's an Amazon executive describing how they use Lambda internally.
  • “No more glue code. No more servers. Just run your code.”

Flies in the Ointment
These articles highlight limitations and issues with AWS Lambda.


  • "Lambda is a building block, not a tool"
  • "Lambda is not well documented"
  • "Lambda is terrible at error handling"
  • Pay attention to the discussion below. There's some resistance to the error handling points.
Vendor Lock-in
Many fear becoming relient on AWS Lambda as AWS can raise prices with just a simple edit.  The following products propose solutions to mitigate that risk.


Case Studies
This section details use of AWS Lambda in practice.

  • Embedded case study for acloud.guru, a AWS education company.
  • Also has material on the strategic reasons to consider AWS Lambda
AWS Case Study: VidRoll (VidRoll Blog)


Implementation Issues
This section provides technical assistance for specific issues I''ve had implementing AWS Lambdas.

Running Python with Compiled Code on AWS Lambda (PerryGeo Blog)

  • Getting Python Lambdas up and running is a completely aggravating experience.  This blog helped me quite a lot.
  • In addition to Mathew's wise guidance, make sure you install/compile any binaries on an instance using the officially supported Lambda AMIs (listed here).  Note that you will likely need to install a compiler (e.g. yum install gcc)