CATEGORY {Java}


Code snippets and IT and Java12 Jan 2008 09:54 am

Acegi – the Java security framework

I suppose that you know what is Spring Security (former Acegi project) is, or at least you are aware of its wide securing abilities for a Java web application. Besides lots of features which Acegi offers you out-of-the box, there are many configurable options. This time, I had a task to implement a functionality which allows restricting access to method invocations based not only on the granted roles to the logged in user, but taking in account passed parameters as well. Let’s see the result.

@Secured annotations

Assume that you are developing pretty common web application with standard security model with users, granted roles, etc. Of course, the main goal of the security infrastructure is accepting/denying user access to different parts of your application based on the granted access rights. Acegi offers you two main ways of controlling access: URL based which analyzes the clients http requests to your application and based on the AOP principles which controls method invocations.

With the URL based method you can achieve rough securing coverage very simply. For instance, if you have /admin part, you just deny the access to all users who don’t have ROLE_ADMIN granted role to the URLs which starts from /admin/** string. The part /store you can make available only to users with role ROLE_CUSTOMER, etc. You can cover all you URLs with such rules, but unfortunately, if you want more precise control, you have to write lots of business logic code in your Controllers/Managers which performs additional checks in the methods body.

Let’s consider an idiomatic example (no real DAOs, no Hibernate etc) consisting of:

  • a controller ShoppingCartAddItemController which is mapped to /shop/card/addItem.jsp url,
  • a service class IShoppingCartManager which provides with business logic,
  • ROLE_USER which is granted to Scott user:
public class ShoppingCartAddItemController extends AbstractController {

    private IShoppingCartManager shoppingCartManager;

    @Override
    protected ModelAndView handleRequestInternal(HttpServletRequest request,
        HttpServletResponse response) throws Exception {
        Integer customerId = ServletRequestUtils.getIntParameter(request, "customerId");
        Integer itemId = ServletRequestUtils.getIntParameter(request, "itemId");
        Integer amount = ServletRequestUtils.getIntParameter(request, "amount");

        shoppingCartManager.addItem(customerId, itemId, amount);

        return new ModelAndView("redirect:/shop/item/congratulations.jsp");
    }

    public void setShoppingCartManager(IShoppingCartManager shoppingCartManager) {
        this.shoppingCartManager = shoppingCartManager;
    }
}

public interface IShoppingCartManager {

    @Secured({"ROLE_USER" })
    void addItem(Integer customerId, Integer itemId, Integer amount);

    @Secured({"ROLE_USER" })
    void deleteItem(Integer customerId, Integer itemId);

}

public class ShoppingCartManager implements IShoppingCartManager {

    public void addItem(Integer customerId, Integer itemId, Integer amount) {
        // use DAO
    }

    public void deleteItem(Integer customerId, Integer itemId) {
        // use DAO
    }

}

The ROLE_USER guarantees that the secured method can be invoked from the controller if the logged in user is granted by this role. But in this case, being logged in, I can change the URL parameter customerId to another value and Acegi won’t check it. To prevent it, we need to paste something like this in every Manager method (or Controller or Filter for every URL which matches /shop/**/*?customerId=… pattern):

    public void addItem(Integer customerId, Integer itemId, Integer amount) {
         Integer loggedCustomerId = securityManager.getLoggedCustomerId();
         if (loggedCustomerId != customerId) {
              throw new AccessDeniedException("access denied");
         }
         // business logic
    }

The main question is: how can we minimize coding, if we know that lots of methods requires the same type of access check? The answer is pretty obvious – we need a AOP aspect which would do this routine instead of us. From another point of view, it would be great if we could employ already used @Secured annotation facilities.

Conditional Roles

After brief googling, I caught the first clue. It was a similar question on Spring framework forum, and eventually I came to the topic which lead to Conditional Roles patch SEC-273. It was an implementation by Usama Rashwan of the idea of Karl Moore.

The idea is elegant and straightforward. All tricks with method invocations in Acegi could be done by implementing RoleVoter interface where you can implement your own access decision logic. But writing your own RoleVoter for every scenario is violating DRY principle. So, it was suggested to write one RoleVoter and to use a Scripting Language like OGNL, MVEL to write conditions right in the role names after special delimiter “::”.

In our case, the new @Secured annotation changes to:

    @Secured({"ROLE_USER::authentication.principal.customerId == arg0" })
    void addItem(Integer customerId, Integer itemId, Integer amount);

where authentication is SecurityContextHolder.getContext().getAuthentication() object which usually holds all logged in user details. This object is put to MVEL context by AbstractInvocationChecker. Besides this object, you can operate by arguments passed to the secured method, here arg0 is the first parameter passed to the method. The MVEL scripting language is very flexible, it can cope with JavaBeans, operations on collections, etc. Of course you can’t do there rocket science computations, but for our needs it is enough.

I suggest the next way of usage:

  • Extend UserDetails class with information required in decision point when the secured method is invoked. It could be ids of objects which is allowed to be manipulated by logged in user.
  • Populate UserDetails with all needed information while a user logging in by UserDetailsService. This object is available as “authentication.principal” object in MVEL context.
  • Write conditional roles based on this objects mixing with passed arguments.
  • Update the SecurityContextHolder.getContext().getAuthentication() if necessary during the user activity.

Buzz

If you are interested in this topic – just download the patch, it has a comprehensive example of usage and clear source code. The JIRA issue stated that this patch goes to 2.0.0 M2 which is going to be released soon.

Java and Programming29 Nov 2007 11:45 am

Minification over obfuscation and other uglifier techniques

Every mature web application enters phase when you need to optimize overall performance. Unfortunately, choosing right DB, tuning your SQL queries and using sophisticated load-balancers are not enough if a user’s browser have to wait for a 5 seconds to load your overloaded AJAX-based grid. And if it happens, you have to think a lot and try many client-side optimization solutions. Now, developers have a great analyzing YSlow tool which can give you dozens of improvement tips. Actually, you can do all this work just analyzing response headers and content with another excellent tool – Firebug.

So, when our application had been optimized against almost all possible bottlenecks, it was decided to use a sort of compression (apart from gzip, which is already employed) of resource files (js, css) based on its structure.

After a little comparison and talks with DHTML gurus, I chose using minification instead of obfuscation, because the later one could be really source of evil bugs. Nowadays, the best supported minifier I found is YUI Compressor.

YUI Compressor

I chose YUI Compressor because it is Java-based free open-sourced tool which means that I always can investigate why something goes wrong, and probable contribute something valuable. This tool can be used as a standalone command line application or used from your code instantiating necessary classes. It could parse javascript files as well as css. Also there is wide range of parameters (e.g. shorten local variables or not, preserve semicolons or not).

Ant script

Good overview of the tool usage from Ant script is here. However, it is just calls to the compressor as to command-line program, which is verbose and sometimes not appropriate. Usually for such tools, there are custom Ant tasks. And it was obvious, that since sources of the tool is available, the Ant task should exist for the YUI Compressor too. After some googling, I found what I needed – semi-abandoned page Minifying JS/CSS. Many thanks to the author – Philippe Mouawad, well done! I would like to see this patch in the release and reference on the official site.

Simple example of usage:

    <path id="yuicompressor.classpath">
        <fileset dir="${yuicompressor.dir}">
            <include name="**/yuicompressor-2.2.5.jar"/>
            <include name="**/YUIAnt.jar"/>
            <!– include name="**/rhino*.jar"/ –>
        </fileset>
    </path> 

    <target name="minify-js-css" description="Minifiy a set of files">
            <taskdef name="yuicompress" classname="com.yahoo.platform.yui.compressor.YUICompressTask">
                <classpath>
                    <path refid="yuicompressor.classpath"/>
                </classpath>
            </taskdef>

            <delete dir="${js-min.dir}"/>
            <mkdir dir="${js-min.dir}" />
            <yuicompress linebreak="300" warn="false" munge="yes" preserveallsemicolons="true"
                outputfolder="${js-min.dir}" >

                <fileset dir="${js.dir}" >
                    <include name="**/*.js" />
                </fileset>
            </yuicompress>
        </target>
 

Encountered problems

Everything works fine except one malicious trouble, I was struggling with for a day. If Rhino (which is stated as dependency library)
was included in task classpath, all returning characters “\n” was pasted as a new line, and an JS error occurred.

var a = "aaa\nbbb";
// turns after minification into:
var a = "aaa
bbb";
 

This bug was eliminated by removing Rhino jar from classpath.

Java12 Sep 2007 12:42 pm

I use in all my web based Java application Tomcat as an IDE and Sysdeo Tomcat launcher for web container starting and stopping. Every project has it’s own configuration of Tomcat launch process. And if you have many workspaces (e.g. couple of ongoing branches – release on production, development version, etc.) you must configure this plugin each time you create a new workspace. If you’re using standard configuration, the only thing you have to do is specifying Tomcat home directory. Whilst, if it is needed to allocate more memory or enable JMX ports, you encounter the problem that you have to enter manually every single parameter in the JVM Settings dialog. There are no Copy All & Paste All functionality, and you can’t even copy configuration to send it to co-worker or to project wiki. Only one, IMHO, meaningless action you can do is dumping current configuration to the Eclipse log. However, output is messy and hard to read.

preferences

Today I decided to find out where this information is stored to know where to paste it next time. The search led me here:
workspace\.metadata\.plugins\org.eclipse.debug.core\.launches\Tomcat 5.x.launch. This is the configuration file which stores everything you can edit through that dialog. Next time, I definitely will edit properties directly there.

configuration file

Also I finally defeated the problem connected to Hot Deploy feature being occurred after I moved to Eclipse Europa. Since that time, any code modifications caused Tomcat crash, even if signature of classes/methods weren’t changed. I had to restart it every time and it became tedious. The problem was eliminated after I removed reloadable="true" parameter in the web application context file.

Code snippets and Java16 Jul 2007 08:06 am

Introduction

Nowadays, every modern web development environment offers a way to manipulate URL on the fly using a rule-based configuration rather than hard coded program logic. Probably every approach has its origins in famous Apache mod_rewrite module. In the Java world, de facto standard is a wonderful Url Rewrite Filter. It can do everything what you expect from such tool and has some delicious topping as a benefit:

  • analyze URL against a pattern (regexp or wildcard)
  • analyze all possible HTTP data like cookies, request parameters, host, remote info, etc
  • change data like cookie, session, request attributes
  • redirect or forward to static or dynamically(based on analysis data) formed URL
  • run your own rolled Java code (e.g. logging, statistics)

During the last few days I got more familiar with this powerful tool and want to show the value it can bring into any Java based Web application. I won’t describe the syntax of the configuration file because there is a comprehensive manual which outlines all options. Also I’m not going to describe simplest use cases, let’s start from something interesting like the first step in integration with affiliate partner.
For instance, in case of integration with Commission Junction affiliate service provider, your application have to set a cookie which identifies a user that come from an affiliate partner, and then redirect the user to the page stated as a request parameter.
So, there are some steps I want to implement with UrlRewrite library:

1. User came to your site by clicking on a banner with link http://example.com/?CJURL=http%3a%2f%2fexample.com%2fregister%2fnew.html (parameter is encoded string http://example.com/register/new.html) Tip: to url encode test data you can use a simple online encoding tool.
2. Your application recognizes a CJURL request parameter and set a cookie (expire time >= 24h) to know that the user came from CJ affiliate program
3. Redirect the user to the landing page equal to CJURL parameter

Setting cookies

This snippet sets cookie if a user comes from our affiliate partner. Despite of the fact that in UrlRewrite manual stated that expire time have to be set in minutes, it actually is in seconds. Probably it is just a typo in the manual, but be careful and alway test real expire date with your browser. It is possible to set all parameters (value can be in the format “[value][:domain[:lifetime[:path]]]”), although we use only two parameters.

<rule>
    <condition type="query-string">^CJURL=(.*)$</condition>
        <from>^(.*)$</from>
        <!– Affiliate service provider –>
        <!– expire time in seconds–>
        <set type="cookie" name="asp">cj::86400</set>
</rule>
 

Redirecting to a request parameter value

It seems to be a simple task to get a value from request query and redirect to url created from that parameter. But you can’t do it explicitly in UrlRewrite. Setting request parameters in is not allowed. There is a technique to parse a query and use the regexp back reference as a value, but you’ll have an encoded value as a result. We need a real request parameter, because in this case the value will be decoded.
To make it clear I’ll give some illustration:
1. The ideal solution if UrlRewrite supported it would be:

<rule>
    <condition type="query-string">^CJURL=(.*)$</condition>
        <from>^(.*$)</from>
        <to type="redirect">%{parameter:CJURL}</to>
 

But as I stated earlier, unfortunately we can’t use request parameter value in <to> clause.

2. If we had a simple parameter like [a-zA-z0-9] without special symbols used in regular URL we can parse it from query string. This approach can be used in beautifying urls according to REST style:

<rule>
<rule match-type="wildcard">
        <from>/*/*/*</from>
        <to type="forward">/$1.do?$2=$3</to>
</rule>
 

But we need to decode this strings, because we can’t redirect to http%3a%2f%2fexample.com before we convert it to http://example.com. This conversion is done by container when you ask for a request parameter value. Therefore this solution WON’T work either:

<rule>
    <condition type="query-string">^CJURL=(.*)$</condition>
        <from>^(.*)?CJURL=(.*)$</from>
        <to type="redirect">$2</to>
</rule>
 

3. Luckily we can use in <to> clause a request attribute. The only problem we have to solve is to write the request parameter value to request attribute somehow, and then read this attribute in <to> clause. I didn’t investigate heavily is it possible using UrlRewrite declarative facilities and employed another great option: calling self rolled Java code. The final version is something like:
urlrewrite.xml:

<!– Affiliates tracking –>
<rule>
    <note>Commission Junction affiliate program</note>
    <condition type="query-string">^CJURL=(.*)$</condition>
        <from>^(.*)?CJURL=(.*)$</from>
        <run class="com.example.web.util.RequestToAttributeSetter">
            <init-param>
                <param-name>parameterName</param-name>
                <param-value>CJURL</param-value>
            </init-param>
    </run>
</rule>
<rule>
    <condition type="query-string">^CJURL=(.*)$</condition>
        <from>^(.*)?CJURL=(.*)$</from>
        <to type="redirect">%{attribute:CJURL}</to>
        <!– Affiliate service provider –>
        <!– expire time in minutes –>
        <set type="cookie" name="asp">cj::86400</set>
</rule>
 

Java code:

package com.example.web.util;

import javax.servlet.ServletConfig;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;

/**
 * Workaround of UrlRewrite filter restriction which doesn’t allow setting request parameter as
 * an redirect target in <to> tag. <br/>
 * Similar solution: http://sujitpal.blogspot.com/2006/09/search-and-replace-with.html
 * @author Andrey Gomilko
 */

public class RequestToAttributeSetter {

    private String parameterName;

    public void run(ServletRequest request, ServletResponse response) {
        if (parameterName != null) {
            HttpServletRequest req = (HttpServletRequest) request;
            Object value = req.getParameter(parameterName);
            req.setAttribute(parameterName, value);
        }
    }

    public void init(ServletConfig config) {
        this.parameterName = config.getInitParameter("parameterName");
    }

    public void destroy() {
    }

}

 

Unit testing of UrlRewrite configuration

Let’s assume you have already written your own urlrewrite.xml file with filter configuration and want to test it. If you decide testing rules on the working server (e.g. Tomcat), you’ll have to restart it every time you have changed something. More convenient way is to use JUnit tests for this purpose. Fortunately URLRewrite has all necessary internal classes and request/response mock classes to dry run this filter without container and analyze the result. The very good example which covers basic use cases is A JUnit test for UrlRewriteFilter. To test rules corresponding to CJ example we need to test cookies in addition to standard tests. Since our UrlRewrite rule is based on the host HTTP header value, you have to set header values to the MockRequest because it doesn’t parse propagated url itself. To test CJ example, I got as a base the test snippet from A JUnit test for UrlRewriteFilter and added some specific methods:

    private void assertRedirect(String fromUrl, String toUrl, Cookie cookie) throws Exception {
        UrlRewriter rewriter = new UrlRewriter(conf);
        MockRequest request = new MockRequest(fromUrl);
        try{
            URL url = new URL(fromUrl);
            request.addHeader("host", url.getHost());
           
            if (url.getQuery() != null) {
                request.setQueryString(url.getQuery());

                StringTokenizer st = new StringTokenizer(url.getQuery(), "&");
                while(st.hasMoreElements()) {
                    String token = st.nextToken();
                    String[] parts = token.split("=");
                    String name = parts[0];
                    String value = parts[1];
                   
                    request.addParameter(name, java.net.URLDecoder.decode(value, "UTF-8"));
                }
            }
           
        } catch(MalformedURLException me){
             // do nothing
        }
       
        MockResponse response = new MockResponse();
        RewrittenUrl rewrittenUrl = rewriter.processRequest(request, response);
        assertNotNull("Could not redirect URL from:" + fromUrl + " to:" + toUrl, rewrittenUrl);

        assertEquals("Redirect from:" + fromUrl + " to:" + toUrl + " did not succeed", toUrl
                , rewrittenUrl.getTarget());
       
        assertTrue(rewrittenUrl.isRedirect());
       

        List cookies = response.getCookies();
        assertEquals(1, cookies.size());
        Cookie setCookie = (Cookie)cookies.get(0);
       
        assertEquals(cookie.getMaxAge(), setCookie.getMaxAge());
        assertEquals(cookie.getName(), setCookie.getName());
        assertEquals(cookie.getValue(), setCookie.getValue());
    }

    public void testCJRewriteDecoded() throws Exception {
        String fromUrl = "http://example.com/?CJURL=http%3a%2f%2fexample.com%2fregister%2fnew.html";
        String toUrl = "http://example.com/reqister/new.html";
       
        Cookie cookie = new Cookie("asp", "cj");
        cookie.setMaxAge(86400); // 24h
        assertRedirect(fromUrl, toUrl, cookie);
    }
 

Conclusion

The main idea of such tools is to reduce coding and transfer all possible application logic to the well-proven tools which can be easily tuned through configuration files. This approach eliminates number of possible bugs and keep tied logic in one place. So, before writing your own filter, think once more about UrlRewrite, probably it’ll fit your needs!

And stay tuned!

Java and Perl and Programming10 Jul 2007 02:22 pm

Europa release

With a short delay after Europa release announcement, I’ve moved to it. Thanks to the fact that Eclipse doesn’t require an installation process and can be just unzipped and run, the last few times I got the tuned Eclipse from my co-workers. Those distributions were well-tuned for J2EE development, with all necessary plugins, project checkstyles, attached sources and so on. Seeing as this time nobody around me offered me such a favor, I did everything on my own.
Actually, to develop a J2EE project, you need to install a dozen of plugins (depends on which frameworks are employed) in addition to the bare distribution, so it’s better to have a cheat sheet every time you start this Eclipse tuning campaign. Since I didn’t have such a plan, I used my current Eclipse 3.1 working set as an example.
This time, I wrote some short notes during the installation process, and I’d like to share it here, to have something in the future to stick with.

Download

Firstly, you need to download an appropriate Eclipse distribution. However there is already a bundle for J2EE developer, I was interested in installing everything I want from the scratch. Therefore I went to http://www.eclipse.org/downloads/ and downloaded Eclipse Classic distribution for Windows (140 MB). It was extracted to C:\Java\eclipse3.3 directory where I store all Java stuff like IDEs, JDKs, etc. Then, as usually I wanted to create a custom shortcut with extended memory allocation for Eclipse, but suddenly noticed eclipse.ini file. To allow Eclipse allocation of more memory, just edit the eclipse.ini file (increase Xms and Xmx values):

-showsplash
org.eclipse.platform
–launcher.XXMaxPermSize
256m
-vmargs
-Xms256m
-Xmx512m
 

Web tools

Some general notes. All plugins can be installed in two common ways: through cute Help->Software Updates->Find and Install… and copying all unzipped stuff to ECLIPSE_HOME/plugins directory. Of course I prefer the first option, and every plugin I’ll be installing through this wizard, except of Sysdeo.
By “default plugins” I mean plugins which can be installed through “Europa Discovery Site” (run the mentioned wizard and find it to be accustomed, if not yet).

From the past experience I knew that I was using a WTP (stands for Web Tools Platform) plugin for web development. The problem was that there were no mentioned WTP plugin on the “Europa Discovery Site”. I carried brief investigation and dig out that the main part of that plugin is a WST subproject (the web standard tools subproject).
Steps to reproduce after Help->Software Updates->Find and Install…:
WST plugin
When I checked WST check box, the wizard apparently gave a tip to select all dependent (required by WST) things too. After I installed a WST, the link to WTP update site suddenly appeared (I guess that it was due to WST), and I decided to install it to have a full stack.
WTP plugin

Subversion integration

When I installed all useful tools from the Europa repository, I moved to things which update sites have to be added manually. The process is almost the same as when we were updating standard components, the only difference is the update site where the plugin is stored. We need to add http://subclipse.tigris.org/update_1.2.x as a New Remote Site…
Update manager

Tomcat launcher

The simplest and the most valuable plugin for Eclipse invoking I ever used is a Sysdeo Tomcat launcher. Download the archive and install according to Installation section steps. Here, the unzipped archive should be placed manually to the ECLIPSE_HOME/plugins directory.
Sysdeo butons

Useful tools

As you know how to install plugins, I’ll mention only links to update site:

  1. http://eclipse-cs.sourceforge.net/update – checkstyle plugin, helps to maintain your coding style due to different available code conventions (Sun, Eclipse, your own)
  2. http://springide.org/updatesite/SpringIDE – Spring related things like bean definitions, xml configuration auto completion
  3. http://e-p-i-c.sf.net/updates – Perl IDE, just for fun to play with RegExps or to write some admin tools
  4. http://www.fabioz.com/pydev/updates – PyDev (Python IDE) not to be restricted only by Java world and be more broadminded
  5. http://eclipse-tools.sourceforge.net/implementors/ – nice Alt+F3 short key usage, allows quick jumps to implementation from interfaces.

Stay tuned!
P.S. I know that saying this phrase is almost as popular as mentioning iPhone in the blog post :)

Code snippets and Java and Perl20 Jun 2007 02:08 am

Introduction

Almost every business application requires a country list as a dictionary data. Even simple registration form might contain country input field. And if you’re going to store billing or shipping information beyond one country you definitely have to have this important dictionary in your system. Firstly, you should decide which countries will be in your list. Depends on your goals it might be short list of the largest countries in the world or full list with all Islands, Territories and even Antarctica. Let’s review common sources for filling your country list dictionary.

Existing country lists

If you have billing or shipping integration with 3rd parties like PayPal, you can get this list from the register page html source code. Open a page with the registration information on the any trusted web portal, then find in the source code country list data (e.g. “View -> Page Source” in FF):

<option value="AL">Albania</option>
<option value="DZ">Algeria</option>
<option value="AD">Andorra</option>
<option value="AO">Angola</option>
<option value="AI">Anguilla</option>
………
 

Now you can copy&paste this information and then parse/use it in any convenient way. As you can see, the value in this list is a two character identifier which is called ‘ISO 3166-1 2 Letter Code‘ . It is very useful as a unique identifier and can work as a primary key in the countries DB table.
However there are lot’s of sites which already have well-proven lists, I’d suggest take this list directly from the original source – United Nations published
official list, which is republished in the many places like List of countries and territories or ISO country codes.

Generating SQL for country table

Let’s take one of that list and paste everything to the Excel sheet, then save it in the CSV format. The result should be similar to ISOCodes.csv and contains data separated by commas:

ABW,AW,Aruba
AFG,AF,Afghanistan
AGO,AO,Angola
AIA,AI,Anguilla
ALA,AX,Aland Islands
………
 

Once we have data stored in Perl readable format, we can parse it and generate SQL code:

#!/usr/bin/perl

# PERL MODULE
use Text::CSV::Simple;
   
# This script generetes sql code for country table

print <<SQL_END;
DROP TABLE IF EXISTS country;
CREATE TABLE country (
  short_code varchar(2) NOT NULL COMMENT ‘ISO 3166-1 2 Letter Code’,
  name varchar(100) NOT NULL,
  PRIMARY KEY  (short_code),
  UNIQUE KEY UNQ_COUNTRY_NAME (name)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
SQL_END

my $tplSQL = "INSERT INTO country (short_code, name) VALUES (\"%s\",\"%s\");\n";
my $tplXML = "<country short_code=\"%s\" name=\"%s\" />\n";
my $csvFile="H:/workspace/update/data/ISOCodes.csv";

my $parser = Text::CSV::Simple->new();
my @data = $parser->read_file($csvFile);

print "– SQL DUMP\n";
foreach(@data) {
        printf $tplSQL, @$_[1], @$_[2];
};
print "– Data for DBUnit tests\n";
foreach(@data) {
        printf $tplXML, @$_[1], @$_[2];
};
 

I’d prefer not to populate data directly to DB, but generate a script which can be edited and invoked many times. This script doesn’t work with command line parameters and use hard coded path because I’m not a Perl programmer who develops a multi-purpose tool:) I’m just a Java developer who needed a valid country list. After executing this script you’ll get SQL code and DBUnit xml snippets.

For those who are curious

Unfortunately, there are no one common countries list, however there are some official lists supported by different organizations like UN, bank communities, post offices, phone companies. When you’re choosing a proper list, take in account countries which are disappeared like USSR or was divided like Serbia and Montenegro. Choose a primary key from large variety of existent (ISO 2,3 characters, number, phone code,…) which satisfies your needs. If you have a unique key, you can get another useful information about chosen country from other sources afterwards. To get answers on those questions, I’d recommend to skim through How many countries are there in the world? and then google on interested keywords.

This is a part of Perl for Java Programmer series. Previous post was Perl for Java programmer: Installation. Stay tuned!

Java and Perl18 Jun 2007 12:42 pm

Introduction

Let’s assume that you are a Java server side programmer. Your current web project involves working with server side specific data, such as lots of media files which are stored in file storages, DB data, XML files, configuration files etc. All files are stored under proper sub-directories, your objects are beautifully mapped to DB using your preferable ORM like Hibernate. Another bunch of data is serialized to XML using something like xstream. Everything works like a charm, until the data storage format or convention is changed. For example, you decide to move all your pictures to another sub-folder. If you have references to these files in your DB you have to modify it too. Or, let’s say, in the next release of your product the textual data stored in the XMLs have to be updated to conform new POJOs which has new properties. What I’m trying to say here, is that if you develop an application which works with lots of non homogeneous data, you have to have an approach to migrate it from one version to another.
I think, that the best solution is using a script language which supports string parsing, SQL executing, file I/O. I chose Perl because of it well-known excellent work with regexps and good reputation between system administrators.

Installing/Configuring Perl

If you are a lucky user of the *nix OS, I believe, you don’t need any additional installation of the Perl, because it is already included in your distributive. For folks who are using Windows, I would recommend installing ActivePerl – Perl binary distribution for Win32 platform. Basically, I don’t know any other, and this one seems to be the best. Installing is a well-known “Next-Next-Next” joy.
My current IDE for Java is Eclipse. And one of great advantages of this famous IDE is it plugin system. There are lots plugins developed by different people almost for every purpose. The first plugin I found was EPIC – open source Perl IDE. Using this plugin you obtain simple but powerful facilities for Perl programming. It is a code highlighting and intellisense while typing, running a program by clicking right mouse-click on a perl source code file and choosing “Run as perl”. Output of the program is directed to Eclipse console. And of course, you can debug the program if output is helpless.
Finally, I’d like to emphasize the very useful tool which is supplied with ActivePerl – Perl Package Manager. Once you need any additional Perl package, you can easily install it in your system from the global repository. Perl Package Manager
For instance, you write a simple script which use B connection to a MySQL instance. But there is an error:

install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC
Perhaps the DBD::mysql perl module hasn't been fully installed,
or perhaps the capitalisation of 'mysql' isn't right.

The only thing you have to do, is to open Perl Package Manager, find missed in standard configuration package DBD::mysql, and install it.

Roadmap of this guide

Today I decided that writing less but more frequently is easier and using his approach I don’t need keeping this important post still in drafts. I’d rather ship every day part by part than wait for another month to complete it.
Therefore, I’m planning to cover in the near weeks these topics:

  • String manipulation
  • Files manipulation
  • DB work
  • XML work
  • SQL generating

Stay tuned!

Code snippets and Java and Programming31 May 2007 06:29 am

File names default sort order

If you are developing an application which works with files, one day you’ll try to get list of the names of all files in a particular directory. It could be like this simplified snippet:

    public List<String> getAllImageNames() {
        List<String> names = new ArrayList<String>();
        File imagesDir = new File(IMAGE_BASE_DIRECTORY));
        if (!imagesDir.exists()) {
            return names;
        }
        for (File image : imagesDir.listFiles(COMMON_IMG_FILTER)) {
            names.add(image.getName());
        }
        return names;
    }

    private static final FileFilter COMMON_IMG_FILTER = new FileFilter() {
        public boolean accept(File pathname) {
            return pathname.isFile() && (!pathname.getName().startsWith("."));
        }
    };
 

Let’s assume that you have some image files with the same name suffix and an index number as a trailing part:

System order Natural order
IMG_1.jpg IMG_1.jpg
IMG_2.jpg IMG_2.jpg
IMG_21.jpg IMG_3.jpg
IMG_22.jpg IMG_21.jpg
IMG_3.jpg IMG_22.jpg

Everything seems to be all right, but returned list is sorted in a way Java usually sorts strings. While this results could completely satisfy your program API, it is not very useful to work with such lists for a human. Moreover, such lists are very common in our life, for example, many digital photo cameras store pictures with such names.
Let’s make the problem more general. We have a list of strings, which could contain digits, and we want to sort this array to get a natural ordered list. You’d say that it is piece of cake, because Java has good facility to perform array/list sorting : Collections.sort() and Comparator. Therefore, we need an appropriate Comparator implementation.

Known solutions for natural order sorting

Unfortunately, there are no any well-known library like Jakarta Commons to perform this sort. And if you try to google it, you will find only some posts like this one with home-grown solutions and grumbling about lack of existing proven solution. The most important thing is to know the right keywords to google against it. Most of articles on this topic are about Natural order.

I found some valuable resources:

  • Natural String Order at Stephen Friedrich’s Blog – an article of Java professional with working Java source and even JUnit tests. The implemented algorithm is very comprehensive and can work with different set of sorting rules (Ascii, case insensitive and locale specific).
  • HumaneStringComparator: Sorting Strings for People – another one Comparator implementation by Tim Fennell.
  • Implementation by Pierre-Luc Paour.
  • Natural Order String Comparison – C version of implementation and links to another languages.
  • Natural Order Numerical Sorting – overview of the problem, links to another articles and solutions, but almost all of them are broken or not Java. It seemed that the author was keen about that idea, registered a special domain and gathered essential information, and then left this site floating without any maintain.

Stress test

One of the main software development rule states: don’t invent your own wheel unless the problem is not your core business. As string sorting was a utility purpose matter, before implementing my own solution I decided to test already found. The test was easy and straightforward – every Comparator had to sort the same number of randomly generated and shuffled strings. The strings were generated by fixed pattern: [a-z][0..1000].jpg. Input data was the large array (100000) of such strings:
..............
jqazrrveqy113.jpg
wzedsgzmvo912.jpg
aayexqpfdu311.jpg
zvpzxjwkml354.jpg
nelacribtl964.jpg
rehsgmzugb244.jpg
eoptzxybtz459.jpg
ukbeogpmhe157.jpg
zgvnrzohwc176.jpg
.............. .

I was interested in only one algorithm efficiency parameter – time elapsing during full sort. As usual, the elapsing time itself doesn’t tell much about the efficiency. To have a minimum achievable value to compare with I chose a result given by the comparator based on a standard Java String.compareTo() method which should be the fastest because it doesn’t deal with string parsing.

        Comparator<String> baseComparator = new Comparator<String>() {
            public int compare(String strA, String strB) {
                // Assuming that strA always != null
                return strA.compareTo(strB);
            }
        };
 

The test results are shown in a table:

Author Version Sort Time Pros Cons
Pierre-Luc Paour NaturalOrderComparator 453ms Quite fast hard to read C-style algorithm
Stephen Friedrich NaturalComparator 4828ms Locale specific Too slow
Stephen Friedrich NaturalComparatorAscii 360ms The fastest one Only ASCII
Stephen Friedrich NaturalComparatorIgnoreCaseAscii 500ms Fast enough, case insensitive Only ASCII
Tim Fennell HumaneStringComparator 4797ms Very good example of OOP style, brief understandable algorithm, use Java facilities extensively Too slow and inefficient
Java native String.compareTo() 235ms standard standard

Conclusion

As you can see from the result sheet, the most powerful and fast is Stephen Friedrich’s package. It is written with good Java style, well commented and supplied with JUnit tests. It provides different kind of sorting depends on what are going to sort – simple ASCII strings like file names or country specific strings containing special characters.
Of course, one day I’d like to see any well-proven solution in the Jakarta Commons project and simply add it as a jar.

General and Java23 Apr 2007 07:04 am

http://www.wrigley.com/wrigley/products/products_eclipse.asp
Must chew for every Java programmer! Moreover, the last one on the page is branded in the same colors as Eclipse IDE. I’m eager to taste it.

Java24 Jan 2007 08:39 am

You can figure out the main difference between two main concepts of modern software development here. The only one thing there omitted is a bugafeature paradigm.

Next Page »