Monday, December 29, 2008

PDFBox Lucene java.lang.NoSuchMethodError

PDFBox version 0.7.3 is currently not compatible with the latest Lucene releases (for example Lucene v 2.4.0).

If you try to build an application using these two conflicting libraries you will most likely get an exception like:


One of the potential solution to solve the problem above is to recompile the PDFBox lib (you will need Ant installed on your machine).

If you download PDFBox and extract it in a $PDFBox dir of your choice, you will notice that in the sub-directory:

$PDFBox/external

Lucene v2.0.0 is used to build the library. Rename or delete both the lucene-core-2.0.0.jar and lucene-demos-2.0.0.jar files and copy in the directory the newer versions (for example lucene-core-2.4.0.jar and lucene-demos-2.4.0.jar).

Go to the top-level $PDFBox directory and simply run "ant" the project will be recompiled. A new library should be created (in my case a PDFBox-0.7.3-dev.jar library located in $PDFBox/lib was created).

Notice, in case you get another exception concerning org.fontbox.afm.AFMParser like in the screenshot below:


you will also need the additional fontbox library added to your project (for example in Netbeans add the library FontBox-0.1.0.jar to the libraries).

Sunday, December 14, 2008

Glassfish + Jackrabbit + PostgreSQL (2)

RMI Access

It is now time to create our first Netbeans 6.5 project to access our repository. First of all, in Netbeans create a New Project->Java Web->Web Application project named testrepositoryweb.


Choose Glassfish V2 (or V3) and JavaEE 5 as illustrated below:


Leave the defaults for the remaining settings. After your project has been created, we can now add a JUnit test to check a RMI connection with the testrepository.

Before doing it, add all the libraries discussed in the Glassfish set up tutorial to the project Library folder, additionally you have to add the jackrabbit-jcr-rmi-1.5.0.jar library.

Go to Test Packages right click and select JUnit test (if it is not in the list choose New->Other...->JUnit) and create a RMIConnectionTest unit case:


You can choose the default package or better create your own package (for this tutorial it does not matter which way you choose).

The source code for the class is:
import java.net.MalformedURLException;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;
import static org.junit.Assert.*;

import javax.jcr.Repository;
import javax.jcr.Session;
import javax.jcr.SimpleCredentials;
import org.apache.jackrabbit.rmi.repository.URLRemoteRepository;

public class RMIConnectionTest
{

public RMIConnectionTest(){}

@BeforeClass
public static void setUpClass() throws Exception {}

@AfterClass
public static void tearDownClass() throws Exception {}

@Before
public void setUp() {}

@After
public void tearDown() {}

@Test
public void testConnection()
{
Session session = null;
try
{
Repository repository = new URLRemoteRepository(
"http://localhost:8080/jackrabbit/rmi");

session = repository.login(new SimpleCredentials(
"anonymous", "anonymous".toCharArray()));

System.out.println("Successfully logged in as user: "
+ session.getUserID());
}
catch(java.net.MalformedURLException ex)
{
fail(ex.getMessage());
}
catch(javax.jcr.LoginException ex)
{
fail(ex.getMessage());
}
catch(javax.jcr.RepositoryException ex)
{
fail(ex.getMessage());
}
finally
{
if(null != session)
session.logout();
}
}
}
At this stage since we didn't set up a username and password for our repository, it will accept any username and password.

If you right click on the RMIConnectionTest.java and choose Run File you should get the test passing with a final message:
Successfully logged in as user: anonymous

JNDI Access


RMI is of course slow. Depending on your repository deployment model you can choose a JNDI connection.

Note that a Jackrabbit repository directory contains a lock file that prevents it from being accessed simultaneously by multiple processes. You will see repository startup exceptions caused by the lock file if you fail to properly close all sessions or otherwise shut down the repository before leaving the process that accesses a repository. This behaviour may create some issues with Glassfish if you choose Shared J2EE Resource deployment model. My advice is to simply have one distinct in-process repository for each web application. In other words, remove (undeploy in Glassfish admin) any application, including the jackrabbit web app previously installed and only leave the testrepositoryweb application.
More specifically, if you get the following exception:

javax.naming.CommunicationException: serial context communication ex
[Root exception is javax.jcr.RepositoryException: The repository home
/home/emanuele/testrepository appears to be in use since the file named .lock is
already locked by the current process.]

while trying to access a repository, you have almost certainly been hit by the problem described above.

Ok, time to set up a JNDI resource for our testrepositoryweb app, we can do this easily from the Glassfish (V2) admin console, go to Resources->JNDI->Custom Resources and create a new Custom Resource as follows:

In other words, create a Custom Resource with:
  • JNDI Name = jcr/testrepository
  • Resource Type = javax.jcr.Repository
  • Factory Class = org.apache.jackrabbit.core.jndi.BindableRepositoryFactory
  • repHomeDir =
  • configFilePath = /repository.xml
The JNDI name is pretty much free form.

We can now create a Servlet that will login to our
testrepository (in a real application we will use instead a Listener implementing ServletContextListener as I will show in a later tutorial). In Netbeans right click on the testrepositoryweb and choose New->Servlet and enter the following:


The source code for the newly created JNDIRepositoryLogin servlet is (we are only implementing the doGet method for the sake of this example):
import java.io.IOException;
import java.io.PrintWriter;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import javax.naming.*;
import javax.jcr.Repository;
import javax.jcr.RepositoryException;
import javax.jcr.Session;
import javax.jcr.SimpleCredentials;

public class JNDIRepositoryLogin extends HttpServlet
{

protected void processRequest(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException
{
response.setContentType("text/html;charset=UTF-8");
PrintWriter out = response.getWriter();
try
{}
finally { out.close(); }
}

@Override
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException
{
PrintWriter writer = response.getWriter();

try
{
InitialContext ctx = new InitialContext();
Repository repository = (Repository) ctx.lookup ("jcr/testrepository");

Session session = repository.login(new SimpleCredentials(
"anonymous", "anonymous".toCharArray()));

writer.println("<h1>Success</h1>");
writer.println("logged in as user: " + session.getUserID());
session.logout();
ctx.close();

}
catch(javax.naming.NamingException ex)
{
ex.printStackTrace(writer);
}
catch(javax.jcr.LoginException ex)
{
ex.printStackTrace(writer);
}
catch(javax.jcr.RepositoryException ex)
{
ex.printStackTrace(writer);
}
finally
{
}
}

@Override
protected void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
processRequest(request, response);
}

@Override
public String getServletInfo() {
return "Short description";
}
}


you can now launch your application and enter the URL:
http://localhost:8080/testrepositoryweb/JNDIRepositoryLogin
You should see a web page displaying:




Datastore

The last feature we will set in this tutorial is the Datastore Setting up a Datastore is trivial, simply add the following entry after the Repository tag in you repository xml configuration file:
<DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">
<param name="url" value="jdbc:postgresql:testrepository"/>
<param name="user" value="postgres"/>
<param name="password" value=""/>
<param name="databaseType" value="postgresql"/>
<param name="driver" value="org.postgresql.Driver"/>
<param name="minRecordLength" value="1024"/>
<param name="maxConnections" value="3"/>
<param name="copyWhenReading" value="true"/>
<param name="tablePrefix" value=""/>
</DataStore>

Don't forget to set the username and password matching your DB configuration.

Saturday, December 13, 2008

Glassfish + Jackrabbit + PostgreSQL

This tutorial explains how to set up a Jackrabbit repository with Postgresql on Glassfish v3 and V2. I am using Ubuntu 8.10

I am going to use the jackrabbit-webapp-1.5.0.war web application shipped with Jackrabbit 1.5.0 to create a repository inside Glassfish.

In the next tutorial, a simple web application will then be created using Netbeans 6.5 to show how to programatically access the repository (insert and search) via Jndi from a servlet.

Repository set up in Glassfish

First of all, you need to copy the following libraries in the directory for Glassfish V3 Prelude:
 $GLASSFISH/glassfish/domains/$DOMAINNAME/lib/ext/
and for Glassfish V2 in:
 $GLASSFISH/domains/$DOMAINNAME/lib/ext/
where $GLASSFISH is your glassfish installation directory (i.e. glassfish-v3-prelude) and $DOMAINNAME is your domain (domain1 is the default).

The libraries are:
I suggest you compile the Jackrabbit source using maven to get the needed jackrabbit libraries. The PDFBox and Poi libraries are for document text parsing and we will use them later.

Restart your domain and deploy the jackrabbit-webapp-1.5.0.war you can use the Glassfish admin web tool (usally at http://localhost:4848/) to deploy, I will assume you have deployed with a Context Root set to /jackrabbit with port 8080

We can now test our installation by navigating to http://localhost:8080/jackrabbit/ in your web browser and you should see the page below:




Notice, the web application uses a configuration file bootstrap.properties located in:
   $GLASSFISH/glassfish/domains/$DOMAINNAME/config/jackrabbit
You have to modify this file in case you want to change the repository or any setting. If you want to re-deploy starting with a fresh configuration you can delete the folder.

If you choose the default jackrabbit repository it will be created in the same config directories we have just discussed for V3 Prelude:
   $GLASSFISH/glassfish/domains/$DOMAINNAME/config/jackrabbit/repository
for V2:
 $GLASSFISH/domains/$DOMAINNAME/config/jackrabbit/repository
Alternatevely, you can select any directory your user has read/write permssion, from now onwards I will assume the repository has been created in
   $HOME/testrepository

PostgreSQL Configuration

It is now time to configure the repository (which by default use the built-in Derby database) to use instead PostgreSQL.

First of all, create a database named testrepository I will create one using PgAdmin with user postgres and template postgres.


Then add the jdbc postgresql driver postgresql-8.3-604.jdbc4.jar again in the directory (V3 Prelude):
   $GLASSFISH/glassfish/domains/$DOMAINNAME/lib/ext/ 
while for V2 in:
  $GLASSFISH/domains/$DOMAINNAME/lib/ext/
PersistenceManager you can use, I prefer the Bundle Database PM.

In order to switch from Derby to PostgresSQL, edit the configuration file
   $HOME/testrepository/repository.xml
look for the two entries under Workspace and Versioning:
   <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.DerbyPersistenceManager">
<param name="url" value="jdbc:derby:${rep.home}/version/db;create=true"/>
<param name="schemaObjectPrefix" value="version_"/>
</PersistenceManager>
and substitute them with:
   <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.PostgreSQLPersistenceManager">
<param name="driver" value="org.postgresql.Driver"/>
<param name="url" value="jdbc:postgresql://localhost:5432/testrepository"/>
<param name="schema" value="postgresql"/>
<param name="user" value="postgres"/>
<param name="password" value=""/>
<param name="schemaObjectPrefix" value="public"/>
<param name="externalBLOBs" value="false"/>
</PersistenceManager>
Few things to be aware of, the schema entry is not really a DB schema but more the "type of DB", on the other hand the schemaObjectPrefix is truly the schema, in this example I am assuming we will use the default public schema. Of course, in the password enter your postgres user password.

Restart the Glassfish application server and if you use PgAdmin you should see the following tables being created: