Tuesday, December 23, 2008

Customizing DSpace v1.5.1

Prior to version 1.5.1 of DSpace, customization was done using a separate project called Manakin. The name persists within DSpace, including DSpace documentation, but actually it is all integrated now and customization is done in the dspace tree, not some separate project.

Details on how to customize dspace are very thin on the ground and most talk about how to do it when it was the separate Manakin project. I have managed to find something that works for me. Here's how I did it:

  • Stop tomcat.
  • Delete the tomcat dir webapps/xmlui.
  • go to the dir in which you unpacked dspace. My dir is called dspace-1.5.1-src-release.
  • Go to the sub-dir where the themes are stored. The documentation I found leads you to the wrong place. I found I had to go to dspace-xmlui/dspace-xmlui-webapp/src/main/webapp/themes.
  • Copy the dir that is closest to the theme you want. Copy the whole dir to a new target theme name. The one I copy is Reference.
  • This next step is CRUCIAL - edit sitemap.xmap to make it refer to the new theme name. If you don't then it will still refer to Reference! You need to go to the map:components section near the start. There you will see a section that looks like this:
    <map:component-configurations>
        <global-variables>
            <theme-path>Reference</theme-path>
            <theme-name>put your theme description 
                here</theme-name>
        </global-variables>
    </map:component-configurations>
      
    Change Reference and update the comment.
  • Edit dspace/config/xmlui.xconf. Go to the end where you will see the path defined for the default theme. It is Reference/. Note the trailing slash, it is important. Change Reference for your theme name.
  • Change something in your theme dir so that the change will be visually obvious. For example you can change the style sheet to refer to a different background image.
  • Rebuild dspace with these theme changes. Use the command 'mvn package'. This builds to [dspace-src]/dspace/target.
  • Update the dspace installation (which is in c:\dspace for me). Use the command 'ant update'.
  • copy the dir dspace/webapps/xmlui to the tomcat webapps dir.
  • Restart tomcat.

Wednesday, December 17, 2008

importing into DSpace

I finally managed to import a text file into DSpace today. It was "Decline of Science in England" by Charles Babbage. Here's how I did it:

  • create an import directory. I called mine test-import.
  • Create a directory below the import directory. I named it decline, after the book.
  • Put the text file into the directory 'decline'.
  • Create the file dublin_core.xml, in the 'decline' directory. The file looked like this:
    <dublin_core>
        <dcvalue element="title" qualifier="none">
            Decline of Science in England
        </dcvalue>
    
        <dcvalue element="subject" qualifier="none">
           science
        </dcvalue>
    </dublin_core>
    
  • Create a file called content. The file contains a list of filenames, one per line. In this case I had body (which was the name of my text file) and preface, which was a separate preface file I created.

To import the item I then created a community and a collection under it, and gave this command:

bin\dsrun org.dspace.app.itemimport.ItemImport -a -e emailAddress
-m mapfile.out -c 125050010/1469 -s test-import

Wednesday, November 26, 2008

setting up DSpace and tomcat on windoze

My attention has turned to digital repositories. I looked at Greenstone but had enough problems with even a simple setup (admittedly trying to build from source) that I gave up. So I am now looking at DSpace. I had terrible trouble getting it to work on Windoze with Tomcat. I got it working eventually. Here is the tale of my voyage of discovery.

Prerequisites

Postgres.

goto http://www.postgresql.org/download/ and click on link for windoze. this will take you to a one-click installer for 8.3 the one-click install runs a nice setup wizard. database superuser = postgres,1gandalf. listening port=5432 (default). locale = English, United Kingdom I got the popup "A non-fatal error occured during the cluster initialisation. Please check the installation log in /tmp for details." There did not seem to be a log in either c:\temp or c:\windows\temp. The wizard then went on to completion. The service says it is starting for quite a while then stops. this seems wrong. The windoze event log has a timeout waiting for server startup, preceeded by a fatal error "could not create lock file 'postmaster.pid', permission denied. Went into c:/Program Files/PostgreSQL/8.3 and did a recursive chmod 777. Then restarted the service. this fixed it.

tomcat.

download binary distro for tomcat 5.5 core, windows installer. connector port = 8030 (default is 8080). u,pw=admin,1gandalf service fails to start. turn on debug in tomcat configuration panel, then restart. catalina logfile shows a bind error, address already in use. this was because there was a leftover process that interferred. I found the process using tcpview. need to put tcpview on my web pages. it is much better than netstat. after the kill tomcat restarted just fine.

maven.

it does not seem to come with anything ready for windoze. I unpacked the binary zip and found it is an installation directory. mvn can be run from the bin directory. Add the full pathname of the bin dir to the PATH using the control panel.

postgres (continued)

run PG Admin III (start->PostgreSQL 8.3->pg Admin III. double-click on localhost to connect to the db, enter the password for the postgres user. double-click on loginRoles then rightMouse. Select 'New login role...'. u,pw=dspace,dspace, check 'can create database objects' and 'can create roles'. double-click on databases then rightMouse, select 'new database...'. name=dspace,owner=dspace, UTF8 encoding. privilege properties, add user/group public with connect rights.

dspace.

edit dspace/config/dspace.cfg.
change pathnames so they work for Windoze, e.g /dspace => c:/dspace
change port number to 8030, i.e the one used by tomcat.
db.name = postgres
db.driver = org.postgresql.Driver
db.username = dspace
db.password = dspace
mail.server=mail.company.co.uk
mail.from.address = amarlow@company.com
feedback.recipient = amarlow@company.com
handle.prefix = 123400009
authentication.password.domain.valid = company.com

create c:\dspace cd to where dspace was unpacked. mvn package. this starts by doing lots of downloading and takes a *long* time. it takes over 77 minutes! cd dspace/target/dspace-1.5.1-build.dir. ant fresh_install

using a DOS cmd window:
cd c:\dspace
bin\dsrun org.dspace.administer.CreateAdministrator
email: amarlow@company.com
first name: andrew
last name: marlow
password: 1gandalf

shutdown tomcat. cd to tomcat conf dir edit context.xml putting in the dspace stuff. the context file is new to tomcat v5. restart tomcat. http://localhost:8030/jspui does not work! could not get this to work. tried to copy webapps/jspui to tomcat webapps dir but this also failed, but for a different reason. this gave an error in the tomcat log because it couldn't find ${dspace.dir}\config\dspace.cfg also get a NPE trying to use log4j when not properly configured. problem seems to be to do with the config directory not being copied over. but where to put it?

See http://mailman.mit.edu/pipermail/dspace-general/2008-July/002103.html for someone else who had exactly the same problem as me.

There seems to be a servlet LoadDSpaceConfig that is supposed to manage this and presunably report if there are any problems. But it seems that this is not working. I found this little nuggest in the DSpace manual on page 111, way after the installation instructions!:

The org.dspace.app.webui.servlet.LoadDSpaceConfig servlet is always loaded first. This is a very simple servlet that checks the dspace-config context parameter from the DSpace deployment descriptor, and uses it to locate dspace.cfg. It also loads up the Log4j configuration. It's important that this servlet is loaded first, since if another servlet is loaded up, it will cause the system to try and load DSpace and Log4j configurations, neither of which would be found.

I removed the context stuff from context.xml. This is to pursue a working configuration using the jspui that is copied into the tomcat webapps dir.

I edited webapps/jspui/WEB-INF/web.xml, replacing the dspace.dir param value with the DOS path (C:\dspace.....). This did have an effect. this gave the following error:

log4j:WARN No appenders could be found for logger (org.apache.commons.digester.Digester.sax).
log4j:WARN Please initialize the log4j system properly.
INFO: Loading provided config file: c:\dspace\config\dspace.cfg
INFO: Using dspace provided log configuration (log.init.config)
INFO: Loading: c:/dspace/config/log4j.propertieslog4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: c:\dspace\log\dspace.log (Access is denied)
        at java.io.FileOutputStream.openAppend(Native Method)
        at java.io.FileOutputStream.(Unknown Source)
        at java.io.FileOutputStream.(Unknown Source)
        at org.apache.log4j.FileAppender.setFile(FileAppender.java:289)
        at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:167)

to fix this I did cd dspace;chmod -R 777 .

this gave

log4j:WARN No appenders could be found for logger (org.apache.commons.digester.Digester.sax).
log4j:WARN Please initialize the log4j system properly.
INFO: Loading provided config file: c:\dspace\config\dspace.cfg
INFO: Using dspace provided log configuration (log.init.config)
INFO: Loading: c:/dspace/config/log4j.properties
this looks alot healthier. Now when I visit http://localhost:8030/jspui/ I get the page I expect. Wow!

Saturday, November 22, 2008

ant, ivy, archiva, maven

I have been looking into how to use ivy to manage dependencies during an ant build. And this caused me to look at apache archiva. It is possible to use ant without archiva but unless one has a local maven mirror this will involve ivy going to an internet maven repo, which can consume alot of network bandwidth. I think this is the point of archiva, to have it as a local maven mirror when ivy finds the jar is not in its cache. Apparently, it can all be configured so that archiva is dynamically updated with missing components by takening them from a maven repo.

I tried and tried to get archiva working but I just could not do it. I've no idea what I am doing wrong. I got so cheesed off that in desperation I set up a jar+pom repo on my own workstations web pages. Ivy has no problem reaching those. This seems to make archiva redundant. Given that once the jar has been located ivy can be configured to put it in a cache, this does make me wonder what the point is of archiva.

Another weirdness I found, whch has got me really cheesed off with archiva and maven, is the state of the poms in the maven mirrors that I found. The mirrors seem to be a complete mess, with most things in the wrong places. For example xstream from thoughtworks is at the top level, and in org/codehaus and com/thoughtworks. Also the pom files always have a group name with dots instead of slashes. This tends to cause the download to give a flat directory structure. I manually set up structures for my web pages that keep the package structure in the jars. This meant I had to hack the poms to fix the group names. Of course this would not work with archiva since these changes would require a recalculation of the sha1 values. I'm glad I don't have to do that.

So, I've got something that works for me. It does involves alot of manual work to find the right jar file in a maven repo, given all the redudant copies in other parts of the tree. And I have to edit practically every pom file I download to avoid getting a completely flat directory structure. But this is price I am willing to pay to have a web space structure that follows the java package structure exactly for all the jars I have.

Friday, November 14, 2008

how to mount an ISO image on Windoze

Every now and then I need to mount an ISO image on Windoze. This saves having to burn a CD/DVD. Here's how to do it:

Go to http://www.tech-recipes.com/rx/62 /xp_small_free_way_to_use_and_mount_images_iso_files_without_burning_them

and click on the download link, winxpvirtualcdcontrolpanel. This is a ZIP that you need to 'run'. It is a self-extracting archive. It contains readme.txt, VCdControlTool.exe and VCdRom.sys.

Installation instructions
=========================
1. Copy VCdRom.sys to your %systemroot%\system32\drivers folder.
2. Execute VCdControlTool.exe
3. Click “Driver control”
4. If the “Install Driver” button is available, click it. Navigate to 
   the %systemroot%\system32\drivers folder, select VCdRom.sys, and click Open.
5. Click “Start”
6. Click OK
7. Click “Add Drive” to add a drive to the drive list. Ensure that the drive added is not a local drive. If it is, continue to click “Add Drive” until an unused drive letter is available.
8. Select an unused drive letter from the drive list and click “Mount”.
9. Navigate to the image file, select it, and click “OK”. UNC naming conventions should not be used, however mapped network drives should be OK.

Thursday, November 13, 2008

Using ActiveMQ and JNDI

I am using JMS again, this time via ActiveMQ. The difference is that time time there is no admin department to set up the queues for me and tell me what property names are needed for the various attributes, such as connection queue factory. Also I need to set up JNDI myself. On Windoze I wanted to set up the ActiveMQ server as a service. This was not documented but luckily it comes with a simple batch service installer. After a bit of digging around I found that once AMQ is running you can connect to a web-based admin facility via http://localhost:8161/admin. This includes the ability to browse the queues. Hermes may not be needed after all! There does appear to be some JNDI stuff inside ActiveMQ. Some help is given in http://activemq.apache.org/jndi-support.html but it is very short on detail. Dynamic queues lets us test things easily. Another little gotcha I found was that the binary install of ActiveMQ does not install the XStream jar bu that jar is needed to examine the contents of queue messages via the web-based queue browser.

Thursday, November 06, 2008

I am learning how to drive eclipse at last!

At long last I have taken the plunge! I am also using ivy in my ant build so I found I need to tell eclipse about this. The Ivy and IvyDE eclipse plugins need to be installed. But I wasn't sure how to proceed from that point. I found a tutorial at http://jira.red5.org/confluence/display/docs/Ivy+setup+with+Eclipse which helps.

Wednesday, August 06, 2008

xemacs and key mapping problems

For some time now I have got a warning when starting xemacs. The warning causes the window to be split into two, with the warning in the lower window. The warning is something to do with the keyboard mapping having mod1 and mod3 defined to the same key. I googled for answers but they all said the same - fix it using xmodmap. I tried it and it didn't work. So in the end I added the following to the end of my .emacs startup file:
(when (featurep 'xemacs) 
;; Don't warn me about nonsensical X11 modifier issues. I don't care 
;; if Alt is simultaneously Mod2 and Mod5 on this machine; I don't 
;; press it anyway. Yay separate Meta keys. :) 
(setq display-warning-minimum-level 'error) 
(setq log-warning-minimal-level 'info)) 

Monday, May 12, 2008

Windoze, php and mySQL

At PHP5 the windoze installation no longer enables mySQL by default. Here's what you have to do: you need to edit php.ini, adding the following lines:
extension_dir = c:\Program Files\PHP\ext
extension=php_mysql.dll
extension=php_pdo_mysql.dll
Now restart Apache. A simple php page with the phpinfo command should reveal that the mySQL extension is now present.

Sunday, April 13, 2008

How to fix corrupt Windoze registry permissions

For some unknown reason my Windoze machine got the registry all screwed up. Hey, it's Windoze, every Windoze user has this done to them eventually. The effect was that I could not reinstall Micro$oft Office coz it gave a fatal registry error during the de-install. All the Micro$oft knowledge base articles were as useless as each other. Shareware registry cleanup utilities didn't clean it up either. Eventually I found out how to do this. You need a tool called subInAcl. Download it from http://www.microsoft.com/downloads/details.aspx?FamilyID=e8ba3e56-d8fe-4a91-93cf-ed6985e3927b&displaylang=en Then create a DOS batch file, let's call it fix-office-perms.cmd, containing the following:
cd /d "%programfiles%\Windows Resource Kits\Tools"
subinacl /subkeyreg HKEY_LOCAL_MACHINE /grant=administrators=f /grant=system=f
subinacl /subkeyreg HKEY_CURRENT_USER /grant=administrators=f /grant=system=f
subinacl /subkeyreg HKEY_CLASSES_ROOT /grant=administrators=f /grant=system=f
subinacl /subdirectories %SystemDrive% /grant=administrators=f /grant=system=f
subinacl /subdirectories %windir%\*.* /grant=administrators=f /grant=system=f
Start a DOS command window, cd to where you put this script, then run it. Watch it repair the permissions. It takes several minutes to run. Strewth, what a hassle! But at least there is a way out. The corrupt registry seemed to be triggered by an error that happened during an attempted install of Visio. Towards the end of the install it said:
Could not open key: UNKNOWN\MsoHelp.HtmlHelp\CLSID.
Verify that you have sufficient access to that key,
or contact your support personnel.

Wednesday, March 26, 2008

Apache and skype conflict over port 80

I added apache to my Windoze setup recently. My windoze machine also has skype. This is when a discovered a nasty interaction. I found it on a forum at http://forums.devarticles.com/programming-tools-11/os-10048t-11921.html. Skype uses port 80! Yes, it even uses 443 (the https port) as well. Luckily these are the default settings and can be changed. And change them you must otherwise apache mysteriously fails to start. The Windoze event log reveals nothing. I eventually found it was a port conflict by trying to start httd from the DOS prompt.