Difference between revisions of "FITS"

From CURATEcamp
Jump to: navigation, search
(Opt tip)
(FITS on Github: JHOVE update)
Line 10: Line 10:
 
A [https://github.com/gmcgath/fits fork of FITS] is now up on Github. Let me know if you want to be added as a contributor.
 
A [https://github.com/gmcgath/fits fork of FITS] is now up on Github. Let me know if you want to be added as a contributor.
  
At some point I'll update the JHOVE jars to the current version.
+
The JHOVE 1.8 jars are now there.
  
 
==Optimization tip==
 
==Optimization tip==
  
 
The HTML module in JHOVE is very slow and not very useful unless you're rejecting the 90% of the HTML files that aren't strictly valid. If you don't need it, edit xml/jhove/jhove.conf and remove the module element which refers to edu.harvard.hul.ois.jhove.module.HtmlModule .
 
The HTML module in JHOVE is very slow and not very useful unless you're rejecting the 90% of the HTML files that aren't strictly valid. If you don't need it, edit xml/jhove/jhove.conf and remove the module element which refers to edu.harvard.hul.ois.jhove.module.HtmlModule .

Revision as of 21:22, 15 November 2012

This page is for notes on how to optimize FITS.

Thread parallelism and memory consumption

FITS runs all its tools in parallel threads. This can result in heavy memory consumption, particularly if a big file is being processed. Harvard optimized FITS for DRS ingest, which can afford a lot of memory. In other environments this might cause thrashing.

One approach might be to add a command line or config parameter to limit the number of simultaneous threads.

FITS on Github

A fork of FITS is now up on Github. Let me know if you want to be added as a contributor.

The JHOVE 1.8 jars are now there.

Optimization tip

The HTML module in JHOVE is very slow and not very useful unless you're rejecting the 90% of the HTML files that aren't strictly valid. If you don't need it, edit xml/jhove/jhove.conf and remove the module element which refers to edu.harvard.hul.ois.jhove.module.HtmlModule .