Thursday, January 21, 2010

PDFTOHTML ON UBUNTU

PDFTOHTML : is a tool which would help to convert pdf files into html,xml files

http://pdftohtml.sourceforge.net/

the library could be installed on Ubuntu using the following commands

apt-get install poppler-utils

apt-get install pdftohtml

once done, you could just type pdftohtml to see whether it's installed properly or not. If its installed, you could see the help information displayed.

Cheers!

Wednesday, January 13, 2010

Use of BigDecimal

Working with floating point numbers can be fun. Typically, when working with amounts, you automatically think of using a double type, unless the value is a whole number, then an int type is typically sufficient. A float or long can also work out, depending upon the size of a value. When dealing with money, though, these types are absolutely the worst thing you can use as they don't necessarily give you the right value, only the value that can be stored in a binary number format. Here is a short example that shows the perils of using a double for calculating a total, taking into account a discount, and adding in sales tax.

The Calc program starts with an amount of $100.05, then gives the user a 10% discount before adding back 5% sale tax. Your sales tax percentage may vary, but this example will use 5%. To see the results, the class uses the NumberFormat class to format the results for what should be displayed as currency.

import java.text.NumberFormat;

public class Calc {
public static void main(String args[]) {
double amount = 100.05;
double discount = amount * 0.10;
double total = amount - discount;
double tax = total * 0.05;
double taxedTotal = tax + total;
NumberFormat money = NumberFormat.getCurrencyInstance();
System.out.println("Subtotal : "+ money.format(amount));
System.out.println("Discount : " + money.format(discount));
System.out.println("Total : " + money.format(total));
System.out.println("Tax : " + money.format(tax));
System.out.println("Tax+Total: " + money.format(taxedTotal));
}
}
Using a double type for all the internal calculations produces the following results:

Subtotal : $100.05
Discount : $10.00
Total : $90.04
Tax : $4.50
Tax+Total: $94.55
The Total value in the middle is what you might expect, but that Tax+Total value at the end is off. That discount should be $10.01 to give you that $90.04 amount. Add in the proper sales tax and the final total goes up a penny. The tax office won't appreciate that. The problem is rounding error. Calculations build on those rounding errors. Here are the unformatted values:

Subtotal : 100.05
Discount : 10.005
Total : 90.045
Tax : 4.50225
Tax+Total: 94.54725
Looking at the unformatted values, the first question you might ask is why does 90.045 round down to 90.04 and not up to 90.05 as you might expect? (or why does 10.005 round to 10.00?) This is controlled by what is called the RoundingMode, an enumeration introduced in Java SE 6 that you had no control over in prior releases. The acquired NumberFormat for currencies has a default rounding mode of HALF_EVEN. This means that when the remaining value is equidistant to the edges, to round towards the even side. According to the Java platform documentation for the enumeration, this will statistically minimize cumulative errors after multiple calculations.

The other available modes in the RoundingMode enumeration are:

CEILING which always rounds towards positive infinity
DOWN which always rounds towards zero
FLOOR which always rounds towards negative infinity
UP which always rounds away from zero
HALF_DOWN which always rounds towards nearest neighbor, unless both neighbors are equidistant, in which case it rounds down
HALF_UP which always rounds towards nearest neighbor, unless both neighbors are equidistant, in which case it rounds up
UNNECESSARY which asserts exact result, with no rounding necessary
Before looking into how to correct the problem, let us look at a slightly modified result, starting with a value of 70 cents, and offering no discount.

Total : $0.70
Tax : $0.03
Tax+Total: $0.74
In the case of the 70 cent transaction, it isn't just a rounding problem. Looking at the values without formatting, here's the output:

Total : 0.7
Tax : 0.034999999999999996
Tax+Total: 0.735
For the sales tax the value 0.035 just can't be stored as a double. It just isn't representable in binary form as a double.

The BigDecimal class helps solve some problems with doing floating-point operations with float and double. The BigDecimal class stores floating-point numbers with practically unlimited precision. To manipulate the data, you call the add(value), subtract(value), multiply(value), or divide(value, scale, roundingMode) methods.

To output BigDecimal values, set the scale and rounding mode with setScale(scale, roundingMode), or use either the toString() or toPlainString() methods. The toString() method may use scientific notation while toPlainString() never will.

Before converting the program to use BigDecimal, it is important to point out how to create one. There are 16 constructors for the class. Since you can't necessarily store the value of a BigDecimal in a primitive object like a double, it is best to create your BigDecimal objects from a String. To demonstrate this error, here's a simple example:

double dd = .35;
BigDecimal d = new BigDecimal(dd);
System.out.println(".35 = " + d);
The output is not what you might have expected:

.35 = 0.34999999999999997779553950749686919152736663818359375
Instead, what you should do is create the BigDecimal directly with the string ".35" as shown here:

BigDecimal d = new BigDecimal(".35");
resulting in the following output:

.35 = 0.35
After creating the value, you can explicitly set the scale of the number and its rounding mode with setScale(). Like other Number subclasses in the Java platform, BigDecimal is immutable, so if you call setScale(), you must "save" the return value:

d = d.setScale(2, RoundingMode.HALF_UP);
The modified program using BigDecimal is shown here. Each calculation requires working with another BigDecimal and setting its scale to ensure the math operations work for dollars and cents. If you want to deal with partial pennies, you can certainly go to three decimal places in the scale but it isn't necessarily.

import java.math.BigDecimal;
import java.math.RoundingMode;

public class Calc2 {

public static void main(String args[]) {
BigDecimal amount = new BigDecimal("100.05");
BigDecimal discountPercent = new BigDecimal("0.10");
BigDecimal discount = amount.multiply(discountPercent);
discount = discount.setScale(2, RoundingMode.HALF_UP);
BigDecimal total = amount.subtract(discount);
total = total.setScale(2, RoundingMode.HALF_UP);
BigDecimal taxPercent = new BigDecimal("0.05");
BigDecimal tax = total.multiply(taxPercent);
tax = tax.setScale(2, RoundingMode.HALF_UP);
BigDecimal taxedTotal = total.add(tax);
taxedTotal = taxedTotal.setScale(2, RoundingMode.HALF_UP);
System.out.println("Subtotal : " + amount);
System.out.println("Discount : " + discount);
System.out.println("Total : " + total);
System.out.println("Tax : " + tax);
System.out.println("Tax+Total: " + taxedTotal);
}
}
Notice that NumberFormat isn't used here, though you can add it back if you'd like to show the currency symbol.

Now, when you run the program, the calculations look a whole lot better:

Subtotal : 100.05
Discount : 10.01
Total : 90.04
Tax : 4.50
Tax+Total: 94.54
BigDecimal offers more functionality than what these examples show. There is also a BigInteger class for when you need unlimited precision using whole numbers. The Java platform documentation for the two classes offers more details for the two classes, including more details on scales, the MathContext class, sorting, and equality.