Monday, November 3, 2008

Java Data Types

Data Types

 

This chapter examines three of Java’s most fundamental elements: data types, variables, and arrays. As with all modern programming languages, Java supports several types of data. You may use these types to declare variables and to create arrays. As you will see, Java’s approach to these items is clean, efficient, and cohesive.

 

Java Is a Strongly Typed Language

It is important to state at the outset that Java is a strongly typed language. Indeed, part of Java’s safety and robustness comes from this fact. Let’s see what this means. First, every variable has a type, every expression has a type, and every type is strictly defined. Second, all assignments, whether explicit or via parameter passing in method calls, are checked for type compatibility. There are no automatic coercions or conversions of conflicting types as in some languages. The Java compiler checks all expressions and parameters to ensure that the types are compatible. Any type mismatches are errors that must be corrected before the compiler will finish compiling the class.

 

If you come from a C or C++ background, keep in mind that Java is more strictly typed than either language. For example, in C/C++ you can assign a floating-point value to an integer. In Java, you cannot. Also, in C there is not necessarily strong type-checking between a parameter and an argument. In Java, there is. You might find Java’s strong type-checking a bit tedious at first. But remember, in the long run it will help reduce the possibility of errors in your code.

 

The Simple Types

Java defines eight simple (or elemental) types of data: byte, short, int, long, char, float, double, and boolean.

 

These can be put in four groups:

 

Ø      Integers This group includes byte, short, int, and long, which are for whole-valued signed numbers.

Ø      Floating-point numbers This group includes float and double, which represent numbers with fractional precision.

Ø      Characters This group includes char, which represents symbols in a character set, like letters and numbers.

Ø      Boolean This group includes boolean, which is a special type for representing true/false values.

 

You can use these types as-is, or to construct arrays or your own class types. Thus, they form the basis for all other types of data that you can create. The simple types represent single values—not complex objects. Although Java is Otherwise, completely object-oriented, the simple types are not. They are analogous to the simple types found in most other non–object-oriented languages. The reason for this is efficiency. Making the simple types into objects would have degraded performance too much.

 

The simple types are defined to have an explicit range and mathematical behavior.  Languages such as C and C++ allow the size of an integer to vary based upon the dictates of the execution environment. However, Java is different. Because of Java’s portability requirement, all data types have a strictly defined range. For example, an int is always 32 bits, regardless of the particular platform. This allows programs to be written that are guaranteed to run without porting on any machine architecture. While strictly specifying the size of an integer may cause a small loss of performance in some environments, it is necessary in order to achieve portability. Let’s look at each type of data in turn.

 

Integers

Java defines four integer types: byte, short, int, and long. All of these are signed, positive and negative values. Java does not support unsigned, positive-only integers. Many other computer languages, including C/C++, support both signed and unsigned integers. However, Java’s designers felt that unsigned integers were unnecessary. Specifically, they felt that the concept of unsigned was used mostly to specify the behavior of the high-order bit, which defined the sign of an int when expressed as a number. As you will see in Chapter 4, Java manages the meaning of the high-order bit differently, by adding a special “unsigned right shift” operator. Thus, the need for an unsigned integer type was eliminated.

The width of an integer type should not be thought of as the amount of storage it consumes, but rather as the behavior it defines for variables and expressions of that type. The Java run-time environment is free to use whatever size it wants, as long as the types behave as you declared them. In fact, at least one implementation stores bytes and shorts as 32-bit (rather than 8- and 16-bit) values to improve performance, because

that is the word size of most computers currently in use.

The width and ranges of these integer types vary widely, as shown in this table:

Byte:

The smallest integer type is byte. This is a signed 8-bit type that has a range from –128 to 127. Variables of type byte are especially useful when you’re working with a stream of data from a network or file. They are also useful when you’re working with raw binary data that may not be directly compatible with Java’s other built-in types. Byte variables are declared by use of the byte keyword. For example, the following declares two byte variables called b and c:

byte b, c;

 

short

short is a signed 16-bit type. It has a range from –32,768 to 32,767. It is probably the least-used Java type, since it is defined as having its high byte first (called big-endian format). This type is mostly applicable to 16-bit computers, which are becoming increasingly scarce.

 

Here are some examples of short variable declarations:

short s;

short t;

 

“Endianness” describes how multibyte data types, such as short, int, and long, are stored in memory. If it takes 2 bytes to represent a short, then which one comes first, the most significant or the least significant? To say that a machine is big-endian, means that the most significant byte is first, followed by the least significant one. Machines such as the SPARC and PowerPC are big-endian, while the Intel x86 series is little-endian.

 

int

The most commonly used integer type is int. It is a signed 32-bit type that has a range from –2,147,483,648 to 2,147,483,647. In addition to other uses, variables of type int are commonly employed to control loops and to index arrays. Any time you have an integer expression involving bytes, shorts, ints, and literal numbers, the entire expression is promoted to int before the calculation is done.

            The int type is the most versatile and efficient type, and it should be used most of the time when you want to create a number for counting or indexing arrays or doing integer math. It may seem that using short or byte will save space, but there is no guarantee that Java won’t promote those types to int internally anyway. Remember, type determines behavior, not size. (The only exception is arrays, where byte is guaranteed to use only one byte per array element, short will use two bytes, and int will use four.)

 

long

long is a signed 64-bit type and is useful for those occasions where an int type is not large enough to hold the desired value. The range of a long is quite large. This makes it useful when big, whole numbers are needed. For example, here is a program that computes the number of miles that light will travel in a specified number of days.

 

// Compute distance light travels using long variables.

class Light {

public static void main(String args[]) {

int lightspeed;

long days;

long seconds;

long distance;

 

// approximate speed of light in miles per second

lightspeed = 186000;

days = 1000;    // specify number of days here

seconds = days * 24 * 60 * 60; // convert to seconds

distance = lightspeed * seconds; // compute distance

 

System.out.print("In " + days);

System.out.print(" days light will travel about ");

System.out.println(distance + " miles.");

}

}

 

This program generates the following output:

In 1000 days light will travel about 16070400000000 miles.

 

Clearly, the result could not have been held in an int variable.

 

Floating-Point Types

Floating-point numbers, also known as real numbers, are used when evaluating expressions that require fractional precision. For example, calculations such as square root, or transcendentals such as sine and cosine, result in a value whose precision requires a floating-point type. Java implements the standard (IEEE–754) set of floating-point types and operators. There are two kinds of floating-point types, float and double, which represent single- and double-precision numbers, respectively. Their width and ranges are shown here:

Each of these floating-point types is examined next.

 

float

The type float specifies a single-precision value that uses 32 bits of storage. Single precision is faster on some processors and takes half as much space as double precision, but will become imprecise when the values are either very large or very small. Variables of type float are useful when you need a fractional component, but don’t require a large degree of precision. For example, float can be useful when representing dollars and cents.

Here are some example float variable declarations:

float hightemp, lowtemp;

 

double

Double precision, as denoted by the double keyword, uses 64 bits to store a value. Double precision is actually faster than single precision on some modern processors that have been optimized for high-speed mathematical calculations. All transcendental math functions, such as sin( ), cos( ), and sqrt( ), return double values. When you need to maintain accuracy over many iterative calculations, or are manipulating large-valued numbers, double is the best choice.

 

Here is a short program that uses double variables to compute the area of a circle:

// Compute the area of a circle.

class Area {

public static void main(String args[]) {

double pi, r, a;

r = 10.8; // radius of circle

pi = 3.1416; // pi, approximately

a = pi * r * r; // compute area

GUAGE

System.out.println("Area of circle is " + a);

}

}

 

Characters

In Java, the data type used to store characters is char. However, C/C++ programmers beware: char in Java is not the same as char in C or C++. In C/C++, char is an integer type that is 8 bits wide. This is not the case in Java. Instead, Java uses Unicode to represent characters. Unicode defines a fully international character set that can represent all of the characters found in all human languages. It is a unification of dozens of character sets, such as Latin, Greek, Arabic, Cyrillic, Hebrew, Katakana, Hangul, and many more. For this purpose, it requires 16 bits. Thus, in Java char is a 16-bit type. The range of a char is 0 to 65,536. There are no negative chars. The standard set of characters known as ASCII still ranges from 0 to 127 as always, and the extended 8-bit character set, ISO-Latin-1, ranges from 0 to 255. Since Java is designed to allow applets to be written for worldwide use, it makes sense that it would use Unicode to represent characters. Of course, the use of Unicode is somewhat inefficient for languages such as English, German, Spanish, or French, whose characters can easily be contained within 8 bits. But such is the price that must be paid for global portability.

 

More information about Unicode can be found at http://www.unicode.org.

 

Here is a program that demonstrates char variables:

// Demonstrate char data type.

class CharDemo {

public static void main(String args[]) {

char ch1, ch2;

ch1 = 88; // code for X

ch2 = 'Y';

System.out.print("ch1 and ch2: ");

System.out.println(ch1 + " " + ch2);

}

}

 

This program displays the following output:

 

ch1 and ch2: X Y

 

Notice that ch1 is assigned the value 88, which is the ASCII (and Unicode) value that corresponds to the letter X. As mentioned, the ASCII character set occupies the first 127 values in the Unicode character set. For this reason, all the “old tricks” that you have used with characters in the past will work in Java, too.

 

Even though chars are not integers, in many cases you can operate on them as if they were integers. This allows you to add two characters together, or to increment the value of a character variable. For example, consider the following program:

 

// char variables behave like integers.

class CharDemo2 {

public static void main(String args[]) {

char ch1;

ch1 = 'X';

 

System.out.println("ch1 contains " + ch1);

ch1++; // increment ch1

System.out.println("ch1 is now " + ch1);

}

}

 

The output generated by this program is shown here:

ch1 contains X

ch1 is now Y

 

In the program, ch1 is first given the value X. Next, ch1 is incremented. This results in ch1 containing Y, the next character in the ASCII (and Unicode) sequence.

 

Booleans

Java has a simple type, called boolean, for logical values. It can have only one of two possible values, true or false. This is the type returned by all relational operators, such as a < b. boolean is also the type required by the conditional expressions that govern the control statements such as if and for.

 

Here is a program that demonstrates the boolean type: THE

JAVA

// Demonstrate boolean values.

class BoolTest {

public static void main(String args[]) {

boolean b;

b = false;

System.out.println("b is " + b);

b = true;

System.out.println("b is " + b);

// a boolean value can control the if statement

if(b) System.out.println("This is executed.");

b = false;

if(b) System.out.println("This is not executed.");

// outcome of a relational operator is a boolean value

System.out.println("10 > 9 is " + (10 > 9));

}

}

 

The output generated by this program is shown here:

b is false

b is true

This is executed.

10 > 9 is true

 

There are three interesting things to notice about this program. First, as you can see, when a boolean value is output by println( ), “true” or “false” is displayed. Second, the value of a boolean variable is sufficient, by itself, to control the if statement. There is no need to write an if statement like this:

 

if(b == true) ...

Third, the outcome of a relational operator, such as <, is a boolean value. This is why the expression 10 > 9 displays the value “true.” Further, the extra set of parentheses around 10 > 9 is necessary because the + operator has a higher precedence than the >.

 

No comments: