[cp-patches] Minor refactoring of PrintStream

Andrew John Hughes ahughes at redhat.com
Wed Jun 16 19:44:27 UTC 2010


2010/6/16 Ivan Maidanski <ivmai at mail.ru>:
> Hi!
>
> Tue, 15 Jun 2010 21:58:02 +0200 Mario Torre <neugens at limasoftware.net>:
>
>> Il giorno mar, 15/06/2010 alle 22.59 +0400, Ivan Maidanski ha scritto:
>> > Hi!
>> >
>> > How fast!
>> >
>> > Tue, 15 Jun 2010 19:07:18 +0100 Andrew John Hughes <ahughes at redhat.com>:
>> >
>> > > 2010/6/15 Ivan Maidanski <ivmai at mail.ru>:
>> > > > Hi!
>> > > >
>> > > > Here, the proposed changes of PrintStream are:
>> > > > 1. to be closer to the RI behavior (see changes for line_separator, out.flush(), append());
>> > >
>> > > While some of these are sensible changes, how are you discovering the
>> > > 'RI behaviour' in the cases where it wouldn't be visible from simple
>> > > runtime testing?
>> >
>> > There are a lot of places in the Classpath saying that while the JDK
>> > docs say nothing (or, even, the opposite) about some feature, the RI
>> > behavior is followed. (For the samples, search for "RI" word in the
>> > Classpath source files.)
>> >
>> > Which one can't be visible, you think, from a test?
>>
>> Hi Ivan,
>>
>> We can't look at the OpenJDK code and then easily make the Classpath one
>> match unless where obvious fixes are in place (if RI is the closed one,
>> we can *never* make this even of obvious fixes), can you please provide
>> us the tests you wrote to detect the faulty functionality?
>
> Let me explain... I ran some application on a JDK and ran it on a VM equipped with the Classpath, and observed some difference in the behavior in that app. I don't think exploiting obscure implementation details of JDK is good but JDK is somewhat the industry standard, so I chose the way of reflecting JDK behavior. I don't look into the proprietary sources, I'll supply you with the tests (I assume comparing the two products using tests is legal).

That's how we usually do it as explained below.

What do you mean by 'JDK'?  Oracle's proprietary JDK? OpenJDK?
Something else?  If it's just being used as a binary, it doesn't
matter as much but it would be good to know.

>
>>
>> > > > 2. some code optimizations (like removal of writeChars() pos/len parameters).
>> > > >
>> > > > ChangeLog entries:
>> > > >        * java/io/PrintStream.java (line_separator): Convert static field to
>> > > >        an instance one (to match the RI functionality).
>> > >
>> > > I don't see why you would want to make this an instance field.
>> >
>> > only to match the RI functionality ;) - test how BufferedWriter is working in the RI.
>>
>> I can't see why you want this change. This field is private, and I think
>> its value never changes during the lifetime of the VM, Actually, even in
>> the JDK the path separator is a field in the File class that is cached
>> at class initialisation time, and finally, the field is static anyway. I
>> can't really see how this impacts the functionality, can you please
>> provide a test?
>
> See below.
>
>> > > >        * java/io/PrintStream.java (error_occurred): Remove an unnecessary
>> > > >        initialization to false.
>> > >
>> > > I think having it explicit does make things clearer.
>> >
>> > Ok. (Although, I don't think that explicitly setting it to false after it was implicitly set to the same value makes things clearer.)
>>
>> I don't completely share Andrew's point of view on this one, and
>> although in this specific case the line of code is harmless, I believe
>> that we should always trust the VM to initialise class and instance
>> fields for us to the default values (which is mandatory anyway), so your
>> change is OK for me.
>> > >
>> > > >        * java/io/PrintStream.java (PrintStream): Use
>> > > >        "new FileOutputStream(fileName)" instead of
>> > > >        "new FileOutputStream(new File(fileName))".
>> > >
>> > > Ok (there are two instances of this, your ChangeLog only lists one)
>> >
>> > I know that. What's the preferred way of listing such things? specify with the arguments? I'll change the log.
>>
>> Not sure about this, I guess we use a couple of different ways, I would
>> add in parenthesis the arguments of the two constructors.
>>
>> > > >        * java/io/PrintStream.java (PrintStream): Throw NPE if out is null.
>> > >
>> > > Where is this specified? The Javadoc for this don't specify an NPE is thrown.
>> > > If OpenJDK does, then the documentation needs updating.  We should include
>> > > a message saying what the problem was.
>> >
>> > the undocumented RI behavior - throw NPE for a null stream (instead of UnsupportedEncodingException one).
>>
>> I would add a comment so (and, again, a test in Mauve).
>
> Adding a comment is ok but I don't think adding a test for an poorly or undocumented behavior is good. Only just to discover/show the difference...
>
>> > >
>> > > >        * java/io/PrintStream.java (print): Call out.flush() only
>> > if needed
>> > > >        (to match the RI).
>> > >
>> > > Where are these extra conditions coming from?
>> >
>> > Again, the RI behaves differently (the RI flushes the underlying
>> > stream only in case of '\n' while the Classpath flushes self (not the
>> > underlying stream directly) every time (provided auto_flush is on).
>>
>> Can you provide some minimal functional test to prove that different
>> behaviour affects user code (sometime those things do affect
>> functionality, but at times not)?
>
> Ok.
>
> 1. test case for flush() -> out.flush() replacement:
>
> import java.io.*;
> class MyPrintStream extends PrintStream {
>  MyPrintStream(OutputStream out) {
>    super(out, true);
>  }
>
>  public void flush() {
>    throw new Error("flush shouldn't be called");
>  }
> }
>
> class TestCase1 {
>
>  public static void main(String[] args) {
>    (new MyPrintStream(System.out)).println();
>    System.out.println("PASSED");
>  }
> }
>
> 2. test case for flushing only if the data contains '\n':
>
> import java.io.*;
> class MyPrintStream extends PrintStream {
>  MyPrintStream(OutputStream out) {
>    super(out);
>  }
>
>  public void flush() {
>    throw new Error("shouldn't be called");
>  }
> }
>
> class TestCase2 {
>
>  public static void main(String[] args) {
>    PrintStream out = new PrintStream(new MyPrintStream(System.out), true);
>    out.print("");
>    out.print(new char[] {});
>  }
> }
>
> I've just looked into OpenJDK source and found its write() implementation is not optimal - see line 476 in http://hg.openjdk.java.net/nio/nio/jdk/file/a11b4063024f/src/share/classes/java/io/PrintStream.java
>
> You can see that a break statement should be placed after out.flush().
>
> IMHO, my code is a bit cleaner here. Or not? :
> +        if (auto_flush
> +            && (println || chars == line_separator
> +                || lastIndexOfNewLine(chars) >= 0))
> +          out.flush();
>
>>
>> > > The append change looks good as do the flush() changes.  I don't see
>> > > the reason for changing the ordering of the && statement as checking
>> > > auto_flush is the cheaper check.
>> >
>> > Ok. Reordering here is not a big deal for performance (IMHO, local var fetch should be faster than global memory access).
>>
>> Well, this depends...
>
> Depends on, but not limited to, the probabilities... ;)
> Ok.
>
>>
>> Cheers,
>> Mario
>
> Tue, 15 Jun 2010 22:09:28 +0200 Mario Torre <neugens at limasoftware.net>:
>
>> Il giorno mar, 15/06/2010 alle 21.58 +0200, Mario Torre ha scritto:
>>
>> > Actually, even in the JDK the path separator is a field in the File class that is cached
>>
>> Ops, forget about this, this is not path, but line separator...
>>
>> Anyway, the rest of the argument stays, I don't see how this can affect
>> the behaviour, and I don't think a line separator changes in any system
>> I have ever seen at runtime.
>
> Agreed. I'd like someone to tell the OpenJDK group about this. ;)
>
> The test case (for Unix):
>
>  new PrintStream(System.out).println(); // prints "\n"
>  System.setProperty("line.separator", "\r\n");
>  new PrintStream(System.out).println(); // prints "\r\n"
>
>>
>> Cheers,
>> Mario
>
> Wed, 16 Jun 2010 13:37:45 +0100 Andrew John Hughes <ahughes at redhat.com>:
>
>> 2010/6/15 Ivan Maidanski <ivmai at mail.ru>:
>> > Hi!
>> >
>> > How fast!
>> >
>> > Tue, 15 Jun 2010 19:07:18 +0100 Andrew John Hughes <ahughes at redhat.com>:
>> >
>> >> 2010/6/15 Ivan Maidanski <ivmai at mail.ru>:
>> >> > Hi!
>> >> >
>> >> > Here, the proposed changes of PrintStream are:
>> >> > 1. to be closer to the RI behavior (see changes for line_separator, out.flush(), append());
>> >>
>> >> While some of these are sensible changes, how are you discovering the
>> >> 'RI behaviour' in the cases where it wouldn't be visible from simple
>> >> runtime testing?
>> >
>> > There are a lot of places in the Classpath saying that while the JDK docs say nothing (or, even, the opposite) about some feature, the RI behavior is followed. (For the samples, search for "RI" word in the Classpath source files.)
>> >
>> > Which one can't be visible, you think, from a test?
>>
>> I'm well aware of our general guidelines on implementation
>> differences, though, as with anything, things differ on a case-by-case
>> basis.  This doesn't mean that Classpath becomes a copy of the 'RI' or
>> that the RI's way of doing things is sacrosanct; it means that if you
>> have a testcase for some behaviour which is unspecified in the
>> Javadocs and this is noticeably different for users, then it should
>> change to the reference implementation behaviour.
>
> Ok.
>
>>
>> Mario touched on my main concern which was about what you are
>> referring to as the RI and how you are testing it.  Looking at
>> proprietary source code is not allowed:
>> http://www.gnu.org/software/classpath/docs/cp-hacking.html#SEC2.  The
>> situation is now a little different than it was in previous years, as
>> there are now effectively two reference implementations: the
>> proprietary JDK binaries provided by Oracle (the 'old RI') and
>> OpenJDK, which is under a number of FOSS licenses.  It's still
>> preferable not to consult the source code if possible, but it is
>> possible to consult the OpenJDK source code and still work on
>> Classpath.  That doesn't mean chunks should be copied verbatim for
>> OpenJDK.  I believe it does mean you can get the gist of how something
>> undocumented works in OpenJDK and then reimplement the ideas in GNU
>> Classpath, something which wasn't possible before.  I am not a lawyer,
>> and any situation where a substantial amount of OpenJDK code was
>> referenced would probably involve contacting FSF legal to be sure.
>
> Ok. Let me not consult the OpenJDK sources.
>
>>
>> There's obviously no problem with running a test against OpenjDK and
>> GNU Classpath+VM to determine the difference.  If this is what has
>> been done, we need to see the tests and preferably have them
>> integrated into Mauve for future reference.
>
> Ok.
>
>>
>> I'm sorry if this is a little draconian, but we have to be careful of
>> what sources are being consulted for work on GNU Classpath to avoid
>> getting sued.
>>
>> >
>> >>
>> >> > 2. some code optimizations (like removal of writeChars() pos/len parameters).
>> >> >
>> >> > ChangeLog entries:
>> >> >        * java/io/PrintStream.java (line_separator): Convert static field to
>> >> >        an instance one (to match the RI functionality).
>> >>
>> >> I don't see why you would want to make this an instance field.
>> >
>> > only to match the RI functionality ;) - test how BufferedWriter is working in the RI.
>>
>> Can you provide such a test?  I can't immediately see why this would
>> make a difference given the field is final.  It seems like someone
>> just forgot to make the field static in 'the RI'.  It's also (more
>> worryingly) not clear how you determined this about the RI.  I guess
>> reflection would work, but still seems dubious unless there is good
>> reason.
>
> No, no reflection is needed (just use System.setProperty()). See the test above.
>
> I could agree there's no need to change this thing. (Hopefully, someone will change the OpenJDK behavior.)
>

Ok, for some reason I was reading this as it was just setting the
value to '\n' not reading a property.
The change does make sense.  If you change the property, you'd expect
PrintStreams created after to reflect that.

>>
>> >
>> >>
>> >> >        * java/io/PrintStream.java (error_occurred): Remove an unnecessary
>> >> >        initialization to false.
>> >>
>> >> I think having it explicit does make things clearer.
>> >
>> > Ok. (Although, I don't think that explicitly setting it to false after it was implicitly set to the same value makes things clearer.)
>> >
>>
>> It's clearer in the sense that you otherwise have to remember that
>> booleans default to being false in Java.
>> I would happily let through the line without the initialiser in a new
>> patch.  I just don't see any reason to change this from how it already
>> is.
>
> Ok. Not a big deal.
>
>>
>> >>
>> >> >        * java/io/PrintStream.java (PrintStream): Use
>> >> >        "new FileOutputStream(fileName)" instead of
>> >> >        "new FileOutputStream(new File(fileName))".
>> >>
>> >> Ok (there are two instances of this, your ChangeLog only lists one)
>> >
>> > I know that. What's the preferred way of listing such things? specify with the arguments? I'll change the log.
>> >
>>
>> I don't understand what Mario is referring to with us using 'a couple
>> of different ways'.  The format is specified in
>> http://www.gnu.org/software/classpath/docs/cp-hacking.html#SEC13 and
>> there are many examples in the existing ChangeLogs in GNU Classpath.
>> You list each file only once, and under it you list each method in
>> brackets, with a brief description of the change made.  With the one
>> you listed, there seem to be mismatches between the ChangeLog and the
>> patch, with me seeing patch changes that weren't explained, and
>> over-repetition of the filename.
>>
>> e.g.
>>
>> 2009-02-05  Mark Wielaard  <mark at klomp.org>
>>
>>         PR classpath/38912:
>>       * gnu/xml/stream/XMLParser.java:
>>         (getLocalName()): Respect stringInterning.
>>       (getName()): Likewise.
>>         (getPrefix()): Likewise.
>>
>> (deliberately chose a good one that wasn't mine ;-) )
>
> Ok. One question: is it possible to use "(getName(), getPrifix()): Likewise." for the above sample?
>

I think it's clearer to write them separately as in the example.
emacs has a nice mode for editing ChangeLogs with correct syntax
highlighting.

>> >> >        * java/io/PrintStream.java (PrintStream): Throw NPE if out is null.
>> >>
>> >> Where is this specified? The Javadoc for this don't specify an NPE is thrown.
>> >> If OpenJDK does, then the documentation needs updating.  We should include
>> >> a message saying what the problem was.
>> >
>> > the undocumented RI behavior - throw NPE for a null stream (instead of UnsupportedEncodingException one).
>>
>> Again, this needs a testcase.  When I mentioned the documentation, I
>> was thinking more that OpenJDK probably needs a fix here too if the
>> NPE isn't mentioned.
>
> Agreed.
>
> The test case code:
>
>  boolean passed = false;
>  try {
>    new PrintStream(null);
>  } catch (NullPointerException e) {
>    passed = true;
>  }
>  System.out.println((passed ? "PASSED" : "FAILED")
>        + ": new PrintStream(null)");
>
>>
>> >
>> >>
>> >> >        * java/io/PrintStream.java (writeChars, print, println): Don't pass
>> >> >        "pos" and "len" parameters to writeChars() (as they are always set
>> >> >        to 0 and buf.length, respectively).
>> >>
>> >> I think the original version should stay.  It's clearer and avoids a
>> >> method call.
>> >
>> > Why is it clearer? (Is writeChars(char[] buf, int offset, int count) clearer than writeChars(char[] buf) (equipped with the appropriate doc)?
>> > And what's the method call is avoided?
>>
>> If those are the default values, then I presume writeChars(char[])
>> just results in the same call to writeChars(char[], 0, len) in another
>> method.
>> It's clearer in the sense you know the arguments being used rather
>> than them being implicit.
>
> Still not clear. writeChars(char[] buf) writes all characters in buf. Are there any implicit arguments?
>

You are saying by your change that writeChars(buf) == writeChars(buf,
0, buf.length); 0 and buf.length would be implicit in the first call.
I didn't realise writeChars was local to this class, but some other
method implemented as:

writeChars(char[] buf)
{
writeChars(buf, 0, buf.length);
}

which would introduce another method call.  As these are private
methods and all calls pass 0 and length as arguments, yes you can do
the suggested change.  I guess pos and length were used in the past
and the method was never simplified when this changed.

You could actually now make writeChars(char[]) call writeChars(String)
to simplify further.

Didn't spot this before:

-  public synchronized void print (char ch)
+  public void print (char ch)

Why is synchronized being dropped?

Also:

-        out.write (oneByte & 0xff);
+        out.write (oneByte);

seems wrong; oneByte is an int so the higher bits need masking AFAICS.
 Was there a reason for this change?

>>
>> >
>> > IMHO, it's arguable that it would be good to still keep for the private functions additional unnecessary arguments (and process them - e.g. call substring(offset, offset+count)).
>> >
>> >>
>> >> This change occurs twice but is listed here only once.
>> >
>> > Ok. I'll change.
>> >
>> >>
>> >> >        * java/io/PrintStream.java (print): Call out.flush() only if needed
>> >> >        (to match the RI).
>> >>
>> >> Where are these extra conditions coming from?
>> >
>> > Again, the RI behaves differently (the RI flushes the underlying stream only in case of '\n' while the Classpath flushes self (not the underlying stream directly) every time (provided auto_flush is on).
>> >
>>
>> Again, a testcase would be appreciated.  At least with this and the
>> NPE, I can see how a test would reveal them.  With the removal of
>> static, I don't see that.
>
> See above (for '\n' and for line_separator).
>
>>
>> >>
>> >> >        * java/io/PrintStream.java (lastIndexOfNewLine): New private static
>> >> >        method (called from print() only).
>> >>
>> >> >        * java/io/PrintStream.java (write): Remove unnecessary "&" operation.
>> >> >        * java/io/PrintStream.java (print, write): Directly call out.flush()
>> >> >        instead of flush() (the same as in the RI).
>> >> >        * java/io/PrintStream.java (append): Call subSequence() also for
>> >> >        "null" string (the same as in the RI).
>> >> >
>> >> >
>> >>
>> >> The append change looks good as do the flush() changes.  I don't see
>> >> the reason for changing the ordering of the && statement as checking
>> >> auto_flush is the cheaper check.
>> >
>> > Ok. Reordering here is not a big deal for performance (IMHO, local var fetch should be faster than global memory access).
>> >
>>
>> Depends on the VM.  It's not just a variable fetch, but also casting
>> the '\n' to an integer and then doing the comparison.
>
> No cast is needed (since '\n' is an int const according to JVM spec).
>

There's still a comparison between two integers.

> (Even with a cast it might be faster on modern CPUs.)
>
>> But either way it's splitting heirs and depends on the VM; there seems
>> to be no good reason to change it.
>
> Agreed.
>
>>
>> >> Additional arguments should stay.
>> >
>> > Still, I don't agree. See above.
>> >
>> >>
>> >> The ChangeLog needs some work as it was hard to follow and doesn't
>> >> match up to the patch.
>> >
>> > Why hard? due to changes grouping? And, please point me the places where it doesn't match.
>> >
>>
>> Answered above.  Parts of the patch were missing and the overuse of
>> the filename made it noisy.
>
> Ok.
>
>>
>> > Regards. Thanks for reviewing the code (and for being the opponent).
>> >
>> > I'll resubmit the modified/final patch/changelog upon we agree the changes.
>> >
>> >> --
>> >> Andrew :-)
>> >>
>> >> Free Java Software Engineer
>> >> Red Hat, Inc. (http://www.redhat.com)
>> >
>> >
>> >
>>
>> Thanks,
>> --
>> Andrew :-)
>
> Regards.
>
>



-- 
Andrew :-)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

Support Free Java!
Contribute to GNU Classpath and the OpenJDK
http://www.gnu.org/software/classpath
http://openjdk.java.net

PGP Key: 94EFD9D8 (http://subkeys.pgp.net)
Fingerprint: F8EF F1EA 401E 2E60 15FA  7927 142C 2591 94EF D9D8



More information about the Classpath-patches mailing list