[cp-patches] RFC: gnu.regexp.* rewritten

Mark Wielaard mark at klomp.org
Fri Mar 3 11:20:18 UTC 2006


Hi Ito,

On Thu, 2006-03-02 at 00:35 +0900, Ito Kazumitsu wrote:
> The imortant points of this change are:
> 
>   (1) A new method REToken#matchThis. This method tries to match
>       the input string against the REToken itself and does not
>       try to match the next RETokens chained to it. The currently
>       used REToken#match method should be defined using REToken#matchThis.
>       This is useful for (3).
> 
>   (2) A new method REToken#findMatch. This is almost the same as
>       the current REToken#match but returns a resulting REMatch
>       instead of a boolean value.  This is useful for the depth-first
>       search with backtracking.
> 
>   (3) New methods REToken#returnsFixedLengthMatches and
>       REToken#findFixedLengthMatches. These will fasten the
>       search for repeated matches if the matched string is
>       supposed to have a fixed length.
> 
>   (4) RETokenOneOf and RETokenRepeated perform a depth-first
>       search with backtracking.
> 
> After this change, the test attached below shows 400% improved
> performance compared with the current CVS version.  The improved
> performance comes mainly from the change (3).  To my regret,
> The change (4) had a negative effect on performance. 

Impressive performance improvement! Interesting that (4) didn't give us
a big win. I would have guessed that was one of the main issues.

> ChangeLog
> 2006-03-01  Ito Kazumitsu  <kaz at maczuka.gcd.org>
> 
> 	* gnu/regexp/BacktrackStack.java: New file.
> 	* gnu/regexp/RE.java(findMatch): New method.
> 	* gnu/regexp/REMatch.java(next,matchFlags,MF_FIND_ALL,
> 	REMatchList): Removed. (backtrackStack): New field.
> 	* gnu/regexp/REToken.java(match): Changed from an abstract
> 	method to an ordinary method defined with the new method
> 	matchThis. (matchThis, getNext, findMatch, returnsFixedLengthMatches,
> 	findFixedLengthMatches, backtrack, toString): New methods.
> 	* gnu/regexp/RETokenAny.java: Inplemented new methods of REToken.
> 	* gnu/regexp/RETokenBackRef.java: Likewise.
> 	* gnu/regexp/RETokenChar.java: Likewise.
> 	* gnu/regexp/RETokenEnd.java: Likewise.
> 	* gnu/regexp/RETokenEndSub.java: Likewise.
> 	* gnu/regexp/RETokenIndependent.java: Likewise.
> 	* gnu/regexp/RETokenLookAhead.java: Likewise.
> 	* gnu/regexp/RETokenLookBehind.java: Likewise.
> 	* gnu/regexp/RETokenNamedProperty.java: Likewise.
> 	* gnu/regexp/RETokenPOSIX.java: Likewise.
> 	* gnu/regexp/RETokenRange.java: Likewise.
> 	* gnu/regexp/RETokenStart.java: Likewise.
> 	* gnu/regexp/RETokenWordBoundary.java: Likewise
> 	* gnu/regexp/RETokenOneOf.java: Rewriten.
> 	* gnu/regexp/RETokenRepeated.java: Rewriten.

You attached a patch for:

> --- gnu/java/nio/charset/iconv/IconvProvider.java.orig	Sat Jul 16 00:12:48 2005
> +++ gnu/java/nio/charset/iconv/IconvProvider.java	Thu Oct 20 23:41:57 2005

Could you sent the patch for the above ChangeLog entry?

Cheers,

Mark
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://developer.classpath.org/pipermail/classpath-patches/attachments/20060303/6fa81377/attachment.pgp


More information about the Classpath-patches mailing list