r1 - 04 Mar 2006 - NelsonFerraz
- NAME
- DESCRIPTION
- Highlights In 5.8.0
- Incompatible Changes
- Binary Incompatibility
- 64-bit platforms and malloc
- AIX Dynaloading
- Attributes for
myvariables now handled at run-time - Socket Extension Dynamic in VMS
- IEEE-format Floating Point Default on OpenVMS? Alpha
- New Unicode Semantics (no more
use utf8, almost) - New Unicode Properties
REF(...)Instead OfSCALAR(...)- pack/unpack D/F recycled
glob()now returns filenames in alphabetical order- Deprecations
- Core Enhancements
- Unicode Overhaul
- PerlIO is Now The Default
- ithreads
- Restricted Hashes
- Safe Signals
- Understanding of Numbers
- Arrays now always interpolate into double-quoted strings [561]
- Miscellaneous Changes
- Modules and Pragmata
- Utility Changes
- New Documentation
- Performance Enhancements
- Installation and Configuration Improvements
- Selected Bug Fixes
- New or Changed Diagnostics
- Changed Internals
- Security Vulnerability Closed [561]
- New Tests
- Known Problems
- The Compiler Suite Is Still Very Experimental
- Localising Tied Arrays and Hashes Is Broken
- Building Extensions Can Fail Because Of Largefiles
- Modifying $_ Inside
for(..) - mod_perl 1.26 Doesn't Build With Threaded Perl
- lib/ftmp-security tests warn 'system possibly insecure'
- libwww-perl (LWP) fails base/date #51
- PDL failing some tests
- Perl_get_sv
- Self-tying Problems
- ext/threads/t/libc
- Failure of Thread (5.005-style) tests
- Timing problems
- Tied/Magical Array/Hash Elements Do Not Autovivify
- Unicode in package/class and subroutine names does not work
- Platform Specific Problems
- AIX
- Alpha systems with old gccs fail several tests
- AmigaOS
- BeOS
- Cygwin ``unable to remap''
- Cygwin ndbm tests fail on FAT
- DJGPP Failures
- FreeBSD built with ithreads coredumps reading large directories
- FreeBSD Failing locale Test 117 For ISO 8859-15 Locales
- IRIX fails ext/List/Util/t/shuffle.t or Digest::MD5
- HP-UX lib/posix Subtest 9 Fails When LP64-Configured
- Linux with glibc 2.2.5 fails t/op/int subtest #6 with -Duse64bitint
- Linux With Sfio Fails op/misc Test 48
- Mac OS X
- Mac OS X dyld undefined symbols
- OS/2 Test Failures
- op/sprintf tests 91, 129, and 130
- SCO
- Solaris 2.5
- Solaris x86 Fails Tests With -Duse64bitint
- SUPER-UX (NEC SX)
- Term::ReadKey not working on Win32
- UNICOS/mk
- UTS
- VOS (Stratus)
- VMS
- Win32
- XML::Parser not working
- z/OS (OS/390)
- Unicode Support on EBCDIC Still Spotty
- Seen In Perl 5.7 But Gone Now
- Reporting Bugs
- SEE ALSO
- HISTORY
NAME
perl58delta - what is new for perl v5.8.0DESCRIPTION
This document describes differences between the 5.6.0 release and the 5.8.0 release. Many of the bug fixes in 5.8.0 were already seen in the 5.6.1 maintenance release since the two releases were kept closely coordinated (while 5.8.0 was still called 5.7.something). Changes that were integrated into the 5.6.1 release are marked[561].
Many of these changes have been further developed since 5.6.1 was released,
those are marked [561+].
You can see the list of changes in the 5.6.1 release (both from the
5.005_03 release and the 5.6.0 release) by reading the perl561delta manpage.
Highlights In 5.8.0
-
Better Unicode support
New IO Implementation
New Thread Implementation
Better Numeric Accuracy
Safe Signals
Many New Modules
More Extensive Regression Testing
Incompatible Changes
Binary Incompatibility
Perl 5.8 is not binary compatible with earlier releases of Perl. You have to recompile your XS modules. (Pure Perl modules should continue to work.) The major reason for the discontinuity is the new IO architecture called PerlIO? . PerlIO? is the default configuration because without it many new features of Perl 5.8 cannot be used. In other words: you just have to recompile your modules containing XS code, sorry about that. In future releases of Perl, non-PerlIO aware XS modules may become completely unsupported. This shouldn't be too difficult for module authors, however: PerlIO? has been designed as a drop-in replacement (at the source code level) for the stdio interface. Depending on your platform, there are also other reasons why we decided to break binary compatibility, please read on.64-bit platforms and malloc
If your pointers are 64 bits wide, the Perl malloc is no longer being used because it does not work well with 8-byte pointers. Also, usually the system mallocs on such platforms are much better optimized for such large memory models than the Perl malloc. Some memory-hungry Perl applications like the PDL don't work well with Perl's malloc. Finally, other applications than Perl (such as mod_perl) tend to prefer the system malloc. Such platforms include Alpha and 64-bit HPPA, MIPS, PPC, and Sparc.AIX Dynaloading
The AIX dynaloading now uses in AIX releases 4.3 and newer the native dlopen interface of AIX instead of the old emulated interface. This change will probably break backward compatibility with compiled modules. The change was made to make Perl more compliant with other applications like mod_perl which are using the AIX native interface. Attributes for my variables now handled at run-time
The my EXPR : ATTRS syntax now applies variable attributes at
run-time. (Subroutine and our variables still get attributes applied
at compile-time.) See the attributes manpage for additional details. In particular,
however, this allows variable attributes to be useful for tie interfaces,
which was a deficiency of earlier releases. Note that the new semantics
doesn't work with the Attribute::Handlers module (as of version 0.76).
Socket Extension Dynamic in VMS
The Socket extension is now dynamically loaded instead of being statically built in. This may or may not be a problem with ancient TCP/IP stacks of VMS: we do not know since we weren't able to test Perl in such configurations.IEEE-format Floating Point Default on OpenVMS? Alpha
Perl now uses IEEE format (T_FLOAT) as the default internal floating point format on OpenVMS? Alpha, potentially breaking binary compatibility with external libraries or existing data. G_FLOAT is still available as a configuration option. The default on VAX (D_FLOAT) has not changed. New Unicode Semantics (no more use utf8, almost)
Previously in Perl 5.6 to use Unicode one would say ``use utf8'' and
then the operations (like string concatenation) were Unicode-aware
in that lexical scope.
This was found to be an inconvenient interface, and in Perl 5.8 the
Unicode model has completely changed: now the ``Unicodeness'' is bound
to the data itself, and for most of the time ``use utf8'' is not needed
at all. The only remaining use of ``use utf8'' is when the Perl script
itself has been written in the UTF-8 encoding of Unicode. (UTF-8 has
not been made the default since there are many Perl scripts out there
that are using various national eight-bit character sets, which would
be illegal in UTF-8.)
See the perluniintro manpage for the explanation of the current model,
and the utf8 manpage for the current use of the utf8 pragma.
New Unicode Properties
Unicode scripts are now supported. Scripts are similar to (and superior to) Unicode blocks. The difference between scripts and blocks is that scripts are the glyphs used by a language or a group of languages, while the blocks are more artificial groupings of (mostly) 256 characters based on the Unicode numbering. In general, scripts are more inclusive, but not universally so. For example, while the scriptLatin includes all the Latin characters and
their various diacritic-adorned versions, it does not include the various
punctuation or digits (since they are not solely Latin).
A number of other properties are now supported, including \p{L&},
\p{Any} \p{Assigned}, \p{Unassigned}, \p{Blank} [561] and
\p{SpacePerl} [561] (along with their \P{...} versions, of course).
See the perlunicode manpage for details, and more additions.
The In or Is prefix to names used with the \p{...} and \P{...}
are now almost always optional. The only exception is that a In prefix
is required to signify a Unicode block when a block name conflicts with a
script name. For example, \p{Tibetan} refers to the script, while
\p{InTibetan} refers to the block. When there is no name conflict, you
can omit the In from the block name (e.g. \p{BraillePatterns}), but
to be safe, it's probably best to always use the In).
REF(...) Instead Of SCALAR(...)
A reference to a reference now stringifies as ``REF(0x81485ec)'' instead
of ``SCALAR(0x81485ec)'' in order to be more consistent with the return
value of ref().
pack/unpack D/F recycled
The undocumented pack/unpack template letters D/F have been recycled for better use: now they stand for long double (if supported by the platform) and NV (Perl internal floating point type). (They used to be aliases for d/f, but you never knew that.) glob() now returns filenames in alphabetical order
The list of filenames from glob() (or <...>) is now by default sorted
alphabetically to be csh-compliant (which is what happened before
in most UNIX platforms). (bsd_glob() does still sort platform
natively, ASCII or EBCDIC, unless GLOB_ALPHASORT is specified.) [561]
Deprecations
-
The semantics of bless(REF, REF) were unclear and until someone proves
it to make some sense, it is forbidden.
The obsolete chat2 library that should never have been allowed
to escape the laboratory has been decommissioned.
Using
chdir(``'') or chdir(undef) instead of explicit chdir() is
doubtful. A failure (think chdir(some_function()) can lead into
unintended chdir() to the home directory, therefore this behaviour
is deprecated.
The builtin dump() function has probably outlived most of its
usefulness. The core-dumping functionality will remain in future
available as an explicit call to CORE::dump(), but in future
releases the behaviour of an unqualified dump() call may change.
The very dusty examples in the eg/ directory have been removed.
Suggestions for new shiny examples welcome but the main issue is that
the examples need to be documented, tested and (most importantly)
maintained.
The (bogus) escape sequences \8 and \9 now give an optional warning
(``Unrecognized escape passed through''). There is no need to \-escape
any \w character.
The *glob{FILEHANDLE} is deprecated, use *glob{IO} instead.
The package; syntax (package without an argument) has been
deprecated. Its semantics were never that clear and its
implementation even less so. If you have used that feature to
disallow all but fully qualified variables, use strict; instead.
The unimplemented POSIX regex features .cc.? and =c=? are still
recognised but now cause fatal errors. The previous behaviour of
ignoring them by default and warning if requested was unacceptable
since it, in a way, falsely promised that the features could be used.
In future releases, non-PerlIO aware XS modules may become completely
unsupported. Since PerlIO? is a drop-in replacement for stdio at the
source code level, this shouldn't be that drastic a change.
Previous versions of perl and some readings of some sections of Camel
III implied that the :raw ``discipline'' was the inverse of :crlf.
Turning off ``clrfness'' is no longer enough to make a stream truly
binary. So the PerlIO? :raw layer (or ``discipline'', to use the Camel
book's older terminology) is now formally defined as being equivalent
to binmode(FH) - which is in turn defined as doing whatever is
necessary to pass each byte as-is without any translation. In
particular binmode(FH) - and hence :raw - will now turn off both
CRLF and UTF-8 translation and remove other layers (e.g. :encoding())
which would modify byte stream.
The current user-visible implementation of pseudo-hashes (the weird
use of the first array element) is deprecated starting from Perl 5.8.0
and will be removed in Perl 5.10.0, and the feature will be
implemented differently. Not only is the current interface rather
ugly, but the current implementation slows down normal array and hash
use quite noticeably. The fields pragma interface will remain
available. The restricted hashes interface is expected to
be the replacement interface (see the Hash::Util manpage). If your existing
programs depends on the underlying implementation, consider using
the Class::PseudoHash manpage from CPAN.
The syntaxes @a->[...] and %h->{...} have now been deprecated.
After years of trying, suidperl is considered to be too complex to
ever be considered truly secure. The suidperl functionality is likely
to be removed in a future release.
The 5.005 threads model (module Thread) is deprecated and expected
to be removed in Perl 5.10. Multithreaded code should be migrated to
the new ithreads model (see the threads manpage, the threads::shared manpage and
the perlthrtut manpage).
The long deprecated uppercase aliases for the string comparison
operators (EQ, NE, LT, LE, GE, GT) have now been removed.
The tr///C and tr///U features have been removed and will not return;
the interface was a mistake. Sorry about that. For similar
functionality, see pack('U0', ...) and pack('C0', ...). [561]
Earlier Perls treated ``sub foo (@bar)'' as equivalent to ``sub foo (@)''.
The prototypes are now checked better at compile-time for invalid
syntax. An optional warning is generated (``Illegal character in
prototype...``) but this may be upgraded to a fatal error in a future
release.
The exec LIST and system LIST operations now produce warnings on
tainted data and in some future release they will produce fatal errors.
The existing behaviour when localising tied arrays and hashes is wrong,
and will be changed in a future release, so do not rely on the existing
behaviour. See Localising Tied Arrays and Hashes Is Broken.
Core Enhancements
Unicode Overhaul
Unicode in general should be now much more usable than in Perl 5.6.0 (or even in 5.6.1). Unicode can be used in hash keys, Unicode in regular expressions should work now, Unicode in tr/// should work now, Unicode in I/O should work now. See the perluniintro manpage for introduction and the perlunicode manpage for details.-
The Unicode Character Database coming with Perl has been upgraded
to Unicode 3.2.0. For more information, see http://www.unicode.org/ .
[561+] (5.6.1 has UCD 3.0.1.)
For developers interested in enhancing Perl's Unicode capabilities:
almost all the UCD files are included with the Perl distribution in
the lib/unicore subdirectory. The most notable omission, for space
considerations, is the Unihan database.
The properties \p{Blank} and \p{SpacePerl} have been added. ``Blank'' is like
C isblank(), that is, it contains only ``horizontal whitespace'' (the space
character is, the newline isn't), and the ``SpacePerl'' is the Unicode
equivalent of
\s (\p{Space} isn't, since that includes the vertical
tabulator character, whereas \s doesn't.)
See ``New Unicode Properties'' earlier in this document for additional
information on changes with Unicode properties.
PerlIO is Now The Default
-
IO is now by default done via PerlIO? rather than system's ``stdio''.
PerlIO? allows ``layers'' to be ``pushed'' onto a file handle to alter the
handle's behaviour. Layers can be specified at open time via 3-arg
form of open:
open($fh,'>:crlf :utf8', $path) || ...or on already opened handles via extended
binmode:
binmode($fh,':encoding(iso-8859-7)');The built-in layers are: unix (low level read/write), stdio (as in previous Perls), perlio (re-implementation of stdio buffering in a portable manner), crlf (does CRLF <=> ``\n'' translation as on Win32, but available on any platform). A mmap layer may be available if platform supports it (mostly UNIXes). Layers to be applied by default may be specified via the 'open' pragma. See Installation and Configuration Improvements for the effects of PerlIO? on your architecture name. If your platform supports fork(), you can use the list form of
open
for pipes. For example:
open KID_PS, "-|", "ps", "aux" or die $!;
forks the the ps(1) manpage command (without spawning a shell, as there are more
than three arguments to open()), and reads its standard output via the
KID_PS filehandle. See the perlipc manpage.
File handles can be marked as accepting Perl's internal encoding of Unicode
(UTF-8 or UTF-EBCDIC depending on platform) by a pseudo layer ``:utf8'' :
open($fh,">:utf8","Uni.txt");Note for EBCDIC users: the pseudo layer ``:utf8'' is erroneously named for you since it's not UTF-8 what you will be getting but instead UTF-EBCDIC. See the perlunicode manpage, the utf8 manpage, and http://www.unicode.org/unicode/reports/tr16/ for more information. In future releases this naming may change. See the perluniintro manpage for more information about UTF-8. If your environment variables (LC_ALL, LC_CTYPE, LANG) look like you want to use UTF-8 (any of the the variables match
/utf-?8/i), your
STDIN, STDOUT, STDERR handles and the default open layer (see the open manpage)
are marked as UTF-8. (This feature, like other new features that
combine Unicode and I/O, work only if you are using PerlIO? , but that's
the default.)
Note that after this Perl really does assume that everything is UTF-8:
for example if some input handle is not, Perl will probably very soon
complain about the input data like this ``Malformed UTF-8 ...'' since
any old eight-bit data is not legal UTF-8.
Note for code authors: if you want to enable your users to use UTF-8
as their default encoding but in your code still have eight-bit I/O streams
(such as images or zip files), you need to explicitly open() or binmode()
with :bytes (see open in the perlfunc manpage and binmode in the perlfunc manpage), or you
can just use binmode(FH) (nice for pre-5.8.0 backward compatibility).
File handles can translate character encodings from/to Perl's internal
Unicode form on read/write via the ``:encoding()'' layer.
File handles can be opened to ``in memory'' files held in Perl scalars via:
open($fh,'>', \$variable) || ...Anonymous temporary files are available without need to 'use FileHandle' or other module via
open($fh,"+>", undef) || ...That is a literal undef, not an undefined value.
ithreads
The new interpreter threads (``ithreads'' for short) implementation of multithreading, by Arthur Bergman, replaces the old ``5.005 threads'' implementation. In the ithreads model any data sharing between threads must be explicit, as opposed to the model where data sharing was implicit. See the threads manpage and the threads::shared manpage, and the perlthrtut manpage. As a part of the ithreads implementation Perl will also use any necessary and detectable reentrant libc interfaces.Restricted Hashes
A restricted hash is restricted to a certain set of keys, no keys outside the set can be added. Also individual keys can be restricted so that the key cannot be deleted and the value cannot be changed. No new syntax is involved: the Hash::Util module is the interface.Safe Signals
Perl used to be fragile in that signals arriving at inopportune moments could corrupt Perl's internal state. Now Perl postpones handling of signals until it's safe (between opcodes). This change may have surprising side effects because signals no longer interrupt Perl instantly. Perl will now first finish whatever it was doing, like finishing an internal operation (likesort()) or an
external operation (like an I/O operation), and only then look at any
arrived signals (and before starting the next operation). No more corrupt
internal state since the current operation is always finished first,
but the signal may take more time to get heard. Note that breaking
out from potentially blocking operations should still work, though.
Understanding of Numbers
In general a lot of fixing has happened in the area of Perl's understanding of numbers, both integer and floating point. Since in many systems the standard number parsing functions likestrtoul()
and atof() seem to have bugs, Perl tries to work around their
deficiencies. This results hopefully in more accurate numbers.
Perl now tries internally to use integer values in numeric conversions
and basic arithmetics (+ - * /) if the arguments are integers, and
tries also to keep the results stored internally as integers.
This change leads to often slightly faster and always less lossy
arithmetics. (Previously Perl always preferred floating point numbers
in its math.)
Arrays now always interpolate into double-quoted strings [561]
In double-quoted strings, arrays now interpolate, no matter what. The behavior in earlier versions of perl 5 was that arrays would interpolate into strings if the array had been mentioned before the string was compiled, and otherwise Perl would raise a fatal compile-time error. In versions 5.000 through 5.003, the error was
Literal @example now requires backslash
In versions 5.004_01 through 5.6.0, the error was
In string, @example now must be written as \@example
The idea here was to get people into the habit of writing
"fred\@example.com" when they wanted a literal @ sign, just as
they have always written "Give me back my \$5" when they wanted a
literal $ sign.
Starting with 5.6.1, when Perl now sees an @ sign in a
double-quoted string, it always attempts to interpolate an array,
regardless of whether or not the array has been used or declared
already. The fatal error has been downgraded to an optional warning:
Possible unintended interpolation of @example in string
This warns you that "fred@example.com" is going to turn into
fred.com if you don't backslash the @.
See http://www.plover.com/~mjd/perl/at-error.html for more details
about the history here.
Miscellaneous Changes
-
AUTOLOAD is now lvaluable, meaning that you can add the :lvalue attribute
to AUTOLOAD subroutines and you can assign to the AUTOLOAD return value.
The $Config{byteorder} (and corresponding BYTEORDER in config.h) was
previously wrong in platforms if
sizeof(long) was 4, but sizeof(IV)
was 8. The byteorder was only sizeof(long) bytes long (1234 or 4321),
but now it is correctly sizeof(IV) bytes long, (12345678 or 87654321).
(This problem didn't affect Windows platforms.)
Also, $Config{byteorder} is now computed dynamically--this is more
robust with ``fat binaries'' where an executable image contains binaries
for more than one binary platform, and when cross-compiling.
perl -d:Module=arg,arg,arg now works (previously one couldn't pass
in multiple arguments.)
do followed by a bareword now ensures that this bareword isn't
a keyword (to avoid a bug where do q(foo.pl) tried to call a
subroutine called q). This means that for example instead of
do format() you must write do &format().
The builtin dump() now gives an optional warning
dump() better written as CORE::dump(),
meaning that by default dump(...) is resolved as the builtin
dump() which dumps core and aborts, not as (possibly) user-defined
sub dump. To call the latter, qualify the call as &dump(...).
(The whole dump() feature is to considered deprecated, and possibly
removed/changed in future releases.)
chomp() and chop() are now overridable. Note, however, that their
prototype (as given by prototype("CORE::chomp") is undefined,
because it cannot be expressed and therefore one cannot really write
replacements to override these builtins.
END blocks are now run even if you exit/die in a BEGIN block.
Internally, the execution of END blocks is now controlled by
PL_exit_flags & PERL_EXIT_DESTRUCT_END. This enables the new
behaviour for Perl embedders. This will default in 5.10. See
the perlembed manpage.
Formats now support zero-padded decimal fields.
Although ``you shouldn't do that'', it was possible to write code that
depends on Perl's hashed key order (Data::Dumper does this). The new
algorithm ``One-at-a-Time'' produces a different hashed key order.
More details are in Performance Enhancements.
lstat(FILEHANDLE) now gives a warning because the operation makes no sense.
In future releases this may become a fatal error.
Spurious syntax errors generated in certain situations, when glob()
caused File::Glob to be loaded for the first time, have been fixed. [561]
Lvalue subroutines can now return undef in list context. However,
the lvalue subroutine feature still remains experimental. [561+]
A lost warning ``Can't declare ... dereference in my'' has been
restored (Perl had it earlier but it became lost in later releases.)
A new special regular expression variable has been introduced:
$^N, which contains the most-recently closed group (submatch).
no Module; does not produce an error even if Module does not have an
unimport() method. This parallels the behavior of use vis-a-vis
import. [561]
The numerical comparison operators return undef if either operand
is a NaN? . Previously the behaviour was unspecified.
our can now have an experimental optional attribute unique that
affects how global variables are shared among multiple interpreters,
see our in the perlfunc manpage.
The following builtin functions are now overridable: each(), keys(),
pop(), push(), shift(), splice(), unshift(). [561]
pack() / unpack() can now group template letters with () and then
apply repetition/count modifiers on the groups.
pack() / unpack() can now process the Perl internal numeric types:
IVs, UVs, NVs-- and also long doubles, if supported by the platform.
The template letters are j, J, F, and D.
pack('U0a*', ...) can now be used to force a string to UTF-8.
my PACKAGE $obj now works. [561]
POSIX::sleep() now returns the number of unslept seconds
(as the POSIX standard says), as opposed to CORE::sleep() which
returns the number of slept seconds.
printf() and sprintf() now support parameter reordering using the
%\d+\$ and *\d+\$ syntaxes. For example
printf "%2\$s %1\$s\n", "foo", "bar";
will print ``bar foo\n''. This feature helps in writing
internationalised software, and in general when the order
of the parameters can vary.
The (\&) prototype now works properly. [561]
prototype(\[$@%&]) is now available to implicitly create references
(useful for example if you want to emulate the tie() interface).
A new command-line option, -t is available. It is the
little brother of -T: instead of dying on taint violations,
lexical warnings are given. This is only meant as a temporary
debugging aid while securing the code of old legacy applications.
This is not a substitute for -T.>
In other taint news, the exec LIST and system LIST have now been
considered too risky (think exec @ARGV: it can start any program
with any arguments), and now the said forms cause a warning under
lexical warnings. You should carefully launder the arguments to
guarantee their validity. In future releases of Perl the forms will
become fatal errors so consider starting laundering now.
Tied hash interfaces are now required to have the EXISTS and DELETE
methods (either own or inherited).
If tr/// is just counting characters, it doesn't attempt to
modify its target.
untie() will now call an UNTIE() hook if it exists. See the perltie manpage
for details. [561]
the utime manpage now supports utime undef, undef, @files to change the
file timestamps to the current time.
The rules for allowing underscores (underbars) in numeric constants
have been relaxed and simplified: now you can have an underscore
simply between digits.
Rather than relying on C's argv[0] (which may not contain a full pathname)
where possible $^X is now set by asking the operating system.
(eg by reading /proc/self/exe on Linux, /proc/curproc/file on FreeBSD? )
A new variable, ${^TAINT}, indicates whether taint mode is enabled.
You can now override the readline() builtin, and this overrides also
the <FILEHANDLE> angle bracket operator.
The command-line options -s and -F are now recognized on the shebang
(#!) line.
Use of the /c match modifier without an accompanying /g modifier
elicits a new warning: Use of /c modifier is meaningless without /g.
Use of /c in substitutions, even with /g, elicits
Use of /c modifier is meaningless in s///.
Use of /g with split elicits Use of /g modifier is meaningless
in split>.
Support for the CLONE special subroutine had been added.
With ithreads, when a new thread is created, all Perl data is cloned,
however non-Perl data cannot be cloned automatically. In CLONE you
can do whatever you need to do, like for example handle the cloning of
non-Perl data, if necessary. CLONE will be executed once for every
package that has it defined or inherited. It will be called in the
context of the new thread, so all modifications are made in the new area.
See the perlmod manpage
Modules and Pragmata
New Modules and Pragmata
Attribute::Handlers, originally by Damian Conway and now maintained
by Arthur Bergman, allows a class to define attribute handlers.
package MyPack;
use Attribute::Handlers;
sub Wolf :ATTR(SCALAR) { print "howl!\n" }
# later, in some package using or inheriting from MyPack...
my MyPack $Fluffy : Wolf; # the attribute handler Wolf will be called
Both variables and routines can have attribute handlers. Handlers can
be specific to type (SCALAR, ARRAY, HASH, or CODE), or specific to the
exact compilation phase (BEGIN, CHECK, INIT, or END).
See the Attribute::Handlers manpage.
B::Concise, by Stephen McCamant? , is a new compiler backend for
walking the Perl syntax tree, printing concise info about ops.
The output is highly customisable. See the B::Concise manpage. [561+]
The new bignum, bigint, and bigrat pragmas, by Tels, implement
transparent bignum support (using the Math::BigInt, Math::BigFloat,
and Math::BigRat backends).
Class::ISA, by Sean Burke, is a module for reporting the search
path for a class's ISA tree. See the Class::ISA manpage.
Cwd now has a split personality: if possible, an XS extension is
used, (this will hopefully be faster, more secure, and more robust)
but if not possible, the familiar Perl implementation is used.
Devel::PPPort, originally by Kenneth Albanowski and now
maintained by Paul Marquess, has been added. It is primarily used
by h2xs to enhance portability of XS modules between different
versions of Perl. See the Devel::PPPort manpage.
Digest, frontend module for calculating digests (checksums), from
Gisle Aas, has been added. See the Digest manpage.
Digest::MD5 for calculating MD5 digests (checksums) as defined in
RFC 1321, from Gisle Aas, has been added. See the Digest::MD5 manpage.
use Digest::MD5 'md5_hex';
$digest = md5_hex("Thirsty Camel");
print $digest, "\n"; # 01d19d9d2045e005c3f1b80e8b164de1
NOTE: the MD5 backward compatibility module is deliberately not
included since its further use is discouraged.
See also the PerlIO? ::via::QuotedPrint manpage.
Encode, originally by Nick Ing-Simmons and now maintained by Dan
Kogai, provides a mechanism to translate between different character
encodings. Support for Unicode, ISO-8859-1, and ASCII are compiled in
to the module. Several other encodings (like the rest of the
ISO-8859, CP*/Win*, Mac, KOI8-R, three variants EBCDIC, Chinese,
Japanese, and Korean encodings) are included and can be loaded at
runtime. (For space considerations, the largest Chinese encodings
have been separated into their own CPAN module, Encode::HanExtra,
which Encode will use if available). See the Encode manpage.
Any encoding supported by Encode module is also available to the
``:encoding()'' layer if PerlIO? is used.
Hash::Util is the interface to the new restricted hashes
feature. (Implemented by Jeffrey Friedl, Nick Ing-Simmons, and
Michael Schwern.) See the Hash::Util manpage.
I18N::Langinfo can be used to query locale information.
See the I18N? ::Langinfo manpage.
I18N::LangTags, by Sean Burke, has functions for dealing with
RFC3066-style language tags. See the I18N? ::LangTags manpage.
ExtUtils::Constant, by Nicholas Clark, is a new tool for extension
writers for generating XS code to import C header constants.
See the ExtUtils? ::Constant manpage.
Filter::Simple, by Damian Conway, is an easy-to-use frontend to
Filter::Util::Call. See the Filter::Simple manpage.
# in MyFilter.pm:
package MyFilter;
use Filter::Simple sub {
while (my ($from, $to) = splice @_, 0, 2) {
s/$from/$to/g;
}
};
1;
# in user's code:
use MyFilter qr/red/ => 'green';
print "red\n"; # this code is filtered, will print "green\n"
print "bored\n"; # this code is filtered, will print "bogreen\n"
no MyFilter;
print "red\n"; # this code is not filtered, will print "red\n"
File::Temp, by Tim Jenness, allows one to create temporary files
and directories in an easy, portable, and secure way. See the File::Temp manpage.
[561+]
Filter::Util::Call, by Paul Marquess, provides you with the
framework to write source filters in Perl. For most uses, the
frontend Filter::Simple is to be preferred. See the Filter::Util::Call manpage.
if, by Ilya Zakharevich, is a new pragma for conditional inclusion
of modules.
the libnet manpage, by Graham Barr, is a collection of perl5 modules related
to network programming. See the Net::FTP manpage, the Net::NNTP manpage, the Net::Ping manpage
(not part of libnet, but related), the Net::POP3 manpage, the Net::SMTP manpage,
and the Net::Time manpage.
Perl installation leaves libnet unconfigured; use libnetcfg
to configure it.
List::Util, by Graham Barr, is a selection of general-utility
list subroutines, such as sum(), min(), first(), and shuffle().
See the List::Util manpage.
Locale::Constants, Locale::Country, Locale::Currency
Locale::Language, and the Locale::Script manpage, by Neil Bowers, have
been added. They provide the codes for various locale standards, such
as ``fr'' for France, ``usd'' for US Dollar, and ``ja'' for Japanese.
use Locale::Country;
$country = code2country('jp'); # $country gets 'Japan'
$code = country2code('Norway'); # $code gets 'no'
See the Locale::Constants manpage, the Locale::Country manpage, the Locale::Currency manpage,
and the Locale::Language manpage.
Locale::Maketext, by Sean Burke, is a localization framework. See
the Locale::Maketext manpage, and the Locale::Maketext::TPJ13 manpage. The latter is an
article about software localization, originally published in The Perl
Journal #13, and republished here with kind permission.
Math::BigRat for big rational numbers, to accompany Math::BigInt and
Math::BigFloat, from Tels. See the Math::BigRat manpage.
Memoize can make your functions faster by trading space for time,
from Mark-Jason Dominus. See the Memoize manpage.
MIME::Base64, by Gisle Aas, allows you to encode data in base64,
as defined in RFC 2045 - MIME (Multipurpose Internet Mail
Extensions)>.
use MIME::Base64;
$encoded = encode_base64('Aladdin:open sesame');
$decoded = decode_base64($encoded);
print $encoded, "\n"; # "QWxhZGRpbjpvcGVuIHNlc2FtZQ=="
See the MIME::Base64 manpage.
MIME::QuotedPrint, by Gisle Aas, allows you to encode data
in quoted-printable encoding, as defined in RFC 2045 - MIME
(Multipurpose Internet Mail Extensions)>.
use MIME::QuotedPrint;
$encoded = encode_qp("\xDE\xAD\xBE\xEF");
$decoded = decode_qp($encoded);
print $encoded, "\n"; # "=DE=AD=BE=EF\n"
print $decoded, "\n"; # "\xDE\xAD\xBE\xEF\n"
See also the PerlIO? ::via::QuotedPrint manpage.
NEXT, by Damian Conway, is a pseudo-class for method redispatch.
See the NEXT manpage.
open is a new pragma for setting the default I/O layers
for open().
PerlIO::scalar, by Nick Ing-Simmons, provides the implementation
of IO to ``in memory'' Perl scalars as discussed above. It also serves
as an example of a loadable PerlIO? layer. Other future possibilities
include PerlIO? ::Array and PerlIO? ::Code. See the PerlIO? ::scalar manpage.
PerlIO::via, by Nick Ing-Simmons, acts as a PerlIO? layer and wraps
PerlIO? layer functionality provided by a class (typically implemented
in Perl code).
PerlIO::via::QuotedPrint, by Elizabeth Mattijsen, is an example
of a PerlIO::via class:
use PerlIO::via::QuotedPrint;
open($fh,">:via(QuotedPrint)",$path);
This will automatically convert everything output to $fh to
Quoted-Printable. See the PerlIO? ::via manpage and the PerlIO? ::via::QuotedPrint manpage.
Pod::ParseLink, by Russ Allbery, has been added,
to parse L<> links in pods as described in the new
perlpodspec.
Pod::Text::Overstrike, by Joe Smith, has been added.
It converts POD data to formatted overstrike text.
See the Pod::Text::Overstrike manpage. [561+]
Scalar::Util is a selection of general-utility scalar subroutines,
such as blessed(), reftype(), and tainted(). See the Scalar::Util manpage.
sort is a new pragma for controlling the behaviour of sort().
Storable gives persistence to Perl data structures by allowing the
storage and retrieval of Perl data to and from files in a fast and
compact binary format. Because in effect Storable does serialisation
of Perl data structures, with it you can also clone deep, hierarchical
datastructures. Storable was originally created by Raphael Manfredi,
but it is now maintained by Abhijit Menon-Sen. Storable has been
enhanced to understand the two new hash features, Unicode keys and
restricted hashes. See the Storable manpage.
Switch, by Damian Conway, has been added. Just by saying
use Switch;
you have switch and case available in Perl.
use Switch;
switch ($val) {
case 1 { print "number 1" }
case "a" { print "string a" }
case [1..10,42] { print "number in list" }
case (@array) { print "number in list" }
case /\w+/ { print "pattern" }
case qr/\w+/ { print "pattern" }
case (%hash) { print "entry in hash" }
case (\%hash) { print "entry in hash" }
case (\&sub) { print "arg to subroutine" }
else { print "previous case not true" }
}
See the Switch manpage.
Test::More, by Michael Schwern, is yet another framework for writing
test scripts, more extensive than Test::Simple. See the Test::More manpage.
Test::Simple, by Michael Schwern, has basic utilities for writing
tests. See the Test::Simple manpage.
Text::Balanced, by Damian Conway, has been added, for extracting
delimited text sequences from strings.
use Text::Balanced 'extract_delimited';
($a, $b) = extract_delimited("'never say never', he never said", "'", '');
$a will be ``'never say never''', $b will be ', he never said'.
In addition to extract_delimited(), there are also extract_bracketed(),
extract_quotelike(), extract_codeblock(), extract_variable(),
extract_tagged(), extract_multiple(), gen_delimited_pat(), and
gen_extract_tagged(). With these, you can implement rather advanced
parsing algorithms. See the Text::Balanced manpage.
threads, by Arthur Bergman, is an interface to interpreter threads.
Interpreter threads (ithreads) is the new thread model introduced in
Perl 5.6 but only available as an internal interface for extension
writers (and for Win32 Perl for fork() emulation). See the threads manpage,
the threads::shared manpage, and the perlthrtut manpage.
threads::shared, by Arthur Bergman, allows data sharing for
interpreter threads. See the threads::shared manpage.
Tie::File, by Mark-Jason Dominus, associates a Perl array with the
lines of a file. See the Tie::File manpage.
Tie::Memoize, by Ilya Zakharevich, provides on-demand loaded hashes.
See the Tie::Memoize manpage.
Tie::RefHash::Nestable, by Edward Avis, allows storing hash
references (unlike the standard Tie::RefHash) The module is contained
within Tie::RefHash. See the Tie::RefHash manpage.
Time::HiRes, by Douglas E. Wegscheid, provides high resolution
timing (ualarm, usleep, and gettimeofday). See the Time::HiRes manpage.
Unicode::UCD offers a querying interface to the Unicode Character
Database. See the Unicode::UCD manpage.
Unicode::Collate, by SADAHIRO Tomoyuki, implements the UCA
(Unicode Collation Algorithm) for sorting Unicode strings.
See the Unicode::Collate manpage.
Unicode::Normalize, by SADAHIRO Tomoyuki, implements the various
Unicode normalization forms. See the Unicode::Normalize manpage.
XS::APItest, by Tim Jenness, is a test extension that exercises XS
APIs. Currently only printf() is tested: how to output various
basic data types from XS.
XS::Typemap, by Tim Jenness, is a test extension that exercises
XS typemaps. Nothing gets installed, but the code is worth studying
for extension writers.
Updated And Improved Modules and Pragmata
-
The following independently supported modules have been updated to the
newest versions from CPAN: CGI, CPAN, DB_File, File::Spec, File::Temp,
Getopt::Long, Math::BigFloat, Math::BigInt, the podlators bundle
(Pod::Man, Pod::Text), Pod::LaTeX [561+], Pod::Parser, Storable,
Term::ANSIColor, Test, Text-Tabs+Wrap.
attributes::reftype() now works on tied arguments.
AutoLoader can now be disabled with
no AutoLoader;.
B::Deparse has been significantly enhanced by Robin Houston. It can
now deparse almost all of the standard test suite (so that the tests
still succeed). There is a make target ``test.deparse'' for trying this
out.
Carp now has better interface documentation, and the @CARP_NOT
interface has been added to get optional control over where errors
are reported independently of @ISA, by Ben Tilly.
Class::Struct can now define the classes in compile time.
Class::Struct now assigns the array/hash element if the accessor
is called with an array/hash element as the sole argument.
The return value of Cwd::fastcwd() is now tainted.
Data::Dumper now has an option to sort hashes.
Data::Dumper now has an option to dump code references
using B::Deparse.
DB_File now supports newer Berkeley DB versions, among
other improvements.
Devel::Peek now has an interface for the Perl memory statistics
(this works only if you are using perl's malloc, and if you have
compiled with debugging).
The English module can now be used without the infamous performance
hit by saying
use English '-no_match_vars';
(Assuming, of course, that you don't need the troublesome variables
$`, $&, or $'.) Also, introduced @LAST_MATCH_START and
@LAST_MATCH_END English aliases for @- and @+.
ExtUtils? ::MakeMaker has been significantly cleaned up and fixed.
The enhanced version has also been backported to earlier releases
of Perl and submitted to CPAN so that the earlier releases can
enjoy the fixes.
The arguments of WriteMakefile() in Makefile.PL are now checked
for sanity much more carefully than before. This may cause new
warnings when modules are being installed. See the ExtUtils? ::MakeMaker manpage
for more details.
ExtUtils? ::MakeMaker now uses File::Spec internally, which hopefully
leads to better portability.
Fcntl, Socket, and Sys::Syslog have been rewritten by Nicholas Clark
to use the new-style constant dispatch section (see the ExtUtils? ::Constant manpage).
This means that they will be more robust and hopefully faster.
File::Find now chdir()s correctly when chasing symbolic links. [561]
File::Find now has pre- and post-processing callbacks. It also
correctly changes directories when chasing symbolic links. Callbacks
(naughtily) exiting with ``next;'' instead of ``return;'' now work.
File::Find is now (again) reentrant. It also has been made
more portable.
The warnings issued by File::Find now belong to their own category.
You can enable/disable them with use/no warnings 'File::Find';.
File::Glob::glob() has been renamed to File::Glob::bsd_glob()
because the name clashes with the builtin glob(). The older
name is still available for compatibility, but is deprecated. [561]
File::Glob now supports GLOB_LIMIT constant to limit the size of
the returned list of filenames.
IPC::Open3 now allows the use of numeric file descriptors.
IO::Socket now has an atmark() method, which returns true if the socket
is positioned at the out-of-band mark. The method is also exportable
as a sockatmark() function.
IO::Socket::INET failed to open the specified port if the service name
was not known. It now correctly uses the supplied port number as is. [561]
IO::Socket::INET has support for the ReusePort? option (if your
platform supports it). The Reuse option now has an alias, ReuseAddr? .
For clarity, you may want to prefer ReuseAddr? .
IO::Socket::INET now supports a value of zero for LocalPort
(usually meaning that the operating system will make one up.)
'use lib' now works identically to @INC. Removing directories
with 'no lib' now works.
Math::BigFloat and Math::BigInt have undergone a full rewrite by Tels.
They are now magnitudes faster, and they support various bignum
libraries such as GMP and PARI as their backends.
Math::Complex handles inf, NaN? etc., better.
Net::Ping has been considerably enhanced by Rob Brown: multihoming is
now supported, Win32 functionality is better, there is now time
measuring functionality (optionally high-resolution using
Time::HiRes), and there is now ``external'' protocol which uses
Net::Ping::External module which runs your external ping utility and
parses the output. A version of Net::Ping::External is available in
CPAN.
Note that some of the Net::Ping tests are disabled when running
under the Perl distribution since one cannot assume one or more
of the following: enabled echo port at localhost, full Internet
connectivity, or sympathetic firewalls. You can set the environment
variable PERL_TEST_Net_Ping to ``1'' (one) before running the Perl test
suite to enable all the Net::Ping tests.
POSIX::sigaction() is now much more flexible and robust.
You can now install coderef handlers, 'DEFAULT', and 'IGNORE'
handlers, installing new handlers was not atomic.
In Safe, %INC is now localised in a Safe compartment so that
use/require work.
In SDBM_File on dosish platforms, some keys went missing because of
lack of support for files with ``holes''. A workaround for the problem
has been added.
In Search::Dict one can now have a pre-processing hook for the
lines being searched.
The Shell module now has an OO interface.
In Sys::Syslog there is now a failover mechanism that will go
through alternative connection mechanisms until the message
is successfully logged.
The Test module has been significantly enhanced.
Time::Local::timelocal() does not handle fractional seconds anymore.
The rationale is that neither does localtime(), and timelocal() and
localtime() are supposed to be inverses of each other.
The vars pragma now supports declaring fully qualified variables.
(Something that our() does not and will not support.)
The utf8:: name space (as in the pragma) provides various
Perl-callable functions to provide low level access to Perl's
internal Unicode representation. At the moment only length()
has been implemented.
Utility Changes
-
Emacs perl mode (emacs/cperl-mode.el) has been updated to version
4.31.
emacs/e2ctags.pl is now much faster.
enc2xs is a tool for people adding their own encodings to the
Encode module.
h2ph now supports C trigraphs.
h2xs now produces a template README.
h2xs now uses Devel::PPPort for better portability between
different versions of Perl.
h2xs uses the new ExtUtils::Constant module
which will affect newly created extensions that define constants.
Since the new code is more correct (if you have two constants where the
first one is a prefix of the second one, the first constant never
got defined), less lossy (it uses integers for integer constant,
as opposed to the old code that used floating point numbers even for
integer constants), and slightly faster, you might want to consider
regenerating your extension code (the new scheme makes regenerating
easy). the h2xs manpage now also supports C trigraphs.
libnetcfg has been added to configure libnet.
perlbug is now much more robust. It also sends the bug report to
perl.org, not perl.com.
perlcc has been rewritten and its user interface (that is,
command line) is much more like that of the UNIX C compiler, cc.
(The perlbc tools has been removed. Use perlcc -B instead.)
Note that perlcc is still considered very experimental and
unsupported.> [561]
perlivp is a new Installation Verification Procedure utility
for running any time after installing Perl.
piconv is an implementation of the character conversion utility
iconv, demonstrating the new Encode module.
pod2html now allows specifying a cache directory.
pod2html now produces XHTML 1.0.
pod2html now understands POD written using different line endings
(PC-like CRLF versus UNIX-like LF versus MacClassic? -like CR).
s2p has been completely rewritten in Perl. (It is in fact a full
implementation of sed in Perl: you can use the sed functionality by
using the psed utility.)
xsubpp now understands POD documentation embedded in the *.xs
files. [561]
xsubpp now supports the OUT keyword.
New Documentation
-
perl56delta details the changes between the 5.005 release and the
5.6.0 release.
perlclib documents the internal replacements for standard C library
functions. (Interesting only for extension writers and Perl core
hackers.) [561+]
perldebtut is a Perl debugging tutorial. [561+]
perlebcdic contains considerations for running Perl on EBCDIC
platforms. [561+]
perlintro is a gentle introduction to Perl.
perliol documents the internals of PerlIO? with layers.
perlmodstyle is a style guide for writing modules.
perlnewmod tells about writing and submitting a new module. [561+]
perlpacktut is a
pack() tutorial.
perlpod has been rewritten to be clearer and to record the best
practices gathered over the years.
perlpodspec is a more formal specification of the pod format,
mainly of interest for writers of pod applications, not to
people writing in pod.
perlretut is a regular expression tutorial. [561+]
perlrequick is a regular expressions quick-start guide.
Yes, much quicker than perlretut. [561]
perltodo has been updated.
perltootc has been renamed as perltooc (to not to conflict
with perltoot in filesystems restricted to ``8.3'' names).
perluniintro is an introduction to using Unicode in Perl.
(perlunicode is more of a detailed reference and background
information)
perlutil explains the command line utilities packaged with the Perl
distribution. [561+]
perlaix perlamiga perlapollo perlbeos perlbs2000
perlce perlcygwin perldgux perldos perlepoc perlfreebsd perlhpux
perlhurd perlirix perlmachten perlmacos perlmint perlmpeix
perlnetware perlos2 perlos390 perlplan9 perlqnx perlsolaris
perltru64 perluts perlvmesa perlvms perlvos perlwin32
These documents usually detail one or more of the following subjects:
configuring, building, testing, installing, and sometimes also using
Perl on the said platform.
Eastern Asian Perl users are now welcomed in their own languages:
README.jp (Japanese), README.ko (Korean), README.cn (simplified
Chinese) and README.tw (traditional Chinese), which are written in
normal pod but encoded in EUC-JP, EUC-KR, EUC-CN and Big5. These
will get installed as
perljp perlko perlcn perltw
-
The documentation for the POSIX-BC platform is called ``BS2000'', to avoid
confusion with the Perl POSIX module.
The documentation for the WinCE? platform is called perlce (README.ce
in the source code kit), to avoid confusion with the perlwin32
documentation on 8.3-restricted filesystems.
Performance Enhancements
map() could get pathologically slow when the result list it generates
is larger than the source list. The performance has been improved for
common scenarios. [561]
sort() is also fully reentrant, in the sense that the sort function
can itself call sort(). This did not work reliably in previous
releases. [561]
sort() has been changed to use primarily mergesort internally as
opposed to the earlier quicksort. For very small lists this may
result in slightly slower sorting times, but in general the speedup
should be at least 20%. Additional bonuses are that the worst case
behaviour of sort() is now better (in computer science terms it now
runs in time O(N log N), as opposed to quicksort's Theta(N**2)
worst-case run time behaviour), and that sort() is now stable
(meaning that elements with identical keys will stay ordered as they
were before the sort). See the sort pragma for information.
The story in more detail: suppose you want to serve yourself a little
slice of Pi.
@digits = ( 3,1,4,1,5,9 );
A numerical sort of the digits will yield (1,1,3,4,5,9), as expected.
Which 1 comes first is hard to know, since one 1 looks pretty
much like any other. You can regard this as totally trivial,
or somewhat profound. However, if you just want to sort the even
digits ahead of the odd ones, then what will
sort { ($a % 2) <=> ($b % 2) } @digits;
yield? The only even digit, 4, will come first. But how about
the odd numbers, which all compare equal? With the quicksort algorithm
used to implement Perl 5.6 and earlier, the order of ties is left up
to the sort. So, as you add more and more digits of Pi, the order
in which the sorted even and odd digits appear will change.
and, for sufficiently large slices of Pi, the quicksort algorithm
in Perl 5.8 won't return the same results even if reinvoked with the
same input. The justification for this rests with quicksort's
worst case behavior. If you run
sort { $a <=> $b } ( 1 .. $N , 1 .. $N );
(something you might approximate if you wanted to merge two sorted
arrays using sort), doubling $N doesn't just double the quicksort time,
it quadruples it. Quicksort has a worst case run time that can
grow like N**2, so-called quadratic behaviour, and it can happen
on patterns that may well arise in normal use. You won't notice this
for small arrays, but you will notice it with larger arrays,
and you may not live long enough for the sort to complete on arrays
of a million elements. So the 5.8 quicksort scrambles large arrays
before sorting them, as a statistical defence against quadratic behaviour.
But that means if you sort the same large array twice, ties may be
broken in different ways.
Because of the unpredictability of tie-breaking order, and the quadratic
worst-case behaviour, quicksort was almost replaced completely with
a stable mergesort. Stable means that ties are broken to preserve
the original order of appearance in the input array. So
sort { ($a % 2) <=> ($b % 2) } (3,1,4,1,5,9);
will yield (4,3,1,1,5,9), guaranteed. The even and odd numbers
appear in the output in the same order they appeared in the input.
Mergesort has worst case O(N log N) behaviour, the best value
attainable. And, ironically, this mergesort does particularly
well where quicksort goes quadratic: mergesort sorts (1..$N, 1..$N)
in O(N) time. But quicksort was rescued at the last moment because
it is faster than mergesort on certain inputs and platforms.
For example, if you really don't care about the order of even
and odd digits, quicksort will run in O(N) time; it's very good
at sorting many repetitions of a small number of distinct elements.
The quicksort divide and conquer strategy works well on platforms
with relatively small, very fast, caches. Eventually, the problem gets
whittled down to one that fits in the cache, from which point it
benefits from the increased memory speed.
Quicksort was rescued by implementing a sort pragma to control aspects
of the sort. The stable subpragma forces stable behaviour,
regardless of algorithm. The _quicksort and _mergesort
subpragmas are heavy-handed ways to select the underlying implementation.
The leading _ is a reminder that these subpragmas may not survive
beyond 5.8. More appropriate mechanisms for selecting the implementation
exist, but they wouldn't have arrived in time to save quicksort.
Hashes now use Bob Jenkins ``One-at-a-Time'' hashing key algorithm
( http://burtleburtle.net/bob/hash/doobs.html ). This algorithm is
reasonably fast while producing a much better spread of values than
the old hashing algorithm (originally by Chris Torek, later tweaked by
Ilya Zakharevich). Hash values output from the algorithm on a hash of
all 3-char printable ASCII keys comes much closer to passing the
DIEHARD random number generation tests. According to perlbench, this
change has not affected the overall speed of Perl.
unshift() should now be noticeably faster.
Installation and Configuration Improvements
Generic Improvements
-
INSTALL now explains how you can configure Perl to use 64-bit
integers even on non-64-bit platforms.
Policy.sh policy change: if you are reusing a Policy.sh file
(see INSTALL) and you use Configure -Dprefix=/foo/bar and in the old
Policy $prefix eq $siteprefix and $prefix eq $vendorprefix, all of
them will now be changed to the new prefix, /foo/bar. (Previously
only $prefix changed.) If you do not like this new behaviour,
specify prefix, siteprefix, and vendorprefix explicitly.
A new optional location for Perl libraries, otherlibdirs, is available.
It can be used for example for vendor add-ons without disturbing Perl's
own library directories.
In many platforms, the vendor-supplied 'cc' is too stripped-down to
build Perl (basically, 'cc' doesn't do ANSI C). If this seems
to be the case and 'cc' does not seem to be the GNU C compiler
'gcc', an automatic attempt is made to find and use 'gcc' instead.
gcc needs to closely track the operating system release to avoid
build problems. If Configure finds that gcc was built for a different
operating system release than is running, it now gives a clearly visible
warning that there may be trouble ahead.
Since Perl 5.8 is not binary-compatible with previous releases
of Perl, Configure no longer suggests including the 5.005
modules in @INC.
Configure
-S can now run non-interactively. [561]
Configure support for pdp11-style memory models has been removed due
to obsolescence. [561]
configure.gnu now works with options with whitespace in them.
installperl now outputs everything to STDERR.
Because PerlIO? is now the default on most platforms, ``-perlio'' doesn't
get appended to the $Config{archname} (also known as $^O) anymore.
Instead, if you explicitly choose not to use perlio (Configure command
line option -Uuseperlio), you will get ``-stdio'' appended.
Another change related to the architecture name is that ``-64all''
(-Duse64bitall, or ``maximally 64-bit'') is appended only if your
pointers are 64 bits wide. (To be exact, the use64bitall is ignored.)
In AFS installations, one can configure the root of the AFS to be
somewhere else than the default /afs by using the Configure
parameter -Dafsroot=/some/where/else.
APPLLIB_EXP, a lesser-known configuration-time definition, has been
documented. It can be used to prepend site-specific directories
to Perl's default search path (@INC); see INSTALL for information.
The version of Berkeley DB used when the Perl (and, presumably, the
DB_File extension) was built is now available as
@Config{qw(db_version_major db_version_minor db_version_patch)}
from Perl and as DB_VERSION_MAJOR_CFG DB_VERSION_MINOR_CFG
DB_VERSION_PATCH_CFG> from C.
Building Berkeley DB3 for compatibility modes for DB, NDBM, and ODBM
has been documented in INSTALL.
If you have CPAN access (either network or a local copy such as a
CD-ROM) you can during specify extra modules to Configure to build and
install with Perl using the -Dextras=... option. See INSTALL for
more details.
In addition to config.over, a new override file, config.arch, is
available. This file is supposed to be used by hints file writers
for architecture-wide changes (as opposed to config.over which is
for site-wide changes).
If your file system supports symbolic links, you can build Perl outside
of the source directory by
mkdir perl/build/directory
cd perl/build/directory
sh /path/to/perl/source/Configure -Dmksymlinks ...
This will create in perl/build/directory a tree of symbolic links
pointing to files in /path/to/perl/source. The original files are left
unaffected. After Configure has finished, you can just say
make all test
and Perl will be built and tested, all in perl/build/directory.
[561]
For Perl developers, several new make targets for profiling
and debugging have been added; see the perlhack manpage.
-
Use of the gprof tool to profile Perl has been documented in
the perlhack manpage. There is a make target called ``perl.gprof'' for
generating a gprofiled Perl executable.
If you have GCC 3, there is a make target called ``perl.gcov'' for
creating a gcoved Perl executable for coverage analysis. See
the perlhack manpage.
If you are on IRIX or Tru64 platforms, new profiling/debugging options
have been added; see the perlhack manpage for more information about pixie and
Third Degree.
Configure -Duseithreads) because it wouldn't work anyway (the
Thread extension requires being Configured with -Duse5005threads).
Note that the 5.005 threads are unsupported and deprecated: if you
have code written for the old threads you should migrate it to the
new ithreads model.>
The Gconvert macro ($Config{d_Gconvert}) used by perl for stringifying
floating-point numbers is now more picky about using sprintf %.*g
rules for the conversion. Some platforms that used to use gcvt may
now resort to the slower sprintf.
The obsolete method of making a special (e.g., debugging) flavor
of perl by saying
make LIBPERL=libperld.a
has been removed. Use -DDEBUGGING instead.
New Or Improved Platforms
For the list of platforms known to support Perl, see Supported Platforms in the perlport manpage.-
AIX dynamic loading should be now better supported.
AIX should now work better with gcc, threads, and 64-bitness. Also the
long doubles support in AIX should be better now. See the perlaix manpage.
AtheOS? ( http://www.atheos.cx/ ) is a new platform.
BeOS? has been reclaimed.
The DG/UX platform now supports 5.005-style threads.
See the perldgux manpage.
The DYNIX/ptx platform (also known as dynixptx) is supported at or
near osvers 4.5.2.
EBCDIC platforms (z/OS (also known as OS/390), POSIX-BC, and VM/ESA)
have been regained. Many test suite tests still fail and the
co-existence of Unicode and EBCDIC isn't quite settled, but the
situation is much better than with Perl 5.6. See the perlos390 manpage,
the perlbs2000 manpage (for POSIX-BC), and the perlvmesa manpage for more information.
Building perl with -Duseithreads or -Duse5005threads now works under
HP-UX 10.20 (previously it only worked under 10.30 or later). You will
need a thread library package installed. See README.hpux. [561]
Mac OS Classic is now supported in the mainstream source package
(MacPerl? has of course been available since perl 5.004 but now the
source code bases of standard Perl and MacPerl? have been synchronised)
[561]
Mac OS X (or Darwin) should now be able to build Perl even on HFS+
filesystems. (The case-insensitivity used to confuse the Perl build
process.)
NCR MP-RAS is now supported. [561]
All the NetBSD? specific patches (except for the installation
specific ones) have been merged back to the main distribution.
NetWare? from Novell is now supported. See the perlnetware manpage.
NonStop? -UX is now supported. [561]
NEC SUPER-UX is now supported.
All the OpenBSD? specific patches (except for the installation
specific ones) have been merged back to the main distribution.
Perl has been tested with the GNU pth userlevel thread package
( http://www.gnu.org/software/pth/pth.html ). All thread tests
of Perl now work, but not without adding some yield()s to the tests,
so while pth (and other userlevel thread implementations) can be
considered to be ``working'' with Perl ithreads, keep in mind the
possible non-preemptability of the underlying thread implementation.
Stratus VOS is now supported using Perl's native build method
(Configure). This is the recommended method to build Perl on
VOS. The older methods, which build miniperl, are still
available. See the perlvos manpage. [561+]
The Amdahl UTS UNIX mainframe platform is now supported. [561]
WinCE? is now supported. See the perlce manpage.
z/OS (formerly known as OS/390, formerly known as MVS OE) now has
support for dynamic loading. This is not selected by default,
however, you must specify -Dusedl in the arguments of Configure. [561]
Selected Bug Fixes
Numerous memory leaks and uninitialized memory accesses have been hunted down. Most importantly, anonymous subs used to leak quite a bit. [561]-
The autouse pragma didn't work for Multi::Part::Function::Names.
caller() could cause core dumps in certain situations. Carp was
sometimes affected by this problem. In particular, caller() now
returns a subroutine name of (unknown) for subroutines that have
been removed from the symbol table.
chop(@list) in list context returned the characters chopped in
reverse order. This has been reversed to be in the right order. [561]
Configure no longer includes the DBM libraries (dbm, gdbm, db, ndbm)
when building the Perl binary. The only exception to this is SunOS? 4.x,
which needs them. [561]
The behaviour of non-decimal but numeric string constants such as
``0x23'' was platform-dependent: in some platforms that was seen as 35,
in some as 0, in some as a floating point number (don't ask). This
was caused by Perl's using the operating system libraries in a situation
where the result of the string to number conversion is undefined: now
Perl consistently handles such strings as zero in numeric contexts.
Several debugger fixes: exit code now reflects the script exit code,
condition "0" now treated correctly, the d command now checks
line number, $. no longer gets corrupted, and all debugger output
now goes correctly to the socket if RemotePort? is set. [561]
The debugger (perl5db.pl) has been modified to present a more
consistent commands interface, via (CommandSet? =580). perl5db.t was
also added to test the changes, and as a placeholder for further tests.
See the perldebug manpage.
The debugger has a new dumpDepth option to control the maximum
depth to which nested structures are dumped. The x command has
been extended so that x N EXPR dumps out the value of EXPR to a
depth of at most N levels.
The debugger can now show lexical variables if you have the CPAN
module PadWalker? installed.
The order of DESTROYs has been made more predictable.
Perl 5.6.0 could emit spurious warnings about redefinition of
dl_error() when statically building extensions into perl.
This has been corrected. [561]
the dprofpp manpage -R didn't work.
*foo{FORMAT} now works.
Infinity is now recognized as a number.
UNIVERSAL::isa no longer caches methods incorrectly. (This broke
the Tk extension with 5.6.0.) [561]
Lexicals I: lexicals outside an eval ``'' weren't resolved
correctly inside a subroutine definition inside the eval ``'' if they
were not already referenced in the top level of the eval``''ed code.
Lexicals II: lexicals leaked at file scope into subroutines that
were declared before the lexicals.
Lexical warnings now propagating correctly between scopes
and into eval "...".
use warnings qw(FATAL all) did not work as intended. This has been
corrected. [561]
warnings::enabled() now reports the state of $^W correctly if the caller
isn't using lexical warnings. [561]
Line renumbering with eval and #line now works. [561]
Fixed numerous memory leaks, especially in eval ``''.
Localised tied variables no longer leak memory
use Tie::Hash;
tie my %tied_hash => 'Tie::StdHash';
...
# Used to leak memory every time local() was called;
# in a loop, this added up.
local($tied_hash{Foo}) = 1;
Localised hash elements (and %ENV) are correctly unlocalised to not
exist, if they didn't before they were localised.
use Tie::Hash;
tie my %tied_hash => 'Tie::StdHash';
...
# Nothing has set the FOO element so far
{ local $tied_hash{FOO} = 'Bar' }
# This used to print, but not now.
print "exists!\n" if exists $tied_hash{FOO};
As a side effect of this fix, tied hash interfaces must define
the EXISTS and DELETE methods.
mkdir() now ignores trailing slashes in the directory name,
as mandated by POSIX.
Some versions of glibc have a broken modfl(). This affects builds
with -Duselongdouble. This version of Perl detects this brokenness
and has a workaround for it. The glibc release 2.2.2 is known to have
fixed the modfl() bug.
Modulus of unsigned numbers now works (4063328477 % 65535 used to
return 27406, instead of 27047). [561]
Some ``not a number'' warnings introduced in 5.6.0 eliminated to be
more compatible with 5.005. Infinity is now recognised as a number. [561]
Numeric conversions did not recognize changes in the string value
properly in certain circumstances. [561]
Attributes (such as :shared) didn't work with our().
our() variables will not cause bogus ``Variable will not stay shared''
warnings. [561]
``our'' variables of the same name declared in two sibling blocks
resulted in bogus warnings about ``redeclaration'' of the variables.
The problem has been corrected. [561]
pack ``Z'' now correctly terminates the string with ``\0''.
Fix password routines which in some shadow password platforms
(e.g. HP-UX) caused getpwent() to return every other entry.
The PERL5OPT? environment variable (for passing command line arguments
to Perl) didn't work for more than a single group of options. [561]
PERL5OPT? with embedded spaces didn't work.
printf() no longer resets the numeric locale to ``C''.
qw(a\\b) now parses correctly as 'a\\b': that is, as three
characters, not four. [561]
pos() did not return the correct value within s///ge in earlier
versions. This is now handled correctly. [561]
Printing quads (64-bit integers) with printf/sprintf now works
without the q L ll prefixes (assuming you are on a quad-capable platform).
Regular expressions on references and overloaded scalars now work. [561+]
Right-hand side magic (GMAGIC) could in many cases such as string
concatenation be invoked too many times.
scalar() now forces scalar context even when used in void context.
SOCKS support is now much more robust.
sort() arguments are now compiled in the right wantarray context
(they were accidentally using the context of the sort() itself).
The comparison block is now run in scalar context, and the arguments
to be sorted are always provided list context. [561]
Changed the POSIX character class :space:? to include the (very
rarely used) vertical tab character. Added a new POSIX-ish character
class :blank:? which stands for horizontal whitespace
(currently, the space and the tab).
The tainting behaviour of sprintf() has been rationalized. It does
not taint the result of floating point formats anymore, making the
behaviour consistent with that of string interpolation. [561]
Some cases of inconsistent taint propagation (such as within hash
values) have been fixed.
The RE engine found in Perl 5.6.0 accidentally pessimised certain kinds
of simple pattern matches. These are now handled better. [561]
Regular expression debug output (whether through use re 'debug'
or via -Dr) now looks better. [561]
Multi-line matches like "a\nxb\n" =~ /(?!\A)x/m were flawed. The
bug has been fixed. [561]
Use of $& could trigger a core dump under some situations. This
is now avoided. [561]
The regular expression captured submatches ($1, $2, ...) are now
more consistently unset if the match fails, instead of leaving false
data lying around in them. [561]
readline() on files opened in ``slurp'' mode could return an extra
``'' (blank line) at the end in certain situations. This has been
corrected. [561]
Autovivification of symbolic references of special variables described
in the perlvar manpage (as in ${$num}) was accidentally disabled. This works
again now. [561]
Sys::Syslog ignored the LOG_AUTH constant.
$AUTOLOAD, sort(), lock(), and spawning subprocesses
in multiple threads simultaneously are now thread-safe.
Tie::Array's SPLICE method was broken.
Allow a read-only string on the left-hand side of a non-modifying tr///.
If STDERR is tied, warnings caused by warn and die now
correctly pass to it.
Several Unicode fixes.
-
BOMs (byte order marks) at the beginning of Perl files
(scripts, modules) should now be transparently skipped.
UTF-16 and UCS-2 encoded Perl files should now be read correctly.
The character tables have been updated to Unicode 3.2.0.
Comparing with utf8 data does not magically upgrade non-utf8 data
into utf8. (This was a problem for example if you were mixing data
from I/O and Unicode data: your output might have got magically encoded
as UTF-8.)
Generating illegal Unicode code points such as U+FFFE, or the UTF-16
surrogates, now also generates an optional warning.
IsAlnum, IsAlpha, and IsWord now match titlecase.
Concatenation with the . operator or via variable interpolation,
eq, substr, reverse, quotemeta, the x operator,
substitution with s///, single-quoted UTF-8, should now work.
The tr/// operator now works. Note that the tr///CU
functionality has been removed (but see pack('U0', ...)).
eval "v200" now works.
Perl 5.6.0 parsed m/\x{ab}/ incorrectly, leading to spurious warnings.
This has been corrected. [561]
Zero entries were missing from the Unicode classes such as IsDigit.
Platform Specific Changes and Fixes
-
BSDI 4.*
Perl now works on post-4.0 BSD/OSes.
All BSDs
Setting
$0 now works (as much as possible; see the perlvar manpage for details).
