|
NAMEperl5100delta - what is new for perl 5.10.0DESCRIPTIONThis document describes the differences between the 5.8.8 release and the 5.10.0 release.Many of the bug fixes in 5.10.0 were already seen in the 5.8.X maintenance releases; they are not duplicated here and are documented in the set of man pages named perl58[1-8]?delta. Core EnhancementsThe "feature" pragmaThe "feature" pragma is used to enable new syntax that would break Perl's backwards-compatibility with older releases of the language. It's a lexical pragma, like "strict" or "warnings".Currently the following new features are available: "switch" (adds a switch statement), "say" (adds a "say" built-in function), and "state" (adds a "state" keyword for declaring "static" variables). Those features are described in their own sections of this document. The "feature" pragma is also implicitly loaded when you require a minimal perl version (with the "use VERSION" construct) greater than, or equal to, 5.9.5. See feature for details. New -E command-line switch-E is equivalent to -e, but it implicitly enables all optional features (like "use feature ":5.10"").Defined-or operatorA new operator "//" (defined-or) has been implemented. The following expression:$a // $b is merely equivalent to defined $a ? $a : $b and the statement $c //= $d; can now be used instead of $c = $d unless defined $c; The "//" operator has the same precedence and associativity as "||". Special care has been taken to ensure that this operator Do What You Mean while not breaking old code, but some edge cases involving the empty regular expression may now parse differently. See perlop for details. Switch and Smart Match operatorPerl 5 now has a switch statement. It's available when "use feature 'switch'" is in effect. This feature introduces three new keywords, "given", "when", and "default":given ($foo) { when (/^abc/) { $abc = 1; } when (/^def/) { $def = 1; } when (/^xyz/) { $xyz = 1; } default { $nothing = 1; } } A more complete description of how Perl matches the switch variable against the "when" conditions is given in "Switch statements" in perlsyn. This kind of match is called smart match, and it's also possible to use it outside of switch statements, via the new "~~" operator. See "Smart matching in detail" in perlsyn. This feature was contributed by Robin Houston. Regular expressions
"say()"say() is a new built-in, only available when "use feature 'say'" is in effect, that is similar to print(), but that implicitly appends a newline to the printed string. See "say" in perlfunc. (Robin Houston)Lexical $_The default variable $_ can now be lexicalized, by declaring it like any other lexical variable, with a simplemy $_; The operations that default on $_ will use the lexically-scoped version of $_ when it exists, instead of the global $_. In a "map" or a "grep" block, if $_ was previously my'ed, then the $_ inside the block is lexical as well (and scoped to the block). In a scope where $_ has been lexicalized, you can still have access to the global version of $_ by using $::_, or, more simply, by overriding the lexical declaration with "our $_". (Rafael Garcia-Suarez) The "_" prototypeA new prototype character has been added. "_" is equivalent to "$" but defaults to $_ if the corresponding argument isn't supplied (both "$" and "_" denote a scalar). Due to the optional nature of the argument, you can only use it at the end of a prototype, or before a semicolon.This has a small incompatible consequence: the prototype() function has been adjusted to return "_" for some built-ins in appropriate cases (for example, "prototype('CORE::rmdir')"). (Rafael Garcia-Suarez) UNITCHECK blocks"UNITCHECK", a new special code block has been introduced, in addition to "BEGIN", "CHECK", "INIT" and "END"."CHECK" and "INIT" blocks, while useful for some specialized purposes, are always executed at the transition between the compilation and the execution of the main program, and thus are useless whenever code is loaded at runtime. On the other hand, "UNITCHECK" blocks are executed just after the unit which defined them has been compiled. See perlmod for more information. (Alex Gough) New Pragma, "mro"A new pragma, "mro" (for Method Resolution Order) has been added. It permits to switch, on a per-class basis, the algorithm that perl uses to find inherited methods in case of a multiple inheritance hierarchy. The default MRO hasn't changed (DFS, for Depth First Search). Another MRO is available: the C3 algorithm. See mro for more information. (Brandon Black)Note that, due to changes in the implementation of class hierarchy search, code that used to undef the *ISA glob will most probably break. Anyway, undef'ing *ISA had the side-effect of removing the magic on the @ISA array and should not have been done in the first place. Also, the cache *::ISA::CACHE:: no longer exists; to force reset the @ISA cache, you now need to use the "mro" API, or more simply to assign to @ISA (e.g. with "@ISA = @ISA"). readdir() may return a "short filename" on WindowsThe readdir() function may return a "short filename" when the long filename contains characters outside the ANSI codepage. Similarly Cwd::cwd() may return a short directory name, and glob() may return short names as well. On the NTFS file system these short names can always be represented in the ANSI codepage. This will not be true for all other file system drivers; e.g. the FAT filesystem stores short filenames in the OEM codepage, so some files on FAT volumes remain inaccessible through the ANSI APIs.Similarly, $^X, @INC, and $ENV{PATH} are preprocessed at startup to make sure all paths are valid in the ANSI codepage (if possible). The Win32::GetLongPathName() function now returns the UTF-8 encoded correct long file name instead of using replacement characters to force the name into the ANSI codepage. The new Win32::GetANSIPathName() function can be used to turn a long pathname into a short one only if the long one cannot be represented in the ANSI codepage. Many other functions in the "Win32" module have been improved to accept UTF-8 encoded arguments. Please see Win32 for details. readpipe() is now overridableThe built-in function readpipe() is now overridable. Overriding it permits also to override its operator counterpart, "qx//" (a.k.a. "``"). Moreover, it now defaults to $_ if no argument is provided. (Rafael Garcia-Suarez)Default argument for readline()readline() now defaults to *ARGV if no argument is provided. (Rafael Garcia-Suarez)state() variablesA new class of variables has been introduced. State variables are similar to "my" variables, but are declared with the "state" keyword in place of "my". They're visible only in their lexical scope, but their value is persistent: unlike "my" variables, they're not undefined at scope entry, but retain their previous value. (Rafael Garcia-Suarez, Nicholas Clark)To use state variables, one needs to enable them by using use feature 'state'; or by using the "-E" command-line switch in one-liners. See "Persistent Private Variables" in perlsub. Stacked filetest operatorsAs a new form of syntactic sugar, it's now possible to stack up filetest operators. You can now write "-f -w -x $file" in a row to mean "-x $file && -w _ && -f _". See "-X" in perlfunc.UNIVERSAL::DOES()The "UNIVERSAL" class has a new method, "DOES()". It has been added to solve semantic problems with the "isa()" method. "isa()" checks for inheritance, while "DOES()" has been designed to be overridden when module authors use other types of relations between classes (in addition to inheritance). (chromatic)See "$obj->DOES( ROLE )" in UNIVERSAL. FormatsFormats were improved in several ways. A new field, "^*", can be used for variable-width, one-line-at-a-time text. Null characters are now handled correctly in picture lines. Using "@#" and "~~" together will now produce a compile-time error, as those format fields are incompatible. perlform has been improved, and miscellaneous bugs fixed.Byte-order modifiers for pack() and unpack()There are two new byte-order modifiers, ">" (big-endian) and "<" (little-endian), that can be appended to most pack() and unpack() template characters and groups to force a certain byte-order for that type or group. See "pack" in perlfunc and perlpacktut for details."no VERSION"You can now use "no" followed by a version number to specify that you want to use a version of perl older than the specified one."chdir", "chmod" and "chown" on filehandles"chdir", "chmod" and "chown" can now work on filehandles as well as filenames, if the system supports respectively "fchdir", "fchmod" and "fchown", thanks to a patch provided by Gisle Aas.OS groups$( and $) now return groups in the order where the OS returns them, thanks to Gisle Aas. This wasn't previously the case.Recursive sort subsYou can now use recursive subroutines with sort(), thanks to Robin Houston.Exceptions in constant foldingThe constant folding routine is now wrapped in an exception handler, and if folding throws an exception (such as attempting to evaluate 0/0), perl now retains the current optree, rather than aborting the whole program. Without this change, programs would not compile if they had expressions that happened to generate exceptions, even though those expressions were in code that could never be reached at runtime. (Nicholas Clark, Dave Mitchell)Source filters in @INCIt's possible to enhance the mechanism of subroutine hooks in @INC by adding a source filter on top of the filehandle opened and returned by the hook. This feature was planned a long time ago, but wasn't quite working until now. See "require" in perlfunc for details. (Nicholas Clark)New internal variables
Miscellaneous"unpack()" now defaults to unpacking the $_ variable."mkdir()" without arguments now defaults to $_. The internal dump output has been improved, so that non-printable characters such as newline and backspace are output in "\x" notation, rather than octal. The -C option can no longer be used on the "#!" line. It wasn't working there anyway, since the standard streams are already set up at this point in the execution of the perl interpreter. You can use binmode() instead to get the desired behaviour. UCD 5.0.0The copy of the Unicode Character Database included in Perl 5 has been updated to version 5.0.0.MADMAD, which stands for Miscellaneous Attribute Decoration, is a still-in-development work leading to a Perl 5 to Perl 6 converter. To enable it, it's necessary to pass the argument "-Dmad" to Configure. The obtained perl isn't binary compatible with a regular perl 5.10, and has space and speed penalties; moreover not all regression tests still pass with it. (Larry Wall, Nicholas Clark)kill() on WindowsOn Windows platforms, "kill(-9, $pid)" now kills a process tree. (On Unix, this delivers the signal to all processes in the same process group.)Incompatible ChangesPacking and UTF-8 stringsThe semantics of pack() and unpack() regarding UTF-8-encoded data has been changed. Processing is now by default character per character instead of byte per byte on the underlying encoding. Notably, code that used things like "pack("a*", $string)" to see through the encoding of string will now simply get back the original $string. Packed strings can also get upgraded during processing when you store upgraded characters. You can get the old behaviour by using "use bytes".To be consistent with pack(), the "C0" in unpack() templates indicates that the data is to be processed in character mode, i.e. character by character; on the contrary, "U0" in unpack() indicates UTF-8 mode, where the packed string is processed in its UTF-8-encoded Unicode form on a byte by byte basis. This is reversed with regard to perl 5.8.X, but now consistent between pack() and unpack(). Moreover, "C0" and "U0" can also be used in pack() templates to specify respectively character and byte modes. "C0" and "U0" in the middle of a pack or unpack format now switch to the specified encoding mode, honoring parens grouping. Previously, parens were ignored. Also, there is a new pack() character format, "W", which is intended to replace the old "C". "C" is kept for unsigned chars coded as bytes in the strings internal representation. "W" represents unsigned (logical) character values, which can be greater than 255. It is therefore more robust when dealing with potentially UTF-8-encoded data (as "C" will wrap values outside the range 0..255, and not respect the string encoding). In practice, that means that pack formats are now encoding-neutral, except "C". For consistency, "A" in unpack() format now trims all Unicode whitespace from the end of the string. Before perl 5.9.2, it used to strip only the classical ASCII space characters. Byte/character count feature in unpack()A new unpack() template character, ".", returns the number of bytes or characters (depending on the selected encoding mode, see above) read so far.The $* and $# variables have been removed$*, which was deprecated in favor of the "/s" and "/m" regexp modifiers, has been removed.The deprecated $# variable (output format for numbers) has been removed. Two new severe warnings, "$#/$* is no longer supported", have been added. substr() lvalues are no longer fixed-lengthThe lvalues returned by the three argument form of substr() used to be a "fixed length window" on the original string. In some cases this could cause surprising action at distance or other undefined behaviour. Now the length of the window adjusts itself to the length of the string assigned to it.Parsing of "-f _"The identifier "_" is now forced to be a bareword after a filetest operator. This solves a number of misparsing issues when a global "_" subroutine is defined.":unique"The ":unique" attribute has been made a no-op, since its current implementation was fundamentally flawed and not threadsafe.Effect of pragmas in evalThe compile-time value of the "%^H" hint variable can now propagate into eval("")uated code. This makes it more useful to implement lexical pragmas.As a side-effect of this, the overloaded-ness of constants now propagates into eval(""). chdir FOOA bareword argument to chdir() is now recognized as a file handle. Earlier releases interpreted the bareword as a directory name. (Gisle Aas)Handling of .pmc filesAn old feature of perl was that before "require" or "use" look for a file with a .pm extension, they will first look for a similar filename with a .pmc extension. If this file is found, it will be loaded in place of any potentially existing file ending in a .pm extension.Previously, .pmc files were loaded only if more recent than the matching .pm file. Starting with 5.9.4, they'll be always loaded if they exist. $^V is now a "version" object instead of a v-string$^V can still be used with the %vd format in printf, but any character-level operations will now access the string representation of the "version" object and not the ordinals of a v-string. Expressions like "substr($^V, 0, 2)" or "split //, $^V" no longer work and must be rewritten.@- and @+ in patternsThe special arrays "@-" and "@+" are no longer interpolated in regular expressions. (Sadahiro Tomoyuki)$AUTOLOAD can now be taintedIf you call a subroutine by a tainted name, and if it defers to an AUTOLOAD function, then $AUTOLOAD will be (correctly) tainted. (Rick Delaney)Tainting and printfWhen perl is run under taint mode, "printf()" and "sprintf()" will now reject any tainted format argument. (Rafael Garcia-Suarez)undef and signal handlersUndefining or deleting a signal handler via "undef $SIG{FOO}" is now equivalent to setting it to 'DEFAULT'. (Rafael Garcia-Suarez)strictures and dereferencing in defined()"use strict 'refs'" was ignoring taking a hard reference in an argument to defined(), as in :use strict 'refs'; my $x = 'foo'; if (defined $$x) {...} This now correctly produces the run-time error "Can't use string as a SCALAR ref while "strict refs" in use". "defined @$foo" and "defined %$bar" are now also subject to "strict 'refs'" (that is, $foo and $bar shall be proper references there.) ("defined(@foo)" and "defined(%bar)" are discouraged constructs anyway.) (Nicholas Clark) "(?p{})" has been removedThe regular expression construct "(?p{})", which was deprecated in perl 5.8, has been removed. Use "(??{})" instead. (Rafael Garcia-Suarez)Pseudo-hashes have been removedSupport for pseudo-hashes has been removed from Perl 5.9. (The "fields" pragma remains here, but uses an alternate implementation.)Removal of the bytecode compiler and of perlcc"perlcc", the byteloader and the supporting modules (B::C, B::CC, B::Bytecode, etc.) are no longer distributed with the perl sources. Those experimental tools have never worked reliably, and, due to the lack of volunteers to keep them in line with the perl interpreter developments, it was decided to remove them instead of shipping a broken version of those. The last version of those modules can be found with perl 5.9.4.However the B compiler framework stays supported in the perl core, as with the more useful modules it has permitted (among others, B::Deparse and B::Concise). Removal of the JPLThe JPL (Java-Perl Lingo) has been removed from the perl sources tarball.Recursive inheritance detected earlierPerl will now immediately throw an exception if you modify any package's @ISA in such a way that it would cause recursive inheritance.Previously, the exception would not occur until Perl attempted to make use of the recursive inheritance while resolving a method or doing a "$foo->isa($bar)" lookup. warnings::enabled and warnings::warnif changed to favor users of modulesThe behaviour in 5.10.x favors the person using the module; The behaviour in 5.8.x favors the module writer;Assume the following code: main calls Foo::Bar::baz() Foo::Bar inherits from Foo::Base Foo::Bar::baz() calls Foo::Base::_bazbaz() Foo::Base::_bazbaz() calls: warnings::warnif('substr', 'some warning message'); On 5.8.x, the code warns when Foo::Bar contains "use warnings;" It does not matter if Foo::Base or main have warnings enabled to disable the warning one has to modify Foo::Bar. On 5.10.0 and newer, the code warns when main contains "use warnings;" It does not matter if Foo::Base or Foo::Bar have warnings enabled to disable the warning one has to modify main. Modules and PragmataUpgrading individual core modulesEven more core modules are now also available separately through the CPAN. If you wish to update one of these modules, you don't need to wait for a new perl release. From within the cpan shell, running the 'r' command will report on modules with upgrades available. See "perldoc CPAN" for more information.Pragmata Changes
New modules
Selected Changes to Core Modules
Utility Changes
New DocumentationThe perlpragma manpage documents how to write one's own lexical pragmas in pure Perl (something that is possible starting with 5.9.4).The new perlglossary manpage is a glossary of terms used in the Perl documentation, technical and otherwise, kindly provided by O'Reilly Media, Inc. The perlreguts manpage, courtesy of Yves Orton, describes internals of the Perl regular expression engine. The perlreapi manpage describes the interface to the perl interpreter used to write pluggable regular expression engines (by AEvar Arnfjoerd` Bjarmason). The perlunitut manpage is a tutorial for programming with Unicode and string encodings in Perl, courtesy of Juerd Waalboer. A new manual page, perlunifaq (the Perl Unicode FAQ), has been added (Juerd Waalboer). The perlcommunity manpage gives a description of the Perl community on the Internet and in real life. (Edgar "Trizor" Bering) The CORE manual page documents the "CORE::" namespace. (Tels) The long-existing feature of "/(?{...})/" regexps setting $_ and pos() is now documented. Performance EnhancementsIn-place sortingSorting arrays in place ("@a = sort @a") is now optimized to avoid making a temporary copy of the array.Likewise, "reverse sort ..." is now optimized to sort in reverse, avoiding the generation of a temporary intermediate list. Lexical array accessAccess to elements of lexical arrays via a numeric constant between 0 and 255 is now faster. (This used to be only the case for global arrays.)XS-assisted SWASHGETSome pure-perl code that perl was using to retrieve Unicode properties and transliteration mappings has been reimplemented in XS.Constant subroutinesThe interpreter internals now support a far more memory efficient form of inlineable constants. Storing a reference to a constant value in a symbol table is equivalent to a full typeglob referencing a constant subroutine, but using about 400 bytes less memory. This proxy constant subroutine is automatically upgraded to a real typeglob with subroutine if necessary. The approach taken is analogous to the existing space optimisation for subroutine stub declarations, which are stored as plain scalars in place of the full typeglob.Several of the core modules have been converted to use this feature for their system dependent constants - as a result "use POSIX;" now takes about 200K less memory. "PERL_DONT_CREATE_GVSV"The new compilation flag "PERL_DONT_CREATE_GVSV", introduced as an option in perl 5.8.8, is turned on by default in perl 5.9.3. It prevents perl from creating an empty scalar with every new typeglob. See perl589delta for details.Weak references are cheaperWeak reference creation is now O(1) rather than O(n), courtesy of Nicholas Clark. Weak reference deletion remains O(n), but if deletion only happens at program exit, it may be skipped completely.sort() enhancementsSalvador Fandin~o provided improvements to reduce the memory usage of "sort" and to speed up some cases.Memory optimisationsSeveral internal data structures (typeglobs, GVs, CVs, formats) have been restructured to use less memory. (Nicholas Clark)UTF-8 cache optimisationThe UTF-8 caching code is now more efficient, and used more often. (Nicholas Clark)Sloppy stat on WindowsOn Windows, perl's stat() function normally opens the file to determine the link count and update attributes that may have been changed through hard links. Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up stat() by not performing this operation. (Jan Dubois)Regular expressions optimisations
Installation and Configuration ImprovementsConfiguration improvements
Compilation improvements
Installation improvements
New Or Improved PlatformsPerl has been reported to work on Symbian OS. See perlsymbian for more information.Many improvements have been made towards making Perl work correctly on z/OS. Perl has been reported to work on DragonFlyBSD and MidnightBSD. Perl has also been reported to work on NexentaOS ( http://www.gnusolaris.org/ ). The VMS port has been improved. See perlvms. Support for Cray XT4 Catamount/Qk has been added. See hints/catamount.sh in the source code distribution for more information. Vendor patches have been merged for RedHat and Gentoo. DynaLoader::dl_unload_file() now works on Windows. Selected Bug Fixes
New or Changed Diagnostics
Changed InternalsIn general, the source code of perl has been refactored, tidied up, and optimized in many places. Also, memory management and allocation has been improved in several points.When compiling the perl core with gcc, as many gcc warning flags are turned on as is possible on the platform. (This quest for cleanliness doesn't extend to XS code because we cannot guarantee the tidiness of code we didn't write.) Similar strictness flags have been added or tightened for various other C compilers. Reordering of SVt_* constantsThe relative ordering of constants that define the various types of "SV" have changed; in particular, "SVt_PVGV" has been moved before "SVt_PVLV", "SVt_PVAV", "SVt_PVHV" and "SVt_PVCV". This is unlikely to make any difference unless you have code that explicitly makes assumptions about that ordering. (The inheritance hierarchy of "B::*" objects has been changed to reflect this.)Elimination of SVt_PVBMRelated to this, the internal type "SVt_PVBM" has been removed. This dedicated type of "SV" was used by the "index" operator and parts of the regexp engine to facilitate fast Boyer-Moore matches. Its use internally has been replaced by "SV"s of type "SVt_PVGV".New type SVt_BINDA new type "SVt_BIND" has been added, in readiness for the project to implement Perl 6 on 5. There deliberately is no implementation yet, and they cannot yet be created or destroyed.Removal of CPP symbolsThe C preprocessor symbols "PERL_PM_APIVERSION" and "PERL_XS_APIVERSION", which were supposed to give the version number of the oldest perl binary-compatible (resp. source-compatible) with the present one, were not used, and sometimes had misleading values. They have been removed.Less space is used by opsThe "BASEOP" structure now uses less space. The "op_seq" field has been removed and replaced by a single bit bit-field "op_opt". "op_type" is now 9 bits long. (Consequently, the "B::OP" class doesn't provide an "seq" method anymore.)New parserperl's parser is now generated by bison (it used to be generated by byacc.) As a result, it seems to be a bit more robust.Also, Dave Mitchell improved the lexer debugging output under "-DT". Use of "const"Andy Lester supplied many improvements to determine which function parameters and local variables could actually be declared "const" to the C compiler. Steve Peters provided new *_set macros and reworked the core to use these rather than assigning to macros in LVALUE context.MathomsA new file, mathoms.c, has been added. It contains functions that are no longer used in the perl core, but that remain available for binary or source compatibility reasons. However, those functions will not be compiled in if you add "-DNO_MATHOMS" in the compiler flags."AvFLAGS" has been removedThe "AvFLAGS" macro has been removed."av_*" changesThe "av_*()" functions, used to manipulate arrays, no longer accept null "AV*" parameters.$^H and %^HThe implementation of the special variables $^H and %^H has changed, to allow implementing lexical pragmas in pure Perl.B:: modules inheritance changedThe inheritance hierarchy of "B::" modules has changed; "B::NV" now inherits from "B::SV" (it used to inherit from "B::IV").Anonymous hash and array constructorsThe anonymous hash and array constructors now take 1 op in the optree instead of 3, now that pp_anonhash and pp_anonlist return a reference to a hash/array when the op is flagged with OPf_SPECIAL. (Nicholas Clark)Known ProblemsThere's still a remaining problem in the implementation of the lexical $_: it doesn't work inside "/(?{...})/" blocks. (See the TODO test in t/op/mydef.t.)Stacked filetest operators won't work when the "filetest" pragma is in effect, because they rely on the stat() buffer "_" being populated, and filetest bypasses stat(). UTF-8 problemsThe handling of Unicode still is unclean in several places, where it's dependent on whether a string is internally flagged as UTF-8. This will be made more consistent in perl 5.12, but that won't be possible without a certain amount of backwards incompatibility.Platform Specific ProblemsWhen compiled with g++ and thread support on Linux, it's reported that the $! stops working correctly. This is related to the fact that the glibc provides two strerror_r(3) implementation, and perl selects the wrong one.Reporting BugsIf you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/rt3/ . There may also be information at http://www.perl.org/ , the Perl Home Page.If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of "perl -V", will be sent off to perlbug@perl.org to be analysed by the Perl porting team. SEE ALSOThe Changes file and the perl590delta to perl595delta man pages for exhaustive details on what changed.The INSTALL file for how to build Perl. The README file for general stuff. The Artistic and Copying files for copyright information.
Visit the GSP FreeBSD Man Page Interface. |