This Week on p5p 1999/10/17
- Introduction
- New Development Release 5.005_62
- Unicode Character Classes
- Module Bundling and the proposed
import
pragma use fields
allows overlapping member namesPREPARE
functions andmy Class $foo
declarationsgoto
Out of Conditional Bug- Regex Range Bug
use lib
change- More discussion about documentation for
pack
- MicroPerl
- Various
Introduction
This is intended to be a short summary of the major and important threads of discussion on the perl5-porters mailing list this week.
I am trying to add value to the summary by omitting threads that I perceive as uninteresting. For example, you often see threads of this type:
Person A
Here is a problem with Perl when you try to do X. I propose solution S.
Person B
Solution S is unnecessary because you can do Y instead.
Person A
No, that does not work, because of Z.
Person B
Oh, I totally misunderstood what you were doing.
Person C
But solution S will break existing code.
Person A
No, it won’t.
Person C
Oh, you’re right. I thought you meant something else.
Person A
Here is a patch to implement solution S.
I am going to try to omit these blind alleys. I will also try to omit some other types of messages:
Person D
Why are submitting a patch for S when we know it will break old code and anyway you could be doing Y instead?
Person E
Fortran has adopted solution S. Perl is not Fortran; therefore solution S is no good. If you want Fortran, you know where to find it.
Person F
You shut up.
Person G
No, you shut up.
Person H
Your formatting style offends me profoundly. Can’t you get a real mailer?
Person I
How can I develop CGI scripts offline on my W98 machine?
Person J
P5P is not a help desk.
Person K
I am writing to report a bug in the localtime
function. It returns the wrong month.
Person L
WORK FROM HOME!!!!
A lot of the stuff I omitted is on topic and has real value, but isn’t particularly interesting. For example I have omitted a bunch of cases where someone submitted a minor patch that was accepted with no discsusion. I omitted some discussions that did not seem to be of general interest. For example, this week, Brad Appleton and Ilya Zakharevich had an exchange about the pod-formatting features of cperl-mode.
It is hard to keep track of everything, and I may occasionally omit something you think is important, or I might misunderstand some important issue. Your additions and emendations are welcome. Please send any corrections, suggestions, additions, or embellishments to mjd-perl-thisweek-YYYYMM@plover.com
where YYYYMM
is the current year and month. For a more complete view of perl5-porters, either subscribe to the mailing list or check the archive.
I wanted to include hot links to the relevant messages in the archive, but the archive was down and I could not get the URLs. This will be corrected in a future issue, even if we have to start our own archive.
New Development Release 5.005_62
Sarathy announced the release of development version 5.005_62. It is available from CPAN.
Changes since 5.005_61 include:
- more 64-bit enhancements: 64-bit
vec()
, lfs fixes (Jarkko Hietaniemi) our
declarations (Larry Wall)sub foo : attrs
,use attrs
deprecated (Spider Boardman)- Configure: new flag
-Accflags=stuff
(Jarkko Hietaniemi) - lvalue subroutines breaks
\(Foo->bar())
(Ilya Zakharevich and Tuomas Lukka) - added
File::Glob
(wasFile::BSDGlob
from Greg Bacon) - building perl with
-DPERL_INTERNAL_GLOB
will do implicituse File::Glob 'globally'
use strict
generates true errors, not fatal warnings- new
-DPERL_Y2KWARN
option for"19$yr"
andsprintf "19%d", $yr
(Ulrich Pfeifer) exists()
works better on pseudo-hashes (Michael Schwern)- warnings on invalid escapes in char ranges (Jarkko Hietaniemi)
- comments in
pack
templates allowed (Ilya Zakharevich) - subroutine context could be popped too soon, leading to Tk coredumps (Russel O’Connor)
- 64bit-safe macros for
int->ptr
andptr->int
typecasts (Robin Barker) - fix for
POPSTACK
panics - fixes for memory leaks when is
@_
modified %@
“leaks”, goneEND
blocks not run under-c
switchdie
/warn
go toSTDERR
, notPerlIO_stderr()
\C{foo}
has been renamed to\N{foo}
pack
templateZ
always packs a null bytePOSIX::strftime()
bugs (Spider Boardman and Jan Dubois)xsubpp
understands function pointers, undocumented (Ilya Zakharevich)h2xs
more selective, and works with opaque types (Ilya Zakharevich)use lib
doesn’t keep dupsMakeMaker
support for uninstalled perl, undocumented (Ilya Zakharevich)Benchmark
enhancements (Barrie Slaymaker)- warning on
join(/foo/,...)
(Mark-Jason Dominus) perlcc
supports other backends (Tom Hughes)- more descriptive diagnostics about opcodes (Michael Schwern)
- Unicode database updated to 3.0-ish (unicode.org)
- EPOC updates (Olaf Flebbe)
- DOS updates (Laszlo Molnar)
- VMS updates (Charles Bailey and others)
- Win32 updates: Windows 95 support, faster
opendir()
,use Shell
support (Benjamin Stuhl, Jochen Wiedmann, Jenda Krynicky and others) - cygwin updates (Eric Fifer and others)
- JPL updates (Brian Jepson)
- Compiler updates (Vishal Bhatia)
DB_File
update to 1.71 (Paul Marquess)PodParser
update to 1.085 (Brad Appleton)pod2text
,pod2man
replaced bypodlators-0.08
(Russ Allbery)Getopt::Long
2.20 (Johan Vromans)perlcompile.pod
,perlhack.pod
(Nathan Torkington)- lots of pod adjustments
- lots of other bug fixes
Sarathy also said that he was moving into beta mode for the upcoming version 5.6.
Unicode Character Classes
Jarkko realized that since Perl was moving into beta mode for 5.6, this was his last chance to propose a new feature. He wants Perl regexes to support the `equivalence class’ feature of POSIX 1003.2. What this means is that certain characters in the character set may be deemed `equivalent’, and the notation [=c=]
denotes a character class containing all the characters equivalent to c
.
How do you decide which characters are considered `equivalent’? Unicode provides a definition that allows you to understand a character like é
; as an e
with an acute accent; we could have Perl understand this to be equivalent to a plain e
. Then the notation [=e=]
would match any of e
, é
, è
, ë
, or ê
. This might be useful.
Jarkko also wants people to be able to define their own equivalence relations, and he wants the m//
and s///
operators to support a new option, analogous to /i
, which would say to ignore all diacritical marks, again using the Unicode tables to decide what a diacritical mark is.
People brought up a number of potential problems; for example, in Danish, the character å
(U00E5) is considered to be an entirely different letter from a
and not equivalent to it at all. But according to Unicode, å
is indeed an a
with a diacritical mark. (Jarkko: ``Ha! I am just expecting some Danes … to jump up here and wave frantically their hands… “ )
Discussion continues; I will deliver an update next week if there is anything to report. By the way, the Unicode code charts are great browsing. The databases that say which characters are composed from which others are interesting too.
Module Bundling and the proposed import pragma
Tim Bunce forwarded a message that Michael King had sent to the modules@perl.com
mailing list. Michael had written a new module, which he named import
. The idea of import
is this: Suppose you have a bunch of modules that are for internal use at your company only. You are worried about namespace collisions with CPAN modules. You can name your internal-use modules with names that all begin with com::yourcompany
and then use
use import 'com::yourcompany';
This does two things. First, it locates all modules in the com::yourcompany
space on the local machine and imports them all. But second, it imports each com::yourcompany::Foo
module with the com::yourcompany
part stripped off. This means that if, for example, you have a com::yourcompany::Template
module, you can now call Template->new()
instead of com::yourcompany::Template->new()
.
Michael says that this is like the import
keyword in Java.
This touched off a number of interesting discussions:
Andreas König suggested that Perl should support a
package "www.foo.org";
directive which would be equivalent to
package org::foo::www;
. But Chip Salzenberg said that this was usually understood to have been a bad idea, because organizations often change URLs for various reasons. For example,foo.com
andbar.com
might merge to becomefoobar.com
. Says Chip: One of the best things about CPAN is that it is the de facto root namespace for all shared Perl modules. Let’s not throw that away!This led to a general discussion about how entire module namespaces might be reserved.
Damian Conway pointed out that we could simply establish the convention that if you have a PAUSE id, that module space is reserved for you. For example, Damian’s PAUSE id is DCONWAY; under this convention, all modules beginning with
DCONWAY::*
would belong to him.Problems with this scheme: Uri Guttman would now own the
URI::*
space, including theURI::URL
module. Nick Ing-Simmons’s PAUSE id isNI-S
, which is not a valid package identifier.John Redford suggested a module that lets you bundle many modules into one. He says:
Then you could write a bundling module, like this:
package MyCompany::CGIBundle; use MyCompany::CGI::BobsCode::Foo; use MyCompany::CGI::Test::Bar; use MyCompany::CGI::BobsBetterCode::Foo2; .... use NameSpace::Transitive;
And then people could just write:
use MyCompany::CGIBundle;
to get all the symbols that were exported into
MyCompany::CGIBundle
re-exported into their own namespace.I had written a module something like this back in February, so I decided that put it on CPAN. It is now available as ModuleBundle. Nick Ing-Simmons also pointed out that his
Tk::widgets
module does something similar:use TK::widgets qw(Text Entry Canvas)
is equivalent to:use Tk::Text (); use Tk::Entry (); use Tk::Canvas();
A couple of people wanted the two functions of this module to be separated. They liked the idea of being able to alias namespaces, so that the objects in
com::yourcompany::Template
could be referred to as if they were inTemplate
, but they were worried about the other function, which is to locate and import a whole lot of stuff indiscriminately. There was some discussion of a namespace aliasing pragma, or of adding this functionality to the existingAlias
module.
There were also a number of uninteresting discussions: Someone wanted to know what would happen if you said use import 'CGI'
. Michael’s answer to that was that that was not what import
was for and that whoever did that would get the bizarre behavior that they deserved. That did not stop a lot of people from making a big fuss about it, however. One person even said ``If you want that functionality, why don’t you write a module to do that?” apparently having forgotten that the way the discussion started was that Michael had written a module to do that, notified the modules list, and then Tim Bunce forwarded his note to P5P.
It appears that Michael is now pursuing a name in the Import
namespace, and may make some changes to the module’s calling interface to better seprate the two functions of his module.
use fields allows overlapping member names
Tuomas Lukka discovered a gotcha in field.pm
. The gotcha is this: Suppose you have a base and a derived class, and both contain a field with the same name, say f
. Suppose the method m
is defined in the base class and inherited by the derived class. Now create an object of the derived class, and call m
on the object. Suppose m
contains code to modify field f
. There are two fields named f
. Which one will be modified?
You would expect that, because m
is defined in the base class, it should modify the base class’s f
. And so it does, if m
is written correctly:
package Base;
sub m {
my Base $self = shift;
$self->{f} = 'newvalue'; # Modifies base class field f
}
But if you accidentally write my $self
instead of my Base $self
, it modifies the wrong member:
package Base;
sub m {
my $self = shift;
$self->{f} = 'newvalue'; # Modifies DERIVED class field f
}
Tuomas points out that the declaration and use of $self
might be very far apart, and that a mistake in a far-away declaration could introduce a bug in the program that was difficult to find. Worse, suppose the object was stored inside some other object, so that instead of $self->{f}
you had $object->{subobject}->{f}
; in this case no declaration at all applies and you get the same problem.
Tuomas’ solution is to simply forbid conflicting field names. fields.pm
and base.pm
will detect this and throw a fatal exception if they detect that a derived class is using a member with the same name as one of its parent classes.
Tuomas’s rationale: It is very dangerous at present, and is `action at a distance’, which means that two apparently unrelated declarations far away, even in entirely different files, might drastially alter the behavior of a subroutine in a third location. The prohibition can be lifted later if a way is found to make it safer, and nobody appears to be using it now. (Someone thought they were, but realized they were mistaken.)
Sarathy appeared to be persuaded that the situation required at least a warning. Tuomas is pushing for a fatal compile-time error.
PREPARE functions and my Class $foo declarations
Ilya had submitted a patch which would have made my Class $foo;
behave as if you had also written Class->PREPARE($foo)
, if there was such a method. If you had my Class $foo = 'bar'
instead, the assignment would occur after the PREPARE
call.
Sarathy did not like the implementation, for reasons I did not completely understand. Sarathy did not like that an AUTOLOAD
call would be made at compile time if PREPARE
was not found where it was supposed to be. Ilya did this by analogy with DESTROY
, but the compile-time call to AUTOLOAD
is peculiar. (Sarathy: ``Yikes!“) Sarathy also objected to the way that the check for PREPARE
was done at compile time, rather than at run time; Ilya said he did it that way for efficiency so that the check would not have to be done every time the declaration was executed.
Sarathy also complained that even though most uses of my Class $foo
would not involve PREPARE
, the compiler would have to make an AUTOLOAD
call for each one of them. Ilya appeared to agree that this was a problem. Sarathy suggested a pragma that declares the PREPARE
method for a class. Ilya pointed out that if the autloading part was removed from his patch, then the definition of the PREPARE
subroutine itself would serve as exactly such a pragma.
An interesting sidetrack developed: Chip said it would be simpler in Topaz if the argument to PREPARE
were \$foo
rather than just $foo
. Ilya said he did not want to do that because constructing a reference costs as much as 21 `simple’ operations. What this means is: Perl takes a certain amount of time to dispatch each operation. For some `simple’ operations, such as performing a scalar assignment, the time to actually perform the operation is dominated by the opcode dispatch time, so they all take about the same amount of time. Constructing a scalar reference, according to Ilya, takes 21 times as long as one of these `simple’ operations, and constructing an array reference takes 50 times as long. Chip was surprised, and so am I.
goto Out of Conditional Bug
Damien Neil reported a bug in 5.005_03 that is triggered when you use goto
out of a conditional block. He also supplied a patch. Watch out for Damien—newcomers who provide core patches are people to pay attention to.
Sarathy said that this was already fixed in the development version. His solution is a little different. Damien’s patch wraps up each branch of the if
as a separate block; Sarathy’s wraps up the entire if
as one block. Sarathy wants to keep his patch because it makes the op tree smaller and so the code is faster at run time. But he notes that Damien’s approach might be better if the peephole optimizer could be instructed to remove the extra instructions for blocks that do not contain goto
.
Regex Range Bug
Faisal Nasim had trouble with a regex that included [\w-]
and [\w-.]
. Different versions of Perl behaved differently for this, depending on whether the -
was seen as indicating a range or not. 5.005_03 interpreted it as if you had written [\w\-.]
; 5.005_62 generated a syntax error. (``Invalid range”.) This was Jarkko’s doing. Opinions varied about what behavior was best, especially since the documentation seemed to support the latter view—but the changed behavior broke old code, as Faisal pointed out.
Larry suggested making it a warning. Jarkko thought this was peachy and put in the patch. While on the subject, he put in a warning for use of \A
etc. in character classes.
use lib change
Tod Irwin wants use lib 'foo'
to append foo
to the front of @INC
(which it does already) and to also remove any other appearances of foo
that happen to be in @INC
already. His motivation: mod_perl
scripts that have use lib
have @INC
lists that get longer and longer and longer.
There were some objections, but the change is in.
Nick Ing-Simmons: Modules which inject things into
@INC
are highly suspect beasts - its like lacing the fruit juice with vodka.
More discussion about documentation for pack
A discussion about the documentation for pack
finished up. I wouldn’t have mentioned this, except:
Sarathy: What would you suggest [the format specifier]
@*
should mean?Ilya: Call the cops.
MicroPerl
This almost slipped by me, but I wrote to ask Ilya what it was about. Here’s what he said:
Build Perl without Configure support. First you build a bastardized version (nanoperl), use it to build a correctly working version with some functionality missing (microperl), then use this to run
Configure.PL
which will make an analogue ofconfig.sh
, then you continue with miniperl and perl as before.The supplied target crazyperl is very close to become nanoperl of the above classification. Minor changes to supplied
micro0/config.sh
should (when bugs are ironed out) produce a microperl.Currently crazyperl passes a lot of tests. This should be improved yet more (apparently the code is there to support the situation when no non-portable services are found, but it has some bugs).
Aren’t you glad I asked? I sure am.
Various
Also a large collection of bug reports, bug fixes, questions, answers, and a small amount of flamage and spam.
Until next week I remain, your humble and obedient servant,
Tags
Feedback
Something wrong with this article? Help us out by opening an issue or pull request on GitHub