This Week on p5p 2000/06/25
- Notes
- Method Lookup Speedup
tr///CU
andtr///UC
Removedis_utf8_string
- Byte-Order Marks Return
pack("U")
- Lexical variables and
eval()
FILEGV
perlhacktut
perlutil.pod
- Missing Methods
- Suppress prototype mismatch warnings
- Autoloaded Constants not Inlined
lib.pm
use English
- Numeric opens in
IPC::Open3
- Regex Bug
Foo isa Foo
README.hpux
my __PACKAGE__ $obj
…- asdgasdfasd
- Various
Notes
You can subscribe to an email version of this summary by sending an empty message to p5p-digest-subscribe@plover.com
.
Please send corrections and additions to mjd-perl-thisweek-YYYYMM@plover.com
where YYYYMM
is the current year and month.
This week’s report is a little late because I went to San Diego Usenix, and then I went to YAPC in Pittsburgh (probably the only person on the continent stupid enough to try to do both) and then I went back to Philadelphia and was driven to Washington DC for a party and came back on the train.
I was going to say it was a quiet week on the list. But it wasn’t. It was merely a low-traffic week. It wasn’t quiet at all; all sort of useful and interesting stuff was posted, and there was an unusually high signal-to-noise ratio.
This week has been named ‘Doug MacEachern and Simon Cozens’ week. Thank you Doug and Simon, and also everyone else who contributed to the unusually high signal-to-noise ratio this week.
Method Lookup Speedup
More discussion of Doug’s patch of last week.
Last week, some people pointed out that it would fail in the presence of code that modifies @ISA
at runtime; Sarathy suggested a pragma that would promise that this would not happen. Nick suggested that use base
could do that.
Doug submitted an updated patch.
For your delectation, Simon Cozens wrote up an extensive explanation of the patch and how it works, including many details about the Perl internals. If you are interested in the Perl internals (and you should be) then this is strongly recommended reading.
I would like very much to run other articles of the same type in the future. This should be construed as a request for people to contribute them. They don’t have to be as complete or detailed as Simon’s.
Thank you very much, Simon.
tr///CU
and tr///UC
Removed
Simon, who has been working on the line discipline feature, got rid of the nasty tr///CU
feature, which Larry had already decided was a bad idea and should be eliminated.
is_utf8_string
Simon also added a function named is_utf8_string
that checks a string to make sure it is valid UTF8. The plan is that if Perl is reading a putatively UTF8 file, it can check the input before setting the UTF8 flag on the resulting scalar.
Byte-Order Marks Return
Simon submitted an improved patch for this. This one just has the lexer use tell()
to see if the putative byte-order mark is at the very beginning of the file.
pack("U")
A few weeks ago there was discussion of what this should do.
Simon submitted a patch that implemented an idea of Larry’s: That a U
at the beginning of the pack template indicates that the result of pack
will be a UTF8 string; anything else indicates a byte string. THis means (for example) that you can put U0
at the beginning of any pattern to force it to produce UTF8; if you want to start with U
but have the result be bytes, add a do-nothing C0
at the beginning instead.
Lexical variables and eval()
Yitzchak Scott-Thoennes reported on a number of puzzles related to the interaction of these features, including:
{ my $x; sub incx { eval '++$x' } }
Here incx
apparently increments the lexical variable; he expected it to increment the global variable. (Rationale: The lexical variable should be optimized away.)
Rick Delaney referred to a relevant article by Ilya in clp.misc.
Yitzchak says that code in a subroutine should not be able to alter lexical variables in a more outer scope, unless it is a closure, which incx
here is not. Rick presents the following counterexample:
my $Pseudo_global = 2;
sub double {
my ($x) = @_;
eval '$x * $Pseudo_global';
}
Discussion seemed inconclusive. No patches were offered.
I said that I had done some research a while back about what Scheme and Common Lisp do in this sort of case, and that I would report back with a summary, but I have not done so.
FILEGV
There was some discussion about the FILEGV
macro. When Perl compiles the op tree, the line and file information is stored in a GV. Or rather, it used to be so; now, if you compile with ithreads, it just uses strings. There were some macros, *FILEGV
, to access this GV, but according to Sarathy, they was mostly used to get at the filename, and there is a more straightforward macro family, *FILE
, which gets the filename directly. Doug MacEachern wanted to use the original macro in B::Graph
, although I was not sure why; Sarathy said that probably B::Graph
needed to be fixed.
perlhacktut
Simon contributed the first half of a document titled perlhacktut
, a tutorial on hacking the Perl core. It talks about how to get started and what to read, provides an overview of Perl’s large subsystems, and the begining of a discussions of Perl’s basic data types and op trees.
If you are interested in the Perl internals (and you should be) then this is strongly recommended reading. (Gosh, that sounds familiar.)
perlutil.pod
Simon also contributed a document describing the utility programs that cmoe packaged with Perl, such as perldoc
, pod2html
, roffitall
, and a2p
.
Quite a busy week for Simon.
Missing Methods
Martyn Pierce pointed out that if you have code like this:
Foo->new('...');
it might fail for two reasons: because the Foo
class does not define that method, or because you forgot to put use Foo
in your program. In both cases the message is
Can't locate object method "new" via package "Foo" ...
Martyn suggested that in the second case, it could add a remark like
(perhaps you forgot to load module "Foo"?)
However, he did not provide a patch.
I also wonder why it says ‘object method’ when it is clearly a class method. I did not provide a patch either. This would be an excellent first patch for someone who wanted to get started patching. Write to me if you are interested in looking into it but do not know where to begin.
Suppress prototype mismatch warnings
Doug MacEachern discovered lots and lots of subroutine declarations in Socket.pm
that were there only to predeclare a bunch of autoloaded constants like AF_INET
. The only purpose for the declarations was to prevent ‘prototype mismatch’ warnings from occurring when the constants were actually autoloaded at run time. He then put in a patch to suppress the warning, if it appears that the subroutine will be autoloaded later, and removed the 20K of constant sub declarations in Socket.pm
.
Autoloaded Constants not Inlined
Doug also discovered that these autoloaded constants’ values are not inlined, because the code that uses them is compiled before the subroutine is loaded. Doug produced a patch to Exporter.pm
that lets you specify a name with a leading +
sign in the use
line to indicate that the subroutine should be invoked once (and hence autoloaded) immediately, when the module is loaded, so that they can be inlined into the following code.
lib.pm
Doug MacEachern decided that it was a shame that lib.pm
has to pull in all of Config.pm
, so he recast lib.pm
as a script, lib.pm.PL
, which generates the real lib.pm
at install time, inserting the appropriate values of $CONFIG
variables inline.
(Many other utilities, such as perlcc
and pod2html
, are generated this way at present. Do ls */*.PL
in the source directory to see a list.)
use English
Barrie Slaymaker contributed a patch so that you can now say
use English '-no_match_english';
and it will import all the usual long names for the punctuation variables, except for $`
, $&
, and $'
, which slow down your regexes. If you don’t supply this flag, then those variables are separately aliased via an eval
statement.
This has been a long time coming—I thought it had been done already.
There was a long sidetrack from having to do with some unimportant style issue, which should have been carried out in private email, or not at all.
Numeric opens in IPC::Open3
Frank Tobin submitted a patch that allows the user of IPC::Open3
to request that any of the ‘files’ to be opened be an already open file descriptor, analogous to the way open FH, "<&=3"
works with regular open
.
Regex Bug
Ian Flanigan found a very upsetting bug in the regex engine.
Foo isa Foo
Johan Vromans complained that
my $r = "Foo";
UNIVERSAL::isa($r, "Foo::");
returns true. Johan does not like that $r
(which is a string) is reported to be a member of class Foo
. It was pointed out that the manual explicitly says that UNIVERSAL::isa]
may be called as a class method, to determine whether one class was a subclass of another, in which case it could be invoked as
Foo->isa('Foo')
which is essentially the same as Johan’s example, and which returns true because the class Foo
is (trivially) a subclass of itself.
Johan said ‘Yuck.’
README.hpux
Jeff Okamoto updated it again.
my __PACKAGE__ $obj
…
Doug MacEachern submitted a patch to enable this. The patch came in just barely before the end-of-the week cutoff, and has already been a lot of discussion of it in the past two days, so I am going to defer talking about it any more until my next report.
Should you want to look at it before then, here it is.
asdgasdfasd
Some anonymous person running as root submitted a bug report (with perlbug
) that only said ‘asdgasdfasd’. Martyn Pearce replied that it was not a bug, but a feature.
Various
A large collection of bug reports, bug fixes, non-bug reports, questions, answers, and a very small amount and spam. No serious flamage however.
This is the end of the month, so I will summarize: I filed 97 messages in the junk
folder, 311 in the misc
folder, and 329 messages in 45 various other folders pertaining to particular topics.
Until next week I remain, your humble and obedient servant,
Tags
Feedback
Something wrong with this article? Help us out by opening an issue or pull request on GitHub